The attached patch reworked the engine framework based on the following
three ideas.
* A new server API to obtain socket file descriptor.
The get_socket_fd() server API allows engines to know properties of
connection socket.
* New APIs to retrieve length of key/data
The item_get_nkey() and item_get_ndata() engine APIs replaced direct
references to item->nkey and item->nbytes.
* Flags to show what features are provided with the engine module.
The 'features' field of ENGINE_HANDLE enables to inform the core
memcached (or intermediation module, like bucket) features of the
loaded engine module.
Any comments are welcome.
Thanks,
(2010/04/01 15:46), KaiGai Kohei wrote:
> Hello,
>
> As we discussed before, I focus on access control facilities in memcached.
> In the "memcached and access control" thread, I was suggested to implement
> it as an engine module rather than core features.
> Except for a few matters, it seems to be a feasible approach.
>
> The issue about remove() method was resolved. (Thanks!)
>
> However, we still have matters which can prevent access control feature
> as an engine module. So, I'd like to discuss the following ideas.
>
> * A new server API to obtain socket file descriptor.
>
> SELinux provides an API to retrieve security context of the peer process
> for the given socket file descriptor. (See, getpeercon(3))
> It allows server processes to know privilege of the client process, and
> we can utilize the privilege for access control decision inside server
> process.
> Right now, conn->sfd is socket file descriptor of the session. The "conn"
> data is delivered to engine module as an opaque "cookie". In actually, we
> can fetch the socket file descriptor based on the knowledge of internal
> format of the structure. But it seems to me basically unpreferable.
>
> I'd like to add a new server API, as follows:
>
> int (*get_socket_file_descriptor)(const void *cookie);
>
>
> * New APIs to retrieve length of key/data
>
> Right now, the item structure has nbytes (uint32_t) and nkey (uint16_t)
> fields,
> and the core memcached can see the length of key/data without any
> interposition
> of engine modules.
> Encapsulation of the references to nbytes/nkey is an awaited feature for me,
> because it makes possible to store security properties of access control
> stuff,
> such as ownership or security context.
>
> I plan the upcoming selinux module intermediate between the core memcached
> and the secondary module (such as default_engine.so). For example, when get()
> method is called, selinux's get() shall be invoked at first. Then, it checks
> user's privilege and calls the secondary module if allowed, like bucket
> engine.
>
> I think the most portable approach to store security properties of items is
> to inject these properties as a part of data field prior to invocation of
> the secondary module.
> For example, if user tries to store a pair of {key:"aaa", value:"foovarbaz"},
> the memcached invokes the primary engine module with this key-value pair.
> Assume the default security context of the item object in this case is
> "classified". So, SELinux module shall inject into the value, then it calls
> the secondary module. The secondary module see the modified key-value pair,
> as if {key:"aaa", value:"classified\0foovarbaz"} is given.
> ^^^^^^^^^^^^
> I believe this approach will work with various kind of secondary modules,
> because they can see just a bit longer data than the original.
> The get_item_data() method gives the primary module a chance to fix up
> pointer of the data field. For example, SELinux can return the address
> next to the first '\0' in the returned data field from the secondary
> module.
>
> However, the memcached refers nbytes and nkey field of item structure without
> any method calls, so it misunderstand length of the data, because item->nbytes
> contains total length of the security context and actual data.
>
> So, I'd like to add the following two engine APIs:
>
> uint16_t (*item_get_nkey)(const item *item);
>
> uint32_t (*item_get_ndata)(const item *item);
>
> It allows the primary module (selinux) to return the modified length of the
> data, being consistent with item_get_data() method.
>
> In my personal opinion, I don't think nkey and nbytes are necessary fields
> in the common item structure. References to them should be encapsulated to
> the engine module.
>
>
> * Flags to show what features are provided with the engine module.
>
> Right now, when the primary module (like selinux or bucket) load secondary
> engine modules, there is no way to know what features are provided with
> the secondary modules.
>
> For example, if the secondary module does not have persistent storage support,
> the selinux module wants to store the security attribute of the item in the
> security identifier form rather than flat text form, from the performance
> perspective.
>
> I'd like to add a "flags" member within engine_interface structure, to inform
> the caller
> Perhasp, we can consider the following flags right now?
>
> ENGINE_FEATURE_PSEUDO = 0x0001 /* set, if module does not manage items
> actually */
> ENGINE_FEATURE_STORAGE = 0x0002 /* set, if item can be stored in
> persistent storage */
> :
>
> * Isn't settings.engine.v1->xxxx() every time ugly?
>
> Right now, the core memcached code calls engine methods using function pointer
> every time. In my sense, it should be wrapped with a thin layer, like:
>
> static inline ENGINE_ERROR_CODE
> engine_initialize(ENGINE_HANDLE* handle, const char* config_str)
> {
> if (settings.engine.v0.interface == 1)
> return settings.engine.v1->initialize(handle, config_str);
> else
> return ENGINE_EINVAL;
> }
>
> Here is no functional differences, but it enables to keep code clean and
> to handle v2, v3 or later version easily.
>
>
> If we need a patch, I'll submit it later.
> Any opinions?
>
> Thanks,
--
KaiGai Kohei <[email protected]>
--
To unsubscribe, reply using "remove me" as the subject.
diff --git a/default_engine.c b/default_engine.c
index 34788a2..43ac61c 100644
--- a/default_engine.c
+++ b/default_engine.c
@@ -84,7 +84,8 @@ ENGINE_ERROR_CODE create_instance(uint64_t interface,
struct default_engine default_engine = {
.engine = {
.interface = {
- .interface = 1
+ .interface = 1,
+ .features = 0,
},
.get_info = default_get_info,
.initialize = default_initialize,
@@ -102,7 +103,9 @@ ENGINE_ERROR_CODE create_instance(uint64_t interface,
.item_get_cas = item_get_cas,
.item_set_cas = item_set_cas,
.item_get_key = item_get_key,
+ .item_get_nkey = item_get_nkey,
.item_get_data = item_get_data,
+ .item_get_ndata = item_get_ndata,
.item_get_clsid = item_get_clsid
},
.server = *api,
@@ -453,11 +456,21 @@ const char* item_get_key(const item* item)
return ret;
}
+uint16_t item_get_nkey(const item *item)
+{
+ return item->nkey;
+}
+
char* item_get_data(const item* item)
{
return ((char*)item_get_key(item)) + item->nkey;
}
+uint32_t item_get_ndata(const item *item)
+{
+ return item->nbytes;
+}
+
uint8_t item_get_clsid(const item* item)
{
return 0;
diff --git a/default_engine.h b/default_engine.h
index e6da3f2..669a627 100644
--- a/default_engine.h
+++ b/default_engine.h
@@ -124,6 +124,8 @@ struct default_engine {
char* item_get_data(const item* item);
const char* item_get_key(const item* item);
+uint16_t item_get_nkey(const item *item);
+uint32_t item_get_ndata(const item *item);
void item_set_cas(item* item, uint64_t val);
uint64_t item_get_cas(const item* item);
uint8_t item_get_clsid(const item* item);
diff --git a/include/memcached/engine.h b/include/memcached/engine.h
index 8f9ca0f..96f20dc 100644
--- a/include/memcached/engine.h
+++ b/include/memcached/engine.h
@@ -86,6 +86,14 @@ extern "C" {
ON_SWITCH_CONN = 3, /**< Processing a different connection on this thread. */
} ENGINE_EVENT_TYPE;
+ /**
+ * Flags to inform features of the loaded engine module.
+ */
+ typedef enum {
+ ENGINE_HAS_SECONDARY = 0x01, /* It has secondary modules. */
+ ENGINE_HAS_PERSISTENT = 0x02, /* Is has persistent storage support. */
+ } ENGINE_FEATURE_SPECS;
+
#define MAX_ENGINE_EVENT_TYPE 3
/**
@@ -163,6 +171,7 @@ extern "C" {
#endif
typedef struct engine_interface {
uint64_t interface; /**< The version number on the engine structure */
+ uint64_t features; /**< mask of the ENGINE_FEATURE_FLAGS */
} ENGINE_HANDLE;
/**
@@ -216,6 +225,19 @@ extern "C" {
void *(*get_engine_specific)(const void *cookie);
/**
+ * Retrieve socket file descriptor of the session for the given
+ * cookie. It allows engine modules to obtain the properties of
+ * connection, but we don't recommend to update status of the
+ * socket file descriptor retrieved.
+ *
+ * @param cookie The cookie provided by the frontend
+ *
+ * @return the socket file descriptor of the session for the given
+ * cookie, or -1 on any errors.
+ */
+ int (*get_socket_fd)(const void *cookie);
+
+ /**
* Get the server's version number.
*
* @return the server's version number
@@ -532,11 +554,21 @@ extern "C" {
const char* (*item_get_key)(const item *item);
/**
+ * Get length of the key from an item
+ */
+ uint16_t (*item_get_nkey)(const item *item);
+
+ /**
* Get the data from an item.
*/
char* (*item_get_data)(const item *item);
/**
+ * Get length of the data from an item
+ */
+ uint32_t (*item_get_ndata)(const item *item);
+
+ /**
* Get an item's class ID.
*/
uint8_t (*item_get_clsid)(const item* item);
diff --git a/memcached.c b/memcached.c
index 2081aff..d4e7a9f 100644
--- a/memcached.c
+++ b/memcached.c
@@ -999,8 +999,13 @@ static void complete_nread_ascii(conn *c) {
item *it = c->item;
const void *key = settings.engine.v1->item_get_key(it);
+ const char *data = settings.engine.v1->item_get_data(it);
+#ifdef ENABLE_DTRACE /* to keep compiler quiet */
+ uint16_t nkey = settings.engine.v1->item_get_nkey(it);
+#endif
+ uint32_t nbytes = settings.engine.v1->item_get_ndata(it);
- if (strncmp(settings.engine.v1->item_get_data(it) + it->nbytes - 2, "\r\n", 2) != 0) {
+ if (strncmp(data + nbytes - 2, "\r\n", 2) != 0) {
out_string(c, "CLIENT_ERROR bad data chunk");
} else {
ENGINE_ERROR_CODE ret = settings.engine.v1->store(settings.engine.v0, c,
@@ -1009,27 +1014,27 @@ static void complete_nread_ascii(conn *c) {
#ifdef ENABLE_DTRACE
switch (c->store_op) {
case OPERATION_ADD:
- MEMCACHED_COMMAND_ADD(c->sfd, key, it->nkey,
- (ret == ENGINE_SUCCESS) ? it->nbytes : -1, c->cas);
+ MEMCACHED_COMMAND_ADD(c->sfd, key, nkey,
+ (ret == ENGINE_SUCCESS) ? nbytes : -1, c->cas);
break;
case OPERATION_REPLACE:
- MEMCACHED_COMMAND_REPLACE(c->sfd, key, it->nkey,
- (ret == ENGINE_SUCCESS) ? it->nbytes : -1, c->cas);
+ MEMCACHED_COMMAND_REPLACE(c->sfd, key, nkey,
+ (ret == ENGINE_SUCCESS) ? nbytes : -1, c->cas);
break;
case OPERATION_APPEND:
- MEMCACHED_COMMAND_APPEND(c->sfd, key, it->nkey,
- (ret == ENGINE_SUCCESS) ? it->nbytes : -1, c->cas);
+ MEMCACHED_COMMAND_APPEND(c->sfd, key, nkey,
+ (ret == ENGINE_SUCCESS) ? nbytes : -1, c->cas);
break;
case OPERATION_PREPEND:
- MEMCACHED_COMMAND_PREPEND(c->sfd, key, it->nkey,
- (ret == ENGINE_SUCCESS) ? it->nbytes : -1, c->cas);
+ MEMCACHED_COMMAND_PREPEND(c->sfd, key, nkey,
+ (ret == ENGINE_SUCCESS) ? nbytes : -1, c->cas);
break;
case OPERATION_SET:
- MEMCACHED_COMMAND_SET(c->sfd, key, it->nkey,
- (ret == ENGINE_SUCCESS) ? it->nbytes : -1, c->cas);
+ MEMCACHED_COMMAND_SET(c->sfd, key, nkey,
+ (ret == ENGINE_SUCCESS) ? nbytes : -1, c->cas);
break;
case OPERATION_CAS:
- MEMCACHED_COMMAND_CAS(c->sfd, key, it->nkey, it->nbytes, c->cas);
+ MEMCACHED_COMMAND_CAS(c->sfd, key, nkey, nbytes, c->cas);
break;
}
#endif
@@ -1369,11 +1374,17 @@ static void complete_update_bin(conn *c) {
assert(c != NULL);
item *it = c->item;
+#ifdef ENABLE_DTRACE
+ const char *key = settings.engine.v1->item_get_key(it);
+ uint16_t nkey = settings.engine.v1->item_get_nkey(it);
+#endif /* to keep compiler quiet */
+ char *data = settings.engine.v1->item_get_data(it);
+ uint32_t nbytes = settings.engine.v1->item_get_ndata(it);
/* We don't actually receive the trailing two characters in the bin
* protocol, so we're going to just set them here */
- *(settings.engine.v1->item_get_data(it) + it->nbytes - 2) = '\r';
- *(settings.engine.v1->item_get_data(it) + it->nbytes - 1) = '\n';
+ data[nbytes - 2] = '\r';
+ data[nbytes - 1] = '\n';
ENGINE_ERROR_CODE ret = c->aiostat;
c->aiostat = ENGINE_SUCCESS;
@@ -1385,24 +1396,24 @@ static void complete_update_bin(conn *c) {
#ifdef ENABLE_DTRACE
switch (c->cmd) {
case OPERATION_ADD:
- MEMCACHED_COMMAND_ADD(c->sfd, settings.engine.v1->item_get_key(it), it->nkey,
- (ret == ENGINE_SUCCESS) ? it->nbytes : -1, c->cas);
+ MEMCACHED_COMMAND_ADD(c->sfd, key, nkey,
+ (ret == ENGINE_SUCCESS) ? nbytes : -1, c->cas);
break;
case OPERATION_REPLACE:
- MEMCACHED_COMMAND_REPLACE(c->sfd, settings.engine.v1->item_get_key(it), it->nkey,
- (ret == ENGINE_SUCCESS) ? it->nbytes : -1, c->cas);
+ MEMCACHED_COMMAND_REPLACE(c->sfd, key, nkey,
+ (ret == ENGINE_SUCCESS) ? nbytes : -1, c->cas);
break;
case OPERATION_APPEND:
- MEMCACHED_COMMAND_APPEND(c->sfd, settings.engine.v1->item_get_key(it), it->nkey,
- (ret == ENGINE_SUCCESS) ? it->nbytes : -1, c->cas);
+ MEMCACHED_COMMAND_APPEND(c->sfd, key, nkey,
+ (ret == ENGINE_SUCCESS) ? nbytes : -1, c->cas);
break;
case OPERATION_PREPEND:
- MEMCACHED_COMMAND_PREPEND(c->sfd, settings.engine.v1->item_get_key(it), it->nkey,
- (ret == ENGINE_SUCCESS) ? it->nbytes : -1, c->cas);
+ MEMCACHED_COMMAND_PREPEND(c->sfd, key, nkey,
+ (ret == ENGINE_SUCCESS) ? nbytes : -1, c->cas);
break;
case OPERATION_SET:
- MEMCACHED_COMMAND_SET(c->sfd, settings.engine.v1->item_get_key(it), it->nkey,
- (ret == ENGINE_SUCCESS) ? it->nbytes : -1, c->cas);
+ MEMCACHED_COMMAND_SET(c->sfd, key, nkey,
+ (ret == ENGINE_SUCCESS) ? nbytes : -1, c->cas);
break;
}
#endif
@@ -1472,7 +1483,7 @@ static void process_bin_get(conn *c) {
case ENGINE_SUCCESS:
/* the length has two unnecessary bytes ("\r\n") */
keylen = 0;
- bodylen = sizeof(rsp->message.body) + (it->nbytes - 2);
+ bodylen = sizeof(rsp->message.body) + settings.engine.v1->item_get_ndata(it) - 2;
STATS_HIT(c, get, key, nkey);
@@ -1492,8 +1503,9 @@ static void process_bin_get(conn *c) {
}
/* Add the data minus the CRLF */
- add_iov(c, settings.engine.v1->item_get_data(it), it->nbytes - 2);
- conn_set_state(c, conn_mwrite);
+ add_iov(c, settings.engine.v1->item_get_data(it),
+ settings.engine.v1->item_get_ndata(it) - 2);
+ conn_set_state(c, conn_mwrite);
/* Remember this command so we can garbage collect it later */
c->item = it;
break;
@@ -2473,8 +2485,6 @@ static void process_bin_flush(conn *c) {
}
static void process_bin_delete(conn *c) {
- item *it;
-
protocol_binary_request_delete* req = binary_get_request(c);
char* key = binary_get_key(c);
@@ -3020,7 +3030,12 @@ static inline void process_get_command(conn *c, token_t *tokens, size_t ntokens,
}
if (it) {
- assert(memcmp(settings.engine.v1->item_get_data(it) + it->nbytes - 2, "\r\n", 2) == 0);
+ const char *key = settings.engine.v1->item_get_key(it);
+ char *data = settings.engine.v1->item_get_data(it);
+ uint16_t nkey = settings.engine.v1->item_get_nkey(it);
+ uint32_t nbytes = settings.engine.v1->item_get_ndata(it);
+
+ assert(memcmp(data + nbytes - 2, "\r\n", 2) == 0);
if (i >= c->isize) {
item **new_list = realloc(c->ilist, sizeof(item *) * c->isize * 2);
@@ -3043,7 +3058,7 @@ static inline void process_get_command(conn *c, token_t *tokens, size_t ntokens,
int suffix_len = snprintf(suffix, SUFFIX_SIZE,
" %u %u\r\n",
it->flags,
- it->nbytes - 2);
+ nbytes - 2);
/*
* Construct the response. Each hit adds three elements to the
@@ -3055,8 +3070,8 @@ static inline void process_get_command(conn *c, token_t *tokens, size_t ntokens,
if (return_cas)
{
- MEMCACHED_COMMAND_GET(c->sfd, settings.engine.v1->item_get_key(it), it->nkey,
- it->nbytes, settings.engine.v1->item_get_cas(it));
+ MEMCACHED_COMMAND_GET(c->sfd, key, nkey, nbytes,
+ settings.engine.v1->item_get_cas(it));
char *cas = get_suffix_buffer(c);
if (cas == NULL) {
@@ -3068,10 +3083,10 @@ static inline void process_get_command(conn *c, token_t *tokens, size_t ntokens,
" %"PRIu64"\r\n",
settings.engine.v1->item_get_cas(it));
if (add_iov(c, "VALUE ", 6) != 0 ||
- add_iov(c, settings.engine.v1->item_get_key(it), it->nkey) != 0 ||
+ add_iov(c, key, nkey) != 0 ||
add_iov(c, suffix, suffix_len - 2) != 0 ||
add_iov(c, cas, cas_len) != 0 ||
- add_iov(c, settings.engine.v1->item_get_data(it), it->nbytes) != 0)
+ add_iov(c, data, nbytes) != 0)
{
settings.engine.v1->release(settings.engine.v0, c, it);
break;
@@ -3079,12 +3094,12 @@ static inline void process_get_command(conn *c, token_t *tokens, size_t ntokens,
}
else
{
- MEMCACHED_COMMAND_GET(c->sfd, settings.engine.v1->item_get_key(it), it->nkey,
- it->nbytes, settings.engine.v1->item_get_cas(it));
+ MEMCACHED_COMMAND_GET(c->sfd, key, nkey, nbytes,
+ settings.engine.v1->item_get_cas(it));
if (add_iov(c, "VALUE ", 6) != 0 ||
- add_iov(c, settings.engine.v1->item_get_key(it), it->nkey) != 0 ||
+ add_iov(c, key, nkey) != 0 ||
add_iov(c, suffix, suffix_len) != 0 ||
- add_iov(c, settings.engine.v1->item_get_data(it), it->nbytes) != 0)
+ add_iov(c, data, nbytes) != 0)
{
settings.engine.v1->release(settings.engine.v0, c, it);
break;
@@ -3094,8 +3109,7 @@ static inline void process_get_command(conn *c, token_t *tokens, size_t ntokens,
if (settings.verbose > 1) {
settings.extensions.logger->log(EXTENSION_LOG_DEBUG, c,
- ">%d sending key %s\n", c->sfd,
- settings.engine.v1->item_get_key(it));
+ ">%d sending key %s\n", c->sfd, key);
}
/* item_get() has incremented it->refcount for us */
@@ -3214,7 +3228,7 @@ static void process_update_command(conn *c, token_t *tokens, const size_t ntoken
c->item = it;
c->ritem = settings.engine.v1->item_get_data(it);
- c->rlbytes = it->nbytes;
+ c->rlbytes = settings.engine.v1->item_get_ndata(it);
c->store_op = store_op;
conn_set_state(c, conn_nread);
break;
@@ -3311,7 +3325,6 @@ static void process_arithmetic_command(conn *c, token_t *tokens, const size_t nt
static void process_delete_command(conn *c, token_t *tokens, const size_t ntokens) {
char *key;
size_t nkey;
- item *it;
assert(c != NULL);
@@ -4760,6 +4773,11 @@ static void *get_engine_specific(const void *cookie) {
return c->engine_storage;
}
+static int get_socket_fd(const void *cookie) {
+ conn *c = (conn *)cookie;
+ return c->sfd;
+}
+
static void *new_independent_stats(void) {
int ii;
struct independent_stats *independent_stats = calloc(sizeof(independent_stats) + sizeof(struct thread_stats) * settings.num_threads, 1);
@@ -4925,6 +4943,7 @@ static void *get_server_api(server_api_t interface)
.get_auth_data = get_auth_data,
.store_engine_specific = store_engine_specific,
.get_engine_specific = get_engine_specific,
+ .get_socket_fd = get_socket_fd,
.server_version = get_server_version,
.hash = hash,
.realtime = realtime,