The attached patch reworked the engine framework based on the following
three ideas.

* A new server API to obtain socket file descriptor.
The get_socket_fd() server API allows engines to know properties of
connection socket.

* New APIs to retrieve length of key/data
The item_get_nkey() and item_get_ndata() engine APIs replaced direct
references to item->nkey and item->nbytes.

* Flags to show what features are provided with the engine module.
The 'features' field of ENGINE_HANDLE enables to inform the core
memcached (or intermediation module, like bucket) features of the
loaded engine module.

Any comments are welcome.

Thanks,

(2010/04/01 15:46), KaiGai Kohei wrote:
> Hello,
> 
> As we discussed before, I focus on access control facilities in memcached.
> In the "memcached and access control" thread, I was suggested to implement
> it as an engine module rather than core features.
> Except for a few matters, it seems to be a feasible approach.
> 
> The issue about remove() method was resolved. (Thanks!)
> 
> However, we still have matters which can prevent access control feature
> as an engine module. So, I'd like to discuss the following ideas.
> 
> * A new server API to obtain socket file descriptor.
> 
> SELinux provides an API to retrieve security context of the peer process
> for the given socket file descriptor. (See, getpeercon(3))
> It allows server processes to know privilege of the client process, and
> we can utilize the privilege for access control decision inside server
> process.
> Right now, conn->sfd is socket file descriptor of the session. The "conn"
> data is delivered to engine module as an opaque "cookie". In actually, we
> can fetch the socket file descriptor based on the knowledge of internal
> format of the structure. But it seems to me basically unpreferable.
> 
> I'd like to add a new server API, as follows:
> 
>    int (*get_socket_file_descriptor)(const void *cookie);
> 
> 
> * New APIs to retrieve length of key/data
> 
> Right now, the item structure has nbytes (uint32_t) and nkey (uint16_t) 
> fields,
> and the core memcached can see the length of key/data without any 
> interposition
> of engine modules.
> Encapsulation of the references to nbytes/nkey is an awaited feature for me,
> because it makes possible to store security properties of access control 
> stuff,
> such as ownership or security context.
> 
> I plan the upcoming selinux module intermediate between the core memcached
> and the secondary module (such as default_engine.so). For example, when get()
> method is called, selinux's get() shall be invoked at first. Then, it checks
> user's privilege and calls the secondary module if allowed, like bucket 
> engine.
> 
> I think the most portable approach to store security properties of items is
> to inject these properties as a part of data field prior to invocation of
> the secondary module.
> For example, if user tries to store a pair of {key:"aaa", value:"foovarbaz"},
> the memcached invokes the primary engine module with this key-value pair.
> Assume the default security context of the item object in this case is
> "classified". So, SELinux module shall inject into the value, then it calls
> the secondary module. The secondary module see the modified key-value pair,
> as if {key:"aaa", value:"classified\0foovarbaz"} is given.
>                           ^^^^^^^^^^^^
> I believe this approach will work with various kind of secondary modules,
> because they can see just a bit longer data than the original.
> The get_item_data() method gives the primary module a chance to fix up
> pointer of the data field. For example, SELinux can return the address
> next to the first '\0' in the returned data field from the secondary
> module.
> 
> However, the memcached refers nbytes and nkey field of item structure without
> any method calls, so it misunderstand length of the data, because item->nbytes
> contains total length of the security context and actual data.
> 
> So, I'd like to add the following two engine APIs:
> 
>    uint16_t (*item_get_nkey)(const item *item);
> 
>    uint32_t (*item_get_ndata)(const item *item);
> 
> It allows the primary module (selinux) to return the modified length of the
> data, being consistent with item_get_data() method.
> 
> In my personal opinion, I don't think nkey and nbytes are necessary fields
> in the common item structure. References to them should be encapsulated to
> the engine module.
> 
> 
> * Flags to show what features are provided with the engine module.
> 
> Right now, when the primary module (like selinux or bucket) load secondary
> engine modules, there is no way to know what features are provided with
> the secondary modules.
> 
> For example, if the secondary module does not have persistent storage support,
> the selinux module wants to store the security attribute of the item in the
> security identifier form rather than flat text form, from the performance
> perspective.
> 
> I'd like to add a "flags" member within engine_interface structure, to inform
> the caller
> Perhasp, we can consider the following flags right now?
> 
>    ENGINE_FEATURE_PSEUDO  = 0x0001  /* set, if module does not manage items 
> actually */
>    ENGINE_FEATURE_STORAGE = 0x0002  /* set, if item can be stored in 
> persistent storage */
>        :
> 
> * Isn't settings.engine.v1->xxxx() every time ugly?
> 
> Right now, the core memcached code calls engine methods using function pointer
> every time. In my sense, it should be wrapped with a thin layer, like:
> 
>    static inline ENGINE_ERROR_CODE
>    engine_initialize(ENGINE_HANDLE* handle, const char* config_str)
>    {
>        if (settings.engine.v0.interface == 1)
>            return settings.engine.v1->initialize(handle, config_str);
>        else
>            return ENGINE_EINVAL;
>    }
> 
> Here is no functional differences, but it enables to keep code clean and
> to handle v2, v3 or later version easily.
> 
> 
> If we need a patch, I'll submit it later.
> Any opinions?
> 
> Thanks,


-- 
KaiGai Kohei <[email protected]>


-- 
To unsubscribe, reply using "remove me" as the subject.
diff --git a/default_engine.c b/default_engine.c
index 34788a2..43ac61c 100644
--- a/default_engine.c
+++ b/default_engine.c
@@ -84,7 +84,8 @@ ENGINE_ERROR_CODE create_instance(uint64_t interface,
    struct default_engine default_engine = {
       .engine = {
          .interface = {
-            .interface = 1
+            .interface = 1,
+            .features = 0,
          },
          .get_info = default_get_info,
          .initialize = default_initialize,
@@ -102,7 +103,9 @@ ENGINE_ERROR_CODE create_instance(uint64_t interface,
          .item_get_cas = item_get_cas,
          .item_set_cas = item_set_cas,
          .item_get_key = item_get_key,
+	 .item_get_nkey = item_get_nkey,
          .item_get_data = item_get_data,
+	 .item_get_ndata = item_get_ndata,
          .item_get_clsid = item_get_clsid
       },
       .server = *api,
@@ -453,11 +456,21 @@ const char* item_get_key(const item* item)
     return ret;
 }
 
+uint16_t item_get_nkey(const item *item)
+{
+    return item->nkey;
+}
+
 char* item_get_data(const item* item)
 {
     return ((char*)item_get_key(item)) + item->nkey;
 }
 
+uint32_t item_get_ndata(const item *item)
+{
+    return item->nbytes;
+}
+
 uint8_t item_get_clsid(const item* item)
 {
     return 0;
diff --git a/default_engine.h b/default_engine.h
index e6da3f2..669a627 100644
--- a/default_engine.h
+++ b/default_engine.h
@@ -124,6 +124,8 @@ struct default_engine {
 
 char* item_get_data(const item* item);
 const char* item_get_key(const item* item);
+uint16_t item_get_nkey(const item *item);
+uint32_t item_get_ndata(const item *item);
 void item_set_cas(item* item, uint64_t val);
 uint64_t item_get_cas(const item* item);
 uint8_t item_get_clsid(const item* item);
diff --git a/include/memcached/engine.h b/include/memcached/engine.h
index 8f9ca0f..96f20dc 100644
--- a/include/memcached/engine.h
+++ b/include/memcached/engine.h
@@ -86,6 +86,14 @@ extern "C" {
         ON_SWITCH_CONN = 3,     /**< Processing a different connection on this thread. */
     } ENGINE_EVENT_TYPE;
 
+    /**
+     * Flags to inform features of the loaded engine module.
+     */
+    typedef enum {
+        ENGINE_HAS_SECONDARY	= 0x01,	/* It has secondary modules. */
+        ENGINE_HAS_PERSISTENT	= 0x02,	/* Is has persistent storage support. */
+    } ENGINE_FEATURE_SPECS;
+
     #define MAX_ENGINE_EVENT_TYPE 3
 
     /**
@@ -163,6 +171,7 @@ extern "C" {
 #endif
     typedef struct engine_interface {
         uint64_t interface; /**< The version number on the engine structure */
+        uint64_t features;	/**< mask of the ENGINE_FEATURE_FLAGS */
     } ENGINE_HANDLE;
 
     /**
@@ -216,6 +225,19 @@ extern "C" {
         void *(*get_engine_specific)(const void *cookie);
 
         /**
+         * Retrieve socket file descriptor of the session for the given
+         * cookie. It allows engine modules to obtain the properties of
+         * connection, but we don't recommend to update status of the
+         * socket file descriptor retrieved.
+         *
+         * @param cookie The cookie provided by the frontend
+         *
+         * @return the socket file descriptor of the session for the given
+         *         cookie, or -1 on any errors.
+         */
+        int (*get_socket_fd)(const void *cookie);
+
+        /**
          * Get the server's version number.
          *
          * @return the server's version number
@@ -532,11 +554,21 @@ extern "C" {
         const char* (*item_get_key)(const item *item);
 
         /**
+         * Get length of the key from an item
+         */
+        uint16_t (*item_get_nkey)(const item *item);
+
+        /**
          * Get the data from an item.
          */
         char* (*item_get_data)(const item *item);
 
         /**
+         * Get length of the data from an item
+         */
+        uint32_t (*item_get_ndata)(const item *item);
+
+        /**
          * Get an item's class ID.
          */
         uint8_t (*item_get_clsid)(const item* item);
diff --git a/memcached.c b/memcached.c
index 2081aff..d4e7a9f 100644
--- a/memcached.c
+++ b/memcached.c
@@ -999,8 +999,13 @@ static void complete_nread_ascii(conn *c) {
 
     item *it = c->item;
     const void *key = settings.engine.v1->item_get_key(it);
+    const char *data = settings.engine.v1->item_get_data(it);
+#ifdef ENABLE_DTRACE	/* to keep compiler quiet */
+    uint16_t nkey = settings.engine.v1->item_get_nkey(it);
+#endif
+    uint32_t nbytes = settings.engine.v1->item_get_ndata(it);
 
-    if (strncmp(settings.engine.v1->item_get_data(it) + it->nbytes - 2, "\r\n", 2) != 0) {
+    if (strncmp(data + nbytes - 2, "\r\n", 2) != 0) {
         out_string(c, "CLIENT_ERROR bad data chunk");
     } else {
         ENGINE_ERROR_CODE ret = settings.engine.v1->store(settings.engine.v0, c,
@@ -1009,27 +1014,27 @@ static void complete_nread_ascii(conn *c) {
 #ifdef ENABLE_DTRACE
         switch (c->store_op) {
         case OPERATION_ADD:
-            MEMCACHED_COMMAND_ADD(c->sfd, key, it->nkey,
-                                  (ret == ENGINE_SUCCESS) ? it->nbytes : -1, c->cas);
+            MEMCACHED_COMMAND_ADD(c->sfd, key, nkey,
+                                  (ret == ENGINE_SUCCESS) ? nbytes : -1, c->cas);
             break;
         case OPERATION_REPLACE:
-            MEMCACHED_COMMAND_REPLACE(c->sfd, key, it->nkey,
-                                      (ret == ENGINE_SUCCESS) ? it->nbytes : -1, c->cas);
+            MEMCACHED_COMMAND_REPLACE(c->sfd, key, nkey,
+                                      (ret == ENGINE_SUCCESS) ? nbytes : -1, c->cas);
             break;
         case OPERATION_APPEND:
-            MEMCACHED_COMMAND_APPEND(c->sfd, key, it->nkey,
-                                     (ret == ENGINE_SUCCESS) ? it->nbytes : -1, c->cas);
+            MEMCACHED_COMMAND_APPEND(c->sfd, key, nkey,
+                                     (ret == ENGINE_SUCCESS) ? nbytes : -1, c->cas);
             break;
         case OPERATION_PREPEND:
-            MEMCACHED_COMMAND_PREPEND(c->sfd, key, it->nkey,
-                                      (ret == ENGINE_SUCCESS) ? it->nbytes : -1, c->cas);
+            MEMCACHED_COMMAND_PREPEND(c->sfd, key, nkey,
+                                      (ret == ENGINE_SUCCESS) ? nbytes : -1, c->cas);
             break;
         case OPERATION_SET:
-            MEMCACHED_COMMAND_SET(c->sfd, key, it->nkey,
-                                  (ret == ENGINE_SUCCESS) ? it->nbytes : -1, c->cas);
+            MEMCACHED_COMMAND_SET(c->sfd, key, nkey,
+                                  (ret == ENGINE_SUCCESS) ? nbytes : -1, c->cas);
             break;
         case OPERATION_CAS:
-            MEMCACHED_COMMAND_CAS(c->sfd, key, it->nkey, it->nbytes, c->cas);
+            MEMCACHED_COMMAND_CAS(c->sfd, key, nkey, nbytes, c->cas);
             break;
         }
 #endif
@@ -1369,11 +1374,17 @@ static void complete_update_bin(conn *c) {
     assert(c != NULL);
 
     item *it = c->item;
+#ifdef ENABLE_DTRACE
+    const char *key = settings.engine.v1->item_get_key(it);
+    uint16_t nkey = settings.engine.v1->item_get_nkey(it);
+#endif	/* to keep compiler quiet */
+    char *data = settings.engine.v1->item_get_data(it);
+    uint32_t nbytes = settings.engine.v1->item_get_ndata(it);
 
     /* We don't actually receive the trailing two characters in the bin
      * protocol, so we're going to just set them here */
-    *(settings.engine.v1->item_get_data(it) + it->nbytes - 2) = '\r';
-    *(settings.engine.v1->item_get_data(it) + it->nbytes - 1) = '\n';
+    data[nbytes - 2] = '\r';
+    data[nbytes - 1] = '\n';
 
     ENGINE_ERROR_CODE ret = c->aiostat;
     c->aiostat = ENGINE_SUCCESS;
@@ -1385,24 +1396,24 @@ static void complete_update_bin(conn *c) {
 #ifdef ENABLE_DTRACE
     switch (c->cmd) {
     case OPERATION_ADD:
-        MEMCACHED_COMMAND_ADD(c->sfd, settings.engine.v1->item_get_key(it), it->nkey,
-                              (ret == ENGINE_SUCCESS) ? it->nbytes : -1, c->cas);
+        MEMCACHED_COMMAND_ADD(c->sfd, key, nkey,
+                              (ret == ENGINE_SUCCESS) ? nbytes : -1, c->cas);
         break;
     case OPERATION_REPLACE:
-        MEMCACHED_COMMAND_REPLACE(c->sfd, settings.engine.v1->item_get_key(it), it->nkey,
-                                  (ret == ENGINE_SUCCESS) ? it->nbytes : -1, c->cas);
+        MEMCACHED_COMMAND_REPLACE(c->sfd, key, nkey,
+                                  (ret == ENGINE_SUCCESS) ? nbytes : -1, c->cas);
         break;
     case OPERATION_APPEND:
-        MEMCACHED_COMMAND_APPEND(c->sfd, settings.engine.v1->item_get_key(it), it->nkey,
-                                 (ret == ENGINE_SUCCESS) ? it->nbytes : -1, c->cas);
+        MEMCACHED_COMMAND_APPEND(c->sfd, key, nkey,
+                                 (ret == ENGINE_SUCCESS) ? nbytes : -1, c->cas);
         break;
     case OPERATION_PREPEND:
-        MEMCACHED_COMMAND_PREPEND(c->sfd, settings.engine.v1->item_get_key(it), it->nkey,
-                                 (ret == ENGINE_SUCCESS) ? it->nbytes : -1, c->cas);
+        MEMCACHED_COMMAND_PREPEND(c->sfd, key, nkey,
+                                 (ret == ENGINE_SUCCESS) ? nbytes : -1, c->cas);
         break;
     case OPERATION_SET:
-        MEMCACHED_COMMAND_SET(c->sfd, settings.engine.v1->item_get_key(it), it->nkey,
-                              (ret == ENGINE_SUCCESS) ? it->nbytes : -1, c->cas);
+        MEMCACHED_COMMAND_SET(c->sfd, key, nkey,
+                              (ret == ENGINE_SUCCESS) ? nbytes : -1, c->cas);
         break;
     }
 #endif
@@ -1472,7 +1483,7 @@ static void process_bin_get(conn *c) {
     case ENGINE_SUCCESS:
         /* the length has two unnecessary bytes ("\r\n") */
         keylen = 0;
-        bodylen = sizeof(rsp->message.body) + (it->nbytes - 2);
+        bodylen = sizeof(rsp->message.body) + settings.engine.v1->item_get_ndata(it) - 2;
 
         STATS_HIT(c, get, key, nkey);
 
@@ -1492,8 +1503,9 @@ static void process_bin_get(conn *c) {
         }
 
         /* Add the data minus the CRLF */
-        add_iov(c, settings.engine.v1->item_get_data(it), it->nbytes - 2);
-        conn_set_state(c, conn_mwrite);
+        add_iov(c, settings.engine.v1->item_get_data(it),
+                   settings.engine.v1->item_get_ndata(it) - 2);
+         conn_set_state(c, conn_mwrite);
         /* Remember this command so we can garbage collect it later */
         c->item = it;
         break;
@@ -2473,8 +2485,6 @@ static void process_bin_flush(conn *c) {
 }
 
 static void process_bin_delete(conn *c) {
-    item *it;
-
     protocol_binary_request_delete* req = binary_get_request(c);
 
     char* key = binary_get_key(c);
@@ -3020,7 +3030,12 @@ static inline void process_get_command(conn *c, token_t *tokens, size_t ntokens,
             }
 
             if (it) {
-                assert(memcmp(settings.engine.v1->item_get_data(it) + it->nbytes - 2, "\r\n", 2) == 0);
+                const char *key = settings.engine.v1->item_get_key(it);
+                char *data = settings.engine.v1->item_get_data(it);
+                uint16_t nkey = settings.engine.v1->item_get_nkey(it);
+                uint32_t nbytes = settings.engine.v1->item_get_ndata(it);
+
+                assert(memcmp(data + nbytes - 2, "\r\n", 2) == 0);
 
                 if (i >= c->isize) {
                     item **new_list = realloc(c->ilist, sizeof(item *) * c->isize * 2);
@@ -3043,7 +3058,7 @@ static inline void process_get_command(conn *c, token_t *tokens, size_t ntokens,
                 int suffix_len = snprintf(suffix, SUFFIX_SIZE,
                                           " %u %u\r\n",
                                           it->flags,
-                                          it->nbytes - 2);
+                                          nbytes - 2);
 
                 /*
                  * Construct the response. Each hit adds three elements to the
@@ -3055,8 +3070,8 @@ static inline void process_get_command(conn *c, token_t *tokens, size_t ntokens,
 
                 if (return_cas)
                 {
-                  MEMCACHED_COMMAND_GET(c->sfd, settings.engine.v1->item_get_key(it), it->nkey,
-                                        it->nbytes, settings.engine.v1->item_get_cas(it));
+                  MEMCACHED_COMMAND_GET(c->sfd, key, nkey, nbytes,
+                                        settings.engine.v1->item_get_cas(it));
 
                   char *cas = get_suffix_buffer(c);
                   if (cas == NULL) {
@@ -3068,10 +3083,10 @@ static inline void process_get_command(conn *c, token_t *tokens, size_t ntokens,
                                             " %"PRIu64"\r\n",
                                             settings.engine.v1->item_get_cas(it));
                   if (add_iov(c, "VALUE ", 6) != 0 ||
-                      add_iov(c, settings.engine.v1->item_get_key(it), it->nkey) != 0 ||
+                      add_iov(c, key, nkey) != 0 ||
                       add_iov(c, suffix, suffix_len - 2) != 0 ||
                       add_iov(c, cas, cas_len) != 0 ||
-                      add_iov(c, settings.engine.v1->item_get_data(it), it->nbytes) != 0)
+                      add_iov(c, data, nbytes) != 0)
                       {
                           settings.engine.v1->release(settings.engine.v0, c, it);
                           break;
@@ -3079,12 +3094,12 @@ static inline void process_get_command(conn *c, token_t *tokens, size_t ntokens,
                 }
                 else
                 {
-                  MEMCACHED_COMMAND_GET(c->sfd, settings.engine.v1->item_get_key(it), it->nkey,
-                                        it->nbytes, settings.engine.v1->item_get_cas(it));
+                  MEMCACHED_COMMAND_GET(c->sfd, key, nkey, nbytes,
+                                        settings.engine.v1->item_get_cas(it));
                   if (add_iov(c, "VALUE ", 6) != 0 ||
-                      add_iov(c, settings.engine.v1->item_get_key(it), it->nkey) != 0 ||
+                      add_iov(c, key, nkey) != 0 ||
                       add_iov(c, suffix, suffix_len) != 0 ||
-                      add_iov(c, settings.engine.v1->item_get_data(it), it->nbytes) != 0)
+                      add_iov(c, data, nbytes) != 0)
                       {
                           settings.engine.v1->release(settings.engine.v0, c, it);
                           break;
@@ -3094,8 +3109,7 @@ static inline void process_get_command(conn *c, token_t *tokens, size_t ntokens,
 
                 if (settings.verbose > 1) {
                     settings.extensions.logger->log(EXTENSION_LOG_DEBUG, c,
-                             ">%d sending key %s\n", c->sfd,
-                             settings.engine.v1->item_get_key(it));
+                             ">%d sending key %s\n", c->sfd, key);
                 }
 
                 /* item_get() has incremented it->refcount for us */
@@ -3214,7 +3228,7 @@ static void process_update_command(conn *c, token_t *tokens, const size_t ntoken
 
         c->item = it;
         c->ritem = settings.engine.v1->item_get_data(it);
-        c->rlbytes = it->nbytes;
+        c->rlbytes = settings.engine.v1->item_get_ndata(it);
         c->store_op = store_op;
         conn_set_state(c, conn_nread);
         break;
@@ -3311,7 +3325,6 @@ static void process_arithmetic_command(conn *c, token_t *tokens, const size_t nt
 static void process_delete_command(conn *c, token_t *tokens, const size_t ntokens) {
     char *key;
     size_t nkey;
-    item *it;
 
     assert(c != NULL);
 
@@ -4760,6 +4773,11 @@ static void *get_engine_specific(const void *cookie) {
     return c->engine_storage;
 }
 
+static int get_socket_fd(const void *cookie) {
+    conn *c = (conn *)cookie;
+    return c->sfd;
+}
+
 static void *new_independent_stats(void) {
     int ii;
     struct independent_stats *independent_stats = calloc(sizeof(independent_stats) + sizeof(struct thread_stats) * settings.num_threads, 1);
@@ -4925,6 +4943,7 @@ static void *get_server_api(server_api_t interface)
         .get_auth_data = get_auth_data,
         .store_engine_specific = store_engine_specific,
         .get_engine_specific = get_engine_specific,
+        .get_socket_fd = get_socket_fd,
         .server_version = get_server_version,
         .hash = hash,
         .realtime = realtime,

Reply via email to