A few ideas of engine framework

KaiGai Kohei Wed, 31 Mar 2010 23:47:34 -0700

Hello,

As we discussed before, I focus on access control facilities in memcached.
In the "memcached and access control" thread, I was suggested to implement
it as an engine module rather than core features.
Except for a few matters, it seems to be a feasible approach.


The issue about remove() method was resolved. (Thanks!)

However, we still have matters which can prevent access control feature
as an engine module. So, I'd like to discuss the following ideas.

* A new server API to obtain socket file descriptor.

SELinux provides an API to retrieve security context of the peer process
for the given socket file descriptor. (See, getpeercon(3))
It allows server processes to know privilege of the client process, and
we can utilize the privilege for access control decision inside server
process.
Right now, conn->sfd is socket file descriptor of the session. The "conn"
data is delivered to engine module as an opaque "cookie". In actually, we
can fetch the socket file descriptor based on the knowledge of internal
format of the structure. But it seems to me basically unpreferable.

I'd like to add a new server API, as follows:

  int (*get_socket_file_descriptor)(const void *cookie);


* New APIs to retrieve length of key/data

Right now, the item structure has nbytes (uint32_t) and nkey (uint16_t) fields,
and the core memcached can see the length of key/data without any interposition
of engine modules.
Encapsulation of the references to nbytes/nkey is an awaited feature for me,
because it makes possible to store security properties of access control stuff,
such as ownership or security context.

I plan the upcoming selinux module intermediate between the core memcached
and the secondary module (such as default_engine.so). For example, when get()
method is called, selinux's get() shall be invoked at first. Then, it checks
user's privilege and calls the secondary module if allowed, like bucket engine.

I think the most portable approach to store security properties of items is
to inject these properties as a part of data field prior to invocation of
the secondary module.
For example, if user tries to store a pair of {key:"aaa", value:"foovarbaz"},
the memcached invokes the primary engine module with this key-value pair.
Assume the default security context of the item object in this case is
"classified". So, SELinux module shall inject into the value, then it calls
the secondary module. The secondary module see the modified key-value pair,
as if {key:"aaa", value:"classified\0foovarbaz"} is given.
                         ^^^^^^^^^^^^
I believe this approach will work with various kind of secondary modules,
because they can see just a bit longer data than the original.
The get_item_data() method gives the primary module a chance to fix up
pointer of the data field. For example, SELinux can return the address
next to the first '\0' in the returned data field from the secondary
module.

However, the memcached refers nbytes and nkey field of item structure without
any method calls, so it misunderstand length of the data, because item->nbytes
contains total length of the security context and actual data.

So, I'd like to add the following two engine APIs:

  uint16_t (*item_get_nkey)(const item *item);

  uint32_t (*item_get_ndata)(const item *item);

It allows the primary module (selinux) to return the modified length of the
data, being consistent with item_get_data() method.

In my personal opinion, I don't think nkey and nbytes are necessary fields
in the common item structure. References to them should be encapsulated to
the engine module.


* Flags to show what features are provided with the engine module.

Right now, when the primary module (like selinux or bucket) load secondary
engine modules, there is no way to know what features are provided with
the secondary modules.

For example, if the secondary module does not have persistent storage support,
the selinux module wants to store the security attribute of the item in the
security identifier form rather than flat text form, from the performance
perspective.

I'd like to add a "flags" member within engine_interface structure, to inform
the caller
Perhasp, we can consider the following flags right now?

  ENGINE_FEATURE_PSEUDO  = 0x0001  /* set, if module does not manage items 
actually */
  ENGINE_FEATURE_STORAGE = 0x0002  /* set, if item can be stored in persistent 
storage */
      :

* Isn't settings.engine.v1->xxxx() every time ugly?

Right now, the core memcached code calls engine methods using function pointer
every time. In my sense, it should be wrapped with a thin layer, like:

  static inline ENGINE_ERROR_CODE
  engine_initialize(ENGINE_HANDLE* handle, const char* config_str)
  {
      if (settings.engine.v0.interface == 1)
          return settings.engine.v1->initialize(handle, config_str);
      else
          return ENGINE_EINVAL;
  }

Here is no functional differences, but it enables to keep code clean and
to handle v2, v3 or later version easily.


If we need a patch, I'll submit it later.
Any opinions?

Thanks,
-- 
KaiGai Kohei <[email protected]>


-- 
To unsubscribe, reply using "remove me" as the subject.

A few ideas of engine framework

Reply via email to