On 4. apr. 2010, at 17.37, KaiGai Kohei wrote:
>
>>> $ git diff origin/reworks_1 origin/reworks_2
>>> -> It adds item_get_nkey() and item_get_ndata() engine APIs, to inject
>>> security attribute as a part of values by intermediation modules
>>> (such as bucket or selinux).
>>>
>>
>> What is the primary motivation for doing this? I don't see why backends
>> would "dynamically change" these values for an item. The reason I added
>> a function to get the key and data was because one could imagine that
>> they could be stored on different locations (or memory mapped data areas)...
>> CAS is called through the api to allow the cas to be optional in the backend
>> if you don't want to waste 8 bytes per item... From what I've seen earlier
>> you chose to store your security information as a textual string after the
>> key? you could still do that but then let nkey contain the number of bytes
>> in the key, and keep the other information somewhere else..
>
> What I try to do is to store the security information as a part of the value
> for the secondary modules, rather than just after keys.
> In this approach, the secondary module doesn't need special treatments on the
> items with security information, because the it just stores the given value
> as is.
>
> As someone pointed out before, I don't think it is not good idea to handle
> the security information specially and independently from existing keys and
> values, because it is unclear whether the secondary modules pay mention about
> this security properties.
> If the secondary module see the security information just a part of values,
> it shall be handled correctly. If not, it is just a bug in the secondary one.
> So, I want to modify the value when it is delivered from the primary module
> to the secondary one, and want to split up the item when it is delivered from
> the primary module to the memcached core.
>
> The existing get_item_data() allows the primary module to modify the pointer
> of data field, although it might be an invention of new usage, but we cannot
> fix up the length of the data field right now,
> The purpose of get_item_ndata() is that the primary module inject its security
> information transparently for both of the core memcached and the secondary
> modules.
>
> User
> | ^
> | | {key = "abcd", value = "foovarbaz"}
> v |
> Memcached
> | ^
> | | {key = "abcd", nkey=4, value = "foovarbaz", nbytes=9}
> v |
> Primary engine module
> | ^
> | | {key = "abcd", nkey=4, value = "secret\0foovarbaz", nbytes=16}
> v | ^^^^^^^^ ... transparently injected
> Secondary engine module <----> [it's item storage]
>
I'm still having a hard time to see how this will work in practice... How would
you request the object from the underlying engine? The key would be longer and
contain data you don't know about? or are you saying that engines have to call
the function to internally determine the length of their keys for lookup?
Personally I don't think it is a good idea to store the security attribute on a
per item basis. you are going to waste a _lot_ of memory (we made CAS optional
so that users could save 8 bytes per item, this is going to be more). From a
memory usage perspective I guess it's better to do something like Dustins
bucket engine, and store items with the same label in a separate container. The
drawback with this is that you have to search all the containers the connection
dominates to find the object. In theory this could be a _lot_ of containers,
but I would guess that in most setups it would most likely be a handful...
>
>>> $ git diff origin/reworks_3 origin/reworks_4
>>> -> It replaces settings.engine.v1->xxx(...) invocations by wrapper
>>> functions.
>>>
>>
>> The source code does not follow the same coding standard as the rest
>> of memcached...
>
> Does the coding standard mean tab, indent, case arc and others?
> If so, I'll fix up them according to rest of memcached.
>
Look at how the rest of the source code is formatted..
> Or, are you saying the wrapper functions are not coding standard in the
> memcached, so unnecessary?
But I don't see how it increase the readability of the source ;-) I would guess
that when we are going to support more interface protocols we need to modify
more than just the invocation of the function (new conditions etc). but who
knows..
Cheers,
Trond