Re: Cacheismo

Rohit Karlupia Wed, 28 Sep 2011 02:25:07 -0700

Thanks for your feedback Dustin and the opportunity to explain my work.
I wanted to start with something that works and later adapt it according to
user feedback if people use it.


The logic for virtual keys is completely implemented in lua. So it is
trivial to more from get to set or add etc
without touching the c code. I though about JSON objects too, but the
current implementation uses
lua tables as data objects. I found some code which does very fast
serialization and de-serialization of lua
tables. If required JSON support can be easily added as it maps fairly well
to lua tables.

Script loading and compilation takes time and hence the decision to have
static set of scripts. In future I
would perhaps expose the "scripts set" as a virtual key so that it can be
manipulated from client side
with regular commands.

thanks!
rohitk

On Wed, Sep 28, 2011 at 2:32 PM, Dustin <[email protected]> wrote:

>
> On Wednesday, September 28, 2011 1:34:03 AM UTC-7, Rohit Karlupia wrote:
>
>
>>     That is true. Reasons why it doesn't impact cacheismo so much.
>>         -  valid buffer sizes are fixed (16 + n16 .. upto 4096) . It is
>> not possible to allocate
>>            more than 4KB of contiguous memory.
>>         -  objects are stored in "list of buffers" instead of a contiguous
>> block of memory.
>>         -   allocator provides a relaxedMalloc function which can return
>> less memory than
>>             what you asked for. item storage code doesn't depend on
>> malloc/but relaxedMalloc.
>>        Thus if we need to store 4KB item, we can use 3KB+1KB or 2K+2K or
>> 2048+1024+512+512 etc
>>        Currently code works with array of buffers, but i guess it should
>> be easy to change
>>        that to a linked list. This also limits the max size of item stored
>> in the cache.
>>
>>      So yes, this allocator also fragments, but it doesn't matter much. We
>> still can use the
>>      fragmented memory but at a higher memory management cost.
>>
>
>   Oh that's neat.  We added support for that in the engine API, but nobody
> had written a scatter/gather allocator so we limited the maximum number of
> segments to 1 (since that's all we've done).
>
>
>> *   *     As I explained earlier, cacheismo doesn't works with contiguous
>> objects (at max 4KB, if available).
>>   So what matters is that we have 1MB free and not how much fragmented it
>> is.  What it does ensure is
>>  that what ever items are freed, 13 bytes or 1300 bytes or 13KB, they are
>> least useful bytes to cache.
>>
>
>   That makes sense.  In this case, it's still almost 81,000 items to be
> evicted down the LRU vs. 1 while you're making room.
>
>
>>   I though those were the defaults and most people would use it that way
>>> only.
>>>
>>
>   You can see the default with memcached -vv
>
>
>>    At some time in future when the code is mature enough.
>>>
>>
>   I've done some similar work in an experimental membase branch, though I
> only used a binary protocol extension for scripting (though it does have
> some pretty neat properties, IMO).
>
>   Adding text protocol support is *almost* easy, but I find the binary
> protocol significantly easier to interact with.
>
>
>>   Those are valid script keys.  Why not   "script set:new:mykey" ?
>>>
>>> Most memcached clients will need to be changed to support new keyword
>> "script".
>> Hence the decision to overload the get command.
>>
>
>   It seems like you'll hit limitations pretty quickly.  For example, it
> looks like you can't have colons or whitespace in the values you operate on.
>  If you allow space separated arguments, you'll end up with your parameters
> reordered, of course.  You'd want something that looks more like the "set"
> command to push scripts in and out.  Might as well just call it "script" at
> that point.  You'll want to push in complex objects and stuff.  In my case,
> I could submit batch jobs, upload scripts dynamically, pass in server-side
> CAS collision resolution functions on JSON data coming from the client,
> etc...
>
>
>> Cacheismo is not faster than memcached.  Memcached was about 20% faster.
>> Cacheismo had better HIT Rate.
>>
>
>   Ah, so this is specifically due to your tighter memory packing.  Sorry,
> that was a bit unclear.
>
>
>> These are the raw numbers....
>>
>
>   ...thanks, always good to see the details.
>

Re: Cacheismo

Reply via email to