Re: Memcached implementation analysis and comparison with Redis

dormando Mon, 16 Feb 2015 23:38:21 -0800

> On Tuesday, 17 February 2015 03:38:55 UTC+7, Dormando wrote:
>
>       Again, in actual benchmarks I've not been able to prove them to be a
>       problem. In one of the gist links I provided before I show an "all miss"
>       case, which acquires/releases a bucket lock but does not have the 
> overhead
>       of processing the value. In those cases it was able to process over 50
>       million keys per second. The internals don't tend to be the slow part
>       anymore.
>
> Seems that that was with 32 threads. Hence throughput per thread is 1.5 mln 
> keys/sec, very roughtly.
> It means 600-700 ns / op latency. (Or about 300 ns, if that was with 16 
> threads).
> Maybe it's not the major part of total thousands of ns average  Memcached op
> takes now, but this is considerable amount to optimize.
> Unconctended spin lock should take only dozens of ns.
> Also, if it just queries an empty table, cache is uncontended, memory of 
> mutex structures is not evicted
> as quickly as under normal conditions. So such test tends to show faster 
> mutex ops than they are actually.


I don't care at all about optimizing up from 1.5m keys/sec per thread.
When fetching real values it got up to 20m keys/sec for 32 threads. That's
more than you can possibly need for a few more years until hardware gets
cheaper.

There's honestly nobody even asking for better performance... it's just an
area of academic study for a lot of people (see one of the dozens of
papers on modifying memcached).

>
> What about optimizing snprintf()?
> - For slab class, we know the highest digit of the value length
> - hand-written itoa()s instead snprintf()?
> - Optimize / precompute most probable combinations of flags in decimal repr
> - If values are constantly sized (or it is hardly the case for memcached?) or 
> one of several size classes,
>   ultra thin hash table with precomputed value lengths should help

Optimzing the snprintf could help write speed a chunk. There're a few good
approaches to it. It's low priority for me (connection work and better
slab rebalancing would be more useful to an end user), but good patches
are welcome.

The best thing anyone can do is help test the branches I'm actually
working on. If I can't release these then we don't move forward and
nothing happens at all :/

We're not sponsored like redis is. I've only ever lost money on this
venture.

Re: Memcached implementation analysis and comparison with Redis

Reply via email to