> Hello,
>
> For now, I'm trying to evaluate the performance of memcached server by using 
> several client workloads.
> I have a question about multi-get implementation in binary protocol.
> As I know, in ascii protocol, we can send multiple keys in a single request 
> packet to implement multi-get.
>
> But, in a binary protocol, it seems that we should send multiple request 
> packets (one request packet per key) to implement multi-get.
> Even though we send multiple getQ, then sends get for the last key, we only 
> can save the number of response packets only for cache miss.
> If I understand correctly, multi-get in binary protocol cannot reduce the 
> number of request packets, and
> it also cannot reduce the number of response packets if hit-ratio is very 
> high (like 99% get hit).
>
> If the performance bottleneck is on the network side not on the CPU, I think 
> reducing the number of packets is still very important,
> but I don't understand why the binary protocol doesn't care about this.
> I missed something?

you're right, it sucks. I was never happy with it, but haven't had time to
add adjustments to the protocol for this. To note, with .19 some
inefficiencies with the protocol were lifted, and most network cards are
fast enough for most situations, even if it's one packet per response (and
for large enough responses they split into multiple packets, anyway).

The reason why this was done is for latency and streaming of responses:

- In ascii multiget, I can send 10,000 keys, then I'm forced to wait for
the server to look up all of the keys before sending its responses, this
isn't typically very high but there's some latency to it.

- In binary multiget, the responses are sent back as it receives them from
the network more or less. This reduces the latency to when you start
seeing responses, regardless of how large your multiget is. this is useful
if you have a kind of client which can start processing responses in a
streaming fashion. This potentially reduces the total time to render your
response since you can keep the CPU busy unmarshalling responses instead
of sleeping.

However, it should have some tunables: One where it at least does one
write per complete packet (TCP_CORK'ed, or similar), and one where it
buffers up to some size. In my tests I can get ascii multiget up to 16.2
million keys/sec, but (with the fixes in .19) binprot caps out at 4.6m and
is spending all of its time calling sendmsg(). Most people need far, far
less than that, so the binprot as is should be okay though.

The code isn't too friendly to this and there're other higher priority
things I'd like to get done sooner. The relatively few number of people
who do 500,000+ requests per second in binprot (they're almost always
ascii at that scale) is the other reason.

-- 

--- 
You received this message because you are subscribed to the Google Groups 
"memcached" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to memcached+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to