> Hello, > > For now, I'm trying to evaluate the performance of memcached server by using > several client workloads. > I have a question about multi-get implementation in binary protocol. > As I know, in ascii protocol, we can send multiple keys in a single request > packet to implement multi-get. > > But, in a binary protocol, it seems that we should send multiple request > packets (one request packet per key) to implement multi-get. > Even though we send multiple getQ, then sends get for the last key, we only > can save the number of response packets only for cache miss. > If I understand correctly, multi-get in binary protocol cannot reduce the > number of request packets, and > it also cannot reduce the number of response packets if hit-ratio is very > high (like 99% get hit). > > If the performance bottleneck is on the network side not on the CPU, I think > reducing the number of packets is still very important, > but I don't understand why the binary protocol doesn't care about this. > I missed something?
you're right, it sucks. I was never happy with it, but haven't had time to add adjustments to the protocol for this. To note, with .19 some inefficiencies with the protocol were lifted, and most network cards are fast enough for most situations, even if it's one packet per response (and for large enough responses they split into multiple packets, anyway). The reason why this was done is for latency and streaming of responses: - In ascii multiget, I can send 10,000 keys, then I'm forced to wait for the server to look up all of the keys before sending its responses, this isn't typically very high but there's some latency to it. - In binary multiget, the responses are sent back as it receives them from the network more or less. This reduces the latency to when you start seeing responses, regardless of how large your multiget is. this is useful if you have a kind of client which can start processing responses in a streaming fashion. This potentially reduces the total time to render your response since you can keep the CPU busy unmarshalling responses instead of sleeping. However, it should have some tunables: One where it at least does one write per complete packet (TCP_CORK'ed, or similar), and one where it buffers up to some size. In my tests I can get ascii multiget up to 16.2 million keys/sec, but (with the fixes in .19) binprot caps out at 4.6m and is spending all of its time calling sendmsg(). Most people need far, far less than that, so the binprot as is should be okay though. The code isn't too friendly to this and there're other higher priority things I'd like to get done sooner. The relatively few number of people who do 500,000+ requests per second in binprot (they're almost always ascii at that scale) is the other reason. -- --- You received this message because you are subscribed to the Google Groups "memcached" group. To unsubscribe from this group and stop receiving emails from it, send an email to memcached+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.