On Nov 8, 2007, at 0:23, Tomash Brechko wrote:

But this doesn't make the scheme more flexible, because this way I
can't use INRC to both update the value _and_ refresh the entry.
Whatever predetermined approach you choose, you'd close the door for
other possible uses.

I didn't say it was more flexible. I was just saying it's flexible enough.

It would make sense to have a separate command for updating the flags or expiration of a record if that is really of interest.

The strong desire to keep the commands small and orthogonal was expressed at the first meeting where we were going over binary protocol.

        No, this get mechanism works fine, and I pipeline everything heavily
with great success.  You can get values back in any order and you are
notified when all of the results are available.

This depends on how you look at it.  I mean _sequential_ pipelining
(as pipelines actually are), while you are talking about batch
processing.  With sequential pipelining, I push requests, and fetch
results, and once there's direct one-to-one correspondence between
request and response, I don't have to have any additional logic on
client side.  I.e. if I have a list of keys, I can push them to the
server, and fetch the results in order.  I don't have to have a hash
on the client to decide where the particular result belongs.

I fail to see what I'm missing. As far as I can tell, you're describing what I already do. See my write up on client optimization and let me know what I'm missing.

        http://bleu.west.spy.net/~dustin/projects/memcached/optimization.html

Note that in my client I can issue several distinct requests and wait (blocking or not) for the results in any order I feel like.

In the text protocol, a get with several keys only returns hits and an end marker. The idea is that if you're issuing that request, you're probably going to return some kind of dictionary structure to something.

In the binary protocol, the concept of a ``multi-get'' was removed in favor of a ``quiet get'' (no result on miss) and a noop. You achieve the same effect, *or*, you can decide you do want NAKs for every miss if you want to from your client. You could also replace the last request with a non-quiet get to optimize out the noop if you wanted.

In *both* cases, I don't see how I could pipeline any more than I am today.

A get across a couple of thousand keys is a one line response in a
case where none exist (or one message response in the binary
protocol).

I'd rather optimize for the "found" case.  Suppose your request has a
large number of keys, and only last one matches.  The client has to
wait till the very end (batch mode), while with pipelining it could
start the processing of not found entries right away.

Ah, well in the general case, there's no processing to do for not found keys. If I've optimized several threads' requests together in such a way that I could theoretically know that all of the requests for one has been satisfied, then I could send that one away sooner.

I would suspect, however, that the time difference is negligible. A multi-get is generally considered faster than a series of individual gets in the text protocol, and they're just barely different in the binary protocol (to the point where I could simply change what my multi-get implementation does to measure the difference).

Another advantage of flexible text protocol is that once it's there,
you don't have to update all text clients (Perl, PHP, etc.)  when you
add new parameter to some command, given that they have the means to
send arbitrary text request.  I.e., it will always be

 $memcached->set($key, $val, @params);

not

 $memcached->new_cas_command(...);

I'm not sure that's a huge advantage. You have to know you're doing a CAS, and they'd both probably be implemented as:

  $memcached->send_cmd(...);

        anyway.

        It's a given that the current protocol isn't perfect.  That's
why we made a new one.  You should complain about that one more.  :)

BTW, is there a description of this binary protocol?

There's not a very good one anywhere. doc/binary-protocol-plan.txt has preliminary documentation which will somewhat explain the spirit of the protocol. It's not been updated since more of the details were agreed upon in the second meeting. There are, however, a couple of implementations you can read that should help you to understand how these protocols are implemented:

The initial test client and server code I wrote after the first meeting (and have kept up-to-date since then) is probably the best reference that exists at the moment:

        http://hg.west.spy.net/hg/python/memcached-test/


My latest memcached binary server tree is available here (tree auto- updated whenever I push my patch stack):

        http://hg.west.spy.net/hg/hacks/memcached-binary-full/archive/tip.tar.gz


        My java client has a pretty solid binary protocol implementation:

        http://hg.west.spy.net/hg/memcached/

--
Dustin Sallings



Reply via email to