Re: Alternative Binary Protocol idea for memcached.

Clint Webb Wed, 20 Feb 2008 02:06:31 -0800

I wasn't expecting such a quick reply.  Good point about allowing multiple
protocols.  I might pull out some of my old code and see how easy it is to
drop in.

I thought I'd give a little background on myself and this protocol style. I
used to work in the controls and automation industry.  If you've ever
checked your luggage into an airport, or sent anything thru USPS, UPS or
FedEX, or bought anything online from amazon, b&n and other large online
stores, or from Walmart, kmart, target, etc... then your product has likely
had some experience with my coding somewhere along the line.

I developed this protocol for a tiny little side-project for one of the
above mentioned companies.  It was basically a connector that took
information from a large number of different systems and passed it to
another.  The requirements changed a lot, so I developed something that
could be fast, but had to be flexible.  If I added a feature to the server,
I didn't want to be forced to update all the clients as well.   Also, some
of the clients were tiny little embedded controllers, so it had to be pretty
simple too.

This solution was VERY fast, as all the commands are a single byte and could
be easily mapped to an array of callback routines.   This protocol also had
to run on a real-time system also, so we had to ensure that all operations
preformed in a predictable fashion.

I seperated the commands by their parameter type.  0-31 had no parameter.
32-63 had a single byte parameter.  64-95 had a 32-bit parameter.  96-127
had a single-byte paramter which was the length of a stream of data to
follow (short-string).  128-159 had a 32-bit integer that was the length of
a stream of data to follow.   This was our 5 different parameter types.  A
command could only have one parameter.

This way, the IO handling code could retrieve all the necessary data and
then pass it off to a callback routine for that command.

Each socket connection would have a set of buffers set aside for the
incoming data.  In this case we would want a buffer set aside to hold the
key and value data.

To speed up processing and ensure that the minimum data set has been
provided, we used a 32-bit (or was it 64-bit?) word as flags.   Each
operation would set or clear a flag(s).   So when a GO command is received,
it can quickly determine what 'action' needs to take place,  and which
'parameters' have been provided.

If we ran out of room having to handle more than 256 commands, we would use
a command of 0xFF which would expect that the next byte would be another
command (from a set different to the first).  I never actually implemented
it though.  The most commands I ever used was about 100 or so.

I cant imagine that a variable-length structured protocol could be much
faster than that.  Still the emphasis of this protocol is not so much on
speed, but on flexibility to add functionality to the protocol (by adding
commands) without breaking existing clients (and without having to handle
multiple versions of the protocol).

The 'noreply' stuff that I have seen around the list could probably benefit
from this protocol.  I haven't looked close enough at the CAS stuff either,
but I suspect that would be easy to implement too.

Also, those that want to shave off a few extra bytes in their client, have
the option of sending a request that only includes the bits they want.  If
you care about expiry leave it out, same with flags, tags, cas id's, and
anything else.  Plus you can stream-line some of your requests by not using
the CLEAR command, and re-using the state.

Dang, if I had a little more time on my hands right now, I'd be really
tempted to implement it.   I don't actually have a *need* for this protocol
in memcached, it was purely an intellectual itch ever since I saw people
complaining about the existing protocols being difficult to expand.

On Feb 20, 2008 4:14 PM, Dustin Sallings <[EMAIL PROTECTED]> wrote:

>
> On Feb 19, 2008, at 22:20, Clint Webb wrote:
>
> > I know a considerable amount of thought and work has gone into the
> > existing flavour of the binary protocol, and I dont expect that work
> > to be discarded, I'm really only mentioning this new concept now as
> > an alternative for the future if we ever find the current binary
> > protocol to be too restrictive and inflexible.  And something to
> > think about, or even use elsewhere.
>
>        The is certainly interesting.  The first step of doing the binary
> protocol implementation was to create a change that allowed multiple
> protocols to coexist.  It would be possible to implement this to run
> in parallel with the existing protocols in the 1.3 codebase.
>
>        Intuitively, it doesn't seem as efficient to process as what we've
> got now, but I like being proven wrong, so I'd welcome another front-
> end protocol.  :)
>
>        Of course, I wrote the majority of the current binary protocol code
> about six months ago, so I'd really like to at least have one in more
> people's hands.
>
> --
> Dustin Sallings
>
>

-- 
"Be excellent to each other"

Re: Alternative Binary Protocol idea for memcached.

Reply via email to