RPC KeyValue encoding

lars hofhansl Sat, 01 Sep 2012 21:04:21 -0700

In 0.96 we changing the wire protocol to use protobufs.

While we're at it, I am wondering whether we can optimize a few things:



1. A Put or Delete can send many KeyValues, all of which have the same row key 
and many will likely have the same column family.
2. Likewise a Scan result or Get is for a single row. Each KV will again will 
have the same row key and many will have the same column family.
3. The client and server do not need to share the same KV implementation as 
long as they are (de)serialized the same. KVs on the server will be backed by a 
shared larger byte[] (the block reads from disk), the KVs in the memstore will 
probably have the same implementation (to use slab, but maybe even here it 
would be benificial to store the row key and CF separately and share between KV 
where possible). Client KVs on the other hand could share a row key and or 
column family.

This would require a KeyValue interface and two different implementations; one 
backed by a byte[] another that stores the pieces separately. Once that is done 
one could even envision KVs backed by a byte buffer.

Both (de)serialize the same, so when the server serializes the KVs it would 
send the row key first, then the CF, then column, TS, finally followed by the 
value. The client could deserialize this and directly reuse the shared part in 
its KV implementation.
That has the potentially to siginificantly cut down client/server network IO 
and save memory on the client, especially with wide columns.

Turning KV into an interface is a major undertaking. Would it be worth the 
effort? Or maybe the RPC should just be compressed?


We'd have to do that before 0.96.0 (I think), because even protobuf would not 
provide enough flexibility to make such a change later - which incidentally 
leads to another discussion about whether client and server should do an 
initial handshake to detect each others version, but that is a different story.


-- Lars

RPC KeyValue encoding

Reply via email to