[protobuf] using compression on protobuf messages

sheila miguez Tue, 22 Jun 2010 10:54:29 -0700

I'm relatively inexperienced with this, and would appreciate advice
and criticism.


After investigating gzip as a possible algorithm to use with our
messages, we want to try lzo. There's a java implementation of lzo
being used by the hadoop community. Gzip has around 120 msec latency
for medium size messages, but we see some messages that reach up to
9MB occasionally, and the performance of gzip on those is poor --
around 4 seconds on the machines we experimented with this on. That's
unacceptable. lzo is better in unit test so far, but I haven't run a
live experiment yet.

I'm curious to know if anyone else has been experimenting with this,
and I'm also curious to get criticism on how I'm using the compression
library.

When I have a message to compress, I know the size of the byte array
stream buffer to allocate. Then call the writeTo on it. Is there
anything I should do other than this, given a message? writeTo should
be pretty performant, yes? In unit test, when measuring the speed that
takes, it is pretty good. I would like to experiment with ways to
tweak things as much as possible. And, right now the idea for
decompression is to make a good enough guess of a typical compression
ratio for the flavor of data we are passing back and forth and use
that as a heuristic for how much we want to allocate at start.

It seems fairly obvious to me to do things the way I am doing, but I
do not like to assume that I know what I'm doing at this point.


-- 
sheila

-- 
You received this message because you are subscribed to the Google Groups 
"Protocol Buffers" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en.

[protobuf] using compression on protobuf messages

Reply via email to