Jim,

The compression ratio is going to be very data dependent. If you can
compress to 1/4 of the original size, that's pretty good. At LinkedIn, we
compress messages up to 200 using gzip. The compressed data is about 1/3 of
the original data.

Thanks,

Jun

On Thu, Aug 2, 2012 at 5:39 PM, James A. Robinson <jim.robin...@stanford.edu
> wrote:

> Hi folks,
>
> We've got a system where we're pushing small XML documents, produced
> as part of an event stream, through kafka to another service.  Each of
> these messages tends to be only around 600 to 900 bytes in length.
>
> I was wondering if any of you had statistics on the average
> compression ratio for a given message format you use, when the
> publisher is configured to compress kafka messages using gzip?
>
> I'm expecting that the compression ratio won't be very high if Kafka
> is compressing each individual message (versus compressing entire
> message sets). In our test we were seeing a compression ratio of
> perhaps 25%, and I think that's about what I'd expect for per-message
> compression.
>
> Jim
>
> - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
> James A. Robinson                       jim.robin...@stanford.edu
> Stanford University HighWire Press      http://highwire.stanford.edu/
> +1 650 7237294 (Work)                   +1 650 7259335 (Fax)
>

Reply via email to