[ 
https://issues.apache.org/jira/browse/KAFKA-1493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14158408#comment-14158408
 ] 

Theo Hultberg commented on KAFKA-1493:
--------------------------------------

If you're looking for a standard way to handle LZ4 there doesn't seem to be 
any, but Cassandra uses a 4 byte field for the uncompressed length and no 
checksum 
(https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/io/compress/LZ4Compressor.java).

I've seen varint used too in other projects, but in my opinion it's a pain to 
implement compared to just using an int, and for very little benefit. The 
drawbacks are that small messages will use one or two bytes more, and that you 
can't handle compressed chunks of over a couple of gigabyte.

Sorry for jumping into the discussion out of the blue, I just stumbled upon 
this while looking through the issues for 0.8.2. I've got very little 
experience with the Kafka codebase, but I'm the author of the Ruby driver for 
Cassandra and I recognized the issue. Hope this was helpful and I didn't 
completely miss the point.

> Use a well-documented LZ4 compression format and remove redundant LZ4HC option
> ------------------------------------------------------------------------------
>
>                 Key: KAFKA-1493
>                 URL: https://issues.apache.org/jira/browse/KAFKA-1493
>             Project: Kafka
>          Issue Type: Improvement
>    Affects Versions: 0.8.2
>            Reporter: James Oliver
>            Assignee: James Oliver
>            Priority: Blocker
>             Fix For: 0.8.2
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to