[
https://issues.apache.org/jira/browse/KAFKA-1493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14158408#comment-14158408
]
Theo Hultberg commented on KAFKA-1493:
--------------------------------------
If you're looking for a standard way to handle LZ4 there doesn't seem to be
any, but Cassandra uses a 4 byte field for the uncompressed length and no
checksum
(https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/io/compress/LZ4Compressor.java).
I've seen varint used too in other projects, but in my opinion it's a pain to
implement compared to just using an int, and for very little benefit. The
drawbacks are that small messages will use one or two bytes more, and that you
can't handle compressed chunks of over a couple of gigabyte.
Sorry for jumping into the discussion out of the blue, I just stumbled upon
this while looking through the issues for 0.8.2. I've got very little
experience with the Kafka codebase, but I'm the author of the Ruby driver for
Cassandra and I recognized the issue. Hope this was helpful and I didn't
completely miss the point.
> Use a well-documented LZ4 compression format and remove redundant LZ4HC option
> ------------------------------------------------------------------------------
>
> Key: KAFKA-1493
> URL: https://issues.apache.org/jira/browse/KAFKA-1493
> Project: Kafka
> Issue Type: Improvement
> Affects Versions: 0.8.2
> Reporter: James Oliver
> Assignee: James Oliver
> Priority: Blocker
> Fix For: 0.8.2
>
>
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)