[ 
https://issues.apache.org/jira/browse/KAFKA-1493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14032832#comment-14032832
 ] 

James Oliver commented on KAFKA-1493:
-------------------------------------

Snappy's block (default size 32kb) compression format is this:
snappy codec header: 8-byte magic header, version [4-byte integer], min 
compatible version [4-byte integer]
compressed block 1: compressed data size [4-byte integer], compressed data
compressed block 2
...
Notable limitations: no checksum

If I understand the proposed format correctly, this is what you're suggesting:
uncompressed data size [n-byte varint], compressed data

While I would expect compressing an entire message as a single block would 
provide a better compression ratio than compressing smaller chunks, doing so 
for larger messages is going to cause serious performance problems.

> Use a well-documented LZ4 compression format and remove redundant LZ4HC option
> ------------------------------------------------------------------------------
>
>                 Key: KAFKA-1493
>                 URL: https://issues.apache.org/jira/browse/KAFKA-1493
>             Project: Kafka
>          Issue Type: Improvement
>            Reporter: James Oliver
>             Fix For: 0.8.2
>
>




--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to