[ 
https://issues.apache.org/jira/browse/KAFKA-3160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15234514#comment-15234514
 ] 

ASF GitHub Bot commented on KAFKA-3160:
---------------------------------------

GitHub user dpkp opened a pull request:

    https://github.com/apache/kafka/pull/1212

    KAFKA-3160: Fix LZ4 Framing

    This contribution is my original work and I license the work under Apache 
2.0.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/dpkp/kafka KAFKA-3160

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/kafka/pull/1212.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #1212
    
----
commit b64e5f9f054131ae7bf6b9a10be861f5fb0caeab
Author: Dana Powers <dana.pow...@gmail.com>
Date:   2016-04-11T04:35:43Z

    Update KafkaLZ4Block* implementation to 1.5.1 framing spec
    
     - update spec to 1.5.1; remove dictID
     - fix frame descriptor HC check (dont include magic bytes)
     - dont require HC validation on input by default
     - add useBrokenHC boolean for output compatibility
     - nominal support for contentChecksum / contentSize flags

commit f1380d0e5f6e1e9d7b48a9cff3fbcd13b7a5fe3f
Author: Dana Powers <dana.pow...@gmail.com>
Date:   2016-04-11T05:35:31Z

    KAFKA-3160: use LZ4 v1.5.1 framing for all v1 messages; keep old framing 
for v0 messages

----


> Kafka LZ4 framing code miscalculates header checksum
> ----------------------------------------------------
>
>                 Key: KAFKA-3160
>                 URL: https://issues.apache.org/jira/browse/KAFKA-3160
>             Project: Kafka
>          Issue Type: Bug
>          Components: compression
>    Affects Versions: 0.8.2.0, 0.8.2.1, 0.9.0.0, 0.8.2.2, 0.9.0.1
>            Reporter: Dana Powers
>            Assignee: Magnus Edenhill
>              Labels: compatibility, compression, lz4
>
> KAFKA-1493 partially implements the LZ4 framing specification, but it 
> incorrectly calculates the header checksum. This causes 
> KafkaLZ4BlockInputStream to raise an error 
> [IOException(DESCRIPTOR_HASH_MISMATCH)] if a client sends *correctly* framed 
> LZ4 data. It also causes KafkaLZ4BlockOutputStream to generate incorrectly 
> framed LZ4 data, which means clients decoding LZ4 messages from kafka will 
> always receive incorrectly framed data.
> Specifically, the current implementation includes the 4-byte MagicNumber in 
> the checksum, which is incorrect.
> http://cyan4973.github.io/lz4/lz4_Frame_format.html
> Third-party clients that attempt to use off-the-shelf lz4 framing find that 
> brokers reject messages as having a corrupt checksum. So currently non-java 
> clients must 'fixup' lz4 packets to deal with the broken checksum.
> Magnus first identified this issue in librdkafka; kafka-python has the same 
> problem.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to