[ 
https://issues.apache.org/jira/browse/KAFKA-3160?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dana Powers updated KAFKA-3160:
-------------------------------
    Description: 
KAFKA-1493 partially implements the LZ4 framing specification, but it 
incorrectly calculates the header checksum. This causes 
KafkaLZ4BlockInputStream to raise an error 
[IOException(DESCRIPTOR_HASH_MISMATCH)] if a client sends *correctly* framed 
LZ4 data. It also causes KafkaLZ4BlockOutputStream to generate incorrectly 
framed LZ4 data, which means clients decoding LZ4 messages from kafka will 
always receive incorrectly framed data.

Specifically, the current implementation includes the 4-byte MagicNumber in the 
checksum, which is incorrect.
http://cyan4973.github.io/lz4/lz4_Frame_format.html

Third-party clients that attempt to use off-the-shelf lz4 framing find that 
brokers reject messages as having a corrupt checksum. So currently non-java 
clients must 'fixup' lz4 packets to deal with the broken checksum.

Magnus first identified this issue in librdkafka; kafka-python has the same 
problem.

  was:
KAFKA-1493 partially implements the LZ4 framing specification, but it 
incorrectly calculates the header checksum. This causes 
KafkaLZ4BlockInputStream to raise an error 
[IOException(DESCRIPTOR_HASH_MISMATCH)] if a client sends *correctly* framed 
LZ4 data. It also causes the kafka broker to always return incorrectly framed 
LZ4 data to clients.

Specifically, the current implementation includes the 4-byte MagicNumber in the 
checksum, which is incorrect.
http://cyan4973.github.io/lz4/lz4_Frame_format.html

Third-party clients that attempt to use off-the-shelf lz4 framing find that 
brokers reject messages as having a corrupt checksum. So currently non-java 
clients must 'fixup' lz4 packets to deal with the broken checksum.

Magnus first identified this issue in librdkafka; kafka-python has the same 
problem.


> Kafka LZ4 framing code miscalculates header checksum
> ----------------------------------------------------
>
>                 Key: KAFKA-3160
>                 URL: https://issues.apache.org/jira/browse/KAFKA-3160
>             Project: Kafka
>          Issue Type: Bug
>          Components: compression
>    Affects Versions: 0.8.2.0, 0.8.2.1, 0.9.0.0, 0.8.2.2, 0.9.0.1
>            Reporter: Dana Powers
>            Assignee: Magnus Edenhill
>              Labels: compatibility, compression, lz4
>
> KAFKA-1493 partially implements the LZ4 framing specification, but it 
> incorrectly calculates the header checksum. This causes 
> KafkaLZ4BlockInputStream to raise an error 
> [IOException(DESCRIPTOR_HASH_MISMATCH)] if a client sends *correctly* framed 
> LZ4 data. It also causes KafkaLZ4BlockOutputStream to generate incorrectly 
> framed LZ4 data, which means clients decoding LZ4 messages from kafka will 
> always receive incorrectly framed data.
> Specifically, the current implementation includes the 4-byte MagicNumber in 
> the checksum, which is incorrect.
> http://cyan4973.github.io/lz4/lz4_Frame_format.html
> Third-party clients that attempt to use off-the-shelf lz4 framing find that 
> brokers reject messages as having a corrupt checksum. So currently non-java 
> clients must 'fixup' lz4 packets to deal with the broken checksum.
> Magnus first identified this issue in librdkafka; kafka-python has the same 
> problem.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to