Vikash Mishra created KAFKA-15160:
-------------------------------------

             Summary: Message bytes duplication in Kafka headers when 
compression is enabled
                 Key: KAFKA-15160
                 URL: https://issues.apache.org/jira/browse/KAFKA-15160
             Project: Kafka
          Issue Type: Bug
          Components: clients, compression, consumer
    Affects Versions: 3.3.2, 3.2.3
            Reporter: Vikash Mishra
         Attachments: java heap dump.png, wireshark-min.png

I created a spring Kafka consumer using @KafkaListener.
During this, I encounter a scenario where when data is compressed ( any 
compression snappy/gzip) and consumed by the consumer then I see that in a heap 
dump, there is a " byte" occupying the same amount of memory as in Message 
value.

This behavior is seen only in cases when compressed data is consumed by 
consumers not in the case of uncompressed data.

Tried to capture Kafka's message through Wireshark, there it shows the proper 
size of data incoming from Kafka server & no extra bytes in headers. So, this 
is definitely something in Kafka client. Spring doesn't do any actions about 
compression; the whole functionality is done internally in the Kafka client 
library.

Attached is the screenshot of the heap dump and Wireshark.

This seems like a critical issue as message size in memory almost gets doubles 
impacting consumer memory and performance. Somewhere it feels like the actual 
message value is copied to headers?

*To Reproduce*
 # Produce compressed data on any topic.
 # Create a simple consumer consuming from the above-created topic.
 # Capture heap dump.

*Expected behavior*

Headers should not show bytes consuming memory equivalent to value.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to