[
https://issues.apache.org/jira/browse/KAFKA-15160?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Vikash Mishra updated KAFKA-15160:
----------------------------------
Attachment: dump-compressed-data-.7z
> Message bytes duplication in Kafka headers when compression is enabled
> ----------------------------------------------------------------------
>
> Key: KAFKA-15160
> URL: https://issues.apache.org/jira/browse/KAFKA-15160
> Project: Kafka
> Issue Type: Bug
> Components: clients, compression, consumer
> Affects Versions: 3.2.3, 3.3.2
> Reporter: Vikash Mishra
> Assignee: Phuc Hong Tran
> Priority: Critical
> Attachments: dump-compressed-data-.7z, java heap dump.png,
> wireshark-min.png
>
>
> I created a spring Kafka consumer using @KafkaListener.
> During this, I encounter a scenario where when data is compressed ( any
> compression snappy/gzip) and consumed by the consumer then I see that in a
> heap dump, there is a " byte" occupying the same amount of memory as in
> Message value.
> This behavior is seen only in cases when compressed data is consumed by
> consumers not in the case of uncompressed data.
> Tried to capture Kafka's message through Wireshark, there it shows the proper
> size of data incoming from Kafka server & no extra bytes in headers. So, this
> is definitely something in Kafka client. Spring doesn't do any actions about
> compression; the whole functionality is done internally in the Kafka client
> library.
> Attached is the screenshot of the heap dump and Wireshark.
> This seems like a critical issue as message size in memory almost gets
> doubles impacting consumer memory and performance. Somewhere it feels like
> the actual message value is copied to headers?
> *To Reproduce*
> # Produce compressed data on any topic.
> # Create a simple consumer consuming from the above-created topic.
> # Capture heap dump.
> *Expected behavior*
> Headers should not show bytes consuming memory equivalent to value.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)