[GitHub] [kafka] divijvaidya opened a new pull request, #13135: KAFKA-14633: Reduce data copy & buffer allocation during decompression

GitBox Fri, 20 Jan 2023 01:43:55 -0800


divijvaidya opened a new pull request, #13135:
URL: https://github.com/apache/kafka/pull/13135

This covers two JIRAs https://issues.apache.org/jira/browse/KAFKA-14632 and
https://issues.apache.org/jira/browse/KAFKA-14633

## Background
![Screenshot 2023-01-19 at 18 27
45](https://user-images.githubusercontent.com/71267/213521204-bb3228ed-7d21-4e07-a520-697ea6fcc0ed.png)
Currently, we use 2 intermediate buffers while handling decompressed data
(one of size 2KB and another of size 16KB). These buffers are (de)allocated
once per batch.

The impact of this was observed in a flamegraph analysis for a compressed
workload where we observed that 75% of CPU during `appendAsLeader()` is taken
up by `ValidateMessagesAndAssignOffsets`.

![Screenshot 2023-01-20 at 10 41
08](https://user-images.githubusercontent.com/71267/213664252-389eaf3d-b8aa-465b-b010-db1024663d6f.png)

## Change
With this PR:
1. we are removing the number of intermediate buffers from 2 to 1. This
reduces 1 point of data copy. Note that this removed data copy occurred in
chunks of 2kb at a time, multiple times. This is achieved by getting rid of
`BufferedInputStream` and moving to `DataInputStream`. This change has only
been made for `zstd` and `gzip`.
2. we are using thread local buffer pool for both the buffers involved in
the process of decompression. This change impacts all compression types.
3. we pushed the skipping of key/value logic to

After the change, the above buffer allocation looks as follows:
![Screenshot 2023-01-19 at 18 28
14](https://user-images.githubusercontent.com/71267/213525653-917ac5ee-810a-435e-bf84-c97d6b76005e.png)

## Results
After this change, a JMH benchmark for `ValidateMessagesAndAssignOffsets`
demonstrated 10-50% increased throughput across all compression types without
any regression. The improvement is prominent when thread cached buffer pools
are used with 1-2% regression in some limited scenarios.

When buffer pools are not used (NO_CACHING in the results), we observed GZIP
having 10% better performance in some cases with 1-4% regression for some other
scenarios. Overall, without using the buffer pools, the upside of this code
change is limited to single digit improvements in certain scenarios.

Details results from JMH benchmark are available here:
[benchmark-jira.xlsx](https://github.com/apache/kafka/files/10465049/benchmark-jira.xlsx)

## Testing
- Sanity testing using the existing unit test to ensure that we don't impact
correctness.
- JMH benchmarks for all compression types to ensure that we did not regress
other compression types.

--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [kafka] divijvaidya opened a new pull request, #13135: KAFKA-14633: Reduce data copy & buffer allocation during decompression

Reply via email to