divijvaidya opened a new pull request, #13135:
URL: https://github.com/apache/kafka/pull/13135

   This covers two JIRAs https://issues.apache.org/jira/browse/KAFKA-14632 and 
https://issues.apache.org/jira/browse/KAFKA-14633 
   
   ## Background 
   ![Screenshot 2023-01-19 at 18 27 
45](https://user-images.githubusercontent.com/71267/213521204-bb3228ed-7d21-4e07-a520-697ea6fcc0ed.png)
   Currently, we use 2 intermediate buffers while handling decompressed data 
(one of size 2KB and another of size 16KB). These buffers are (de)allocated 
once per batch. 
   
   The impact of this was observed in a flamegraph analysis for a compressed 
workload where we observed that 75% of CPU during `appendAsLeader()` is taken 
up by `ValidateMessagesAndAssignOffsets`.
   
   ![Screenshot 2023-01-20 at 10 41 
08](https://user-images.githubusercontent.com/71267/213664252-389eaf3d-b8aa-465b-b010-db1024663d6f.png)
   
   
   ## Change
   With this PR:
   1. we are removing the number of intermediate buffers from 2 to 1. This 
reduces 1 point of data copy. Note that this removed data copy occurred in 
chunks of 2kb at a time, multiple times. This is achieved by getting rid of 
`BufferedInputStream` and moving to `DataInputStream`. This change has only 
been made for `zstd` and `gzip`.
   2. we are using thread local buffer pool for both the buffers involved in 
the process of decompression. This change impacts all compression types.
   3. we pushed the skipping of key/value logic to 
   
   After the change, the above buffer allocation looks as follows:
   ![Screenshot 2023-01-19 at 18 28 
14](https://user-images.githubusercontent.com/71267/213525653-917ac5ee-810a-435e-bf84-c97d6b76005e.png)
   
   ## Results
   After this change, a JMH benchmark for `ValidateMessagesAndAssignOffsets` 
demonstrated 10-50% increased throughput across all compression types without 
any regression. The improvement is prominent when thread cached buffer pools 
are used with 1-2% regression in some limited scenarios.
   
   When buffer pools are not used (NO_CACHING in the results), we observed GZIP 
having 10% better performance in some cases with 1-4% regression for some other 
scenarios. Overall, without using the buffer pools, the upside of this code 
change is limited to single digit improvements in certain scenarios.
   
   
   Details results from JMH benchmark are available here: 
[benchmark-jira.xlsx](https://github.com/apache/kafka/files/10465049/benchmark-jira.xlsx)
   
   
   ## Testing
   - Sanity testing using the existing unit test to ensure that we don't impact 
correctness.
   - JMH benchmarks for all compression types to ensure that we did not regress 
other compression types.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to