ijuma opened a new pull request #9229: URL: https://github.com/apache/kafka/pull/9229
Use a caching BufferSupplier per request handler thread so that decompression buffers are cached if supported by the underlying CompressionType. This reduces allocations significantly for LZ4 when the number of partitions is high. The decompression buffer size is typically 64 KB, so a produce request with 1000 partitions results in 64 MB of allocations even if each produce batch is small (likely, when there are so many partitions). I did a quick producer perf local test with 5000 partitions, 1 KB record size, 1 broker, lz4 and ~0.5 for the producer compression rate metric: Before this change: > 20000000 records sent, 346314.349535 records/sec (330.27 MB/sec), 148.33 ms avg latency, 2267.00 ms max latency, 115 ms 50th, 383 ms 95th, 777 ms 99th, 1514 ms 99.9th. After this change: > 20000000 records sent, 431956.113259 records/sec (411.95 MB/sec), 117.79 ms avg latency, 1219.00 ms max latency, 99 ms 50th, 295 ms 95th, 440 ms 99th, 662 ms 99.9th. That's a 25% throughput improvement and p999 latency was reduced to under half (in this test). TODO: Remove default arguments ### Committer Checklist (excluded from commit message) - [ ] Verify design and implementation - [ ] Verify test coverage and CI build status - [ ] Verify documentation (including upgrade notes) ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org