novosibman opened a new pull request, #13782: URL: https://github.com/apache/kafka/pull/13782
Trunk version of initial change: https://github.com/apache/kafka/pull/13768 in branch "3.4" Key difference with branched change: Passed and used existing `scheduler` which already is being used for flushing large segment logs and indices. In all cases snapshot's fileChannel is kept opened when passed to other threads for flushing and closing (so removing try-with-resource in this change). Related issue https://issues.apache.org/jira/browse/KAFKA-9693 The issue with repeating latency spikes during Kafka log segments rolling still reproduced on the latest versions including kafka_2.13-3.4.0. It was found that flushing Kafka snapshot file during segments rolling blocks producer request handling thread for some time. Reproduced latency improvement in the kafka_2.13-3.6.0-snapshot by offloading flush operation. Used available on my side single node test configuration: kafka_2.13-3.6.0-snapshot - trunk version kafka_2.13-3.6.0-snapshot-fix - trunk version with provided change partitions=10 # rolling at each ~52 seconds ![image](https://github.com/apache/kafka/assets/6793713/6f71a515-36d2-4d10-a577-6a8712c2dbf0) partitions=100 # rolling events about each 8.5 minute: ![image](https://github.com/apache/kafka/assets/6793713/a7780840-75e2-4fca-b1a6-7fa17cec702c) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org