wernerdv opened a new pull request, #17443: URL: https://github.com/apache/kafka/pull/17443
NotLeaderOrFollowerException occurs here https://github.com/apache/kafka/blob/trunk/core/src/main/scala/kafka/server/RemoteLeaderEndPoint.scala#L188 The current fix is to catch and ignore NotLeaderOrFollowerException. Local benchmark result: ``` ./jmh-benchmarks/jmh.sh ReplicaFetcherThreadBenchmark running gradlew :jmh-benchmarks:clean :jmh-benchmarks:shadowJar > Configure project : Starting build with version 4.0.0-SNAPSHOT (commit id 94c7ede7) using Gradle 8.10, Java 17 and Scala 2.13.15 Build properties: ignoreFailures=false, maxParallelForks=6, maxScalacThreads=6, maxTestRetries=0 Deprecated Gradle features were used in this build, making it incompatible with Gradle 9.0. You can use '--warning-mode all' to show the individual deprecation warnings and determine if they come from your own scripts or plugins. For more on this, please refer to https://docs.gradle.org/8.10/userguide/command_line_interface.html#sec:command_line_warnings in the Gradle documentation. BUILD SUCCESSFUL in 22s 96 actionable tasks: 23 executed, 73 up-to-date gradle build done running JMH with args: ReplicaFetcherThreadBenchmark # JMH version: 1.37 # VM version: JDK 17.0.12, OpenJDK 64-Bit Server VM, 17.0.12+7-Ubuntu-1ubuntu222.04 # VM invoker: /usr/lib/jvm/java-17-openjdk-amd64/bin/java # VM options: <none> # Blackhole mode: compiler (auto-detected, use -Djmh.blackhole.autoDetect=false to disable) # Warmup: 5 iterations, 10 s each # Measurement: 15 iterations, 10 s each # Timeout: 10 min per iteration # Threads: 1 thread, will synchronize iterations # Benchmark mode: Average time, time/op # Benchmark: org.apache.kafka.jmh.fetcher.ReplicaFetcherThreadBenchmark.testFetcher # Parameters: (partitionCount = 100) # Run progress: 0,00% complete, ETA 00:13:20 # Fork: 1 of 1 # Warmup Iteration 1: [2024-10-10 13:36:57,811] WARN The new 'consumer' rebalance protocol is only supported in KRaft cluster with the new group coordinator. (kafka.server.KafkaConfig:70) OpenJDK 64-Bit Server VM warning: Sharing is only supported for boot loader classes because bootstrap classpath has been appended 1929,906 ns/op # Warmup Iteration 2: 1860,040 ns/op # Warmup Iteration 3: 1879,765 ns/op # Warmup Iteration 4: 1884,042 ns/op # Warmup Iteration 5: 1875,712 ns/op Iteration 1: 1877,666 ns/op Iteration 2: 1885,357 ns/op Iteration 3: 1876,356 ns/op Iteration 4: 1874,775 ns/op Iteration 5: 1875,129 ns/op Iteration 6: 1872,721 ns/op Iteration 7: 1876,337 ns/op Iteration 8: 1890,266 ns/op Iteration 9: 1870,369 ns/op Iteration 10: 1885,525 ns/op Iteration 11: 1989,414 ns/op Iteration 12: 1912,892 ns/op Iteration 13: 1922,298 ns/op Iteration 14: 1902,687 ns/op Iteration 15: 1906,352 ns/op Result "org.apache.kafka.jmh.fetcher.ReplicaFetcherThreadBenchmark.testFetcher": 1894,543 ±(99.9%) 32,917 ns/op [Average] (min, avg, max) = (1870,369, 1894,543, 1989,414), stdev = 30,790 CI (99.9%): [1861,626, 1927,460] (assumes normal distribution) # JMH version: 1.37 # VM version: JDK 17.0.12, OpenJDK 64-Bit Server VM, 17.0.12+7-Ubuntu-1ubuntu222.04 # VM invoker: /usr/lib/jvm/java-17-openjdk-amd64/bin/java # VM options: <none> # Blackhole mode: compiler (auto-detected, use -Djmh.blackhole.autoDetect=false to disable) # Warmup: 5 iterations, 10 s each # Measurement: 15 iterations, 10 s each # Timeout: 10 min per iteration # Threads: 1 thread, will synchronize iterations # Benchmark mode: Average time, time/op # Benchmark: org.apache.kafka.jmh.fetcher.ReplicaFetcherThreadBenchmark.testFetcher # Parameters: (partitionCount = 500) # Run progress: 25,00% complete, ETA 00:10:12 # Fork: 1 of 1 # Warmup Iteration 1: [2024-10-10 13:40:22,069] WARN The new 'consumer' rebalance protocol is only supported in KRaft cluster with the new group coordinator. (kafka.server.KafkaConfig:70) OpenJDK 64-Bit Server VM warning: Sharing is only supported for boot loader classes because bootstrap classpath has been appended 8464,782 ns/op # Warmup Iteration 2: 8192,703 ns/op # Warmup Iteration 3: 8162,707 ns/op # Warmup Iteration 4: 8122,797 ns/op # Warmup Iteration 5: 8169,713 ns/op Iteration 1: 8057,133 ns/op Iteration 2: 8053,061 ns/op Iteration 3: 8077,125 ns/op Iteration 4: 8039,068 ns/op Iteration 5: 8024,524 ns/op Iteration 6: 8035,134 ns/op Iteration 7: 8013,353 ns/op Iteration 8: 8018,225 ns/op Iteration 9: 8021,750 ns/op Iteration 10: 8053,567 ns/op Iteration 11: 8047,978 ns/op Iteration 12: 8515,976 ns/op Iteration 13: 8523,523 ns/op Iteration 14: 8521,076 ns/op Iteration 15: 8524,231 ns/op Result "org.apache.kafka.jmh.fetcher.ReplicaFetcherThreadBenchmark.testFetcher": 8168,382 ±(99.9%) 236,112 ns/op [Average] (min, avg, max) = (8013,353, 8168,382, 8524,231), stdev = 220,860 CI (99.9%): [7932,269, 8404,494] (assumes normal distribution) # JMH version: 1.37 # VM version: JDK 17.0.12, OpenJDK 64-Bit Server VM, 17.0.12+7-Ubuntu-1ubuntu222.04 # VM invoker: /usr/lib/jvm/java-17-openjdk-amd64/bin/java # VM options: <none> # Blackhole mode: compiler (auto-detected, use -Djmh.blackhole.autoDetect=false to disable) # Warmup: 5 iterations, 10 s each # Measurement: 15 iterations, 10 s each # Timeout: 10 min per iteration # Threads: 1 thread, will synchronize iterations # Benchmark mode: Average time, time/op # Benchmark: org.apache.kafka.jmh.fetcher.ReplicaFetcherThreadBenchmark.testFetcher # Parameters: (partitionCount = 1000) # Run progress: 50,00% complete, ETA 00:06:56 # Fork: 1 of 1 # Warmup Iteration 1: [2024-10-10 13:43:53,904] WARN The new 'consumer' rebalance protocol is only supported in KRaft cluster with the new group coordinator. (kafka.server.KafkaConfig:70) OpenJDK 64-Bit Server VM warning: Sharing is only supported for boot loader classes because bootstrap classpath has been appended 16887,223 ns/op # Warmup Iteration 2: 16481,102 ns/op # Warmup Iteration 3: 16141,360 ns/op # Warmup Iteration 4: 16114,730 ns/op # Warmup Iteration 5: 16072,493 ns/op Iteration 1: 15944,404 ns/op Iteration 2: 16098,280 ns/op Iteration 3: 15944,495 ns/op Iteration 4: 16056,134 ns/op Iteration 5: 15999,214 ns/op Iteration 6: 16086,102 ns/op Iteration 7: 16064,142 ns/op Iteration 8: 16058,817 ns/op Iteration 9: 16059,667 ns/op Iteration 10: 16082,960 ns/op Iteration 11: 16037,771 ns/op Iteration 12: 15971,635 ns/op Iteration 13: 15983,740 ns/op Iteration 14: 15946,546 ns/op Iteration 15: 16033,504 ns/op Result "org.apache.kafka.jmh.fetcher.ReplicaFetcherThreadBenchmark.testFetcher": 16024,494 ±(99.9%) 58,477 ns/op [Average] (min, avg, max) = (15944,404, 16024,494, 16098,280), stdev = 54,699 CI (99.9%): [15966,017, 16082,971] (assumes normal distribution) # JMH version: 1.37 # VM version: JDK 17.0.12, OpenJDK 64-Bit Server VM, 17.0.12+7-Ubuntu-1ubuntu222.04 # VM invoker: /usr/lib/jvm/java-17-openjdk-amd64/bin/java # VM options: <none> # Blackhole mode: compiler (auto-detected, use -Djmh.blackhole.autoDetect=false to disable) # Warmup: 5 iterations, 10 s each # Measurement: 15 iterations, 10 s each # Timeout: 10 min per iteration # Threads: 1 thread, will synchronize iterations # Benchmark mode: Average time, time/op # Benchmark: org.apache.kafka.jmh.fetcher.ReplicaFetcherThreadBenchmark.testFetcher # Parameters: (partitionCount = 5000) # Run progress: 75,00% complete, ETA 00:03:32 # Fork: 1 of 1 # Warmup Iteration 1: [2024-10-10 13:47:35,385] WARN The new 'consumer' rebalance protocol is only supported in KRaft cluster with the new group coordinator. (kafka.server.KafkaConfig:70) OpenJDK 64-Bit Server VM warning: Sharing is only supported for boot loader classes because bootstrap classpath has been appended 90793,927 ns/op # Warmup Iteration 2: 87362,350 ns/op # Warmup Iteration 3: 86543,760 ns/op # Warmup Iteration 4: 86226,549 ns/op # Warmup Iteration 5: 87073,001 ns/op Iteration 1: 87169,335 ns/op Iteration 2: 87936,322 ns/op Iteration 3: 87299,357 ns/op Iteration 4: 88002,498 ns/op Iteration 5: 86985,872 ns/op Iteration 6: 87270,992 ns/op Iteration 7: 86482,991 ns/op Iteration 8: 86770,086 ns/op Iteration 9: 85933,541 ns/op Iteration 10: 85948,712 ns/op Iteration 11: 87414,002 ns/op Iteration 12: 87037,239 ns/op Iteration 13: 87365,601 ns/op Iteration 14: 87367,632 ns/op Iteration 15: 87196,675 ns/op Result "org.apache.kafka.jmh.fetcher.ReplicaFetcherThreadBenchmark.testFetcher": 87078,724 ±(99.9%) 640,395 ns/op [Average] (min, avg, max) = (85933,541, 87078,724, 88002,498), stdev = 599,026 CI (99.9%): [86438,329, 87719,119] (assumes normal distribution) # Run complete. Total time: 00:16:42 REMEMBER: The numbers below are just data. To gain reusable insights, you need to follow up on why the numbers are the way they are. Use profilers (see -prof, -lprof), design factorial experiments, perform baseline and negative tests that provide experimental control, make sure the benchmarking environment is safe on JVM/OS/HW level, ask for reviews from the domain experts. Do not assume the numbers tell you what you want them to tell. NOTE: Current JVM experimentally supports Compiler Blackholes, and they are in use. Please exercise extra caution when trusting the results, look into the generated code to check the benchmark still works, and factor in a small probability of new VM bugs. Additionally, while comparisons between different JVMs are already problematic, the performance difference caused by different Blackhole modes can be very significant. Please make sure you use the consistent Blackhole mode for comparisons. Benchmark (partitionCount) Mode Cnt Score Error Units ReplicaFetcherThreadBenchmark.testFetcher 100 avgt 15 1894,543 ± 32,917 ns/op ReplicaFetcherThreadBenchmark.testFetcher 500 avgt 15 8168,382 ± 236,112 ns/op ReplicaFetcherThreadBenchmark.testFetcher 1000 avgt 15 16024,494 ± 58,477 ns/op ReplicaFetcherThreadBenchmark.testFetcher 5000 avgt 15 87078,724 ± 640,395 ns/op JMH benchmarks done ``` ### Committer Checklist (excluded from commit message) - [ ] Verify design and implementation - [ ] Verify test coverage and CI build status - [ ] Verify documentation (including upgrade notes) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
