oertl opened a new pull request, #20359: URL: https://github.com/apache/kafka/pull/20359
This PR optimizes the Murmur2 hash computation. Instead of processing bytes individually, the proposed change uses a `VarHandle` to load 4 bytes from the given `data` array at once, leading to a significant speedup. The first commit extends the unit tests for better coverage. The second commit introduces a JMH benchmark for measuring the speedup. And the third commit actually changes the Murmur2 implementation. The benchmark results (executed locally on my notebook) before and after this change are shown below. For example, `TEST_CASE_1_64` means that the input size varies randomly from 1 to 64 bytes, making branch prediction difficult. Another example, `TEST_CASE_4_4` varies the input size from 4 to 4 bytes, thus effectively providing arrays with a constant length of 4 bytes, which could be beneficial for branch prediction. All considered test cases show significant speedups. **Before:** ``` Benchmark (testCase) Mode Cnt Score Error Units Murmur2Benchmark.hashBytes TEST_CASE_1_4 avgt 20 5.613 ± 0.110 us/op Murmur2Benchmark.hashBytes TEST_CASE_1_16 avgt 20 10.646 ± 0.086 us/op Murmur2Benchmark.hashBytes TEST_CASE_1_64 avgt 20 24.313 ± 0.371 us/op Murmur2Benchmark.hashBytes TEST_CASE_1_256 avgt 20 77.239 ± 0.651 us/op Murmur2Benchmark.hashBytes TEST_CASE_4_4 avgt 20 7.939 ± 0.061 us/op Murmur2Benchmark.hashBytes TEST_CASE_16_16 avgt 20 15.769 ± 0.180 us/op Murmur2Benchmark.hashBytes TEST_CASE_64_64 avgt 20 40.739 ± 0.617 us/op Murmur2Benchmark.hashBytes TEST_CASE_256_256 avgt 20 141.524 ± 1.599 us/op ``` **After:** ``` Benchmark (testCase) Mode Cnt Score Error Units Murmur2Benchmark.hashBytes TEST_CASE_1_4 avgt 20 3.515 ± 0.072 us/op Murmur2Benchmark.hashBytes TEST_CASE_1_16 avgt 20 7.918 ± 0.102 us/op Murmur2Benchmark.hashBytes TEST_CASE_1_64 avgt 20 16.257 ± 0.254 us/op Murmur2Benchmark.hashBytes TEST_CASE_1_256 avgt 20 46.919 ± 0.590 us/op Murmur2Benchmark.hashBytes TEST_CASE_4_4 avgt 20 6.277 ± 0.137 us/op Murmur2Benchmark.hashBytes TEST_CASE_16_16 avgt 20 9.768 ± 0.138 us/op Murmur2Benchmark.hashBytes TEST_CASE_64_64 avgt 20 22.950 ± 0.289 us/op Murmur2Benchmark.hashBytes TEST_CASE_256_256 avgt 20 81.898 ± 1.405 us/op ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org