Viktor Somogyi-Vass created KAFKA-10650:
-------------------------------------------
Summary: Use Murmur3 hashing instead of MD5 in SkimpyOffsetMap
Key: KAFKA-10650
URL: https://issues.apache.org/jira/browse/KAFKA-10650
Project: Kafka
Issue Type: Improvement
Components: core
Reporter: Viktor Somogyi-Vass
Assignee: Viktor Somogyi-Vass
The usage of MD5 has been uncovered during testing Kafka for FIPS (Federal
Information Processing Standards) verification.
While MD5 isn't a FIPS incompatibility here as it isn't used for cryptographic
purposes, I spent some time with this as it isn't ideal either. MD5 is a
relatively fast crypto hashing algo but there are much better performing
algorithms for hash tables as it's used in SkimpyOffsetMap.
By applying Murmur3 (that is implemented in Streams) I could achieve a 3x
faster {{put}} operation and the overall segment cleaning sped up by 30% while
preserving the same collision rate (both performed within 0.0015 - 0.007,
mostly with 0.004 median).
--
This message was sent by Atlassian Jira
(v8.3.4#803005)