tommyettinger commented on issue #1409: URL: https://github.com/apache/incubator-fury/issues/1409#issuecomment-2028576591
OK, I've been thinking about this for a while, and tinkering a lot with different options. I was surprised to find that at least one version of cuckoo hashing really can perform quite well (though not as dramatically better on containsKey() performance), as long as it never encounters two different keys with the same exact hashCode(). I modified an existing Apache v2-licensed cuckoo hash map, reducing allocations as much as possible, and the performance gains are considerable on some operations: ``` POPULATE IdentityCuckooMap : 198676.832 ns/op JDK IdentityHashMap: 508930.787 ns/op CONTAINS IdentityCuckooMap : 1023.728 ns/op JDK IdentityHashMap: 1418.562 ns/op COPY IdentityCuckooMap : 28171.380 ns/op JDK IdentityHashMap: 302847.095 ns/op ITERATE IdentityCuckooMap : 27954.503 ns/op JDK IdentityHashMap: 149939.553 ns/op ``` The catch here is that if a cuckoo hash "fails," it can enter an infinite loop or allocate memory forever, etc. Very bad worst-case properties. However, the internals of this particular cuckoo-hashed map and the existing FuryObjectMap are quite similar, and I'm working on an approach that will "flip" from using cuckoo hashing as long as it is viable, to using linear probing as FuryObjectMap currently does, if fully-colliding keys are present (or some other situation would force a rehash). I'm not sure what kind of speed penalty the flipping will impose; it is possible branch prediction will figure out that fully-colliding keys are extremely rare, and that would mean the cost would be very low relative to the above numbers. But, the code is larger for each method, which might be a problem for inlining. I'll have to benchmark and see... I know the JDK HashMap can also change its implementation to use a TreeMap-like Red-Black Tree if it encounters excessive collisions on Comparable k eys, so this isn't an unprecedented model. I'm testing the FlipMap code in [my jdkgdxds repo](https://github.com/tommyettinger/jdkgdxds/blob/master/src/test/java/com/github/tommyettinger/ds/test/FlipMap.java) for correctness, and I'll be benchmarking it using what should be similar code in [my assorted-benchmarks repo](https://github.com/tommyettinger/assorted-benchmarks/blob/master/jmh/src/main/java/de/heidelberg/pvs/container_bench/FlipMap.java). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
