I wrote: "most lookups are failed lookups", but Oops! 291e3 unique/4.8e6 total 
= 6% ==> 94% of lookups are successful. So, the `memcmp` that happens only 
after hash codes match does happen almost all the time, and so my particular 
data set is more like 16B/(32B+9B) = 39% cached, not 50%. This doesn't really 
alter the rest of my analysis, though since 39% and 50% are pretty close.

A corollary of this is that (for that style of data set), Robin Hood hashing 
w/linear probing would not help. RH can help for failed lookups because probe 
sequences for missing items become about 50% as long (once the search hash code 
is > table hash code you know it's missing and where to put the new key). It 
costs a little work { O(1.15x) } to keep entries along probe sequences sorted 
by hash code. So, RH only pays off if you care about worst case times or have a 
workload dominated by failed lookups over long probe sequences. At 56% table 
load (291./512) the sequences aren't long unless the hash function is weak 
relative to the data (not the case for me unless all 4 functions were equally 
weak).

Anyway, RH would be about the only algorithmic speed-up I think could help in 
the poorly cached large table case, but only for very low hit rate data. It 
would not be crazy to add a `defined(useRobinHood)` in the Nim hash table impl 
(or even have a per-instance run-time flag) since whether RH helps or hurts is 
so dependent upon workload.

Reply via email to