Oh, and in the [adix](https://github.com/c-blake/adix) git history, I do have a pseudo-random probing impl along with one that has an idea by @timothee to start linear probing and only switch to pseudo-random after some number of cache lines. Those used to be called `otset.nim` and `ottab.nim` and more or less are exactly the Python 3.6 dict impl. I decided this whole general idea space was a distraction from better designs and just deleted it.
The Python algo may be ok for the _very_ limited "language-symbol-table" use cases Python mostly applies it to (where it really should not be doing run time hash lookups in the first place, IMO). It is just obviously bad for very large tables. It "doubles up" the already more indirect than necessary old Python pseudo-random design. Each step in the collision chain must fetch the index out of the hash-table part and then the hash code out of the real data part. A cold cache, poorly predicted lookup with a collision search depth of 4 could take 8*60..120ns DIMM memory load latencies. A simple way to improve this by 2x is in `lptabz` which is to store some of the hash code along with the index in the "hash-table-like part". Then if the hash table part uses linear probing this reduces to just 2 loads. This is 2x more than the 1 load "unordered" variant, but you get compactness & insertion-order in exchange (as well as optimal iteration over elements performance).
