Re: AA rehash threshold

deadalnix via Digitalmars-d Tue, 18 Nov 2014 18:51:55 -0800

On Wednesday, 19 November 2014 at 02:16:36 UTC, Steven
Schveighoffer wrote:

From all I remember about hash tables, you are supposed tominimize the number of collisions when adding/looking up data.This helps keep the O(1) lookup property intact.


That is the theory, but the real world is sometime different. It
depends on how you resolve collisions. If you do it via some kind
of linked list, then yes, you must remove all collisions.

But if you do it by putting the element in a slot close to its
actual position, then thing gets different. In that case, you
don't want to minimize collision, but to minimize the distance to
the object, in order for it to be in the cache line.

After all, unless your hash table is very small, the first hit is
most likely a cache miss, meaning ~300 cycles. at this point the
cache line is in L1, with a 3 cycle access time. So accessing
several element in the same cache line is often preferable.

You may want to read about robin hood hash table for a simple
example of this. These hash tables are fast even with a very high
collision rate.

Re: AA rehash threshold

Reply via email to