On  7.08.2010 00:21, chromatic wrote:
On Friday 06 August 2010 at 13:23, Paul C wrote:

As an aside, is this doubling in size each time considered a good thing?
Seems okay up to about 64 or so, but then the table gets big fast. I'm sure
everyone knows the trick of using a vector of sizes and representing the
size in the hash as an index into the vector. The cool thing is that the
vector can be customized.

Doubling isn't ideal, that's true.  I'd like to see an exponential backoff, but
I have too many suspicions that our hashes only work right now if the bucket
size is a power of two.

-- c

Yes, your doubts are right.

Currently, if we have a hash if M buckets, we construct a bit mask

hash->mask = M-1

When we look for the bucket to put/get a key we use something like this:

bucket_index = (hash_fn(key)) & hash->mask

Alternative approach that supports arbitrary values for M will be:

bicket_index = (hash_fn(key)) % M

The claim why it is like that (masking and power of 2 buckets) is that this makes hash table expanding cheaper. A am not so sure that this is true because I have not solid understanding of the current expand/resize algorithm.

luben


_______________________________________________
http://lists.parrot.org/mailman/listinfo/parrot-dev

Reply via email to