On 7.08.2010 00:21, chromatic wrote:
On Friday 06 August 2010 at 13:23, Paul C wrote:
As an aside, is this doubling in size each time considered a good thing?
Seems okay up to about 64 or so, but then the table gets big fast. I'm sure
everyone knows the trick of using a vector of sizes and representing the
size in the hash as an index into the vector. The cool thing is that the
vector can be customized.
Doubling isn't ideal, that's true. I'd like to see an exponential backoff, but
I have too many suspicions that our hashes only work right now if the bucket
size is a power of two.
-- c
Yes, your doubts are right.
Currently, if we have a hash if M buckets, we construct a bit mask
hash->mask = M-1
When we look for the bucket to put/get a key we use something like this:
bucket_index = (hash_fn(key)) & hash->mask
Alternative approach that supports arbitrary values for M will be:
bicket_index = (hash_fn(key)) % M
The claim why it is like that (masking and power of 2 buckets) is that
this makes hash table expanding cheaper. A am not so sure that this is
true because I have not solid understanding of the current expand/resize
algorithm.
luben
_______________________________________________
http://lists.parrot.org/mailman/listinfo/parrot-dev