Re: [Pharo-project] Perfect hash [was: Re: HashTable vs Dictionary]

Andres Valloud Thu, 22 Oct 2009 13:00:03 -0700

My point would be that either:

* the hashed storage is always of size 1000 (or greater), so if you 
store a few objects in the hashed collection it wastes a lot of space;

* the hashed storage changes size to match the objects stored in it, but 
looking at f(x) mod p risks reintroducing the collisions you wanted to 
avoid in the first place.

So, to not waste space in this particular scenario, then either N should 
be small or basically all N objects should be stored in the hashed 
collection.  If N is small, then why not set f(x) to produce values 
between 1 and N, and then use an array?  And if N is large and the 
utilization of the hashed collection is close to 100% of N, then why not 
also use an array in that case?  In the end, really nice perfect hash 
functions point the finger at alternatives that may obviate using hashed 
collections.  Of course, your mileage may vary...

Andres.

Martin McClure wrote:
> Andres Valloud wrote:
>   
>> I thought I'd point out a subtle potential problem.  Assume N = 1000, 
>> f(x_j) = j-1, and so k = 1 because 0 <= f(x_j) < N because 0 < j <= 
>> 1000.  Now, store x_{7i} in a Set, for all i such that 1 <= 7i <= 1000.  
>> Clearly, f(x) is a perfect hash function as defined above.  However, if 
>> the size of the table used by the set is a multiple of 7, performance 
>> will be terrible.
>>
>>     
>
> Huh? If your hash function produces integers between 0 and 999, your 
> table size must be at least 1000, otherwise you store past the end of 
> your table.
>
> Regards,
>
> -Martin
>
>   

_______________________________________________
Pharo-project mailing list
[email protected]
http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project

Re: [Pharo-project] Perfect hash [was: Re: HashTable vs Dictionary]

Reply via email to