My point would be that either:
* the hashed storage is always of size 1000 (or greater), so if you
store a few objects in the hashed collection it wastes a lot of space;
* the hashed storage changes size to match the objects stored in it, but
looking at f(x) mod p risks reintroducing the collisions you wanted to
avoid in the first place.
So, to not waste space in this particular scenario, then either N should
be small or basically all N objects should be stored in the hashed
collection. If N is small, then why not set f(x) to produce values
between 1 and N, and then use an array? And if N is large and the
utilization of the hashed collection is close to 100% of N, then why not
also use an array in that case? In the end, really nice perfect hash
functions point the finger at alternatives that may obviate using hashed
collections. Of course, your mileage may vary...
Andres.
Martin McClure wrote:
> Andres Valloud wrote:
>
>> I thought I'd point out a subtle potential problem. Assume N = 1000,
>> f(x_j) = j-1, and so k = 1 because 0 <= f(x_j) < N because 0 < j <=
>> 1000. Now, store x_{7i} in a Set, for all i such that 1 <= 7i <= 1000.
>> Clearly, f(x) is a perfect hash function as defined above. However, if
>> the size of the table used by the set is a multiple of 7, performance
>> will be terrible.
>>
>>
>
> Huh? If your hash function produces integers between 0 and 999, your
> table size must be at least 1000, otherwise you store past the end of
> your table.
>
> Regards,
>
> -Martin
>
>
_______________________________________________
Pharo-project mailing list
[email protected]
http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project