I don't know the history of that method, but I think there are reasons
why no mixing could be beneficial. I don't know if it ends up being
true in practice. For example, if you frequently store keys with
values 0..N-1 then in a big-enough map, and maybe even often access
the keys in order, then this tends to give you dense and linear access
to the table.

If the map is big enough that collisions are rare, then the negative
side effects you get from chaining are small and rare.

I haven't thought deeply about it and don't have evidence it's better
or worse but there may be a reason.

On Wed, Jun 5, 2013 at 1:49 PM, Dawid Weiss
<[email protected]> wrote:
>> But that's absolutely weird. The mixing function should take care of that.
>
>> But that's absolutely weird. The mixing function should take care of that.
>
> Sure, unless you don't have any... This is what's currently in Mahout
> (look closely at the first line!):
>
>   public static int hash(int value) {
>     return value;
>
>     //return value * 0x278DDE6D; // see
> org.apache.mahout.math.jet.random.engine.DRand
>
>     /*
>     value &= 0x7FFFFFFF; // make it >=0
>     int hashCode = 0;
>     do hashCode = 31*hashCode + value%10;
>     while ((value /= 10) > 0);
>
>     return 28629151*hashCode; // spread even further; h*31^5
>     */
>   }
>
> So there is no redistributing of keys. In my opinion this is a bug
> that should be addressed.
>
> Dawid

Reply via email to