http://computinglife.wordpress.com/2008/11/20/why-do-hash-functions-use-prime-numbers/

<http://computinglife.wordpress.com/2008/11/20/why-do-hash-functions-use-prime-numbers/>discusses
the theory in detail.

On Sun, Oct 24, 2010 at 7:30 AM, Shi Yu <[email protected]> wrote:

> There is a suggestion to set the number of reducers to a prime number
> closest to the number of nodes and number of mappers a prime number closest
> to several times the number of nodes in the cluster. But there is also
> saying that "There is no need for the number of reduces to be prime. The
> only thing it helps is if you are using the HashPartitioner and your key's
> hash function is too linear. In practice, you usually want to use 99% of
> your reduce capacity of the cluster."
>
> Could anyone explain what is the theory behind the prime number and the
> hash function here?
>
> Shi
>
>

Reply via email to