You may also see Knuth's "The Art of Computer Programming".  I remember that 
there is a discussion about prime number and hash function. (It should be in 
Volume 3 Chapter 6.  There is a section about hashing.  Sorry that I don't have 
the book with me and can't give you the page numbers.)


Nicholas



________________________________
From: aniket ray <[email protected]>
To: [email protected]
Sent: Mon, October 25, 2010 12:12:16 PM
Subject: Re: Prime number of reduces vs. linear hash function

http://computinglife.wordpress.com/2008/11/20/why-do-hash-functions-use-prime-numbers/


<http://computinglife.wordpress.com/2008/11/20/why-do-hash-functions-use-prime-numbers/>discusses

the theory in detail.

On Sun, Oct 24, 2010 at 7:30 AM, Shi Yu <[email protected]> wrote:

> There is a suggestion to set the number of reducers to a prime number
> closest to the number of nodes and number of mappers a prime number closest
> to several times the number of nodes in the cluster. But there is also
> saying that "There is no need for the number of reduces to be prime. The
> only thing it helps is if you are using the HashPartitioner and your key's
> hash function is too linear. In practice, you usually want to use 99% of
> your reduce capacity of the cluster."
>
> Could anyone explain what is the theory behind the prime number and the
> hash function here?
>
> Shi
>
>

Reply via email to