Prime number of reduces vs. linear hash function

Shi Yu Sat, 23 Oct 2010 19:00:39 -0700

There is a suggestion to set the number of reducers to a prime numberclosest to the number of nodes and number of mappers a prime numberclosest to several times the number of nodes in the cluster. But thereis also saying that "There is no need for the number of reduces to beprime. The only thing it helps is if you are using the HashPartitionerand your key's hash function is too linear. In practice, you usuallywant to use 99% of your reduce capacity of the cluster."

Could anyone explain what is the theory behind the prime number and thehash function here?

Shi

Prime number of reduces vs. linear hash function

Reply via email to