I'm not quite sure what you mean by the Shuffle algorithm but briefly
for each (key,value) pair, a hash value is computed and then the
record is sent to a node by taking the modulo of that hash. So if you
have n reducers, the record goes to reducer # : hash(key) % n.

The sorting algorithm is most probably a variation of the external
sorting algorithm used for sorting objects that don't fit in the
memory. See:

http://en.wikipedia.org/wiki/External_sorting

for more details.

Jim

On Wed, Apr 28, 2010 at 1:16 PM, Dan Fundatureanu
<[email protected]> wrote:
> What is the algorithm used in "Shuffle and Sort" step?
>

Reply via email to