Hadoop streaming: How is data distributed from mappers to reducers?

Nipun Saggar Sun, 23 Aug 2009 05:10:28 -0700

Hi all,

I have recently started using Hadoop streaming. From the documentation, I
understand that by default, each line output from a mapper up to the first
tab becomes the key and rest of the line is the value. I wanted to know that
between the mapper and reducer, is there a shuffling(sorting) phase? More
specifically, Would it be correct to assume that output from all mappers
with the same key will go to the same reducer?


Thanks,
Nipun

Hadoop streaming: How is data distributed from mappers to reducers?

Reply via email to