Hi Mohit, What would be the advantage? Reducers in most cases read data from all the mappers. In the case where mappers were to write to HDFS, a reducer would still require to read data from other datanodes across the cluster.
Prashant On Apr 4, 2012, at 9:55 PM, Mohit Anchlia <mohitanch...@gmail.com> wrote: > On Wed, Apr 4, 2012 at 8:42 PM, Harsh J <ha...@cloudera.com> wrote: > >> Hi Mohit, >> >> On Thu, Apr 5, 2012 at 5:26 AM, Mohit Anchlia <mohitanch...@gmail.com> >> wrote: >>> I am going through the chapter "How mapreduce works" and have some >>> confusion: >>> >>> 1) Below description of Mapper says that reducers get the output file >> using >>> HTTP call. But the description under "The Reduce Side" doesn't >> specifically >>> say if it's copied using HTTP. So first confusion, Is the output copied >>> from mapper -> reducer or from reducer -> mapper? And second, Is the call >>> http:// or hdfs:// >> >> The flow is simple as this: >> 1. For M+R job, map completes its task after writing all partitions >> down into the tasktracker's local filesystem (under mapred.local.dir >> directories). >> 2. Reducers fetch completion locations from events at JobTracker, and >> query the TaskTracker there to provide it the specific partition it >> needs, which is done over the TaskTracker's HTTP service (50060). >> >> So to clear things up - map doesn't send it to reduce, nor does reduce >> ask the actual map task. It is the task tracker itself that makes the >> bridge here. >> >> Note however, that in Hadoop 2.0 the transfer via ShuffleHandler would >> be over Netty connections. This would be much more faster and >> reliable. >> >>> 2) My understanding was that mapper output gets written to hdfs, since >> I've >>> seen part-m-00000 files in hdfs. If mapper output is written to HDFS then >>> shouldn't reducers simply read it from hdfs instead of making http calls >> to >>> tasktrackers location? >> >> A map-only job usually writes out to HDFS directly (no sorting done, >> cause no reducer is involved). If the job is a map+reduce one, the >> default output is collected to local filesystem for partitioning and >> sorting at map end, and eventually grouping at reduce end. Basically: >> Data you want to send to reducer from mapper goes to local FS for >> multiple actions to be performed on them, other data may directly go >> to HDFS. >> >> Reducers currently are scheduled pretty randomly but yes their >> scheduling can be improved for certain scenarios. However, if you are >> pointing that map partitions ought to be written to HDFS itself (with >> replication or without), I don't see performance improving. Note that >> the partitions aren't merely written but need to be sorted as well (at >> either end). To do that would need ability to spill frequently (cause >> we don't have infinite memory to do it all in RAM) and doing such a >> thing on HDFS would only mean slowdown. >> >> Thanks for clearing my doubts. In this case I was merely suggesting that > if the mapper output (merged output in the end or the shuffle output) is > stored in HDFS then reducers can just retrieve it from HDFS instead of > asking tasktracker for it. Once reducer threads read it they can continue > to work locally. > > > >> I hope this helps clear some things up for you. >> >> -- >> Harsh J >>