[ http://issues.apache.org/jira/browse/HADOOP-195?page=all ]
Sameer Paranjpye updated HADOOP-195:
------------------------------------
Attachment: parallel-copiers.txt
Here's a patch that ought to address a lot of the concerns with the copy phase.
The reduce task now launches a small number of threads (default 5) that fetch
map outputs in parallel. The difference between this and the parallel RPC is
that the parallel fetches are independent of each other, as soon as one copy
finishes the next one starts. The parallel RPC send N requests and wait for
them all to complete before sending the next N.
Ran Owen's benchmark with this code for a total runtime of 8 hours 53 minutes.
Sorting 2000GB on 200 nodes, with 10 parallel fetchers per reduce. The
(map+copy) finished in approximately 2.5 hours, so that the bulk of the time
was spent in the reduce phase. We saw 41 failures during sort and reduce, which
pushed out the total runtime.
Tasktracker latency causing pings and progress reports from the child to fail
appears to be the biggest concern at this point.
The copy phase could be speeded further by improving the hit rate at the
beginning of the copy. At this time a reduce probes for a random subset of the
map outputs
and fetches as many as are available, the fetcher threads are mostly idle early
on, and they don't need to be. Having the job tracker furnish a list of maps
that have completed since the last query from a tasktracker would go a long way
towards addressing this.
The number of parallel copiers per reduce is controlled by the config variable
"mapred.reduce.parallel.copiers" the default is 5
> transfer map output transfer with http instead of rpc
> -----------------------------------------------------
>
> Key: HADOOP-195
> URL: http://issues.apache.org/jira/browse/HADOOP-195
> Project: Hadoop
> Type: Improvement
> Components: mapred
> Versions: 0.2
> Reporter: Owen O'Malley
> Assignee: Owen O'Malley
> Fix For: 0.3
> Attachments: MapFileSimulator.java, data-transfer-chart.pdf,
> mapfilesimulator-big.txt, mapfilesimulator-sort2.txt, netstat.log,
> netstat.xls, parallel-copiers.txt
>
> The data transfer of the map output should be transfered via http instead
> rpc, because rpc is very slow for this application and the timeout behavior
> is suboptimal. (server sends data and client ignores it because it took more
> than 10 seconds to be received.)
--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
http://www.atlassian.com/software/jira