[
https://issues.apache.org/jira/browse/HADOOP-939?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12467717
]
Doug Cutting commented on HADOOP-939:
-------------------------------------
I suspect that most of the performance gains to be had by declaring input to be
sorted can also be had by using heuristics that also speed things when input is
only nearly sorted. (By "nearly sorted" I mean things like merging a set of
updates into a sorted database, e.g., the crawl db update task in Nutch.)
Eric Baldeschwieler proposed a simple model for MapReduce performance. If you
assume that disks can read and write at 100MB/s, and that nodes can talk within
rack at 100MB/s (Gb/s) and to nodes in another rack at 10MB/s, then a MapReduce
requires the following number of seconds per 100MB. (Note that this assumes
various sort optimizations that are already in progress, where map outputs are
buffered and sorted before they're spilled to the local disk on map nodes, and
reduce inputs are buffered and merged before they're spilled to the local disk
on the reduce node, so that, in many cases, reduce can proceed without an
explicit sort stage but simply by merging a set of already sorted input files
from the local disk.)
a. 1 read input data from local drive on map node
[ map ]
b. 1 write batches of sorted output data to temporary file on map node
c. 10 shuffle batches of sorted data to reduce node
d. 1 write batches of sorted data to reduce node
[ reduce]
e. 1 write one copy of output locally
f. 2 transfer and write one copy to another node on the same rack
g. 11 transfer and write one copy to an off-rack node
So the total is 27s/100MB. Only two of those are really sort-specific, (b) and
(d). 14 (more than half) are unavoidable.
The biggest chunk of fat to go after for pre-sorted input is (c). This can be
eliminated if maps can be placed near reduces. For example, tasktrackers might
report the size of each partition they're generating and the jobtracker might
use this to schedule reduces on racks which already have a lot of their input.
> No-sort optimization
> --------------------
>
> Key: HADOOP-939
> URL: https://issues.apache.org/jira/browse/HADOOP-939
> Project: Hadoop
> Issue Type: New Feature
> Components: mapred
> Environment: all
> Reporter: Doug Judd
>
> There should be a way to tell the mapred framework that the output of the
> map() phase will already be sorted. The Reduce phase can just merge the
> intermediate files together without sorting.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.