[
https://issues.apache.org/jira/browse/MAPREDUCE-6923?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Robert Schmidtke updated MAPREDUCE-6923:
----------------------------------------
Attachment: MAPREDUCE-6923.00.patch
Initial patch from {{trunk}} that uses the minimum of {{shuffleBufferSize}} and
{{trans}} in {{FadvisedFileRegion}}.
> YARN Shuffle I/O for small partitions
> -------------------------------------
>
> Key: MAPREDUCE-6923
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6923
> Project: Hadoop Map/Reduce
> Issue Type: Improvement
> Environment: Observed in Hadoop 2.7.3 and above (judging from the
> source code of future versions), and Ubuntu 16.04.
> Reporter: Robert Schmidtke
> Assignee: Robert Schmidtke
> Attachments: MAPREDUCE-6923.00.patch
>
>
> When a job configuration results in small partitions read by each reducer
> from each mapper (e.g. 65 kilobytes as in my setup: a
> [TeraSort|https://github.com/apache/hadoop/blob/branch-2.7.3/hadoop-mapreduce-project/hadoop-mapreduce-examples/src/main/java/org/apache/hadoop/examples/terasort/TeraSort.java]
> of 256 gigabytes using 2048 mappers and reducers each), and setting
> {code:xml}
> <property>
> <name>mapreduce.shuffle.transferTo.allowed</name>
> <value>false</value>
> </property>
> {code}
> then the default setting of
> {code:xml}
> <property>
> <name>mapreduce.shuffle.transfer.buffer.size</name>
> <value>131072</value>
> </property>
> {code}
> results in almost 100% overhead in reads during shuffle in YARN, because for
> each 65K needed, 128K are read.
> I propose a fix in
> [FadvisedFileRegion.java|https://github.com/apache/hadoop/blob/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-shuffle/src/main/java/org/apache/hadoop/mapred/FadvisedFileRegion.java#L114]
> as follows:
> {code:java}
> ByteBuffer byteBuffer = ByteBuffer.allocate(Math.min(this.shuffleBufferSize,
> trans > Integer.MAX_VALUE ? Integer.MAX_VALUE : (int) trans));
> {code}
> e.g.
> [here|https://github.com/apache/hadoop/compare/branch-2.7.3...robert-schmidtke:adaptive-shuffle-buffer].
> This sets the shuffle buffer size to the minimum value of the shuffle buffer
> size specified in the configuration (128K by default), and the actual
> partition size (65K on average in my setup). In my benchmarks this reduced
> the read overhead in YARN from about 100% (255 additional gigabytes as
> described above) down to about 18% (an additional 45 gigabytes). The runtime
> of the job remained the same in my setup.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]