[
https://issues.apache.org/jira/browse/HADOOP-13169?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15505078#comment-15505078
]
Hudson commented on HADOOP-13169:
---------------------------------
SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #10462 (See
[https://builds.apache.org/job/Hadoop-trunk-Commit/10462/])
HADOOP-13169. Randomize file list in SimpleCopyListing. Contributed by
(cnauroth: rev 98bdb5139769eb55893971b43b9c23da9513a784)
* (edit)
hadoop-tools/hadoop-distcp/src/main/java/org/apache/hadoop/tools/DistCpConstants.java
* (edit)
hadoop-tools/hadoop-distcp/src/main/java/org/apache/hadoop/tools/SimpleCopyListing.java
* (edit)
hadoop-tools/hadoop-distcp/src/test/java/org/apache/hadoop/tools/TestCopyListing.java
> Randomize file list in SimpleCopyListing
> ----------------------------------------
>
> Key: HADOOP-13169
> URL: https://issues.apache.org/jira/browse/HADOOP-13169
> Project: Hadoop Common
> Issue Type: Improvement
> Components: tools/distcp
> Reporter: Rajesh Balamohan
> Assignee: Rajesh Balamohan
> Priority: Minor
> Fix For: 2.8.0
>
> Attachments: HADOOP-13169-branch-2-001.patch,
> HADOOP-13169-branch-2-002.patch, HADOOP-13169-branch-2-003.patch,
> HADOOP-13169-branch-2-004.patch, HADOOP-13169-branch-2-005.patch,
> HADOOP-13169-branch-2-006.patch, HADOOP-13169-branch-2-007.patch,
> HADOOP-13169-branch-2-008.patch, HADOOP-13169-branch-2-009.patch,
> HADOOP-13169-branch-2-010.patch
>
>
> When copying files to S3, based on file listing some mappers can get into S3
> partition hotspots. This would be more visible, when data is copied from hive
> warehouse with lots of partitions (e.g date partitions). In such cases, some
> of the tasks would tend to be a lot more slower than others. It would be good
> to randomize the file paths which are written out in SimpleCopyListing to
> avoid this issue.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]