distcp can generate uneven map task assignments
-----------------------------------------------
Key: MAPREDUCE-1059
URL: https://issues.apache.org/jira/browse/MAPREDUCE-1059
Project: Hadoop Map/Reduce
Issue Type: Bug
Components: distcp
Reporter: Aaron Kimball
Assignee: Aaron Kimball
Attachments: MAPREDUCE-1059.patch
distcp writes out a SequenceFile containing the source files to transfer, and
their sizes. Map tasks are created over spans of this file, representing files
which each mapper should transfer. In practice, some transfer loads yield many
empty map tasks and a few tasks perform the bulk of the work.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.