Zheng Shao created HADOOP-13975: ----------------------------------- Summary: Allow DistCp to use MultiThreadedMapper Key: HADOOP-13975 URL: https://issues.apache.org/jira/browse/HADOOP-13975 Project: Hadoop Common Issue Type: New Feature Components: tools/distcp Affects Versions: 3.0.0-alpha1 Reporter: Zheng Shao Assignee: Zheng Shao Priority: Minor
Although distcp allow users to control the parallelism via number of mappers, sometimes it's desirable to run fewer mappers but more threads per mapper. Since distcp is network bound (either by throughput or more frequently by latency of creating connections, opening files, reading/writing files, and closing files), this can make each mapper much more efficient. In that way, a lot of resources can be shared so we can save memory and connections to NameNode. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-dev-h...@hadoop.apache.org