Wei-Chiu Chuang created HDFS-14574: -------------------------------------- Summary: [distcp] Add ability to increase the replication factor for fileList.seq Key: HDFS-14574 URL: https://issues.apache.org/jira/browse/HDFS-14574 Project: Hadoop HDFS Issue Type: Improvement Components: distcp Reporter: Wei-Chiu Chuang
distcp creates fileList.seq with default replication factor = 3. For large clusters runing distcp job with thousands of mappers, that 3-replica for the file listing file is not good enough, because DataNodes easily run out of max number of xceivers. It looks like we can pass in a distcp option, update replication factor in when creating the sequence file writer: [https://github.com/apache/hadoop/blob/trunk/hadoop-tools/hadoop-distcp/src/main/java/org/apache/hadoop/tools/SimpleCopyListing.java#L517-L521] -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org