Zoran Dimitrijevic created HADOOP-11827: -------------------------------------------
Summary: Speed-up distcp buildListing() using threadpool Key: HADOOP-11827 URL: https://issues.apache.org/jira/browse/HADOOP-11827 Project: Hadoop Common Issue Type: Improvement Components: tools/distcp Affects Versions: 3.0.0 Reporter: Zoran Dimitrijevic Assignee: Zoran Dimitrijevic Priority: Minor For very large source trees on s3 distcp is taking long time to build file listing (client code, before starting mappers). For a dataset I used (1.5M files, 50K dirs) it was taking 65 minutes before my fix in HADOOP-11785 and 36 minutes after the fix). -- This message was sent by Atlassian JIRA (v6.3.4#6332)