[
https://issues.apache.org/jira/browse/HADOOP-1540?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14535403#comment-14535403
]
Zoran Dimitrijevic commented on HADOOP-1540:
--------------------------------------------
#5: we were experiencing performance issues for large number of files only
because of RPCs to either namenode or to s3. Filtering each file name locally
using a small number of compiled regex or glob rules should not be a big deal,
especially since it's optional. For example, sorting a big filelist that we do
now is much more expensive.
Thank you for your patch!
> distcp should support an exclude list
> -------------------------------------
>
> Key: HADOOP-1540
> URL: https://issues.apache.org/jira/browse/HADOOP-1540
> Project: Hadoop Common
> Issue Type: Improvement
> Components: util
> Affects Versions: 2.6.0
> Reporter: Senthil Subramanian
> Assignee: Rich Haase
> Priority: Minor
> Labels: BB2015-05-TBR, patch
> Attachments: HADOOP-1540.003.patch, HADOOP-1540.004.patch,
> HADOOP-1540.005.patch, HADOOP-1540.006.patch
>
>
> There should be a way to ignore specific paths (eg: those that have already
> been copied over under the current srcPath).
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)