[ 
https://issues.apache.org/jira/browse/HDFS-8234?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

J.Andreina updated HDFS-8234:
-----------------------------
    Attachment: HDFS-8234.3.patch

Attaching rebased patch.
Please review.

> DistributedFileSystem and Globber should apply PathFilter early
> ---------------------------------------------------------------
>
>                 Key: HDFS-8234
>                 URL: https://issues.apache.org/jira/browse/HDFS-8234
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>            Reporter: Rohini Palaniswamy
>            Assignee: J.Andreina
>              Labels: newbie
>         Attachments: HDFS-8234.1.patch, HDFS-8234.2.patch, HDFS-8234.3.patch
>
>
> HDFS-985 added partial listing in listStatus to avoid listing entries of 
> large directory in one go. If listStatus(Path p, PathFilter f) call is made, 
> filter is applied after fetching all the entries resulting in a big list 
> being constructed on the client side. If the 
> DistributedFileSystem.listStatusInternal() applied the PathFilter it would be 
> more efficient. So DistributedFileSystem should override listStatus(Path f, 
> PathFilter filter) and apply PathFilter early. 
> Globber.java also applies filter after calling listStatus.  It should call 
> listStatus with the PathFilter.
> {code}
> FileStatus[] children = listStatus(candidate.getPath());
>            .........
>             for (FileStatus child : children) {
>               // Set the child path based on the parent path.
>               child.setPath(new Path(candidate.getPath(),
>                       child.getPath().getName()));
>               if (globFilter.accept(child.getPath())) {
>                 newCandidates.add(child);
>               }
>             }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

Reply via email to