[ 
https://issues.apache.org/jira/browse/HADOOP-13371?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16360718#comment-16360718
 ] 

ASF GitHub Bot commented on HADOOP-13371:
-----------------------------------------

Github user steveloughran commented on the issue:

    https://github.com/apache/hadoop/pull/204
  
    wontfix. S3guard is needed for consistency, and as it delivers the speedup 
we need at the same time, making traumatic changes to the core code is hard to 
justify right now


> S3A globber to use bulk listObject call over recursive directory scan
> ---------------------------------------------------------------------
>
>                 Key: HADOOP-13371
>                 URL: https://issues.apache.org/jira/browse/HADOOP-13371
>             Project: Hadoop Common
>          Issue Type: Sub-task
>          Components: fs, fs/s3
>    Affects Versions: 2.8.0
>            Reporter: Steve Loughran
>            Assignee: Steve Loughran
>            Priority: Major
>
> HADOOP-13208 produces O(1) listing of directory trees in 
> {{FileSystem.listStatus}} calls, but doesn't do anything for 
> {{FileSystem.globStatus()}}, which uses a completely different codepath, one 
> which does a selective recursive scan by pattern matching as it goes down, 
> filtering out those patterns which don't match. Cost is 
> O(matching-directories) + cost of examining the files.
> It should be possible to do the glob status listing in S3A not through the 
> filtered treewalk, but through a list + filter operation. This would be an 
> O(files) lookup *before any filtering took place*.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

Reply via email to