[ 
https://issues.apache.org/jira/browse/HADOOP-16801?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gabor Bota resolved HADOOP-16801.
---------------------------------
    Resolution: Fixed

> S3Guard queries S3 with recursive file listings
> -----------------------------------------------
>
>                 Key: HADOOP-16801
>                 URL: https://issues.apache.org/jira/browse/HADOOP-16801
>             Project: Hadoop Common
>          Issue Type: Sub-task
>          Components: fs/s3
>    Affects Versions: 3.3.0
>            Reporter: Mustafa Iman
>            Assignee: Mustafa Iman
>            Priority: Minor
>         Attachments: HADOOP-aws-no-prefetch.prelim.patch
>
>
> S3Guard does not respect authoritative metadatastore when listFiles is used 
> with recursive=true. It queries S3 even when given directory tree is 1-level 
> with no nested directories and the parent directory listing is authoritative. 
> S3Guard should check the listings in given directory tree for 
> authoritativeness and not query S3 when all listings in the tree are marked 
> as authoritative in metadata table (given metadatastore is configured to be 
> authoritative.
> Below is the description of how the current code works:
> S3AFileSystem#listFiles with recursive option, queries S3 even when directory 
> listing is authoritative. FileStatusListingIterator is created with given 
> entries from metadata store 
> [https://github.com/apache/hadoop/blob/trunk/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/Listing.java#L126]
>  . However, FileStatusListingIterator has an ObjectListingIterator that 
> prefetches from s3 regardless of authoritative listing. We observed this 
> behavior when using DynamDBMetadataStore.
> I suppressed the unnecessary S3 calls by providing a dumb listing iterator to 
> listFiles call in the provided patch. Obviously this is not a solution. Just 
> demonstrating the source of the problem.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org

Reply via email to