[
https://issues.apache.org/jira/browse/HADOOP-16801?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17026567#comment-17026567
]
Hudson commented on HADOOP-16801:
---------------------------------
SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #17920 (See
[https://builds.apache.org/job/Hadoop-trunk-Commit/17920/])
HADOOP-16801. S3Guard listFiles will not query S3 if all listings are (github:
rev 5977360878e6780bd04842c8a2156f9848e1d088)
* (edit)
hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/s3guard/ITestDynamoDBMetadataStoreAuthoritativeMode.java
* (edit)
hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/impl/DeleteOperation.java
* (edit)
hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/s3guard/S3Guard.java
* (edit)
hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/s3guard/ImportOperation.java
* (edit)
hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/s3guard/MetadataStoreListFilesIterator.java
* (edit)
hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3AFileSystem.java
> S3Guard listFiles will not query S3 if all listings are authoritative
> ---------------------------------------------------------------------
>
> Key: HADOOP-16801
> URL: https://issues.apache.org/jira/browse/HADOOP-16801
> Project: Hadoop Common
> Issue Type: Sub-task
> Components: fs/s3
> Affects Versions: 3.3.0
> Reporter: Mustafa Iman
> Assignee: Mustafa Iman
> Priority: Minor
> Attachments: HADOOP-aws-no-prefetch.prelim.patch
>
>
> S3Guard does not respect authoritative metadatastore when listFiles is used
> with recursive=true. It queries S3 even when given directory tree is 1-level
> with no nested directories and the parent directory listing is authoritative.
> S3Guard should check the listings in given directory tree for
> authoritativeness and not query S3 when all listings in the tree are marked
> as authoritative in metadata table (given metadatastore is configured to be
> authoritative.
> Below is the description of how the current code works:
> S3AFileSystem#listFiles with recursive option, queries S3 even when directory
> listing is authoritative. FileStatusListingIterator is created with given
> entries from metadata store
> [https://github.com/apache/hadoop/blob/trunk/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/Listing.java#L126]
> . However, FileStatusListingIterator has an ObjectListingIterator that
> prefetches from s3 regardless of authoritative listing. We observed this
> behavior when using DynamDBMetadataStore.
> I suppressed the unnecessary S3 calls by providing a dumb listing iterator to
> listFiles call in the provided patch. Obviously this is not a solution. Just
> demonstrating the source of the problem.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]