Mustafa Iman created HADOOP-16801:
-------------------------------------
Summary: S3Guard queries S3 with recursive file listings
Key: HADOOP-16801
URL: https://issues.apache.org/jira/browse/HADOOP-16801
Project: Hadoop Common
Issue Type: Bug
Components: tools
Reporter: Mustafa Iman
Attachments: HADOOP-aws-no-prefetch.prelim.patch
S3AFileSystem#listFiles with recursive option, queries S3 even when directory
listing is authoritative. FileStatusListingIterator is created with given
entries from metadata store
[https://github.com/apache/hadoop/blob/trunk/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/Listing.java#L126]
. However, FileStatusListingIterator has an ObjectListingIterator that
prefetches from s3 regardless of authoritative listing. We observed this
behavior when using DynamDBMetadataStore.
I suppressed the unnecessary S3 calls by providing a dumb listing iterator to
listFiles call in the provided patch. Obviously this is not a solution. Just
demonstrating the source of the problem.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]