Mustafa Iman created HADOOP-16801: ------------------------------------- Summary: S3Guard queries S3 with recursive file listings Key: HADOOP-16801 URL: https://issues.apache.org/jira/browse/HADOOP-16801 Project: Hadoop Common Issue Type: Bug Components: tools Reporter: Mustafa Iman Attachments: HADOOP-aws-no-prefetch.prelim.patch
S3AFileSystem#listFiles with recursive option, queries S3 even when directory listing is authoritative. FileStatusListingIterator is created with given entries from metadata store [https://github.com/apache/hadoop/blob/trunk/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/Listing.java#L126] . However, FileStatusListingIterator has an ObjectListingIterator that prefetches from s3 regardless of authoritative listing. We observed this behavior when using DynamDBMetadataStore. I suppressed the unnecessary S3 calls by providing a dumb listing iterator to listFiles call in the provided patch. Obviously this is not a solution. Just demonstrating the source of the problem. -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-dev-h...@hadoop.apache.org