[
https://issues.apache.org/jira/browse/HADOOP-13926?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15950435#comment-15950435
]
Aaron Fabbri commented on HADOOP-13926:
---------------------------------------
This is a good start, thank you for rebasing. I think this still needs:
1. To handle {{listFiles(recursive=true)}}.
2. Merge S3 and MetadataStore (like {{listStatus()}}) for non-authoritative
(i.e. "not all directory contents are in MetadataStore") case.
For the {{recursive=false}} (and {{listLocatedStatus()}}) case, this patch is
almost there, except it needs to handle non-authoritative case where we have to
merge MetadataStore output with the S3 iterator. I can think of a simple
algorithm for that case (until we add paging for MetadataStore). (Make a
{{Set}} which is copy of DirListingMetadata, as you return S3 iterator results,
remove those paths from the {{Set}}. When S3 iterator becomes empty, return
remaining entries in the {{Set}}.
For {{recursive=true}} it will be a little trickier. I can think of another
non-paged (non-scalable) algorithm. Later, when we have full directory entry
paging for DirListingMetadata it will get more interesting. We may have to
introduce some ordering to the S3 iterator to do it efficiently.
For unblocking merge to trunk, how about the caveat that S3Guard list
consistency does not support listFiles() yet? You simply get S3 results
without additional consistency guarantees and we'd implement listFiles() after
merge.
I will be available to work on this soon (I budgeted some time in a week or
two) if that helps.
> S3Guard: Improve listLocatedStatus and listFiles
> ------------------------------------------------
>
> Key: HADOOP-13926
> URL: https://issues.apache.org/jira/browse/HADOOP-13926
> Project: Hadoop Common
> Issue Type: Sub-task
> Components: fs/s3
> Reporter: Rajesh Balamohan
> Assignee: Steve Loughran
> Attachments: HADOOP-13926-HADOOP-13345.001.patch,
> HADOOP-13926.wip.proto.branch-13345.1.patch
>
>
> Need to check if {{listLocatedStatus}} can make use of metastore's
> listChildren feature.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]