[ 
https://issues.apache.org/jira/browse/HADOOP-13926?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15950435#comment-15950435
 ] 

Aaron Fabbri commented on HADOOP-13926:
---------------------------------------

This is a good start, thank you for rebasing. I think this still needs:

1. To handle {{listFiles(recursive=true)}}.

2. Merge S3 and MetadataStore (like {{listStatus()}}) for non-authoritative 
(i.e. "not all directory contents are in MetadataStore") case.

For the {{recursive=false}} (and {{listLocatedStatus()}}) case, this patch is 
almost there, except it needs to handle non-authoritative case where we have to 
merge MetadataStore output with the S3 iterator.  I can think of a simple 
algorithm for that case (until we add paging for MetadataStore).  (Make a 
{{Set}} which is copy of DirListingMetadata, as you return S3 iterator results, 
remove those paths from the {{Set}}.  When S3 iterator becomes empty, return 
remaining entries in the {{Set}}.

For {{recursive=true}} it will be a little trickier.  I can think of another 
non-paged (non-scalable) algorithm.  Later, when we have full directory entry 
paging for DirListingMetadata it will get more interesting.  We may have to 
introduce some ordering to the S3 iterator to do it efficiently.

 For unblocking merge to trunk, how about the caveat that S3Guard list 
consistency does not support listFiles() yet?  You simply get S3 results 
without additional consistency guarantees and we'd implement listFiles() after 
merge.

I will be available to work on this soon (I budgeted some time in a week or 
two) if that helps.


> S3Guard: Improve listLocatedStatus and listFiles
> ------------------------------------------------
>
>                 Key: HADOOP-13926
>                 URL: https://issues.apache.org/jira/browse/HADOOP-13926
>             Project: Hadoop Common
>          Issue Type: Sub-task
>          Components: fs/s3
>            Reporter: Rajesh Balamohan
>            Assignee: Steve Loughran
>         Attachments: HADOOP-13926-HADOOP-13345.001.patch, 
> HADOOP-13926.wip.proto.branch-13345.1.patch
>
>
> Need to check if {{listLocatedStatus}} can make use of metastore's 
> listChildren feature.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to