[ 
https://issues.apache.org/jira/browse/HADOOP-13345?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15394565#comment-15394565
 ] 

Aaron Fabbri commented on HADOOP-13345:
---------------------------------------

Thanks [~cnauroth].  Really cool to compare two independently designed 
solutions.  Thanks for the GCS link, I'll check that out.

I agree we should proceed and collaborate on this.  Feature branch sounds good.

{quote}
The main difference I see is that my work focused more on consistency, with the 
S3 bucket still treated as source of truth, and your work focused more on 
performance. I hadn't tried anything with the DynamoDB lookup completely 
short-circuiting the S3 lookup. I think we can reconcile this though.
{quote}

We tried to make the {{MetadataStore}} interface expressive enough to allow 
implementations (both the MetadataStore impl. and the client code that uses it) 
to decide on whether or not the {{MetadataStore}} can be source of truth on 
directory listings:

- Our {{MetadataStore#listStatus(Path)}} returns a {{CachedDirectory}} which 
contains a flag {{isFullyCached}}.   Implementations may always set that flag 
to false, indicating that the client needs to consult the backing storage as 
well.

- If a client connector wishes to take advantage of the performance benefits, 
it can publish full directory listings to the {{MetadataStore}} via 
{{putListStatus()}} with {{isFullyCached=true}}, and also note the 
{{isFullyCached}} flags on the return values from {{listStatus()}}.  If a 
client connector does not want to deal with two possible sources of truth (e.g. 
to simplify failure cases), it can chose not to publish full listings to the 
{{MetadataStore}}, and to ignore any {{isFullyCached}} flags that are set on 
return from {{MetadataStore#listStatus()}}.



> S3Guard: Improved Consistency for S3A
> -------------------------------------
>
>                 Key: HADOOP-13345
>                 URL: https://issues.apache.org/jira/browse/HADOOP-13345
>             Project: Hadoop Common
>          Issue Type: New Feature
>          Components: fs/s3
>            Reporter: Chris Nauroth
>            Assignee: Chris Nauroth
>         Attachments: HADOOP-13345.prototype1.patch, 
> S3C-ConsistentListingonS3-Design.pdf, S3GuardImprovedConsistencyforS3A.pdf, 
> s3c.001.patch
>
>
> This issue proposes S3Guard, a new feature of S3A, to provide an option for a 
> stronger consistency model than what is currently offered.  The solution 
> coordinates with a strongly consistent external store to resolve 
> inconsistencies caused by the S3 eventual consistency model.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to