[ 
https://issues.apache.org/jira/browse/HADOOP-13345?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15453734#comment-15453734
 ] 

Aaron Fabbri commented on HADOOP-13345:
---------------------------------------

Thank you for the feedback [~eddyxu].

Let me give an example why I think the {{fullyCached}} or {{isAuthoritative}} 
flag is required for return value from {{MetadataStore.listChildren()}}.

Assume we have an existing s3 bucket that contains these files:

/a/b/file0
/a/b/file1

Now, assume we start up a Hadoop cluster, with s3guard configured for the 
{{MetadataStore}} to be authoritative, and do the following operations:

create(/a/b/file2)
listStatus(/a/b)

In this case we have to query both the MetadataStore, and the s3 backend, as 
/a/b/file2 visibility may be subject to eventual consistency.  Also the 
MetadataStore only knows about /a/b/file2, so the client has to consult s3 to 
learn about file0 and file1.  In the listStatus() above, 
{{MetadataStore.listChildren(/a/b)}} will return {{(("/a/b/file2"), 
isAuthoritative=false)}}, since the MetadataStore did not get a {{put()}} with 
{{isAuthoritative=true}}, nor did it see a {{mkdir(/a/b)}} happen.

Two examples where {{MetadataStore.listChildren()}} would return a result with 
{{isAuthoritative=true}}:

1. 
mkdir(/a/b/c)
create(/a/b/c/fileA)
listStatus(/a/b/c)

Here, since the metadata store saw the creation of /a/b/c, it knows that it has 
observed all creations and deletions inside the /a/b/c directory.

2. Extending the original example:

Existed before cluster startup:
/a/b/file0
/a/b/file1

Then with cluster, we see:
create(/a/b/file2)
listStatus(/a/b)
listStatus(/a/b)

The first call to listStatus(/a/b) will have to fetch the full directory 
contents from s3 since {{MetadataStore.listChildren(/a/b)}} will return 
{{isAuthoritative=false}}.  Once the client gets the full listing from s3, it 
can call {{MetadataStore.put(('a/b/file0', '/a/b/file1', '/a/b/file2'), 
isAuthoritative=true)}}.  *Now*, the MetadataStore has been told it has full 
contents of /a/b, and the second call to listStatus(/a/b) above will see the 
MetadataStore return {{('a/b/file0', '/a/b/file1', '/a/b/file2'), 
isAuthoritative=true)}}


> S3Guard: Improved Consistency for S3A
> -------------------------------------
>
>                 Key: HADOOP-13345
>                 URL: https://issues.apache.org/jira/browse/HADOOP-13345
>             Project: Hadoop Common
>          Issue Type: New Feature
>          Components: fs/s3
>            Reporter: Chris Nauroth
>            Assignee: Chris Nauroth
>         Attachments: HADOOP-13345.prototype1.patch, 
> S3C-ConsistentListingonS3-Design.pdf, S3GuardImprovedConsistencyforS3A.pdf, 
> S3GuardImprovedConsistencyforS3AV2.pdf, s3c.001.patch
>
>
> This issue proposes S3Guard, a new feature of S3A, to provide an option for a 
> stronger consistency model than what is currently offered.  The solution 
> coordinates with a strongly consistent external store to resolve 
> inconsistencies caused by the S3 eventual consistency model.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to