Hello Dave,

You are correct that S3A currently may suffer unexpected effects from eventual 
consistency due to negative caching on the S3 side for the initial HEAD 
request.  In practice, I have never seen any negative consequences from this 
particular aspect of S3 eventual consistency, but in theory the problem is 
possible.

If you are interested in mitigating the effects of S3 eventual consistency, 
then you might be interested in watching development of the S3Guard project, 
tracked in Apache JIRA HADOOP-13345.

https://issues.apache.org/jira/browse/HADOOP-13345

To summarize, we plan to support use of an external store with strong 
consistency guarantees for S3A file system metadata.  In the interaction you 
described, we could consult the consistent metadata store instead of sending a 
HEAD request to S3 to determine if the object already exists.

--Chris Nauroth

From: Dave Maughan <davidamaug...@gmail.com>
Date: Thursday, October 6, 2016 at 4:07 AM
To: "user@hadoop.apache.org" <user@hadoop.apache.org>
Subject: S3AFileSystem & read-after-write consistency

Hi,

I'm investigating S3's read-after-write consistency model with S3AFileSystem 
and something is not quite clear to me, so I'm hoping someone more 
knowledgeable can clarify it for me.

Amazon state (http://docs.aws.amazon.com/AmazonS3/latest/dev/Introduction.html):

    "Amazon S3 provides read-after-write consistency for PUTS of new objects in 
your S3 bucket in all regions with one caveat. The caveat is that if you make a 
HEAD or GET request to the key name (to find if the object exists) before 
creating the object, Amazon S3 provides eventual consistency for 
read-after-write".

In S3FileSystem, create -> exists -> getFileStatus -> 
AmazonS3Client.getObjectMetadata (HEAD).

Does this mean that currently, S3AFileSystem cannot take advantage of S3's 
read-after-write consistency?

Thanks
- Dave

Reply via email to