[ 
https://issues.apache.org/jira/browse/HADOOP-16085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16758432#comment-16758432
 ] 

Ben Roling commented on HADOOP-16085:
-------------------------------------

Thanks for the thoughts [~fabbri].  I have an initial patch that I will upload 
soon.  The patch stores object versionId and as such only provides 
read-after-overwrite protection if object versioning is enabled.  If object 
versioning is not enabled on the bucket, things would function the same as 
before.

I hadn't really considered storing eTags instead.  I'll look at the feasibility 
of doing that as it could remove the dependency on enabling object versioning 
to make the feature more broadly applicable.  I think my organization is likely 
to enable object versioning anyway, but if S3Guard doesn't depend on it then 
more folks may benefit.

Thanks for your list of considerations.  Here are some responses:
 * The feature is always enabled and adds zero round trips.  versionId was 
available already on the PutObject response so I'm just capturing it and 
storing it.  This is the way it is in my patch anyway.  I'm curious for 
feedback if you believe there should be a capability to toggle the feature on 
or off?
 * There isn't really a conflict resolution policy.  If we have versionId in 
the metadata, we provide it on the GetObject request.  We either get back what 
we are looking for or a 404.  I'm guessing 404s are not going to happen (except 
if the object is deleted outside the context of S3Guard, but that's outside the 
scope of this).  I assume read-after-overwrite inconsistencies in general with 
S3 happen due to cache hits on the old version, but when the version (eTag or 
versionId) is explicitly specified there should be no cache hit and we would 
get the same read-after-write consistency as you get on an initial PUT (no 
overwrite).  Even if I am wrong, the worst case is you get a 
FileNotFoundException, which is much better than an inconsistent read and no 
error.  Retries could be added on 404, but maybe wait until it is proven they 
are necessary.
 * I'm not trying to protect against a racing writer issue.  I can add 
something to the documentation about it.
 * The changes are backward and forward compatible with existing buckets and 
tables.  The new versionId attribute is optional.
 * MetadataStore expiry should be fine.  The versionId is optional.  If it 
isn't there, no problem.  The only risk of inconsistent read-after-overwrite is 
if the metadata is purged more quickly than S3 itself becomes 
read-after-overwrite consistent for the object being read.  I can update 
documentation to mention this.
 * I guess with regard to HADOOP-15779, there could be a new type of S3Guard 
metadata inconsistency.  If an object is overwritten outside of S3Guard, 
S3Guard will not have the correct eTag or versionId and the reader may end up 
seeing either a 404 or the old content.  Prior to this, the reader would see 
whatever content S3 returns on a GET that is not qualified by eTag or versionId.

> S3Guard: use object version to protect against inconsistent read after 
> replace/overwrite
> ----------------------------------------------------------------------------------------
>
>                 Key: HADOOP-16085
>                 URL: https://issues.apache.org/jira/browse/HADOOP-16085
>             Project: Hadoop Common
>          Issue Type: Sub-task
>          Components: fs/s3
>    Affects Versions: 3.2.0
>            Reporter: Ben Roling
>            Priority: Major
>
> Currently S3Guard doesn't track S3 object versions.  If a file is written in 
> S3A with S3Guard and then subsequently overwritten, there is no protection 
> against the next reader seeing the old version of the file instead of the new 
> one.
> It seems like the S3Guard metadata could track the S3 object version.  When a 
> file is created or updated, the object version could be written to the 
> S3Guard metadata.  When a file is read, the read out of S3 could be performed 
> by object version, ensuring the correct version is retrieved.
> I don't have a lot of direct experience with this yet, but this is my 
> impression from looking through the code.  My organization is looking to 
> shift some datasets stored in HDFS over to S3 and is concerned about this 
> potential issue as there are some cases in our codebase that would do an 
> overwrite.
> I imagine this idea may have been considered before but I couldn't quite 
> track down any JIRAs discussing it.  If there is one, feel free to close this 
> with a reference to it.
> Am I understanding things correctly?  Is this idea feasible?  Any feedback 
> that could be provided would be appreciated.  We may consider crafting a 
> patch.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to