[jira] [Commented] (HADOOP-16085) S3Guard: use object version to protect against inconsistent read after replace/overwrite

Steve Loughran (JIRA) Fri, 01 Feb 2019 11:04:20 -0800


    [ 
https://issues.apache.org/jira/browse/HADOOP-16085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16758577#comment-16758577
 ]


Steve Loughran commented on HADOOP-16085:
-----------------------------------------

the enemy here is eventual consistency. Which is of course the whole reason 
S3Guard was needed. 

What issues are we worrying about

# mixed writer: some not going with s3guard, some doing. Even in nonauth mode, 
I worry about delete tombstones.
# failure during large operations and so s3 not being in sync with the store.
# failure during a workflow with one or more GET calls on the second attempt 
picking up the old version.

HADOOP-15625 is going to address the changes within an open file through etag 
comparison, but without the etag being cached in the S3Guard repo, it's not 
going to detect inconsistencies between the version expected and the version 
read.

Personally, I'm kind of reluctant to rely on S3Guard for being the sole defence 
against this problem. 

bq.  a re-run of a pipeline stage should always use a new output directory,

if you use the S3A committers for your work, and the default mode -insert a 
guid into the filename- then filenames are always created unique. It becomes 
impossible to get a RAW inconsistency. This is essentially where we are going, 
along with Apache Iceberg (incubating). Rather than jump through 
hoop-after-hoop of workarounds for S3s apparent decision to never deliver 
consistent views, come up with data structures which only need one point of 
consistency (you need to know the unique filename of the latest iceberg file).

Putting that aside, yes, keeping version markers would be good. I like etags 
because they are exposed in getFileChecksum(); their flaw is that they can be 
very large on massive MPUs (32bytes/block uploaded). 


BTW, if you are worried about how observable is eventual consistency, generally 
its delayed listings over actual content. There's a really good paper with 
experimental data which does measure how often you can observe RAW 
inconsistencies http://www.aifb.kit.edu/images/8/8d/Ic2e2014.pdf


> S3Guard: use object version to protect against inconsistent read after 
> replace/overwrite
> ----------------------------------------------------------------------------------------
>
>                 Key: HADOOP-16085
>                 URL: https://issues.apache.org/jira/browse/HADOOP-16085
>             Project: Hadoop Common
>          Issue Type: Sub-task
>          Components: fs/s3
>    Affects Versions: 3.2.0
>            Reporter: Ben Roling
>            Priority: Major
>         Attachments: HADOOP-16085_3.2.0_001.patch
>
>
> Currently S3Guard doesn't track S3 object versions.  If a file is written in 
> S3A with S3Guard and then subsequently overwritten, there is no protection 
> against the next reader seeing the old version of the file instead of the new 
> one.
> It seems like the S3Guard metadata could track the S3 object version.  When a 
> file is created or updated, the object version could be written to the 
> S3Guard metadata.  When a file is read, the read out of S3 could be performed 
> by object version, ensuring the correct version is retrieved.
> I don't have a lot of direct experience with this yet, but this is my 
> impression from looking through the code.  My organization is looking to 
> shift some datasets stored in HDFS over to S3 and is concerned about this 
> potential issue as there are some cases in our codebase that would do an 
> overwrite.
> I imagine this idea may have been considered before but I couldn't quite 
> track down any JIRAs discussing it.  If there is one, feel free to close this 
> with a reference to it.
> Am I understanding things correctly?  Is this idea feasible?  Any feedback 
> that could be provided would be appreciated.  We may consider crafting a 
> patch.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (HADOOP-16085) S3Guard: use object version to protect against inconsistent read after replace/overwrite

Reply via email to