[ 
https://issues.apache.org/jira/browse/HADOOP-16085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16755292#comment-16755292
 ] 

Ben Roling commented on HADOOP-16085:
-------------------------------------

[~gabor.bota] thanks for the response.  You are hitting on a different issue 
than I was trying to get at.  The problem I am referencing (assuming I 
understand the system well enough myself) is present even if all parties are 
always using S3Guard with the same config.

Consider a process writes s3a://my-bucket/foo.txt with content "abc".  Another 
process comes along later and does an overwrite of s3a://my-bucket/foo.txt with 
"def".  Finally, a reader process reads s3a://my-bucket/foo.txt.  S3Guard 
ensures the reader knows that s3a://my-bucket/foo.txt exists and that the 
reader sees something, but does nothing to ensure the reader sees "def" instead 
of "abc" as the content.

That was a contrived example.  One place I see this as likely to occur is 
during failure and retry scenarios of multi-stage ETL pipelines.  Stage 1 runs, 
writing to s3://my-bucket/output, but fails after writing only some of the 
output files that were supposed to go into that directory.  The stage is re-run 
with the same output directory in an overwrite mode such that the original 
output is deleted and the job is rerun with the same s3://my-bucket/output 
target directory.  This time the run is successful, so the ETL continues to 
Stage 2, passing s3://my-bucket/output as the input.  When Stage 2 runs, 
S3Guard ensures it sees the correct listing of files within 
s3://my-bucket/output, but does nothing to ensure it reads the correct version 
of each of these files if the output happened to vary in any way between the 
first and second execution of Stage 1.

One way to avoid this is to suggest that a re-run of a pipeline stage should 
always use a new output directory, but that is not always practical.

> S3Guard: use object version to protect against inconsistent read after 
> replace/overwrite
> ----------------------------------------------------------------------------------------
>
>                 Key: HADOOP-16085
>                 URL: https://issues.apache.org/jira/browse/HADOOP-16085
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: fs/s3
>            Reporter: Ben Roling
>            Priority: Major
>
> Currently S3Guard doesn't track S3 object versions.  If a file is written in 
> S3A with S3Guard and then subsequently overwritten, there is no protection 
> against the next reader seeing the old version of the file instead of the new 
> one.
> It seems like the S3Guard metadata could track the S3 object version.  When a 
> file is created or updated, the object version could be written to the 
> S3Guard metadata.  When a file is read, the read out of S3 could be performed 
> by object version, ensuring the correct version is retrieved.
> I don't have a lot of direct experience with this yet, but this is my 
> impression from looking through the code.  My organization is looking to 
> shift some datasets stored in HDFS over to S3 and is concerned about this 
> potential issue as there are some cases in our codebase that would do an 
> overwrite.
> I imagine this idea may have been considered before but I couldn't quite 
> track down any JIRAs discussing it.  If there is one, feel free to close this 
> with a reference to it.
> Am I understanding things correctly?  Is this idea feasible?  Any feedback 
> that could be provided would be appreciated.  We may consider crafting a 
> patch.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

Reply via email to