[ https://issues.apache.org/jira/browse/HADOOP-16085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16755292#comment-16755292 ]
Ben Roling commented on HADOOP-16085: ------------------------------------- [~gabor.bota] thanks for the response. You are hitting on a different issue than I was trying to get at. The problem I am referencing (assuming I understand the system well enough myself) is present even if all parties are always using S3Guard with the same config. Consider a process writes s3a://my-bucket/foo.txt with content "abc". Another process comes along later and does an overwrite of s3a://my-bucket/foo.txt with "def". Finally, a reader process reads s3a://my-bucket/foo.txt. S3Guard ensures the reader knows that s3a://my-bucket/foo.txt exists and that the reader sees something, but does nothing to ensure the reader sees "def" instead of "abc" as the content. That was a contrived example. One place I see this as likely to occur is during failure and retry scenarios of multi-stage ETL pipelines. Stage 1 runs, writing to s3://my-bucket/output, but fails after writing only some of the output files that were supposed to go into that directory. The stage is re-run with the same output directory in an overwrite mode such that the original output is deleted and the job is rerun with the same s3://my-bucket/output target directory. This time the run is successful, so the ETL continues to Stage 2, passing s3://my-bucket/output as the input. When Stage 2 runs, S3Guard ensures it sees the correct listing of files within s3://my-bucket/output, but does nothing to ensure it reads the correct version of each of these files if the output happened to vary in any way between the first and second execution of Stage 1. One way to avoid this is to suggest that a re-run of a pipeline stage should always use a new output directory, but that is not always practical. > S3Guard: use object version to protect against inconsistent read after > replace/overwrite > ---------------------------------------------------------------------------------------- > > Key: HADOOP-16085 > URL: https://issues.apache.org/jira/browse/HADOOP-16085 > Project: Hadoop Common > Issue Type: Improvement > Components: fs/s3 > Reporter: Ben Roling > Priority: Major > > Currently S3Guard doesn't track S3 object versions. If a file is written in > S3A with S3Guard and then subsequently overwritten, there is no protection > against the next reader seeing the old version of the file instead of the new > one. > It seems like the S3Guard metadata could track the S3 object version. When a > file is created or updated, the object version could be written to the > S3Guard metadata. When a file is read, the read out of S3 could be performed > by object version, ensuring the correct version is retrieved. > I don't have a lot of direct experience with this yet, but this is my > impression from looking through the code. My organization is looking to > shift some datasets stored in HDFS over to S3 and is concerned about this > potential issue as there are some cases in our codebase that would do an > overwrite. > I imagine this idea may have been considered before but I couldn't quite > track down any JIRAs discussing it. If there is one, feel free to close this > with a reference to it. > Am I understanding things correctly? Is this idea feasible? Any feedback > that could be provided would be appreciated. We may consider crafting a > patch. -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org