noslowerdna commented on a change in pull request #646: HADOOP-16085: use object version or etags to protect against inconsistent read after replace/overwrite URL: https://github.com/apache/hadoop/pull/646#discussion_r269754782
########## File path: hadoop-tools/hadoop-aws/src/site/markdown/tools/hadoop-aws/s3guard.md ########## @@ -1021,6 +1117,41 @@ java.io.IOException: Invalid region specified "iceland-2": The region specified in `fs.s3a.s3guard.ddb.region` is invalid. +### Error `RemoteFileChangedException` + +An exception like the following could occur for a couple of reasons: + +* the S3Guard metadata is out of sync with the true S3 metadata. For +example, the S3Guard DynamoDB table is tracking a different ETag than the ETag +shown in the exception. This may suggest the object was updated in S3 without +involvement from S3Guard or there was a transient failure when S3Guard tried to +write to S3. + +* S3 is exhibiting read-after-overwrite eventual consistency. The S3Guard +metadata was updated with a new ETag during a recent write, but the current read +is not seeing that ETag due to S3 eventual consistency. This exception prevents +the reader from an inconsistent read where the reader sees an older version of +the file. + +``` +org.apache.hadoop.fs.s3a.RemoteFileChangedException: open 's3a://my-bucket/test/file.txt': + ETag change reported by S3 while reading at position 0. + Version 4e886e26c072fef250cfaf8037675405 was unavailable Review comment: I think there's ETag/Version mismatch here - shouldn't this say "Version change reported...", or "ETag ... was unavailable"? ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
