gallushi commented on a change in pull request #1871:
URL: https://github.com/apache/iceberg/pull/1871#discussion_r538459629



##########
File path: 
core/src/main/java/org/apache/iceberg/hadoop/HadoopTableOperations.java
##########
@@ -161,8 +163,24 @@ public void commit(TableMetadata base, TableMetadata 
metadata) {
           "Failed to check if next version exists: %s", finalMetadataFile);
     }
 
-    // this rename operation is the atomic commit operation
-    renameToFinal(fs, tempMetadataFile, finalMetadataFile);
+    if (useAtomicWrite) {
+      // using atomic write
+      try {
+        TableMetadataParser.write(metadata, 
io().newOutputFile(finalMetadataFile.toString()));
+      } catch (RuntimeIOException e) {
+        CommitFailedException cfe = new CommitFailedException(
+                "Failed to commit changes using atomic write: %s", 
finalMetadataFile.toString());

Review comment:
       > How do you know that the `RuntimeIOException` indicates a write 
conflict?
   
   we can't really know that.
   if the write fails due to a conflict, an `IOException` will be thrown in 
`FSDataOutputStream.close(...)`, it will be caught in 
`TableMetadataParser.internalWrite(...)` and a `RuntimeIOException` will be 
thrown.
   it will be tricky to identify that this is due to a conflict, since:
   1) other problems might have caused the `IOException` to be thrown.
   2) the exception text will vary based on underlying FS implementation (for 
example, for IBM COS if using Stocator as the connector, the exception message 
will look something like
    `Caused by: java.io.IOException: saving output gal/test/v1.metadata.json 
com.amazonaws.services.s3.model.AmazonS3Exception: At least one of the 
preconditions you specified did not hold. (Service: Amazon S3; Status Code: 
412; Error Code: PreconditionFailed; Request ID.....`)
   
   > Should we support a conflict exception in the `FileIO` that can be thrown 
instead?
   
   the exception might be thrown by the output stream (e.g., during 
`close(...)`) - so IMHO changing the `FileIO` or even the `OutputFile` won't be 
enough to allow identifying conflicts.
   
   
   
   
   
   
   
   




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to