szehon-ho opened a new pull request #4261:
URL: https://github.com/apache/iceberg/pull/4261


   After #3717 all HIve exceptions become CommitStateUnknownException.  That is 
because checkCommitStatus stops returning FAILURE mode and always either 
SUCCESS or UNKNOWN, and subsequent specific error-handling blocks [specific 
catch 
blocks](https://github.com/apache/iceberg/blob/master/hive-metastore/src/main/java/org/apache/iceberg/hive/HiveTableOperations.java#L287)
 never get run.
   
   For instance, doing a create table statement with invalid name:
   
   > CREATE TABLE `my_table` (id INT) USING ICEBERG
   
   Gets this kind of confusing and alarming error message for user:
   
   ```
   org.apache.iceberg.exceptions.CommitStateUnknownException: `tbl` is not a 
valid object name
   Cannot determine whether the commit was successful or not, the underlying 
data files may or may not be needed. Manual intervention via the Remove Orphan 
Files Action can remove these files when a connection to the Catalog can be 
re-established if the commit was actually unsuccessful.
   Please check to see whether or not your commit was successful before 
retrying this commit. Retrying an already successful operation will result in 
duplicate records or unintentional modifications.
   At this time no files will be deleted including possibly unused manifest 
lists.
   ```
   
   Also, another side effect is there is a NPE in checkCommitStatus() code:
   
   ```
   java.lang.NullPointerException
        at 
org.apache.iceberg.BaseMetastoreTableOperations.lambda$checkCommitStatus$4(BaseMetastoreTableOperations.java:302)
        at 
org.apache.iceberg.util.Tasks$Builder.runTaskWithRetry(Tasks.java:404)
        at 
org.apache.iceberg.util.Tasks$Builder.runSingleThreaded(Tasks.java:214)
        at org.apache.iceberg.util.Tasks$Builder.run(Tasks.java:198)
        at org.apache.iceberg.util.Tasks$Builder.run(Tasks.java:190)
        at 
org.apache.iceberg.BaseMetastoreTableOperations.checkCommitStatus(BaseMetastoreTableOperations.java:300)
        at 
org.apache.iceberg.hive.HiveTableOperations.doCommit(HiveTableOperations.java:285)
        at 
org.apache.iceberg.BaseMetastoreTableOperations.commit(BaseMetastoreTableOperations.java:127)
        at 
org.apache.iceberg.BaseMetastoreCatalog$BaseMetastoreCatalogTableBuilder.create(BaseMetastoreCatalog.java:165)
        at org.apache.iceberg.catalog.Catalog.createTable(Catalog.java:78)
        at org.apache.iceberg.catalog.Catalog.createTable(Catalog.java:112)
   ```
   
   This change just pushes down the original Hive known failure scenarios 
(user-errors), and skips the CommitStateUnknown handling for those.  These 
error-handling were running before #3717, and so this change just re-activates 
them.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to