[ 
https://issues.apache.org/jira/browse/HIVE-25563?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ádám Szita updated HIVE-25563:
------------------------------
    Description: 
For all Iceberg table operations (select, insert, alter, etc..) Hive tries to 
load the Iceberg table by reading in its metadata files.

If these metadata files are not present or are inaccessible for any reason, 
then operations on such a table will result in a long > 10 minutes hang for the 
user's session. This is because there's a retry logic with exponential 
intervals, and 21 retries by default in place:
{code:java}
java.lang.Thread.State: TIMED_WAITING (sleeping)
        at java.lang.Thread.sleep(Native Method)
        at java.lang.Thread.sleep(Thread.java:340)
        at java.util.concurrent.TimeUnit.sleep(TimeUnit.java:386)
        at 
org.apache.iceberg.util.Tasks$Builder.runTaskWithRetry(Tasks.java:453)
        at 
org.apache.iceberg.util.Tasks$Builder.runSingleThreaded(Tasks.java:214)
        at org.apache.iceberg.util.Tasks$Builder.run(Tasks.java:198)
        at org.apache.iceberg.util.Tasks$Builder.run(Tasks.java:190)
        at 
org.apache.iceberg.BaseMetastoreTableOperations.refreshFromMetadataLocation(BaseMetastoreTableOperations.java:178)
        at 
org.apache.iceberg.BaseMetastoreTableOperations.refreshFromMetadataLocation(BaseMetastoreTableOperations.java:160)
        at 
org.apache.iceberg.hive.HiveTableOperations.doRefresh(HiveTableOperations.java:183)
        at 
org.apache.iceberg.BaseMetastoreTableOperations.refresh(BaseMetastoreTableOperations.java:94)
        at 
org.apache.iceberg.BaseMetastoreTableOperations.current(BaseMetastoreTableOperations.java:77)
        at 
org.apache.iceberg.BaseMetastoreCatalog.loadTable(BaseMetastoreCatalog.java:93)
        at org.apache.iceberg.mr.Catalogs.loadTable(Catalogs.java:106)
        at org.apache.iceberg.mr.Catalogs.loadTable(Catalogs.java:96)
        at 
org.apache.iceberg.mr.hive.IcebergTableUtil.lambda$getTable$2(IcebergTableUtil.java:69)
        at 
org.apache.iceberg.mr.hive.IcebergTableUtil$$Lambda$284/1429147768.get(Unknown 
Source)
        at java.util.Optional.orElseGet(Optional.java:267)
        at 
org.apache.iceberg.mr.hive.IcebergTableUtil.getTable(IcebergTableUtil.java:66)
        at 
org.apache.iceberg.mr.hive.HiveIcebergSerDe.initialize(HiveIcebergSerDe.java:105)
        at 
org.apache.hadoop.hive.metastore.HiveMetaStoreUtils.getDeserializer(HiveMetaStoreUtils.java:95)
...
 {code}
We should make the retry count configurable, and with a lower default setting.

  was:
For all Iceberg table operations (select, insert, alter, etc..) Hive tries to 
load the Iceberg table by reading in its metadata files.

If these metadata files are not present or are inaccessible for any reason, 
then operations on such a table will result in a long > 10 minutes hang for the 
user's session. This is because there's a retry logic with exponential 
intervals, and 21 retries by default in place:
{code:java}
java.lang.Thread.State: TIMED_WAITING (sleeping)
        at java.lang.Thread.sleep(Native Method)
        at java.lang.Thread.sleep(Thread.java:340)
        at java.util.concurrent.TimeUnit.sleep(TimeUnit.java:386)
        at 
org.apache.iceberg.util.Tasks$Builder.runTaskWithRetry(Tasks.java:453)
        at 
org.apache.iceberg.util.Tasks$Builder.runSingleThreaded(Tasks.java:214)
        at org.apache.iceberg.util.Tasks$Builder.run(Tasks.java:198)
        at org.apache.iceberg.util.Tasks$Builder.run(Tasks.java:190)
        at 
org.apache.iceberg.BaseMetastoreTableOperations.refreshFromMetadataLocation(BaseMetastoreTableOperations.java:178)
        at 
org.apache.iceberg.BaseMetastoreTableOperations.refreshFromMetadataLocation(BaseMetastoreTableOperations.java:160)
        at 
org.apache.iceberg.hive.HiveTableOperations.doRefresh(HiveTableOperations.java:183)
        at 
org.apache.iceberg.BaseMetastoreTableOperations.refresh(BaseMetastoreTableOperations.java:94)
        at 
org.apache.iceberg.BaseMetastoreTableOperations.current(BaseMetastoreTableOperations.java:77)
        at 
org.apache.iceberg.BaseMetastoreCatalog.loadTable(BaseMetastoreCatalog.java:93)
        at org.apache.iceberg.mr.Catalogs.loadTable(Catalogs.java:106)
        at org.apache.iceberg.mr.Catalogs.loadTable(Catalogs.java:96)
        at 
org.apache.iceberg.mr.hive.IcebergTableUtil.lambda$getTable$2(IcebergTableUtil.java:69)
        at 
org.apache.iceberg.mr.hive.IcebergTableUtil$$Lambda$284/1429147768.get(Unknown 
Source)
        at java.util.Optional.orElseGet(Optional.java:267)
        at 
org.apache.iceberg.mr.hive.IcebergTableUtil.getTable(IcebergTableUtil.java:66)
        at 
org.apache.iceberg.mr.hive.HiveIcebergSerDe.initialize(HiveIcebergSerDe.java:105)
        at 
org.apache.hadoop.hive.metastore.HiveMetaStoreUtils.getDeserializer(HiveMetaStoreUtils.java:95)
...
 {code}


> Iceberg table operations hang a long time if metadata is missing/corrupted
> --------------------------------------------------------------------------
>
>                 Key: HIVE-25563
>                 URL: https://issues.apache.org/jira/browse/HIVE-25563
>             Project: Hive
>          Issue Type: Bug
>            Reporter: Ádám Szita
>            Assignee: Ádám Szita
>            Priority: Major
>
> For all Iceberg table operations (select, insert, alter, etc..) Hive tries to 
> load the Iceberg table by reading in its metadata files.
> If these metadata files are not present or are inaccessible for any reason, 
> then operations on such a table will result in a long > 10 minutes hang for 
> the user's session. This is because there's a retry logic with exponential 
> intervals, and 21 retries by default in place:
> {code:java}
> java.lang.Thread.State: TIMED_WAITING (sleeping)
>         at java.lang.Thread.sleep(Native Method)
>         at java.lang.Thread.sleep(Thread.java:340)
>         at java.util.concurrent.TimeUnit.sleep(TimeUnit.java:386)
>         at 
> org.apache.iceberg.util.Tasks$Builder.runTaskWithRetry(Tasks.java:453)
>         at 
> org.apache.iceberg.util.Tasks$Builder.runSingleThreaded(Tasks.java:214)
>         at org.apache.iceberg.util.Tasks$Builder.run(Tasks.java:198)
>         at org.apache.iceberg.util.Tasks$Builder.run(Tasks.java:190)
>         at 
> org.apache.iceberg.BaseMetastoreTableOperations.refreshFromMetadataLocation(BaseMetastoreTableOperations.java:178)
>         at 
> org.apache.iceberg.BaseMetastoreTableOperations.refreshFromMetadataLocation(BaseMetastoreTableOperations.java:160)
>         at 
> org.apache.iceberg.hive.HiveTableOperations.doRefresh(HiveTableOperations.java:183)
>         at 
> org.apache.iceberg.BaseMetastoreTableOperations.refresh(BaseMetastoreTableOperations.java:94)
>         at 
> org.apache.iceberg.BaseMetastoreTableOperations.current(BaseMetastoreTableOperations.java:77)
>         at 
> org.apache.iceberg.BaseMetastoreCatalog.loadTable(BaseMetastoreCatalog.java:93)
>         at org.apache.iceberg.mr.Catalogs.loadTable(Catalogs.java:106)
>         at org.apache.iceberg.mr.Catalogs.loadTable(Catalogs.java:96)
>         at 
> org.apache.iceberg.mr.hive.IcebergTableUtil.lambda$getTable$2(IcebergTableUtil.java:69)
>         at 
> org.apache.iceberg.mr.hive.IcebergTableUtil$$Lambda$284/1429147768.get(Unknown
>  Source)
>         at java.util.Optional.orElseGet(Optional.java:267)
>         at 
> org.apache.iceberg.mr.hive.IcebergTableUtil.getTable(IcebergTableUtil.java:66)
>         at 
> org.apache.iceberg.mr.hive.HiveIcebergSerDe.initialize(HiveIcebergSerDe.java:105)
>         at 
> org.apache.hadoop.hive.metastore.HiveMetaStoreUtils.getDeserializer(HiveMetaStoreUtils.java:95)
> ...
>  {code}
> We should make the retry count configurable, and with a lower default setting.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to