Riza Suminto created IMPALA-13842:
-------------------------------------

             Summary: Test against sys.impala_query_log should wait until 
coordinator quiesce.
                 Key: IMPALA-13842
                 URL: https://issues.apache.org/jira/browse/IMPALA-13842
             Project: IMPALA
          Issue Type: Bug
          Components: Test
            Reporter: Riza Suminto


We are seeing evidence of HMS deadlock for table sys.impala_query_log in 
downstream build.
{noformat}
I0307 02:24:06.992715 18405 SnapshotProducer.java:422] 
5f4ecf63310cb7f5:e2db615800000000] Committed snapshot 4325155866837015464 
(MergeAppend)
I0307 02:24:07.002012 18405 LoggingMetricsReporter.java:38] 
5f4ecf63310cb7f5:e2db615800000000] Received metrics report: 
CommitReport{tableName=ImpalaHiveCatalog.sys.impala_query_log, 
snapshotId=4325155866837015464, sequenceNumber=9, operation=append, 
commitMetrics=CommitMetricsResult{totalDuration=TimerResult{timeUnit=NANOSECONDS,
 totalDuration=PT0.241321564S, count=1}, attempts=CounterResult{unit=COUNT, 
value=1}, addedDataFiles=CounterResult{unit=COUNT, value=1}, 
removedDataFiles=null, totalDataFiles=CounterResult{unit=COUNT, value=9}, 
addedDeleteFiles=null, addedEqualityDeleteFiles=null, 
addedPositionalDeleteFiles=null, removedDeleteFiles=null, 
removedEqualityDeleteFiles=null, removedPositionalDeleteFiles=null, 
totalDeleteFiles=CounterResult{unit=COUNT, value=0}, 
addedRecords=CounterResult{unit=COUNT, value=1}, removedRecords=null, 
totalRecords=CounterResult{unit=COUNT, value=41}, 
addedFilesSizeInBytes=CounterResult{unit=BYTES, value=26244}, 
removedFilesSizeInBytes=null, totalFilesSizeInBytes=CounterResult{unit=BYTES, 
value=273450}, addedPositionalDeletes=null, removedPositionalDeletes=null, 
totalPositionalDeletes=CounterResult{unit=COUNT, value=0}, 
addedEqualityDeletes=null, removedEqualityDeletes=null, 
totalEqualityDeletes=CounterResult{unit=COUNT, value=0}}, 
metadata={iceberg-version=Apache Iceberg 1.5.2.2025.0.20.0-45 (commit 
f2db4a27a43aaec790af8b9e2ac51f4a50b26260)}}
I0307 02:24:07.002182 18405 CatalogServiceCatalog.java:1327] 
5f4ecf63310cb7f5:e2db615800000000] Added catalog version 2126 in table's 
sys.impala_query_log in-flight events
W0307 02:24:07.053566 18405 Tasks.java:456] 5f4ecf63310cb7f5:e2db615800000000] 
Retrying task after failure: Waiting for lock on table sys.impala_query_log
Java exception follows:
org.apache.iceberg.hive.MetastoreLock$WaitingForLockException: Waiting for lock 
on table sys.impala_query_log
        at 
org.apache.iceberg.hive.MetastoreLock.lambda$acquireLock$1(MetastoreLock.java:217)
        at 
org.apache.iceberg.util.Tasks$Builder.runTaskWithRetry(Tasks.java:413)
        at 
org.apache.iceberg.util.Tasks$Builder.runSingleThreaded(Tasks.java:219)
        at org.apache.iceberg.util.Tasks$Builder.run(Tasks.java:203)
        at 
org.apache.iceberg.hive.MetastoreLock.acquireLock(MetastoreLock.java:209)
        at org.apache.iceberg.hive.MetastoreLock.lock(MetastoreLock.java:146)
        at 
org.apache.iceberg.hive.HiveTableOperations.doCommit(HiveTableOperations.java:182)
        at 
org.apache.iceberg.BaseMetastoreTableOperations.commit(BaseMetastoreTableOperations.java:135)
        at 
org.apache.iceberg.BaseTransaction.lambda$commitSimpleTransaction$3(BaseTransaction.java:427)
        at 
org.apache.iceberg.util.Tasks$Builder.runTaskWithRetry(Tasks.java:413)
        at 
org.apache.iceberg.util.Tasks$Builder.runSingleThreaded(Tasks.java:219)
        at org.apache.iceberg.util.Tasks$Builder.run(Tasks.java:203)
        at org.apache.iceberg.util.Tasks$Builder.run(Tasks.java:196)
        at 
org.apache.iceberg.BaseTransaction.commitSimpleTransaction(BaseTransaction.java:423)
        at 
org.apache.iceberg.BaseTransaction.commitTransaction(BaseTransaction.java:318)
        at 
org.apache.impala.service.CatalogOpExecutor.insertIntoIcebergTable(CatalogOpExecutor.java:7593)
        at 
org.apache.impala.service.CatalogOpExecutor.updateCatalogImpl(CatalogOpExecutor.java:7386)
        at 
org.apache.impala.service.CatalogOpExecutor.updateCatalog(CatalogOpExecutor.java:7227)
        at 
org.apache.impala.service.JniCatalog.lambda$updateCatalog$15(JniCatalog.java:510)
        at 
org.apache.impala.service.JniCatalogOp.lambda$execAndSerialize$1(JniCatalogOp.java:90)
        at org.apache.impala.service.JniCatalogOp.execOp(JniCatalogOp.java:58)
        at 
org.apache.impala.service.JniCatalogOp.execAndSerialize(JniCatalogOp.java:89)
        at 
org.apache.impala.service.JniCatalogOp.execAndSerialize(JniCatalogOp.java:100)
        at 
org.apache.impala.service.JniCatalog.execAndSerialize(JniCatalog.java:251)
        at 
org.apache.impala.service.JniCatalog.execAndSerialize(JniCatalog.java:265)
        at 
org.apache.impala.service.JniCatalog.updateCatalog(JniCatalog.java:509)

...

W0307 02:25:49.672634 18405 Tasks.java:456] 5f4ecf63310cb7f5:e2db615800000000] 
Retrying task after failure: Waiting for lock on table 
sys.impala_query_log{noformat}
This might be related to IMPALA-13772 that disable FETCH_ROWS_TIMEOUT_MS option.

During custom cluster test run, impala minicluster is restart between tests. It 
test does not wait until coordinator quiesce (all pending INSERT INTO 
sys.impala_query_log is done), it might be the case that HMS lock is being hold 
when CatalogD terminates.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to