[
https://issues.apache.org/jira/browse/IMPALA-11509?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Andrew Sherman resolved IMPALA-11509.
-------------------------------------
Fix Version/s: Impala 4.3.0
Resolution: Fixed
> Dropping files of Iceberg during table loading may cause Impalad to stuck in
> infinite loop
> ------------------------------------------------------------------------------------------
>
> Key: IMPALA-11509
> URL: https://issues.apache.org/jira/browse/IMPALA-11509
> Project: IMPALA
> Issue Type: Bug
> Components: Frontend
> Affects Versions: Impala 4.1.0
> Reporter: Gabor Kaszab
> Assignee: Andrew Sherman
> Priority: Critical
> Labels: iceberg, impala-iceberg
> Fix For: Impala 4.3.0
>
>
> This issues is very similar to
> https://issues.apache.org/jira/browse/IMPALA-11502. The repro steps are also
> almost identical, however in this case the folder of the table should be
> dropped right when the INSERT into starts.
> Repro steps:
> 1) Create the Iceberg table:
> {code:java}
> DROP DATABASE IF EXISTS `drop_incomplete_table` CASCADE;
> CREATE DATABASE `drop_incomplete_table`;
> CREATE TABLE drop_incomplete_table.iceberg_tbl (i int) stored as iceberg
> tblproperties('iceberg.catalog'='hadoop.catalog',
>
> 'iceberg.catalog_location'='/test-warehouse/drop_incomplete_table');
> {code}
> 2) For this step timing is essential and might require a few try to hit the
> issue. Try to run INSERT INTO and dropping the HDFS folder at the same time.
> Manually executing them is fine, this doesn't require scripting.
> {code:java}
> INSERT INTO drop_incomplete_table.iceberg_tbl VALUES (1), (2), (3);
> hdfs dfs -rm -r hdfs://localhost:20500/test-warehouse/drop_incomplete_table
> {code}
> You will notice you hit the issue when Impala shell start to hang. The jstack
> of the hanging impalad (not the catalogd) will contain this for one of the
> threads:
> {code:java}
> "Thread-15" #30 prio=5 os_prio=0 tid=0x000000000db2a000 nid=0x56f4 in
> Object.wait() [0x00007f0e7b59a000]
> java.lang.Thread.State: TIMED_WAITING (on object monitor)
> at java.lang.Object.wait(Native Method)
> at
> org.apache.impala.catalog.ImpaladCatalog.waitForCatalogUpdate(ImpaladCatalog.java:290)
> - locked <0x0000000724f7cdc0> (a java.lang.Object)
> at
> org.apache.impala.analysis.StmtMetadataLoader.loadTables(StmtMetadataLoader.java:229)
> at
> org.apache.impala.analysis.StmtMetadataLoader.loadTables(StmtMetadataLoader.java:141)
> at
> org.apache.impala.service.Frontend.doCreateExecRequest(Frontend.java:2001)
> at
> org.apache.impala.service.Frontend.getTExecRequest(Frontend.java:1913)
> at
> org.apache.impala.service.Frontend.createExecRequest(Frontend.java:1737)
> at
> org.apache.impala.service.JniFrontend.createExecRequest(JniFrontend.java:164)
> {code}
> Initially, Iceberg tables are created as IncompleteTables and when there is a
> query on the table, they will be loaded as IcebergTable. For me it seems,
> that when we run the first query after creating the table, with some timing
> of dropping the files we can get into a state where the table appears as a
> "missingTable" in StmtMetadataLoader.loadTable(), however, when a prioritized
> table load is requested, the Catalog says that the table is already loaded.
> This results the table always appearing as "missingTable" and we never get
> out of the [while
> loop|https://github.com/apache/impala/blob/62e20d1ba842a3f27395251c57dea9850f462fc9/fe/src/main/java/org/apache/impala/analysis/StmtMetadataLoader.java#L196]
> in loadTables().
> I managed to repro this using HiveCatalog, but I didn't have luck to repro
> with non-Iceberg, traditional Hive tables.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]