[
https://issues.apache.org/jira/browse/IMPALA-10283?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Quanlong Huang resolved IMPALA-10283.
-------------------------------------
Fix Version/s: Impala 4.0
Resolution: Fixed
> IllegalStateException in applying incremental partition updates
> ---------------------------------------------------------------
>
> Key: IMPALA-10283
> URL: https://issues.apache.org/jira/browse/IMPALA-10283
> Project: IMPALA
> Issue Type: Bug
> Components: Catalog
> Affects Versions: Impala 4.0
> Reporter: Quanlong Huang
> Assignee: Quanlong Huang
> Priority: Critical
> Fix For: Impala 4.0
>
>
> When incremental metatdata updates are enabled (by default), catalogd sends
> incremental partition updates based on the last sent table snapshot.
> Coordinators will apply these partition updates on their existing table
> snapshots.
> Each partition update is aka a partition instance. Partition instances are
> identified by partition ids. Each partition instance is a snapshot of the
> metadata of a partition. When applying incremental partition updates, there
> is a Precondition check assuming that new partition updates should not be
> duplicated with existing partition ids:
> [https://github.com/apache/impala/blob/3ba8d637cdf38a68e25e573afa8d1d05047df2f6/fe/src/main/java/org/apache/impala/catalog/ImpaladCatalog.java#L515]
> The motivation of this check is to detect whether catalogd is sending
> duplicated partition updates. However, it could be hitted when the
> coordinator has a newer version of the table than the last sent table
> snapshot in catalogd. This happens when two coordinators both execute DMLs on
> the same table, and the DMLs finish within a catalog topic update time
> window. Note that coordinator will receive a table snapshot from catalogd as
> a response of the DML request. So one of the coordinator will have a table
> version that is lower than the latest version in catalogd but larger than the
> last sent table version in catalogd. When applying incremental partition
> updates on this coordinator, the Precondition check will be hitted. We should
> remove this check to accept this case.
> To reproduce the issue, create a partitioned table and warm up its metadata
> cache by running any query on it (e.g. describe)
> {code:sql}
> create table multi_inserts_tbl (id int) partitioned by (p int);
> desc multi_inserts_tbl;
> {code}
> Run two inserts using two different coordinators in one command:
> {code:java}
> bin/impala-shell.sh -q "insert into multi_inserts_tbl partition (p) values
> (0, 0)"; bin/impala-shell.sh -i localhost:21051 -q "insert into
> multi_inserts_tbl partition (p) values (1, 1)"
> {code}
> We may find the following IllegalStateException in logs of the first
> coordinator:
> {code:java}
> I1026 14:54:06.127398 11497 ImpaladCatalog.java:224] Adding:
> TABLE:default.multi_inserts_tbl version: 1495 size: 1464
> I1026 14:54:06.127557 11497 ImpaladCatalog.java:224] Adding:
> CATALOG_SERVICE_ID version: 1495 size: 60
> I1026 14:54:06.127887 11497 ImpaladCatalog.java:249] Adding 2 partition(s):
> HDFS_PARTITION:default.multi_inserts_tbl:(p=0,p=1), version=1495,
> size=(avg=597, min=597, max=597, sum=1194)
> E1026 14:54:06.134311 11497 ImpaladCatalog.java:256] Error adding catalog
> object: null
> Java exception follows:
> java.lang.IllegalStateException
> at
> com.google.common.base.Preconditions.checkState(Preconditions.java:492)
> at
> org.apache.impala.catalog.ImpaladCatalog.addTable(ImpaladCatalog.java:515)
> at
> org.apache.impala.catalog.ImpaladCatalog.addCatalogObject(ImpaladCatalog.java:325)
> at
> org.apache.impala.catalog.ImpaladCatalog.updateCatalog(ImpaladCatalog.java:254)
> at
> org.apache.impala.service.FeCatalogManager$CatalogdImpl.updateCatalogCache(FeCatalogManager.java:114)
> at
> org.apache.impala.service.Frontend.updateCatalogCache(Frontend.java:378)
> at
> org.apache.impala.service.JniFrontend.updateCatalogCache(JniFrontend.java:178)
> {code}
> This makes the first coordinator fail to update the table to the latest
> version:
> {code:java}
> $ bin/impala-shell.sh -q "show partitions multi_inserts_tbl"
> Starting Impala Shell with no authentication using Python 2.7.16
> Warning: live_progress only applies to interactive shell sessions, and is
> being skipped for now.
> Opened TCP connection to localhost:21050
> Connected to localhost:21050
> Server version: impalad version 4.0.0-SNAPSHOT DEBUG (build
> 3ba8d637cdf38a68e25e573afa8d1d05047df2f6)
> Query: show partitions multi_inserts_tbl
> +-------+-------+--------+------+--------------+-------------------+--------+-------------------+-------------------------------------------------------------+
> | p | #Rows | #Files | Size | Bytes Cached | Cache Replication | Format |
> Incremental stats | Location
> |
> +-------+-------+--------+------+--------------+-------------------+--------+-------------------+-------------------------------------------------------------+
> | 0 | -1 | 1 | 2B | NOT CACHED | NOT CACHED | TEXT |
> false |
> hdfs://localhost:20500/test-warehouse/multi_inserts_tbl/p=0 |
> | Total | -1 | 1 | 2B | 0B | | |
> |
> |
> +-------+-------+--------+------+--------------+-------------------+--------+-------------------+-------------------------------------------------------------+
> Fetched 2 row(s) in 0.01s {code}
> Partition p=1 is missing.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]