[ 
https://issues.apache.org/jira/browse/IMPALA-10283?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Quanlong Huang resolved IMPALA-10283.
-------------------------------------
    Fix Version/s: Impala 4.0
       Resolution: Fixed

> IllegalStateException in applying incremental partition updates
> ---------------------------------------------------------------
>
>                 Key: IMPALA-10283
>                 URL: https://issues.apache.org/jira/browse/IMPALA-10283
>             Project: IMPALA
>          Issue Type: Bug
>          Components: Catalog
>    Affects Versions: Impala 4.0
>            Reporter: Quanlong Huang
>            Assignee: Quanlong Huang
>            Priority: Critical
>             Fix For: Impala 4.0
>
>
> When incremental metatdata updates are enabled (by default), catalogd sends 
> incremental partition updates based on the last sent table snapshot. 
> Coordinators will apply these partition updates on their existing table 
> snapshots.
> Each partition update is aka a partition instance. Partition instances are 
> identified by partition ids. Each partition instance is a snapshot of the 
> metadata of a partition. When applying incremental partition updates, there 
> is a Precondition check assuming that new partition updates should not be 
> duplicated with existing partition ids: 
> [https://github.com/apache/impala/blob/3ba8d637cdf38a68e25e573afa8d1d05047df2f6/fe/src/main/java/org/apache/impala/catalog/ImpaladCatalog.java#L515]
> The motivation of this check is to detect whether catalogd is sending 
> duplicated partition updates. However, it could be hitted when the 
> coordinator has a newer version of the table than the last sent table 
> snapshot in catalogd. This happens when two coordinators both execute DMLs on 
> the same table, and the DMLs finish within a catalog topic update time 
> window. Note that coordinator will receive a table snapshot from catalogd as 
> a response of the DML request. So one of the coordinator will have a table 
> version that is lower than the latest version in catalogd but larger than the 
> last sent table version in catalogd. When applying incremental partition 
> updates on this coordinator, the Precondition check will be hitted. We should 
> remove this check to accept this case.
> To reproduce the issue, create a partitioned table and warm up its metadata 
> cache by running any query on it (e.g. describe)
> {code:sql}
> create table multi_inserts_tbl (id int) partitioned by (p int);
> desc multi_inserts_tbl;
> {code}
> Run two inserts using two different coordinators in one command:
> {code:java}
> bin/impala-shell.sh -q "insert into multi_inserts_tbl partition (p) values 
> (0, 0)"; bin/impala-shell.sh -i localhost:21051 -q "insert into 
> multi_inserts_tbl partition (p) values (1, 1)"
> {code}
> We may find the following IllegalStateException in logs of the first 
> coordinator:
> {code:java}
> I1026 14:54:06.127398 11497 ImpaladCatalog.java:224] Adding: 
> TABLE:default.multi_inserts_tbl version: 1495 size: 1464 
> I1026 14:54:06.127557 11497 ImpaladCatalog.java:224] Adding: 
> CATALOG_SERVICE_ID version: 1495 size: 60
> I1026 14:54:06.127887 11497 ImpaladCatalog.java:249] Adding 2 partition(s): 
> HDFS_PARTITION:default.multi_inserts_tbl:(p=0,p=1), version=1495, 
> size=(avg=597, min=597, max=597, sum=1194)
> E1026 14:54:06.134311 11497 ImpaladCatalog.java:256] Error adding catalog 
> object: null 
> Java exception follows:
> java.lang.IllegalStateException
>         at 
> com.google.common.base.Preconditions.checkState(Preconditions.java:492)
>         at 
> org.apache.impala.catalog.ImpaladCatalog.addTable(ImpaladCatalog.java:515)
>         at 
> org.apache.impala.catalog.ImpaladCatalog.addCatalogObject(ImpaladCatalog.java:325)
>         at 
> org.apache.impala.catalog.ImpaladCatalog.updateCatalog(ImpaladCatalog.java:254)
>         at 
> org.apache.impala.service.FeCatalogManager$CatalogdImpl.updateCatalogCache(FeCatalogManager.java:114)
>         at 
> org.apache.impala.service.Frontend.updateCatalogCache(Frontend.java:378)
>         at 
> org.apache.impala.service.JniFrontend.updateCatalogCache(JniFrontend.java:178)
> {code}
> This makes the first coordinator fail to update the table to the latest 
> version:
> {code:java}
> $ bin/impala-shell.sh -q "show partitions multi_inserts_tbl"
> Starting Impala Shell with no authentication using Python 2.7.16
> Warning: live_progress only applies to interactive shell sessions, and is 
> being skipped for now.
> Opened TCP connection to localhost:21050
> Connected to localhost:21050
> Server version: impalad version 4.0.0-SNAPSHOT DEBUG (build 
> 3ba8d637cdf38a68e25e573afa8d1d05047df2f6)
> Query: show partitions multi_inserts_tbl
> +-------+-------+--------+------+--------------+-------------------+--------+-------------------+-------------------------------------------------------------+
> | p     | #Rows | #Files | Size | Bytes Cached | Cache Replication | Format | 
> Incremental stats | Location                                                  
>   |
> +-------+-------+--------+------+--------------+-------------------+--------+-------------------+-------------------------------------------------------------+
> | 0     | -1    | 1      | 2B   | NOT CACHED   | NOT CACHED        | TEXT   | 
> false             | 
> hdfs://localhost:20500/test-warehouse/multi_inserts_tbl/p=0 |
> | Total | -1    | 1      | 2B   | 0B           |                   |        | 
>                   |                                                           
>   |
> +-------+-------+--------+------+--------------+-------------------+--------+-------------------+-------------------------------------------------------------+
> Fetched 2 row(s) in 0.01s {code}
> Partition p=1 is missing.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to