[ 
https://issues.apache.org/jira/browse/IMPALA-14695?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18062720#comment-18062720
 ] 

Quanlong Huang commented on IMPALA-14695:
-----------------------------------------

Take an ALTER PARTITION statement as an example:
{code:sql}
alter table functional.alltypes partition(year=2010, month=1) set 
tblproperties('key'='value') {code}
>From the cold start (i.e. cache is empty), coordinator takes 7 
>TGetPartialCatalogObjectRequests to fetch the required metadata.
 # Get db list
 # Get table list of "functional" db
 # Get table list of "default" db
 # Get hmsTable of "functional.alltypes" table
 # Get column stats of "functional.alltypes" table
 # Get partition list (name+id) of "functional.alltypes" table
 # Get single partition (id=12)

In practise, we can assume the db list and table list are already cached. They 
are usually not the cause of InconsistentMetadataFetchException. So usually 
coordinator sends the last 4 requests. If the table version changes in catalogd 
during any of the requests, coordinator throws 
InconsistentMetadataFetchException and redo the query planning (including 
re-fetching the metadata), which causes the performance regression.

In the last 4 requests, the first 3 of them are sent in StmtMetadataLoader 
before analyzing the statement. Code path:
{noformat}
StmtMetadataLoader.getMissingTables() -> LocalDb.getTable() -> LocalTable.load()
  -> LocalTable.loadTableMetadata() -> CatalogdMetaProvider.loadTable() # Get 
hmsTable
  -> LocalFsTable.loadColumnStats()
    -> CatalogdMetaProvider.loadTableColumnStatistics()                 # Get 
column stats
    -> LocalFsTable.loadPartitionValueMap() -> 
CatalogdMetaProvider.loadPartitionList()  # Get partition list{noformat}
Some points for optimization:
 * We shouldn't always load the column stats especially for most of the DDLs 
which don't need the stats. Note that StmtMetadataLoader is aware of the 
statement type. There is a 
[TODO|https://github.com/apache/impala/blob/31769a7fb50ae1d6b6d69d366a776df441e00e3a/fe/src/main/java/org/apache/impala/catalog/local/LocalTable.java#L150-L154]
 in the code.
 * While fetching the hmsTable, we can also fetch the partition list, 
especially for statements that involves partitions. Then StmtMetadataLoader 
will just trigger one TGetPartialCatalogObjectRequest. The difficulty might be 
in CatalogdMetaProvider that we need a new method of it to update two cache 
items. We need to deal with piggyback requests, e.g. when there is already an 
in-flight request fetching only the hmsTable.

Once the 3 requests can be simplified into one, coordinator won't hit 
InconsistentMetadataFetchException till the last request (fetching the single 
partition). We can improve this to allow using partition instances from older 
table versions, as long as the partition ids match. This is based on the fact 
that HdfsPartition instances are immutable in catalogd and each has a unique 
partition id (across the same table). The check that need to improve is
[https://github.com/apache/impala/blob/31769a7fb50ae1d6b6d69d366a776df441e00e3a/fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java#L533-L543]

This would be helpful when users submit a batch of partition DDLs, each 
modifying a different partition. Each statement will invalidate the table 
(hmsTable) and the partition list. But the new partition list just changes one 
partition id. For other partitions in the list, the partition ids remain the 
same as in the old partition list.

> Fast path for simple partition queries to fetch missing metadata from 
> catalogd in batch
> ---------------------------------------------------------------------------------------
>
>                 Key: IMPALA-14695
>                 URL: https://issues.apache.org/jira/browse/IMPALA-14695
>             Project: IMPALA
>          Issue Type: Bug
>          Components: Catalog, Frontend
>            Reporter: Quanlong Huang
>            Assignee: Quanlong Huang
>            Priority: Critical
>
> It's a regression of local catalog mode (i.e. catalog-v2) that query planning 
> needs to retry when metadata changes in catalogd side. Example coordinator 
> logs:
> {code:java}
> W0121 11:23:13.707223 317257 CatalogdMetaProvider.java:469] 
> 0f4d14a4e9abc477:7cfdb09400000000] Catalog object TCatalogObject(type:TABLE, 
> catalog_version:19032, table:TTable(db_name:mydb, tbl_name:mytbl)) changed 
> version from 19032 to 19041 while fetching metadata
> W0121 11:23:20.107473 317257 Frontend.java:2127] 
> 0f4d14a4e9abc477:7cfdb09400000000] Retrying plan of query alter table 
> mydb.mytbl partition (p='2268357') set tblproperties('numRows'='1325', 
> 'STATS_GENERATED_VIA_STATS_TASK'='true'): Catalog object 
> TCatalogObject(type:TABLE, catalog_version:19032, table:TTable(db_name:mydb, 
> tbl_name:mytbl)) changed version between accesses. (retry #32 of 40)
> {code}
> For simple queries like REFRESH PARTITION, COMPUTE INCREMENTAL STATS on a 
> single partition, or ALTER TABLE on a single partition, if some metadata is 
> missing in coordinator's local cache, we can consider sending a single batch 
> request to catalogd to fetch all the metadata the query needs, thus to avoid 
> hitting InconsistentMetadataFetchException which requires retries.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to