Quanlong Huang created IMPALA-14695:
---------------------------------------
Summary: Fast path for simple partition queries to fetch missing
metadata from catalogd in batch
Key: IMPALA-14695
URL: https://issues.apache.org/jira/browse/IMPALA-14695
Project: IMPALA
Issue Type: Bug
Components: Catalog, Frontend
Reporter: Quanlong Huang
Assignee: Quanlong Huang
It's a regression of local catalog mode (i.e. catalog-v2) that query planning
needs to retry when metadata changes in catalogd side. Example coordinator logs:
{code:java}
W0121 11:23:13.707223 317257 CatalogdMetaProvider.java:469]
0f4d14a4e9abc477:7cfdb09400000000] Catalog object TCatalogObject(type:TABLE,
catalog_version:19032, table:TTable(db_name:mydb, tbl_name:mytbl)) changed
version from 19032 to 19041 while fetching metadata
W0121 11:23:20.107473 317257 Frontend.java:2127]
0f4d14a4e9abc477:7cfdb09400000000] Retrying plan of query alter table
mydb.mytbl partition (p='2268357') set tblproperties('numRows'='1325',
'STATS_GENERATED_VIA_STATS_TASK'='true'): Catalog object
TCatalogObject(type:TABLE, catalog_version:19032, table:TTable(db_name:mydb,
tbl_name:mytbl)) changed version between accesses. (retry #32 of 40)
{code}
For simple queries like REFRESH PARTITION, COMPUTE INCREMENTAL STATS on a
single partition, or ALTER TABLE on a single partition, if some metadata is
missing in coordinator's local cache, we can consider sending a single batch
request to catalogd to fetch all the metadata the query needs, thus to avoid
hitting InconsistentMetadataFetchException which requires retries.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]