Quanlong Huang created IMPALA-13403:
---------------------------------------
Summary: Trivial changes in StorageDescriptor of ALTER_TABLE event
is not enough to decide file metadata reload can be skipped
Key: IMPALA-13403
URL: https://issues.apache.org/jira/browse/IMPALA-13403
Project: IMPALA
Issue Type: Bug
Components: Catalog
Reporter: Quanlong Huang
Assignee: Sai Hemanth Gantasala
IMPALA-12487 adds an optimization that if an ALTER_TABLE event has trivial
changes in StorageDescriptor (e.g. removing optional field
'storedAsSubDirectories'=false which defaults to false), file metadata reload
will be skipped, no matter what changes are in the table properties
(parameters):
{code:java}
boolean skipFileMetadata = false;
if (isFieldSchemaChanged(beforeTable, afterTable) ||
isTableOwnerChanged(beforeTable.getOwner(), afterTable.getOwner())) {
skipFileMetadata = true;
} else if (!Objects.equals(beforeTable.getSd(), afterTable.getSd())) {
if (isTrivialSdPropsChanged(beforeTable.getSd(), afterTable.getSd())) {
skipFileMetadata = true;
}
} else if (!isCustomTblPropsChanged(whitelistedTblProperties, beforeTable,
afterTable)) {
skipFileMetadata = true;
}
return skipFileMetadata;{code}
[https://github.com/apache/impala/blob/11396d3146dfa2193420f79ec284f5212f058982/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java#L1940-L1944]
This is problematic since some HMS clients (e.g. Spark) could modify both the
table properties and StorageDescriptor. If there is a non-trivial changes in
table properties (e.g. 'location' change), we shouldn't skip reloading file
metadata.
CC [~hemanth619], [~VenuReddy], [~csringhofer]
--
This message was sent by Atlassian Jira
(v8.20.10#820010)