[
https://issues.apache.org/jira/browse/IMPALA-13403?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17912047#comment-17912047
]
ASF subversion and git services commented on IMPALA-13403:
----------------------------------------------------------
Commit 1f7b9601e5a768c0b2061fef95c750ae74059b84 in impala's branch
refs/heads/master from Sai Hemanth Gantasala
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=1f7b9601e ]
IMPALA-13403: Refactor the checks of skip reloading file metadata for
ALTER_TABLE events
IMPALA-12487 adds an optimization that if an ALTER_TABLE event has
trivial changes in StorageDescriptor (e.g. removing optional field
'storedAsSubDirectories'=false which defaults to false), file
metadata reload will be skipped, no matter what changes are in the
table properties. This is problematic since some HMS clients (e.g.
Spark) could modify both the table properties and StorageDescriptor.
If there is a non-trivial changes in table properties (e.g. 'location'
change), we shouldn't skip reloading file metadata.
Testing:
- Added a unit test to verify the same
Change-Id: Ia969dd32385ac5a1a9a65890a5ccc8cd257f4b97
Reviewed-on: http://gerrit.cloudera.org:8080/21971
Reviewed-by: Impala Public Jenkins <[email protected]>
Tested-by: Impala Public Jenkins <[email protected]>
> Trivial changes in StorageDescriptor of ALTER_TABLE event is not enough to
> decide file metadata reload can be skipped
> ---------------------------------------------------------------------------------------------------------------------
>
> Key: IMPALA-13403
> URL: https://issues.apache.org/jira/browse/IMPALA-13403
> Project: IMPALA
> Issue Type: Bug
> Components: Catalog
> Reporter: Quanlong Huang
> Assignee: Sai Hemanth Gantasala
> Priority: Critical
>
> IMPALA-12487 adds an optimization that if an ALTER_TABLE event has trivial
> changes in StorageDescriptor (e.g. removing optional field
> 'storedAsSubDirectories'=false which defaults to false), file metadata reload
> will be skipped, no matter what changes are in the table properties
> (parameters):
> {code:java}
> boolean skipFileMetadata = false;
> if (isFieldSchemaChanged(beforeTable, afterTable) ||
> isTableOwnerChanged(beforeTable.getOwner(), afterTable.getOwner()))
> {
> skipFileMetadata = true;
> } else if (!Objects.equals(beforeTable.getSd(), afterTable.getSd())) {
> if (isTrivialSdPropsChanged(beforeTable.getSd(), afterTable.getSd()))
> {
> skipFileMetadata = true;
> }
> } else if (!isCustomTblPropsChanged(whitelistedTblProperties,
> beforeTable,
> afterTable)) {
> skipFileMetadata = true;
> }
> return skipFileMetadata;{code}
> [https://github.com/apache/impala/blob/11396d3146dfa2193420f79ec284f5212f058982/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java#L1940-L1944]
> This is problematic since some HMS clients (e.g. Spark) could modify both the
> table properties and StorageDescriptor. If there is a non-trivial changes in
> table properties (e.g. 'location' change), we shouldn't skip reloading file
> metadata.
> CC [~hemanth619], [~VenuReddy], [~csringhofer]
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]