Quanlong Huang created IMPALA-13403:
---------------------------------------

             Summary: Trivial changes in StorageDescriptor of ALTER_TABLE event 
is not enough to decide file metadata reload can be skipped
                 Key: IMPALA-13403
                 URL: https://issues.apache.org/jira/browse/IMPALA-13403
             Project: IMPALA
          Issue Type: Bug
          Components: Catalog
            Reporter: Quanlong Huang
            Assignee: Sai Hemanth Gantasala


IMPALA-12487 adds an optimization that if an ALTER_TABLE event has trivial 
changes in StorageDescriptor (e.g. removing optional field 
'storedAsSubDirectories'=false which defaults to false), file metadata reload 
will be skipped, no matter what changes are in the table properties 
(parameters):
{code:java}
      boolean skipFileMetadata = false;
      if (isFieldSchemaChanged(beforeTable, afterTable) ||
          isTableOwnerChanged(beforeTable.getOwner(), afterTable.getOwner())) {
        skipFileMetadata = true;
      } else if (!Objects.equals(beforeTable.getSd(), afterTable.getSd())) {
        if (isTrivialSdPropsChanged(beforeTable.getSd(), afterTable.getSd())) {
          skipFileMetadata = true;
        }
      } else if (!isCustomTblPropsChanged(whitelistedTblProperties, beforeTable,
          afterTable)) {
        skipFileMetadata = true;
      }
      return skipFileMetadata;{code}
[https://github.com/apache/impala/blob/11396d3146dfa2193420f79ec284f5212f058982/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java#L1940-L1944]

This is problematic since some HMS clients (e.g. Spark) could modify both the 
table properties and StorageDescriptor. If there is a non-trivial changes in 
table properties (e.g. 'location' change), we shouldn't skip reloading file 
metadata.

CC [~hemanth619], [~VenuReddy], [~csringhofer] 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to