[
https://issues.apache.org/jira/browse/IMPALA-14646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18051096#comment-18051096
]
ASF subversion and git services commented on IMPALA-14646:
----------------------------------------------------------
Commit de0e417d3e33348c35555c7d4133a466476880b0 in impala's branch
refs/heads/master from stiga-huang
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=de0e417d3 ]
IMPALA-14646: StorageDescriptor normalization should deal with parameters
When checking whether an ALTER_TABLE event has trivial changes in
StorageDescriptor, we normalize fields that are unrelated to file
metadata, e.g. cols, bucketCols, sortCols, etc. However, the parameters
map of StorageDescriptor is not normalized, which causes null and empty
map be considered as different.
This patch adds the normalization on the parameters map of
StorageDescriptor. Also improves the logs when non-trival SD changes is
detected. Currently we just dump the full SD objects, which is pretty
verbose and hard to analyze. This patches add logs to show the actual
changes.
Tests
- Added FE test
Change-Id: I6a9fcf2d60a41e9669d49412d49a2416c13d17bc
Reviewed-on: http://gerrit.cloudera.org:8080/23814
Reviewed-by: Impala Public Jenkins <[email protected]>
Tested-by: Impala Public Jenkins <[email protected]>
> StorageDescriptor normalization should deal with parameters
> -----------------------------------------------------------
>
> Key: IMPALA-14646
> URL: https://issues.apache.org/jira/browse/IMPALA-14646
> Project: IMPALA
> Issue Type: Bug
> Components: Catalog
> Reporter: Quanlong Huang
> Assignee: Quanlong Huang
> Priority: Critical
>
> When checking whether an ALTER_TABLE event has trivial changes in
> StorageDescriptor, we normalize fields that are unrelated to file metadata,
> e.g. cols, bucketCols, sortCols, etc. See code in
> [normalizeStorageDescriptor()|https://github.com/apache/impala/blob/85d77b908b12ae3d3f48ed5d49f38fb3832edc4e/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java#L2334-L2347].
> However, the parameters map of StorageDescriptor is not normalized, which
> causes null and empty map be considered as different:
> {code:bash}
> I1217 21:53:42.995173 279445 MetastoreEvents.java:825] EventId: 365355029
> EventType: ALTER_TABLE Non-trivial change in table storage descriptor (SD)
> detected for table my_db.my_tbl. So file metadata should be reloaded. SD
> before: StorageDescriptor(cols:[...], location:hdfs://table_path,
> inputFormat:org.apache.iceberg.mr.hive.HiveIcebergInputFormat,
> outputFormat:org.apache.iceberg.mr.hive.HiveIcebergOutputFormat,
> compressed:false, numBuckets:0, serdeInfo:SerDeInfo(name:null,
> serializationLib:org.apache.iceberg.mr.hive.HiveIcebergSerDe, parameters:{}),
> bucketCols:[], sortCols:[], parameters:{},
> skewedInfo:SkewedInfo(skewedColNames:[], skewedColValues:[],
> skewedColValueLocationMaps:{}), storedAsSubDirectories:false), SD after:
> StorageDescriptor(cols:[...], location:hdfs://table_path,
> inputFormat:org.apache.iceberg.mr.hive.HiveIcebergInputFormat,
> outputFormat:org.apache.iceberg.mr.hive.HiveIcebergOutputFormat,
> compressed:false, numBuckets:0, serdeInfo:SerDeInfo(name:null,
> serializationLib:org.apache.iceberg.mr.hive.HiveIcebergSerDe, parameters:{}),
> bucketCols:null, sortCols:null, parameters:null){code}
> In the above log, the old SD has an empty parameters map: {}, the new SD has
> a null parameters map. This should be considered as trivial change.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]