[ 
https://issues.apache.org/jira/browse/IMPALA-14646?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Quanlong Huang resolved IMPALA-14646.
-------------------------------------
    Fix Version/s: Impala 5.0.0
       Resolution: Fixed

> StorageDescriptor normalization should deal with parameters
> -----------------------------------------------------------
>
>                 Key: IMPALA-14646
>                 URL: https://issues.apache.org/jira/browse/IMPALA-14646
>             Project: IMPALA
>          Issue Type: Bug
>          Components: Catalog
>            Reporter: Quanlong Huang
>            Assignee: Quanlong Huang
>            Priority: Critical
>             Fix For: Impala 5.0.0
>
>
> When checking whether an ALTER_TABLE event has trivial changes in 
> StorageDescriptor, we normalize fields that are unrelated to file metadata, 
> e.g. cols, bucketCols, sortCols, etc. See code in 
> [normalizeStorageDescriptor()|https://github.com/apache/impala/blob/85d77b908b12ae3d3f48ed5d49f38fb3832edc4e/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java#L2334-L2347].
> However, the parameters map of StorageDescriptor is not normalized, which 
> causes null and empty map be considered as different:
> {code:bash}
> I1217 21:53:42.995173 279445 MetastoreEvents.java:825] EventId: 365355029 
> EventType: ALTER_TABLE Non-trivial change in table storage descriptor (SD) 
> detected for table my_db.my_tbl. So file metadata should be reloaded. SD 
> before: StorageDescriptor(cols:[...], location:hdfs://table_path, 
> inputFormat:org.apache.iceberg.mr.hive.HiveIcebergInputFormat, 
> outputFormat:org.apache.iceberg.mr.hive.HiveIcebergOutputFormat, 
> compressed:false, numBuckets:0, serdeInfo:SerDeInfo(name:null, 
> serializationLib:org.apache.iceberg.mr.hive.HiveIcebergSerDe, parameters:{}), 
> bucketCols:[], sortCols:[], parameters:{}, 
> skewedInfo:SkewedInfo(skewedColNames:[], skewedColValues:[], 
> skewedColValueLocationMaps:{}), storedAsSubDirectories:false), SD after: 
> StorageDescriptor(cols:[...], location:hdfs://table_path, 
> inputFormat:org.apache.iceberg.mr.hive.HiveIcebergInputFormat, 
> outputFormat:org.apache.iceberg.mr.hive.HiveIcebergOutputFormat, 
> compressed:false, numBuckets:0, serdeInfo:SerDeInfo(name:null, 
> serializationLib:org.apache.iceberg.mr.hive.HiveIcebergSerDe, parameters:{}), 
> bucketCols:null, sortCols:null, parameters:null){code}
> In the above log, the old SD has an empty parameters map: {}, the new SD has 
> a null parameters map. This should be considered as trivial change.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to