[
https://issues.apache.org/jira/browse/HIVE-29265?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Zhihua Deng updated HIVE-29265:
-------------------------------
Affects Version/s: 4.0.1
4.2.0
> UnsupportedDoubleException could leave the stale column marker in
> COLUMN_STATS_ACCURATE
> ---------------------------------------------------------------------------------------
>
> Key: HIVE-29265
> URL: https://issues.apache.org/jira/browse/HIVE-29265
> Project: Hive
> Issue Type: Bug
> Components: Statistics
> Affects Versions: 4.0.1, 4.2.0
> Reporter: Zhihua Deng
> Priority: Major
>
> Take the schema_evol_orc_nonvec_part.q as an example,
> {code:java}
> CREATE TABLE
> part_change_lower_to_higher_numeric_group_decimal_to_float_n7(insert_num int,
> c1 decimal(38,18), c2 decimal(38,18),
> c3 float,
> b STRING) PARTITIONED BY(part INT);
> insert into table
> part_change_lower_to_higher_numeric_group_decimal_to_float_n7
> partition(part=1) SELECT insert_num,
> decimal1, decimal1,
> float1,
> 'original' FROM schema_evolution_data_n25; {code}
> for column c3, the above query will throw UnsupportedDoubleException on
> gathering the column stats, as a result this column stats is ignored, we
> couldn't find the stats entry in part_col_stats. While in partition_params,
> the column stats c3 is marked as true:
> \{"BASIC_STATS":"true","COLUMN_STATS":{"b":"true","c1":"true","c2":"true","c3":"true","insert_num":"true"}}
> If a valid insert happens afterwards, the new column stats for c3 will take
> over, this would make the c3 stats incorrect.
>
--
This message was sent by Atlassian Jira
(v8.20.10#820010)