[
https://issues.apache.org/jira/browse/FLINK-14663?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Kurt Young updated FLINK-14663:
-------------------------------
Issue Type: Bug (was: Improvement)
> Distinguish unknown column stats and zero
> -----------------------------------------
>
> Key: FLINK-14663
> URL: https://issues.apache.org/jira/browse/FLINK-14663
> Project: Flink
> Issue Type: Bug
> Components: Connectors / Hive, Table SQL / API
> Reporter: Kurt Young
> Priority: Major
>
> When converting from hive stats to flink's column stats, we didn't check
> whether some columns stats is really set or just an initial value. For
> example:
> {code:java}
> // code placeholder
> LongColumnStatsData longColStats = stats.getLongStats();
> return new CatalogColumnStatisticsDataLong(
> longColStats.getLowValue(),
> longColStats.getHighValue(),
> longColStats.getNumDVs(),
> longColStats.getNumNulls());
> {code}
> Hive `LongColumnStatsData` actually has information whether some stats is
> set through APIs like `isSetNumDVs()`. And the initial values are all 0, it
> will confuse us is it really 0 or just an initial value.
>
> We can use -1 to represent UNKNOWN value for column stats.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)