Kurt Young created FLINK-14663:
----------------------------------

             Summary: Distinguish unknown column stats and zero
                 Key: FLINK-14663
                 URL: https://issues.apache.org/jira/browse/FLINK-14663
             Project: Flink
          Issue Type: Improvement
          Components: Connectors / Hive, Table SQL / API
            Reporter: Kurt Young


When converting from hive stats to flink's column stats, we didn't check 
whether some columns stats is really set or just an initial value. For example:
{code:java}
// code placeholder
LongColumnStatsData longColStats = stats.getLongStats();
return new CatalogColumnStatisticsDataLong(
      longColStats.getLowValue(),
      longColStats.getHighValue(),
      longColStats.getNumDVs(),
      longColStats.getNumNulls());
{code}
 Hive `LongColumnStatsData` actually has information whether some stats is set 
through APIs like `isSetNumDVs()`. And the initial values are all 0, it will 
confuse us is it really 0 or just an initial value. 

 

We can use -1 to represent UNKNOWN value for column stats. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to