konstantinb commented on code in PR #6208:
URL: https://github.com/apache/hive/pull/6208#discussion_r2612598553
##########
ql/src/java/org/apache/hadoop/hive/ql/stats/StatsUtils.java:
##########
@@ -894,6 +878,40 @@ public static ColStatistics
getColStatistics(ColumnStatisticsObj cso, String col
return cs;
}
+ public static void fillColumnStatisticsData(ColumnStatisticsData data,
ColStatistics cs,
+ String colType) throws MetaException {
+ ColStatistics.Range r = cs.getRange();
+ Object lowValue = r != null ? r.minValue : null;
+ Object highValue = r != null ? r.maxValue : null;
+ StatObjectConverter.fillColumnStatisticsData(colType, data, lowValue,
highValue,
+ cs.getNumNulls(), cs.getCountDistint(), cs.getBitVectors(),
cs.getHistogram(),
+ cs.getAvgColLen(), cs.getAvgColLen(), cs.getNumTrues(),
cs.getNumFalses());
+ }
+
+ private static void fillColStatisticsFromLongStatsData(ColStatistics cs,
LongColumnStatsData longStats,
+ double avgColLen) {
+ cs.setCountDistint(longStats.getNumDVs());
+ cs.setNumNulls(longStats.getNumNulls());
+ cs.setAvgColLen(avgColLen);
+ Long lowVal = longStats.isSetLowValue() ? longStats.getLowValue() : null;
+ Long highVal = longStats.isSetHighValue() ? longStats.getHighValue() :
null;
+ cs.setRange(lowVal, highVal);
+ cs.setBitVectors(longStats.getBitVectors());
+ cs.setHistogram(longStats.getHistogram());
+ }
+
+ private static void fillColStatisticsFromDoubleStatsData(ColStatistics cs,
DoubleColumnStatsData doubleStats,
+ double avgColLen) {
+ cs.setCountDistint(doubleStats.getNumDVs());
+ cs.setNumNulls(doubleStats.getNumNulls());
+ cs.setAvgColLen(avgColLen);
+ Double lowVal = doubleStats.isSetLowValue() ? doubleStats.getLowValue() :
null;
+ Double highVal = doubleStats.isSetHighValue() ? doubleStats.getHighValue()
: null;
+ cs.setRange(lowVal, highVal);
+ cs.setBitVectors(doubleStats.getBitVectors());
+ cs.setHistogram(doubleStats.getHistogram());
+ }
+
Review Comment:
there is still a bit of code duplication here, but it is about 2x times less
vs the original code. And since the column stats classes do not have a common
base, refactoring could bring more side effects than benefits
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]