zabetak commented on code in PR #6410:
URL: https://github.com/apache/hive/pull/6410#discussion_r3045375737


##########
ql/src/java/org/apache/hadoop/hive/ql/stats/StatsUtils.java:
##########
@@ -613,6 +614,24 @@ public static int getNDVPartitionColumn(PartitionIterable 
partitions, String par
     return distinctVals.size();
   }
 
+  private static long getNumNullsForPartCol(PartitionIterable partitions, 
String partColName, HiveConf conf) {
+    long numNulls = 0;
+    String defaultPartitionName = HiveConf.getVar(conf, 
HiveConf.ConfVars.DEFAULT_PARTITION_NAME);
+    for (Partition partition : partitions) {
+      String partVal = partition.getSpec().get(partColName);
+      if (partVal != null && partVal.equals(defaultPartitionName)) {
+        Map<String, String> parameters = partition.getParameters();
+        if (parameters != null && parameters.get(StatsSetupConst.ROW_COUNT) != 
null) {
+          long rowCount = 
Long.parseLong(parameters.get(StatsSetupConst.ROW_COUNT));
+          if (rowCount > 0) {
+            numNulls = safeAdd(numNulls, rowCount);
+          }
+        }
+      }
+    }
+    return numNulls;
+  }
+

Review Comment:
   I am wondering if we could take advantage of the existing 
`StatsUtils#getNumRows` method to some extend. At the very least we may be able 
to reuse some existing classes such as 
`org.apache.hadoop.hive.ql.stats.BasicStats`.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to