vineetgarg02 commented on a change in pull request #1907: [CALCITE-3909] 
RelMdMinRowCount doesn't take into account UNION DISTINCT
URL: https://github.com/apache/calcite/pull/1907#discussion_r406970102
 
 

 ##########
 File path: 
core/src/main/java/org/apache/calcite/rel/metadata/RelMdMinRowCount.java
 ##########
 @@ -54,14 +54,31 @@
   }
 
   public Double getMinRowCount(Union rel, RelMetadataQuery mq) {
-    double rowCount = 0.0;
-    for (RelNode input : rel.getInputs()) {
-      Double partialRowCount = mq.getMinRowCount(input);
-      if (partialRowCount != null) {
-        rowCount += partialRowCount;
+    if (rel.all) {
+      double rowCount = 0.0;
+      for (RelNode input : rel.getInputs()) {
+        Double partialRowCount = mq.getMinRowCount(input);
+        if (partialRowCount != null) {
+          rowCount += partialRowCount;
+        }
+      }
+      return rowCount;
+    } else {
+      boolean valid = false;
+      double rowCount = Double.POSITIVE_INFINITY;
+      for (RelNode input : rel.getInputs()) {
+        Double partialRowCount = mq.getMinRowCount(input);
+        if (partialRowCount != null && rowCount > partialRowCount) {
+          rowCount = partialRowCount;
 
 Review comment:
   @chunweilei  If I understand correctly this is basically trying to find the 
input with least number of rows, and that number is being used as minimum row 
count. If so then this looks wrong, because minimum number of rows for UNION 
DISTINCT will be 1 since the input with least number of rows can have all 
duplicates in which case only one row will be returned.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

Reply via email to