vineetgarg02 commented on a change in pull request #1907: [CALCITE-3909]
RelMdMinRowCount doesn't take into account UNION DISTINCT
URL: https://github.com/apache/calcite/pull/1907#discussion_r406970102
##########
File path:
core/src/main/java/org/apache/calcite/rel/metadata/RelMdMinRowCount.java
##########
@@ -54,14 +54,31 @@
}
public Double getMinRowCount(Union rel, RelMetadataQuery mq) {
- double rowCount = 0.0;
- for (RelNode input : rel.getInputs()) {
- Double partialRowCount = mq.getMinRowCount(input);
- if (partialRowCount != null) {
- rowCount += partialRowCount;
+ if (rel.all) {
+ double rowCount = 0.0;
+ for (RelNode input : rel.getInputs()) {
+ Double partialRowCount = mq.getMinRowCount(input);
+ if (partialRowCount != null) {
+ rowCount += partialRowCount;
+ }
+ }
+ return rowCount;
+ } else {
+ boolean valid = false;
+ double rowCount = Double.POSITIVE_INFINITY;
+ for (RelNode input : rel.getInputs()) {
+ Double partialRowCount = mq.getMinRowCount(input);
+ if (partialRowCount != null && rowCount > partialRowCount) {
+ rowCount = partialRowCount;
Review comment:
@chunweilei If I understand correctly this is basically trying to find the
input with least number of rows, and that number is being used as minimum row
count. If so then this looks wrong, because minimum number of rows for UNION
DISTINCT will be 1 since the input with least number of rows can have all
duplicates in which case only one row will be returned.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
With regards,
Apache Git Services