amansinha100 commented on a change in pull request #729: Drill 1328: Support table statistics for Parquet URL: https://github.com/apache/drill/pull/729#discussion_r250695535
########## File path: exec/java-exec/src/main/java/org/apache/drill/exec/planner/cost/DrillRelMdDistinctRowCount.java ########## @@ -56,9 +73,159 @@ public Double getDistinctRowCount(Join rel, RelMetadataQuery mq, return getDistinctRowCount((RelNode) rel, mq, groupKey, predicate); } - public Double getDistinctRowCount(DrillScanRel scan, RelMetadataQuery mq, - ImmutableBitSet groupKey, RexNode predicate) { - // Consistent with the estimation of Aggregate row count in RelMdRowCount : distinctRowCount = rowCount * 10%. - return scan.estimateRowCount(mq) * 0.1; + @Override + public Double getDistinctRowCount(RelNode rel, RelMetadataQuery mq, ImmutableBitSet groupKey, RexNode predicate) { + if (rel instanceof TableScan && !DrillRelOptUtil.guessRows(rel)) { + return getDistinctRowCount((TableScan) rel, mq, groupKey, predicate); + } else if (rel instanceof SingleRel && !DrillRelOptUtil.guessRows(rel)) { + if (rel instanceof Window) { + int childFieldCount = ((Window)rel).getInput().getRowType().getFieldCount(); + // For window aggregates delegate ndv to parent + for (int bit : groupKey) { + if (bit >= childFieldCount) { Review comment: Considering that window functions don't change the NDV, it is not clear why additional check is being done here (the comment above says 'delegate ndv to parent' but this is doing it conditionally. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services