buraksenn commented on code in PR #20926:
URL: https://github.com/apache/datafusion/pull/20926#discussion_r3040198754
##########
datafusion/physical-plan/src/aggregates/mod.rs:
##########
@@ -1113,6 +1163,34 @@ impl AggregateExec {
}
}
+ /// Computes the estimated number of distinct groups across all grouping
sets.
+ /// For each grouping set, computes `product(NDV_i + null_adj_i)` for
active columns,
+ /// then sums across all sets. Returns `None` if any active column is not
a direct
+ /// column reference or lacks `distinct_count` stats.
+ /// When `null_count` is absent or unknown, null_adjustment defaults to 0.
+ fn compute_group_ndv(&self, child_statistics: &Statistics) ->
Option<usize> {
Review Comment:
I think this makes sense but to not overcomplicate compute_group_ndv I've
extracted `estimate_num_rows` from the caller side to wrap this. Hope it is ok
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]