msirek commented on code in PR #8038:
URL: https://github.com/apache/arrow-datafusion/pull/8038#discussion_r1387164578


##########
datafusion/physical-plan/src/aggregates/mod.rs:
##########
@@ -800,6 +807,39 @@ impl AggregateExec {
     pub fn group_by(&self) -> &PhysicalGroupBy {
         &self.group_by
     }
+
+    /// true, if this Aggregate has a group-by with no required or explicit 
ordering,
+    /// no filtering and no aggregate expressions
+    /// This method qualifies the use of the LimitedDistinctAggregation 
rewrite rule
+    /// on an AggregateExec.
+    pub fn is_unordered_unfiltered_group_by_distinct(&self) -> bool {
+        // ensure there is a group by
+        if self.group_by().is_empty() {
+            return false;
+        }
+        // ensure there are no aggregate expressions
+        if !self.aggr_expr().is_empty() {
+            return false;
+        }
+        // ensure there are no filters on aggregate expressions; the above 
check
+        // may preclude this case
+        if self.filter_expr().iter().any(|e| e.is_some()) {
+            return false;
+        }
+        // ensure there are no order by expressions
+        if self.order_by_expr().iter().any(|e| e.is_some()) {
+            return false;
+        }
+        // ensure there is no output ordering; can this rule be relaxed?

Review Comment:
   > it is probably subsumed by the check on the required input ordering
   
   OK. Even if there is an ORDER BY in a nested expression (e.g derived table), 
there's no requirement that the rows are presented in that order unless there's 
a top-level ORDER BY. 
   
   Is it OK to keep this check in until we have a known test case where there 
is an output ordering but no required input ordering?  Just want to avoid 
incorrect results in the event there is some edge case not considered.
   
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to