alamb commented on code in PR #5132:
URL: https://github.com/apache/arrow-datafusion/pull/5132#discussion_r1100745375


##########
datafusion/expr/src/logical_plan/builder.rs:
##########
@@ -403,6 +403,28 @@ impl LogicalPlanBuilder {
                 Ok(())
             })?;
 
+        // if current plan is distinct or current plan is repartition and its 
child plan is distinct,
+        // then this plan is a select distinct plan
+        let is_select_distinct = match self.plan {
+            LogicalPlan::Distinct(_) => true,
+            LogicalPlan::Repartition(Repartition { ref input, .. }) => {
+                matches!(input.as_ref(), &LogicalPlan::Distinct(_))
+            }
+            _ => false,

Review Comment:
   🤔  that is a good point. If the plan is being created from SQL I think this 
is the pattern.
   
   However, if it is built directly (via a `DataFrame` or someone using 
`LogicalPlanBuilder` directly) it is probably incorrect to throw an error here 
(as they may have meant to to run the distinct before the sort for some reason)
   
   Maybe we should move the check to the SQL planner



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to