berkaysynnada commented on code in PR #14919: URL: https://github.com/apache/datafusion/pull/14919#discussion_r1975931108
########## datafusion/core/tests/physical_optimizer/enforce_sorting.rs: ########## @@ -3346,3 +3351,62 @@ async fn test_window_partial_constant_and_set_monotonicity() -> Result<()> { Ok(()) } + +#[tokio::test] +async fn test_preserve_needed_coalesce() -> Result<()> { + // Input to EnforceSorting, from our test case. + let plan = projection_exec_with_alias( + union_exec(vec![parquet_exec_with_stats(); 2]), + vec![ + ("a".to_string(), "a".to_string()), + ("b".to_string(), "value".to_string()), + ], + ); + let plan = Arc::new(CoalescePartitionsExec::new(plan)); + let schema = schema(); + let sort_key = LexOrdering::new(vec![PhysicalSortExpr { + expr: col("a", &schema).unwrap(), + options: SortOptions::default(), + }]); + let plan: Arc<dyn ExecutionPlan> = + single_partitioned_aggregate(plan, vec![("a".to_string(), "a1".to_string())]); + let plan = sort_exec(sort_key, plan); + + // Starting plan: as in our test case. + assert_eq!( + get_plan_string(&plan), + vec![ + "SortExec: expr=[a@0 ASC], preserve_partitioning=[false]", + " AggregateExec: mode=SinglePartitioned, gby=[a@0 as a1], aggr=[]", Review Comment: ```rust /// Applies the entire logical aggregation operation in a single operator, /// as opposed to Partial / Final modes which apply the logical aggregation using /// two operators. /// /// This mode requires that the input is partitioned by group key (like /// FinalPartitioned) SinglePartitioned, ``` It means that when an `AggregateExec` is `SinglePartitioned`, then its input should be multi-partitioned (I should say the wording is a bit confusing -- SinglePartitioned means **Single** layer of aggregation, but its input is **Partitioned**). However, in your example plan, `AggregateExec` has a single partition (as it is above CoalescePartitions). So, what I was trying to say that your initial plan should cannot ever exists in general conditions. I'd also like to remind that; EnforceSorting can handle invalid plans and make them valid in terms of **ordering** conditions (and expects valid distribution conditions), and EnforceDistribution does the same for **distribution** conditions (and again expects valid ordering conditions). However, in this reproducer, you are giving an invalid plan to the EnforceSorting in terms of **distribution**. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org