mingmwang commented on code in PR #5171:
URL: https://github.com/apache/arrow-datafusion/pull/5171#discussion_r1099911470
##########
datafusion/core/src/physical_optimizer/sort_enforcement.rs:
##########
@@ -102,44 +128,189 @@ impl TreeNodeRewritable for PlanWithCorrespondingSort {
.collect::<Result<Vec<_>>>()?;
let children_plans = children_requirements
.iter()
- .map(|elem| elem.plan.clone())
+ .map(|item| item.plan.clone())
.collect::<Vec<_>>();
let sort_onwards = children_requirements
+ .into_iter()
+ .enumerate()
+ .map(|(idx, item)| {
+ let plan = &item.plan;
+ // Leaves of the `sort_onwards` are `SortExec`(Introduces
ordering). This tree collects
+ // all the intermediate executors that maintain this
ordering. If
+ // we just saw a sort-introducing operator, we reset the
tree and
+ // start accumulating.
+ if is_sort(plan) {
+ return Some(ExecTree {
+ idx,
+ plan: item.plan,
+ children: vec![],
+ });
+ } else if is_limit(plan) {
+ // There is no sort linkage for this path, it starts
at a limit.
+ return None;
+ }
+ let is_spm = is_sort_preserving_merge(plan);
+ let output_ordering = plan.output_ordering();
+ let required_orderings = plan.required_input_ordering();
+ let children =
+ izip!(&plan.children(), item.sort_onwards,
required_orderings)
+ .filter_map(|(child, element, required_ordering)| {
+ // Executor maintains or partially maintains
its child's output ordering
+ let maintains = ordering_satisfy(
+ child.output_ordering(),
+ output_ordering,
+ || child.equivalence_properties(),
+ );
+ if (required_ordering.is_none() && maintains)
|| is_spm {
+ element
+ } else {
+ None
+ }
+ })
Review Comment:
It is quite dangerous to compare the child `output ordering` with the parent
plan's output ordering. Some times the column indexes are changed and can not
satisfy even the plan actually maintains the output ordering. I think we
already have the `maintains_input_order()` method in the `ExecutionPlan`, can
we use that ?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]