2010YOUY01 commented on code in PR #21976:
URL: https://github.com/apache/datafusion/pull/21976#discussion_r3194201853
##########
datafusion/physical-optimizer/src/optimizer.rs:
##########
@@ -170,18 +169,12 @@ impl PhysicalOptimizer {
// those are handled by the later `FilterPushdown` rule.
// See `FilterPushdownPhase` for more details.
Arc::new(FilterPushdown::new()),
- // The EnforceDistribution rule is for adding essential
repartitioning to satisfy distribution
- // requirements. Please make sure that the whole plan tree is
determined before this rule.
- // This rule increases parallelism if doing so is beneficial to
the physical plan; i.e. at
- // least one of the operators in the plan benefits from increased
parallelism.
- Arc::new(EnforceDistribution::new()),
- // The CombinePartialFinalAggregate rule should be applied after
the EnforceDistribution rule
+ // EnsureRequirements: merged EnforceDistribution + EnforceSorting
into a
+ // single idempotent rule with distribution-aware pushdown_sorts.
+ // See https://github.com/apache/datafusion/issues/21973
Review Comment:
```suggestion
// Ensures each input plan satisfies the distribution and
ordering requirements
// declared by `ExecutionPlan::required_input_distribution` and
// `ExecutionPlan::required_input_ordering`.
// If the requirements are already satisfied, this rule leaves
the plan
// unchanged. For example, it does not add sorting when the
input is a file
// scan whose existing order already satisfies the required
ordering.
// Otherwise, this rule inserts the necessary repartitioning and
sorting
// operators.
// This used to be implemented as two separate rules:
`EnforceDistribution`
// and `EnforceSorting`. It is now a single idempotent rule with
// distribution-aware `pushdown_sorts`.
// See https://github.com/apache/datafusion/issues/21973.
```
Added more comments since this is the entry point.
I also have a question: What is this 'distribution-aware pushdown_sorts'? We
cloud link some reference to it.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]