zhuqi-lucas commented on code in PR #21976:
URL: https://github.com/apache/datafusion/pull/21976#discussion_r3278975177
##########
datafusion/core/src/optimizer_rule_reference.md:
##########
@@ -67,27 +67,26 @@ Rule order matters. The default pipeline may change between
releases.
The same rule name may appear more than once when the default pipeline runs it
in multiple phases.
-| order | rule | phase | summary
|
-| ----- | ------------------------------ | ----------------------- |
------------------------------------------------------------------------------------------------------------
|
-| 1 | `OutputRequirements` | add phase | Adds
helper nodes so output requirements survive later physical rewrites.
|
-| 2 | `aggregate_statistics` | - | Uses
exact source statistics to answer some aggregates without scanning data.
|
-| 3 | `join_selection` | - | Chooses
join implementation, build side, and partition mode from statistics and stream
properties. |
-| 4 | `LimitedDistinctAggregation` | - | Pushes
limit hints into grouped distinct-style aggregations when only a small result
is needed. |
-| 5 | `FilterPushdown` | pre-optimization phase | Pushes
supported physical filters down toward data sources before distribution and
sorting are enforced. |
-| 6 | `EnforceDistribution` | - | Adds
repartitioning only where needed to satisfy physical distribution requirements.
|
-| 7 | `CombinePartialFinalAggregate` | - | Collapses
adjacent partial and final aggregates when the distributed shape makes them
redundant. |
-| 8 | `EnforceSorting` | - | Adds or
removes local sorts to satisfy required input orderings.
|
-| 9 | `OptimizeAggregateOrder` | - | Updates
aggregate expressions to use the best ordering once sort requirements are
known. |
-| 10 | `WindowTopN` | - | Replaces
eligible row-number window and filter patterns with per-partition TopK
execution. |
-| 11 | `ProjectionPushdown` | early pass | Pushes
projections toward inputs before later physical rewrites add more limit and
TopK structure. |
-| 12 | `OutputRequirements` | remove phase | Removes
the temporary output-requirement helper nodes after requirement-sensitive
planning is done. |
-| 13 | `LimitAggregation` | - | Passes a
limit hint into eligible aggregations so they can keep fewer accumulator
buckets. |
-| 14 | `LimitPushPastWindows` | - | Pushes
fetch limits through bounded window operators when doing so keeps the result
correct. |
-| 15 | `HashJoinBuffering` | - | Adds
buffering on the probe side of hash joins so probing can start before build
completion. |
-| 16 | `LimitPushdown` | - | Moves
physical limits into child operators or fetch-enabled variants to cut data
early. |
-| 17 | `TopKRepartition` | - | Pushes
TopK below hash repartition when the partition key is a prefix of the sort key.
|
-| 18 | `ProjectionPushdown` | late pass | Runs
projection pushdown again after limit and TopK rewrites expose new pruning
opportunities. |
-| 19 | `PushdownSort` | - | Pushes
sort requirements into data sources that can already return sorted output.
|
-| 20 | `EnsureCooperative` | - | Wraps
non-cooperative plan parts so long-running tasks yield fairly.
|
-| 21 | `FilterPushdown(Post)` | post-optimization phase | Pushes
dynamic filters at the end of optimization, after plan references stop moving.
|
-| 22 | `SanityCheckPlan` | - | Validates
that the final physical plan meets ordering, distribution, and infinite-input
safety requirements. |
+| order | rule | phase | summary
|
+| ----- | ------------------------------ | ----------------------- |
--------------------------------------------------------------------------------------------------------------------------------
|
+| 1 | `OutputRequirements` | add phase | Adds
helper nodes so output requirements survive later physical rewrites.
|
+| 2 | `aggregate_statistics` | - | Uses
exact source statistics to answer some aggregates without scanning data.
|
+| 3 | `join_selection` | - | Chooses
join implementation, build side, and partition mode from statistics and stream
properties. |
+| 4 | `LimitedDistinctAggregation` | - | Pushes
limit hints into grouped distinct-style aggregations when only a small result
is needed. |
+| 5 | `FilterPushdown` | pre-optimization phase | Pushes
supported physical filters down toward data sources before distribution and
sorting are enforced. |
+| 6 | `EnsureRequirements` | - | Enforces
both distribution and sorting requirements in a single idempotent rule
(replaces EnforceDistribution + EnforceSorting). |
Review Comment:
Good point — once those rules are gone, the "replaces" parenthetical only
dangles a reference readers can't grep. Dropped it; now just "Enforces both
distribution and sorting requirements in a single idempotent rule."
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]