peter-toth commented on code in PR #56277:
URL: https://github.com/apache/spark/pull/56277#discussion_r3381297254
##########
sql/core/src/test/scala/org/apache/spark/sql/SparkSessionExtensionSuite.scala:
##########
@@ -132,6 +132,12 @@ class SparkSessionExtensionSuite extends PlanTest with
AdaptiveSparkPlanHelper {
}
}
+ test("SPARK-57194: inject early operator optimization rule") {
+ withSession(Seq(_.injectPreOperatorOptimizationRule(MyRule))) { session =>
+
assert(session.sessionState.optimizer.batches.flatMap(_.rules).contains(MyRule(session)))
Review Comment:
`batches.flatMap(_.rules).contains(MyRule(session))` only proves the rule
landed in *some* batch -- it would still pass if a wiring bug put it in
`operatorOptimizationBatch`, the very interleaving this feature exists to
avoid. The sibling `injectPreCBORule` test pins the specific accessor instead
(`optimizer.preCBORules.contains(...)`, line 131). Since this PR exposes
`preOperatorOptimizationRules`, assert against it so the test actually checks
the wiring:
```suggestion
assert(session.sessionState.optimizer.preOperatorOptimizationRules.contains(MyRule(session)))
```
##########
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala:
##########
@@ -228,6 +228,8 @@ abstract class Optimizer(catalogManager: CatalogManager)
Batch("Aggregate", fixedPoint,
RemoveLiteralFromGroupExpressions,
RemoveRepetitionFromGroupExpressions),
+ Batch("Pre Operator Optimization", Once,
Review Comment:
Several placement-sensitive batches in this list carry a comment explaining
the constraint that pins them where they are -- `Subquery` (FixedPoint(1) to
skip idempotence), `Pre CBO Rules` ("after operator optimization, before
stats-dependent batches"), `Early Filter and Projection Push-Down`, `Join
Reorder`. This batch's entire purpose is its position -- it has to run before
`FoldablePropagation` / `ConstantFolding` in `operatorOptimizationBatch` -- but
nothing here records that, so a future batch reordering has no signal to
preserve. Worth a one-liner:
```scala
// Injected rules run once here, before the operator-optimization fixed
point, so they can
// observe the plan before FoldablePropagation/ConstantFolding rewrite it.
Keep this before
// `operatorOptimizationBatch`.
Batch("Pre Operator Optimization", Once,
preOperatorOptimizationRules: _*),
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]