peter-toth commented on code in PR #56277:
URL: https://github.com/apache/spark/pull/56277#discussion_r3381297254


##########
sql/core/src/test/scala/org/apache/spark/sql/SparkSessionExtensionSuite.scala:
##########
@@ -132,6 +132,12 @@ class SparkSessionExtensionSuite extends PlanTest with 
AdaptiveSparkPlanHelper {
     }
   }
 
+  test("SPARK-57194: inject early operator optimization rule") {
+    withSession(Seq(_.injectPreOperatorOptimizationRule(MyRule))) { session =>
+      
assert(session.sessionState.optimizer.batches.flatMap(_.rules).contains(MyRule(session)))

Review Comment:
   `batches.flatMap(_.rules).contains(MyRule(session))` only proves the rule 
landed in *some* batch -- it would still pass if a wiring bug put it in 
`operatorOptimizationBatch`, the very interleaving this feature exists to 
avoid. The sibling `injectPreCBORule` test pins the specific accessor instead 
(`optimizer.preCBORules.contains(...)`, line 131). Since this PR exposes 
`preOperatorOptimizationRules`, assert against it so the test actually checks 
the wiring:
   
   ```suggestion
         
assert(session.sessionState.optimizer.preOperatorOptimizationRules.contains(MyRule(session)))
   ```
   



##########
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala:
##########
@@ -228,6 +228,8 @@ abstract class Optimizer(catalogManager: CatalogManager)
     Batch("Aggregate", fixedPoint,
       RemoveLiteralFromGroupExpressions,
       RemoveRepetitionFromGroupExpressions),
+    Batch("Pre Operator Optimization", Once,

Review Comment:
   Several placement-sensitive batches in this list carry a comment explaining 
the constraint that pins them where they are -- `Subquery` (FixedPoint(1) to 
skip idempotence), `Pre CBO Rules` ("after operator optimization, before 
stats-dependent batches"), `Early Filter and Projection Push-Down`, `Join 
Reorder`. This batch's entire purpose is its position -- it has to run before 
`FoldablePropagation` / `ConstantFolding` in `operatorOptimizationBatch` -- but 
nothing here records that, so a future batch reordering has no signal to 
preserve. Worth a one-liner:
   
   ```scala
   // Injected rules run once here, before the operator-optimization fixed 
point, so they can
   // observe the plan before FoldablePropagation/ConstantFolding rewrite it. 
Keep this before
   // `operatorOptimizationBatch`.
   Batch("Pre Operator Optimization", Once,
     preOperatorOptimizationRules: _*),
   ```
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to