peter-toth opened a new pull request, #56275:
URL: https://github.com/apache/spark/pull/56275

   ### What changes were proposed in this pull request?
   
   Thread an optional `QueryPlanningTracker` parameter into 
`QueryExecution.prepareForExecution` and 
`AdaptiveSparkPlanExec.applyPhysicalRules`, and record per-rule timing the same 
way `QueryExecution.normalize` already does. The primary call site in 
`lazyExecutedPlan`, the five in-class call sites in `AdaptiveSparkPlanExec` 
(initial preparations, query post-planner strategy rules, post-stage creation 
-- both result and intermediate -- and AQE replanning), and the preprocessing 
call site in `InsertAdaptiveSparkPlan` all pass `Some(context.qe.tracker)`. 
`AdaptiveSparkPlanExec.reOptimize` now uses `optimizer.executeAndTrack` instead 
of `optimizer.execute` so AQE re-optimizer rule timing is also recorded.
   
   After this change, preparation rules of the main query 
(`EnsureRequirements`, `CollapseCodegenStages`, `ReuseExchangeAndSubquery`, 
etc.) and AQE-only rules (`AdjustShuffleExchangePosition`, `ValidateSparkPlan`, 
`OptimizeSkewedJoin`, `PlanAdaptiveSubqueries`, etc.) become visible via 
`QueryPlanningTracker.rules` and `topRulesByTime`.
   
   The following preparation paths keep the default `None` and are 
intentionally left as follow-ups:
   - `QueryExecution.prepareExecutedPlan(spark, plan)` -- subquery preparation 
called from `PlanSubqueries`.
   - `QueryExecution.prepareExecutedPlan(plan, context)` -- AQE dynamic pruning 
subquery preparation called from `PlanAdaptiveDynamicPruningFilters`.
   - `AdaptiveSparkPlanExec.optimizeQueryStage` -- the per-stage 
`queryStageOptimizerRules` foldLeft, which has its own `AQEShuffleReadRule` 
rollback handling and is structurally different from `applyPhysicalRules`.
   
   ### Why are the changes needed?
   
   Preparation rules ran through plain `foldLeft` patterns and bypassed 
`RuleExecutor.execute`, so neither `RuleExecutor.dumpTimeSpent()` nor 
`QueryPlanningTracker.topRulesByTime` reported them. Their wall-clock time was 
only visible as part of the `planning` phase total. For real workloads, a 
long-running preparation rule -- e.g. `EnsureRequirements` over a key-grouped 
join with many partitions, or AQE rules applied per stage -- was invisible 
per-rule, which made diagnosing planning-time gaps hard. 
`QueryExecution.normalize` already takes an optional tracker and records 
per-rule timing in exactly this shape; this change extends the same precedent 
to `prepareForExecution` and to `AdaptiveSparkPlanExec.applyPhysicalRules`, 
plus switches the AQE re-optimizer call to its tracking variant.
   
   ### Does this PR introduce _any_ user-facing change?
   
   No. Observability only. `tracker.rules` and `tracker.topRulesByTime(...)` 
now contain entries for preparation and AQE rules in addition to 
analyzer/optimizer rules. No plan changes, no public API additions.
   
   ### How was this patch tested?
   
   New tests in `QueryPlanningTrackerEndToEndSuite`:
   - `SPARK-57212: Track preparation rules` -- a non-shuffle query asserts 
`EnsureRequirements` and `CollapseCodegenStages` appear in `tracker.rules`.
   - `SPARK-57212: Track AQE-internal preparation rules` -- a shuffle query 
asserts AQE-only rules `AdjustShuffleExchangePosition` and `ValidateSparkPlan` 
appear, confirming the AQE side of the wiring.
   
   `build/sbt 'sql/testOnly *QueryPlanningTrackerEndToEndSuite'`
   
   ### Was this patch authored or co-authored using generative AI tooling?
   
   Generated-by: Claude Opus 4.7


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to