[PR] [SPARK-56869][SQL] Speed up TreeNode transforms when rule doesn't match [spark]

via GitHub Thu, 14 May 2026 14:51:37 -0700


gengliangwang opened a new pull request, #55889:
URL: https://github.com/apache/spark/pull/55889


   ### What changes were proposed in this pull request?
   
   Gate `CurrentOrigin.withOrigin(origin) { rule.applyOrElse(...) }` in four 
`TreeNode` rule-driven transform methods behind `rule.isDefinedAt(...)`, so the 
ThreadLocal wrap only runs when the rule actually fires:
   
   - `TreeNode.transformDownWithPruning`
   - `TreeNode.transformUpWithPruning`
   - `TreeNode.transformUpWithBeforeAndAfterRuleOnChildren`
   - `TreeNode.multiTransformDownWithPruning` (also drops a side-effecting 
default closure that became unnecessary)
   
   ### Why are the changes needed?
   
   `CurrentOrigin.withOrigin` is only observable when the rule constructs new 
nodes — they pick up `CurrentOrigin.get` in their `override val origin` field. 
On nodes the rule doesn't match, the wrap is pure overhead: two `ThreadLocal` 
writes plus a `try`/`finally` per node visit.
   
   JFR profiling (60s sample, 1.1M iterations of `transformDown` over a 
1024-leaf balanced `Add` tree with a non-matching rule) shows:
   
   - 66% of CPU samples in `ThreadLocalMap.set` (line 486)
   - 13% in `ThreadLocalMap.getEntryAfterMiss`
   -  9% more in `ThreadLocalMap.set` (line 493)
   
   Total: **~88% of transform CPU spent inside `CurrentOrigin.withOrigin` for 
nodes the rule never matched.**
   
   Microbenchmark (best time per N iterations, JDK 17, Xeon 8175M @ 2.50GHz, 
baseline = `upstream/master`):
   
   | case                                         | baseline  | optimized | 
speedup |
   
|----------------------------------------------|----------:|----------:|--------:|
   | `transformDown` deep chain(5000) no-op       |     12 ms |      6 ms |    
2.0x |
   | `transformDown` deep chain(5000) rewrite leaf|  20109 ms |  15850 ms |   
1.27x |
   | `transformDown` wide(100) no-op              |      5 ms |      3 ms |    
1.7x |
   | `transformDown` balanced(1024) no-op         |      7 ms |      2 ms |    
3.5x |
   | `transformDown` balanced(4096) no-op         |     34 ms |     15 ms |    
2.3x |
   | `transformUp` deep chain(1000) no-op         |      3 ms |      1 ms |    
3.0x |
   | `transformUp` deep chain(5000) no-op         |     15 ms |      6 ms |    
2.5x |
   | `transformUp` balanced(1024) no-op           |      8 ms |      3 ms |    
2.7x |
   | `transformUp` balanced(4096) no-op           |     39 ms |     25 ms |    
1.6x |
   
   Rewrite-heavy cases are unchanged because `withOrigin` still runs when the 
rule fires. Real Spark workloads (analyzer / optimizer batches running many 
rules across many nodes, where each rule matches a small subset) are dominated 
by the no-match case, so these savings should compound.
   
   ### Does this PR introduce _any_ user-facing change?
   
   No. The change is internal to `TreeNode` rule machinery and preserves all 
observable semantics:
   
   - `CurrentOrigin` is still set before any rule body that constructs new 
nodes runs.
   - `markRuleAsIneffective` / `isRuleIneffective` bookkeeping unchanged.
   - `copyTagsFrom` ordering on rule-replacement unchanged.
   - `fastEquals` short-circuit unchanged.
   - Result identity (`this eq result` on a no-op transform) preserved.
   
   ### How was this patch tested?
   
   - `build/sbt 'catalyst/testOnly *TreeNodeSuite'` — 36 / 36 pass.
   - `build/sbt 'catalyst/test'` — 9272 / 9272 pass across 352 suites (~7 min).
   
   ### Was this patch authored or co-authored using generative AI tooling?
   
   Generated-by: Claude Code (Anthropic Claude Opus 4.7)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[PR] [SPARK-56869][SQL] Speed up TreeNode transforms when rule doesn't match [spark]

Reply via email to