shrirangmhalgi opened a new pull request, #56083:
URL: https://github.com/apache/spark/pull/56083

   
   ### What changes were proposed in this pull request?
   Normalize CTE IDs of orphan `CTERelationRef` nodes in `NormalizeCTEIds`. 
Previously, only `CTERelationRef` nodes inside `WithCTE` were normalized via 
`canonicalizeCTE`. Refs that exist outside any `WithCTE` (orphans) kept their 
original IDs.
   
   ### Why are the changes needed?
   After `InlineCTE` or `MergeSubplans` runs, some `CTERelationRef` nodes can 
end up outside their parent `WithCTE` node. When `NormalizeCTEIds` processes 
the plan, these orphan refs are skipped, leaving non-normalized IDs. This 
breaks plan comparison and caching because the same logical plan gets different 
CTE IDs across sessions (since `CTERelationDef` uses a global monotonically 
increasing counter).
   
   
   ### Does this PR introduce _any_ user-facing change?
   No. This is an internal plan normalization fix that affects plan caching 
correctness.
   
   
   ### How was this patch tested?
   Added `NormalizeCTEIdsSuite` with a test that constructs a plan with a 
`CTERelationRef` outside `WithCTE` and verifies all ref IDs are normalized. 
Without the fix, the orphan ref retains its original ID (100); with the fix, 
it's normalized to 0.
   
   ### Was this patch authored or co-authored using generative AI tooling?
   Yes.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to