Mark Jarvin created SPARK-57158:
-----------------------------------

             Summary: ExplainUtils: extract operator ID assignment phase from 
processPlan into a private helper
                 Key: SPARK-57158
                 URL: https://issues.apache.org/jira/browse/SPARK-57158
             Project: Spark
          Issue Type: Bug
          Components: SQL
    Affects Versions: 4.2.0, 4.1.3
            Reporter: Mark Jarvin


ExplainUtils.processPlan currently conflates two distinct phases in a single 
method body:
 # Operator ID assignment: traversing the plan tree (main plan, subqueries, and 
adaptively-optimized-out exchanges per SPARK-42753) to populate an 
IdentityHashMap with monotonically-increasing IDs.
 # Text output generation: calling processPlanSkippingSubqueries on each 
discovered subtree to format the verbose explain string.

These phases are sequential and independent: the second only begins after the 
first is fully complete. Despite this, they are interleaved in the same 40-line 
method body, with the ID-assignment scaffolding (reused-exchange tracking, 
subquery collection, the SPARK-42753 optimized-out exchange loop) making it 
harder to see where phase 1 ends and phase 2 begins.

We should extract the ID-assignment phase into a private 
assignOperatorIds(plan, idMap) helper that returns the discovered (subqueries, 
optimizedOutExchanges) for the caller. processPlan is then reduced to: 
initialize the idMap, call assignOperatorIds, then perform the text-output pass 
over what was found. The behavior is identical, this is a pure refactoring.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to