Mark Jarvin created SPARK-57158:
-----------------------------------
Summary: ExplainUtils: extract operator ID assignment phase from
processPlan into a private helper
Key: SPARK-57158
URL: https://issues.apache.org/jira/browse/SPARK-57158
Project: Spark
Issue Type: Bug
Components: SQL
Affects Versions: 4.2.0, 4.1.3
Reporter: Mark Jarvin
ExplainUtils.processPlan currently conflates two distinct phases in a single
method body:
# Operator ID assignment: traversing the plan tree (main plan, subqueries, and
adaptively-optimized-out exchanges per SPARK-42753) to populate an
IdentityHashMap with monotonically-increasing IDs.
# Text output generation: calling processPlanSkippingSubqueries on each
discovered subtree to format the verbose explain string.
These phases are sequential and independent: the second only begins after the
first is fully complete. Despite this, they are interleaved in the same 40-line
method body, with the ID-assignment scaffolding (reused-exchange tracking,
subquery collection, the SPARK-42753 optimized-out exchange loop) making it
harder to see where phase 1 ends and phase 2 begins.
We should extract the ID-assignment phase into a private
assignOperatorIds(plan, idMap) helper that returns the discovered (subqueries,
optimizedOutExchanges) for the caller. processPlan is then reduced to:
initialize the idMap, call assignOperatorIds, then perform the text-output pass
over what was found. The behavior is identical, this is a pure refactoring.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]