Ali Afroozeh created SPARK-31721:
------------------------------------
Summary: Assert optimized plan is initialized before tracking the
execution of planning
Key: SPARK-31721
URL: https://issues.apache.org/jira/browse/SPARK-31721
Project: Spark
Issue Type: Improvement
Components: SQL
Affects Versions: 3.0.0
Reporter: Ali Afroozeh
The {{QueryPlanningTracker}} in {{QueryExeuction}} reports the planning time
that also includes the optimization time. This happens because the
{{optimizedPlan}} in {{QueryExecution}} is lazy and only will initialize when
first called. When {{df.queryExecution.executedPlan}} is called, the the
tracker starts recording the planning time, and then calls the optimized plan.
This causes the planning time to start before optimization and also include the
planning time.
This PR fixes this behavior by introducing a method {{assertOptimized}},
similar to {{assertAnalyzed}}that explicitly initializes the optimized plan.
This method is called before measuring the time for {{sparkPlan}} and
{{executedPlan}}. We call it before {{sparkPlan}} because that also counts as
planning time.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]