[
https://issues.apache.org/jira/browse/SPARK-31721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17108223#comment-17108223
]
Apache Spark commented on SPARK-31721:
--------------------------------------
User 'dbaliafroozeh' has created a pull request for this issue:
https://github.com/apache/spark/pull/28543
> Assert optimized plan is initialized before tracking the execution of planning
> ------------------------------------------------------------------------------
>
> Key: SPARK-31721
> URL: https://issues.apache.org/jira/browse/SPARK-31721
> Project: Spark
> Issue Type: Improvement
> Components: SQL
> Affects Versions: 3.0.0
> Reporter: Ali Afroozeh
> Priority: Major
>
> The {{QueryPlanningTracker}} in {{QueryExeuction}} reports the planning time
> that also includes the optimization time. This happens because the
> {{optimizedPlan}} in {{QueryExecution}} is lazy and only will initialize when
> first called. When {{df.queryExecution.executedPlan}} is called, the the
> tracker starts recording the planning time, and then calls the optimized
> plan. This causes the planning time to start before optimization and also
> include the planning time.
> This PR fixes this behavior by introducing a method {{assertOptimized}},
> similar to {{assertAnalyzed}} that explicitly initializes the optimized plan.
> This method is called before measuring the time for {{sparkPlan}} and
> {{executedPlan}}. We call it before {{sparkPlan}} because that also counts as
> planning time.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]