dbaliafroozeh opened a new pull request #28543:
URL: https://github.com/apache/spark/pull/28543


   ### What changes were proposed in this pull request?
   The QueryPlanningTracker in QueryExeuction reports the planning time that 
also includes the optimization time. This happens because the optimizedPlan in 
QueryExecution is lazy and only will initialize when first called. When 
df.queryExecution.executedPlan is called, the the tracker starts recording the 
planning time, and then calls the optimized plan. This causes the planning time 
to start before optimization and also include the planning time.
   This PR fixes this behavior by introducing a method assertOptimized, similar 
to assertAnalyzed that explicitly initializes the optimized plan. This method 
is called before measuring the time for sparkPlan and executedPlan. We call it 
before sparkPlan because that also counts as planning time.
   
   ### Does this PR introduce _any_ user-facing change?
   No
   
   
   ### How was this patch tested?
   Unit tests


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to