zhongyu09 commented on a change in pull request #30998:
URL: https://github.com/apache/spark/pull/30998#discussion_r551166759
##########
File path:
sql/core/src/main/scala/org/apache/spark/sql/execution/adaptive/AdaptiveSparkPlanExec.scala
##########
@@ -189,8 +189,17 @@ case class AdaptiveSparkPlanExec(
stagesToReplace = result.newStages ++ stagesToReplace
executionId.foreach(onUpdatePlan(_, result.newStages.map(_.plan)))
+ // SPARK-33933: we should submit tasks of broadcast stages first, to
avoid waiting
+ // for tasks to be scheduled and leading to broadcast timeout.
+ val reorderedNewStages = result.newStages
Review comment:
As for flaky, I admit a little bit.
I believe the order of calling `materialize` can guarantee that the order of
task to be scheduled in normal circumstances, but, to be honest, this is
guarantee is not strict since the submit of broadcast job and shuffle map
stage(job) are in different thread. But, at least we reach the same level as
non AQE.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]