[
https://issues.apache.org/jira/browse/SPARK-20288?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Imran Rashid reassigned SPARK-20288:
------------------------------------
Assignee: jin xing (was: Imran Rashid)
> Improve BasicSchedulerIntegrationSuite "multi-stage job"
> --------------------------------------------------------
>
> Key: SPARK-20288
> URL: https://issues.apache.org/jira/browse/SPARK-20288
> Project: Spark
> Issue Type: Bug
> Components: Spark Core
> Affects Versions: 2.1.0
> Reporter: jin xing
> Assignee: jin xing
> Priority: Minor
> Fix For: 2.3.0
>
>
> ShuffleId is determined before job submitted. But it's hard to predict
> stageId by shuffleId.
> Stage is created in DAGScheduler(
> https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala#L381),
> but the order is n
> ot determined.
> I added a log(println(s"Creating ShufflMapStage-$id on
> shuffle-${shuffleDep.shuffleId}")) after
> (https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala#L331),
> when testing BasicSchedulerIntegrationSuite:"multi-stage job". It will print:
> Creating ShufflMapStage-0 on shuffle-0
> Creating ShufflMapStage-1 on shuffle-2
> Creating ShufflMapStage-2 on shuffle-1
> Creating ShufflMapStage-3 on shuffle-3
> or
> Creating ShufflMapStage-0 on shuffle-1
> Creating ShufflMapStage-1 on shuffle-3
> Creating ShufflMapStage-2 on shuffle-0
> Creating ShufflMapStage-3 on shuffle-2
> So It might be better to avoid generating the MapStatus by stageId.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]