[GitHub] spark pull request #21577: [SPARK-24589][core] Correctly identify tasks in o...

cloud-fan Mon, 18 Jun 2018 21:21:01 -0700

Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21577#discussion_r196290828
  
    --- Diff: 
core/src/main/scala/org/apache/spark/scheduler/OutputCommitCoordinator.scala ---
    @@ -109,20 +116,21 @@ private[spark] class OutputCommitCoordinator(conf: 
SparkConf, isDriver: Boolean)
        * @param maxPartitionId the maximum partition id that could appear in 
this stage's tasks (i.e.
        *                       the maximum possible value of 
`context.partitionId`).
        */
    -  private[scheduler] def stageStart(stage: StageId, maxPartitionId: Int): 
Unit = synchronized {
    +  private[scheduler] def stageStart(stage: Int, maxPartitionId: Int): Unit 
= synchronized {
         stageStates(stage) = new StageState(maxPartitionId + 1)
    --- End diff --
    
    I checked the related code in `DAGScheduler`, if `T1_1.1` succeeds, the 
re-tried stage won't launch task for this partition, because Spark tracks 
finished tasks for a job.



---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark pull request #21577: [SPARK-24589][core] Correctly identify tasks in o...

Reply via email to