[GitHub] spark pull request: [SPARK-13343] [CORE] speculative tasks that di...

tgravescs Thu, 24 Mar 2016 11:56:18 -0700

Github user tgravescs commented on a diff in the pull request:

    https://github.com/apache/spark/pull/11916#discussion_r57369488
  
    --- Diff: 
core/src/main/scala/org/apache/spark/scheduler/TaskSetManager.scala ---
    @@ -620,6 +620,14 @@ private[spark] class TaskSetManager(
         // Note: "result.value()" only deserializes the value when it's called 
at the first time, so
         // here "result.value()" just returns the value and won't block other 
threads.
         sched.dagScheduler.taskEnded(tasks(index), Success, result.value(), 
result.accumUpdates, info)
    +    // Kill other task attempts if any as the one attempt succeeded
    +    for (attemptInfo <- taskAttempts(index) if attemptInfo.attemptNumber 
!= info.attemptNumber
    --- End diff --
    
    so I'll try the patch out but I'm pretty sure it will still show multiple 
succeeded tasks that were speculative.
    
    in  SparkHadoopMapRedUtil.commitTask
    it has the check:
        if (committer.needsTaskCommit(mrTaskContext)) {
        ...
       } else {
       // Some other attempt committed the output, so we do nothing and signal 
success
          logInfo(s"No need to commit output of task because 
needsTaskCommit=false: $mrTaskAttemptID")
        }
    
    So if another task commits, and then the second speculative task tries to 
commit, its simply going to log this message and send the task finished event 
back to driver.  Driver is going to take that as success.
    
    If your intention is just to solve the issue with killing tasks perhaps 
move this PR to be for https://issues.apache.org/jira/browse/SPARK-10530,  and 
leave SPARK-13343 open.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark pull request: [SPARK-13343] [CORE] speculative tasks that di...

Reply via email to