Github user tgravescs commented on a diff in the pull request:

    https://github.com/apache/spark/pull/11916#discussion_r57316516
  
    --- Diff: 
core/src/main/scala/org/apache/spark/scheduler/TaskSetManager.scala ---
    @@ -620,6 +620,14 @@ private[spark] class TaskSetManager(
         // Note: "result.value()" only deserializes the value when it's called 
at the first time, so
         // here "result.value()" just returns the value and won't block other 
threads.
         sched.dagScheduler.taskEnded(tasks(index), Success, result.value(), 
result.accumUpdates, info)
    +    // Kill other task attempts if any as the one attempt succeeded
    +    for (attemptInfo <- taskAttempts(index) if attemptInfo.attemptNumber 
!= info.attemptNumber
    --- End diff --
    
    I've only taken a quick look at this pr but I think there is a race here 
that if the speculative task finishes before this kill event can actually go 
out it will still be marked as success.  Please correct me if I'm wrong.
    
    SparkHadoopMapRedUtil.commitTask can just log if something is already 
committed, (sometimes it gets CommitDeniedException, depending on timing).    
At one point I had played with changing that to commit denied but other issues 
happened and I didn't have time to finish investigating.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to