Github user devaraj-kavali commented on a diff in the pull request:

    https://github.com/apache/spark/pull/11916#discussion_r57340394
  
    --- Diff: 
core/src/main/scala/org/apache/spark/scheduler/TaskSetManager.scala ---
    @@ -620,6 +620,14 @@ private[spark] class TaskSetManager(
         // Note: "result.value()" only deserializes the value when it's called 
at the first time, so
         // here "result.value()" just returns the value and won't block other 
threads.
         sched.dagScheduler.taskEnded(tasks(index), Success, result.value(), 
result.accumUpdates, info)
    +    // Kill other task attempts if any as the one attempt succeeded
    +    for (attemptInfo <- taskAttempts(index) if attemptInfo.attemptNumber 
!= info.attemptNumber
    --- End diff --
    
    Thanks @tgravescs for the comment.
    
    If anyone attempt is actually completed(succeeded) and not reached the 
success event here and during that time if any other attempt tries to commit 
the o/p then the SparkHadoopMapRedUtil.commitTask would prevent it doing so. 
And other case is that if the task attempt completes in Executor before getting 
the kill signal from TaskSetManager.handleSuccessfulTask then the Executor 
ignores the kill request and there will be no problem. I don't see a case that 
there will be two attempts becoming success where the task attempts use the 
commit coordination, Please help me understand if there are any. 
    
    Here the major issue is, there are other task attempts running and not 
releasing the executor threads even if there is a task attempt already 
succeeded for the same task, sometimes these unnecessary task attempts keep 
running till the job/application completion(if the worker nodes running these 
attempts are very slow) which makes the application performance worse.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to