pgandhi999 commented on a change in pull request #22806: [SPARK-25250][CORE] :
On successful completion of a task attempt on a parti…
URL: https://github.com/apache/spark/pull/22806#discussion_r244770628
##########
File path:
core/src/main/scala/org/apache/spark/scheduler/TaskSchedulerImpl.scala
##########
@@ -286,6 +286,29 @@ private[spark] class TaskSchedulerImpl(
}
}
+ /**
+ * SPARK-25250: Whenever any Result Task gets successfully completed, we
simply mark the
+ * corresponding partition id as completed in all attempts for that
particular stage. As a
+ * result, we do not see any Killed tasks due to TaskCommitDenied Exceptions
showing up
+ * in the UI.
+ */
+ override def markPartitionIdAsCompletedAndKillCorrespondingTaskAttempts(
Review comment:
As far as I understand the code, `killAllTaskAttempts` kills all the running
tasks for a particular stage whereas
`markPartitionIdAsCompletedAndKillCorrespondingTaskAttempts` kills all running
tasks for all stages and attempts working on a particular partition that has
already been marked as completed by one of the previously running tasks for
that corresponding partition. So the logic is different for both the cases, but
we can modify the code to have one fixed method for performing both these
tasks. Let me know what you think!
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
With regards,
Apache Git Services
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]