[GitHub] spark pull request #21943: [SPARK-24795][Core][FOLLOWUP] Kill all running ta...

2018-08-02 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/21943


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21943: [SPARK-24795][Core][FOLLOWUP] Kill all running ta...

2018-08-01 Thread jiangxb1987
Github user jiangxb1987 commented on a diff in the pull request:

https://github.com/apache/spark/pull/21943#discussion_r207103696
  
--- Diff: 
core/src/main/scala/org/apache/spark/scheduler/TaskScheduler.scala ---
@@ -51,7 +51,7 @@ private[spark] trait TaskScheduler {
   // Submit a sequence of tasks to run.
   def submitTasks(taskSet: TaskSet): Unit
 
-  // Cancel a stage.
+  // Kill all the tasks in a stage and fail the stage and all the jobs 
that depend on the stage.
--- End diff --

Updated comment to note that if the backend doesn't support kill a task 
then the method shall throw UnsupportedOperationException.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21943: [SPARK-24795][Core][FOLLOWUP] Kill all running ta...

2018-08-01 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/21943#discussion_r207098475
  
--- Diff: 
core/src/main/scala/org/apache/spark/scheduler/TaskScheduler.scala ---
@@ -51,7 +51,7 @@ private[spark] trait TaskScheduler {
   // Submit a sequence of tasks to run.
   def submitTasks(taskSet: TaskSet): Unit
 
-  // Cancel a stage.
+  // Kill all the tasks in a stage and fail the stage and all the jobs 
that depend on the stage.
--- End diff --

is it guaranteed to work for any backend like YARN, Mesos, K8s?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21943: [SPARK-24795][Core][FOLLOWUP] Kill all running ta...

2018-08-01 Thread mengxr
Github user mengxr commented on a diff in the pull request:

https://github.com/apache/spark/pull/21943#discussion_r206982253
  
--- Diff: 
core/src/main/scala/org/apache/spark/scheduler/TaskSchedulerImpl.scala ---
@@ -252,6 +252,22 @@ private[spark] class TaskSchedulerImpl(
 }
   }
 
+  override def killAllTaskAttempts(
+  stageId: Int,
+  interruptThread: Boolean,
+  reason: String): Unit = synchronized {
+logInfo(s"Killing all running tasks in stage $stageId: $reason")
+taskSetsByStageIdAndAttempt.get(stageId).foreach { attempts =>
--- End diff --

This is some dup code and we dropped the useful comments from 
`cancelTasks`. It would be great if we move the common code here with comment 
and let cancelTasks call this method.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21943: [SPARK-24795][Core][FOLLOWUP] Kill all running ta...

2018-08-01 Thread jiangxb1987
GitHub user jiangxb1987 opened a pull request:

https://github.com/apache/spark/pull/21943

[SPARK-24795][Core][FOLLOWUP] Kill all running tasks when a task in a 
barrier stage fail

## What changes were proposed in this pull request?

Kill all running tasks when a task in a barrier stage fail in the middle. 
`TaskScheduler`.`cancelTasks()` will also fail the job, so we implemented a new 
method `killAllTaskAttempts()` to just kill all running tasks of a stage 
without cancel the stage/job.

## How was this patch tested?

To be added.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/jiangxb1987/spark killAllTasks

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/21943.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #21943


commit e4752c520f602e1d31aa5a51bc08a1b738b8aebb
Author: Xingbo Jiang 
Date:   2018-08-01T14:34:03Z

kill all task attempts for a stage without fail the entire job.

commit 97bfba8b0d4e464690d89f743a68b03a92f8c3e7
Author: Xingbo Jiang 
Date:   2018-08-01T14:45:27Z

update comments




---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org