GitHub user ericl opened a pull request:

    https://github.com/apache/spark/pull/17531

    [SPARK-20217][core] Executor should not fail stage if killed task throws 
non-interrupted exception

    ## What changes were proposed in this pull request?
    
    If tasks throw non-interrupted exceptions on kill (e.g. 
java.nio.channels.ClosedByInterruptException), their death is reported back as 
TaskFailed instead of TaskKilled. This causes stage failure in some cases.
    
    This is reproducible as follows. Run the following, and then use 
SparkContext.killTaskAttempt to kill one of the tasks. The entire stage will 
fail since we threw a RuntimeException instead of InterruptedException.
    
    We should probably unconditionally return TaskKilled instead of TaskFailed 
if the task was killed by the driver, regardless of the actual exception thrown.
    
    ```
    spark.range(100).repartition(100).foreach { i =>
      try {
        Thread.sleep(10000000)
      } catch {
        case t: InterruptedException =>
          throw new RuntimeException(t)
      }
    }
    ```
    Based on the code in TaskSetManager, I think this also affects kills of 
speculative tasks. However, since the number of speculated tasks is few, and 
usually you need to fail a task a few times before the stage is cancelled, 
probably no-one noticed this in production.
    
    ## How was this patch tested?
    
    Unit test. The test fails before the change in Executor.scala
    
    cc @JoshRosen

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/ericl/spark fix-task-interrupt

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/17531.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #17531
    
----
commit 8f6283a7c407d28c043523d91b8c3a24da0eff52
Author: Eric Liang <e...@databricks.com>
Date:   2017-04-04T23:52:51Z

    Tue Apr  4 16:52:51 PDT 2017

commit 9d59960626178acb68918f1fce1a4f85b0da7493
Author: Eric Liang <e...@databricks.com>
Date:   2017-04-05T00:04:06Z

    Tue Apr  4 17:04:06 PDT 2017

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to