GitHub user JoshRosen opened a pull request:

    https://github.com/apache/spark/pull/16340

    [SPARK-18928] Check TaskContext.isInterrupted() in FileScanRDD, JDBCRDD & 
UnsafeSorter

    ## What changes were proposed in this pull request?
    
    In order to respond to task cancellation, Spark tasks must periodically 
check `TaskContext.isInterrupted()`, but this check is missing on a few 
critical read paths used in Spark SQL, including `FileScanRDD`, `JDBCRDD`, and 
UnsafeSorter-based sorts. This can cause interrupted / cancelled tasks to 
continue running and become zombies (as also described in #16189).
    
    This patch aims to fix this problem by adding `TaskContext.isInterrupted()` 
checks to these paths. Note that I could have used `InterruptibleIterator` to 
simply wrap a bunch of iterators but in some cases this would have an adverse 
performance penalty or might not be effective due to certain special uses of 
Iterators in Spark SQL. Instead, I inlined `InterruptibleIterator`-style logic 
into existing iterator subclasses.
    
    ## How was this patch tested?
    
    Tested manually in `spark-shell` with two different reproductions of 
non-cancellable tasks, one involving scans of huge files and another involving 
sort-merge joins that spill to disk. Both causes of zombie tasks are fixed by 
the changes added here.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/JoshRosen/spark sql-task-interruption

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/16340.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #16340
    
----
commit 08f9aa5a1dde58cf45f75b67ee88a9b978c3486c
Author: Josh Rosen <[email protected]>
Date:   2016-12-19T19:27:26Z

    Fix FileScanRDD interruption.

commit 2d43d5a501a162384d45eedf188ffed178e96351
Author: Josh Rosen <[email protected]>
Date:   2016-12-19T19:28:40Z

    Fix for JDBCRDD interruption.

commit 236efe580b52ed7ec6ce9e36f9b814147ebad99d
Author: Josh Rosen <[email protected]>
Date:   2016-12-19T21:22:07Z

    Make UnsafeSorterIterator interruptible.

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to