[ 
https://issues.apache.org/jira/browse/LIVY-533?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Björn Lohrmann updated LIVY-533:
--------------------------------
    Description: 
Running stages of Spark jobs submitted via Livy' programmatic API cannot 
(always) be successfully cancelled.

The current implementation of .JobWrapper.cancel() interrupts the worker thread 
on the Spark driver (via Future.cancel(true)):

[https://github.com/apache/incubator-livy/blob/4cfb6bcb8fb9ac6b2d6c8b3d04b20f647b507e1f/rsc/src/main/java/org/apache/livy/rsc/driver/JobWrapper.java#L84]

This does not always cancel all activity in Spark, e.g. long-running stages may 
remain unaffected.

The Spark-way of cancelling jobs seems to be via 
SparkContext.setJobGroup()/cancelJobGroup(), which is also being used in Livy's 
REPL Session:

[https://github.com/apache/incubator-livy/blob/4cfb6bcb8fb9ac6b2d6c8b3d04b20f647b507e1f/repl/src/main/scala/org/apache/livy/repl/Session.scala#L164]

I have opened a PR that invokes setJobGroup()/cancelJobGroup() in addition to 
interrupting the worker thread running on the driver:

[https://github.com/apache/incubator-livy/pull/128]

 

  was:
Running stages of Spark jobs submitted via Livy' programmatic API cannot 
(always) be successfully cancelled.

The current implementation of .JobWrapper.cancel() interrupts the worker thread 
on the Spark driver (via Future.cancel(true)):

[https://github.com/apache/incubator-livy/blob/4cfb6bcb8fb9ac6b2d6c8b3d04b20f647b507e1f/rsc/src/main/java/org/apache/livy/rsc/driver/JobWrapper.java#L84]

This does not always cancel all activity in Spark, e.g. long-running stages may 
remain unaffected.

The Spark-way of cancelling jobs seems to be via 
SparkContext.setJobGroup()/cancelJobGroup(), which is also being used in Livy's 
REPL Session:

[https://github.com/apache/incubator-livy/blob/4cfb6bcb8fb9ac6b2d6c8b3d04b20f647b507e1f/repl/src/main/scala/org/apache/livy/repl/Session.scala#L164]


I have opened a PR that invokes setJobGroup()/cancelJobGroup() in addition to 
interrupting the worker thread running on the driver.


>  Spark jobs submitted via programmatic API cannot always be canceled 
> ---------------------------------------------------------------------
>
>                 Key: LIVY-533
>                 URL: https://issues.apache.org/jira/browse/LIVY-533
>             Project: Livy
>          Issue Type: Bug
>          Components: RSC
>    Affects Versions: 0.5.0
>            Reporter: Björn Lohrmann
>            Priority: Major
>              Labels: pull-request-available
>
> Running stages of Spark jobs submitted via Livy' programmatic API cannot 
> (always) be successfully cancelled.
> The current implementation of .JobWrapper.cancel() interrupts the worker 
> thread on the Spark driver (via Future.cancel(true)):
> [https://github.com/apache/incubator-livy/blob/4cfb6bcb8fb9ac6b2d6c8b3d04b20f647b507e1f/rsc/src/main/java/org/apache/livy/rsc/driver/JobWrapper.java#L84]
> This does not always cancel all activity in Spark, e.g. long-running stages 
> may remain unaffected.
> The Spark-way of cancelling jobs seems to be via 
> SparkContext.setJobGroup()/cancelJobGroup(), which is also being used in 
> Livy's REPL Session:
> [https://github.com/apache/incubator-livy/blob/4cfb6bcb8fb9ac6b2d6c8b3d04b20f647b507e1f/repl/src/main/scala/org/apache/livy/repl/Session.scala#L164]
> I have opened a PR that invokes setJobGroup()/cancelJobGroup() in addition to 
> interrupting the worker thread running on the driver:
> [https://github.com/apache/incubator-livy/pull/128]
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to