juliuszsompolski opened a new pull request, #41005:
URL: https://github.com/apache/spark/pull/41005
### What changes were proposed in this pull request?
Currently, queries that are ran using Spark Connect cannot be interrupted.
Even when the RPC connection got broken, the Spark jobs on the server continue
running.
This PR proposes a
```
rpc Interrupt(InterruptRequest) returns (InterruptResponse) {}
```
server RPC API, that can be called from the client as
`SparkSession.interruptAll()` to interrupt all actively running Spark Jobs from
ExecutePlan executions. In most user scenarios, SparkSession is not used for
multiple executions concurrently, but is used sequentially, so `interruptAll()`
should serve a big chunk of user needs. It can also be used to clean up.
To keep track of executions, we introduce `ExecutionHolder` to hold the
execution state, and make `SessionHolder` keep track of the executions
currently running in the session. In this first PR, the interrupt only
interrupts running Spark Jobs. As such, it is to a degree best effort, because
it will not interrupt commands that don't run Spark Jobs, or it will not
interrupt anything if a Spark Job is not running when it the interrupt is
received by the server, and the command will continue running and may continue
launching more Spark jobs later in that case.
Future work I plan to design and work on will involve:
* Implement this in the Python client (TODO - do this tomorrow in this PR)
* Interrupting any execution. This will involve moving execution from the
GRPC handler thread handling ExecutePlan to launching it in a separate thread
that can be interrupted. `ExecutionHolder`
* Interrupting executions selectively. This will involve exposing the
operationId to the user.
### Why are the changes needed?
Need to have APIs to be able to interrupt running queries in Spark Connect.
### Does this PR introduce _any_ user-facing change?
Yes.
Users of Spark Connect can now call `interruptAll()` on client
`SparkSession` object, to send an interrupt RPC to the server, which will
interrupt the running queries.
In followup PRs, this will be extended to Python client, and to work not
only for interrupting Spark Jobs.
### How was this patch tested?
Added E2E tests to ClientE2ETestSuite.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]