Juliusz Sompolski created SPARK-43952:
-----------------------------------------
Summary: Cancel Spark jobs not only by a single "jobgroup", but
allow multiple "job tags"
Key: SPARK-43952
URL: https://issues.apache.org/jira/browse/SPARK-43952
Project: Spark
Issue Type: New Feature
Components: Spark Core
Affects Versions: 3.5.0
Reporter: Juliusz Sompolski
Currently, the only way to cancel running Spark Jobs is by using
SparkContext.cancelJobGroup, using a job group name that was previously set
using SparkContext.setJobGroup. This is problematic if multiple different parts
of the system want to do cancellation, and set their own ids.
For example,
[https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/execution/exchange/BroadcastExchangeExec.scala#L133]
sets it's own job group, which may override job group set by user. This way,
if user cancels the job group they set, it will not cancel these broadcast jobs
launches from within their jobs...
As a solution, consider adding SparkContext.addJobTag /
SparkContext.removeJobTag, which would allow to have multiple "tags" on the
jobs, and introduce SparkContext.cancelJobsByTag to allow more flexible
cancelling of jobs.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]