[
https://issues.apache.org/jira/browse/SPARK-31549?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Apache Spark reassigned SPARK-31549:
------------------------------------
Assignee: Apache Spark
> Pyspark SparkContext.cancelJobGroup do not work correctly
> ---------------------------------------------------------
>
> Key: SPARK-31549
> URL: https://issues.apache.org/jira/browse/SPARK-31549
> Project: Spark
> Issue Type: Bug
> Components: PySpark
> Affects Versions: 2.4.5, 3.0.0
> Reporter: Weichen Xu
> Assignee: Apache Spark
> Priority: Critical
>
> Pyspark SparkContext.cancelJobGroup do not work correctly. This is an issue
> existing for a long time. This is because of pyspark thread didn't pinned to
> jvm thread when invoking java side methods, which leads to all pyspark API
> which related to java local thread variables do not work correctly.
> (Including `sc.setLocalProperty`, `sc.cancelJobGroup`, `sc.setJobDescription`
> and so on.)
> This is serious issue. Now there's an experimental pyspark 'PIN_THREAD' mode
> added in spark-3.0 which address it, but the 'PIN_THREAD' mode exists two
> issue:
> * It is disabled by default. We need to set additional environment variable
> to enable it.
> * There's memory leak issue which haven't been addressed.
> Now there's a series of project like hyperopt-spark, spark-joblib which rely
> on `sc.cancelJobGroup` API (use it to stop running jobs in their code). So it
> is critical to address this issue and we hope it work under default pyspark
> mode. An optional approach is implementing methods like
> `rdd.setGroupAndCollect`.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]