[jira] [Commented] (SPARK-1740) Pyspark cancellation kills unrelated pyspark workers

Josh Rosen (JIRA) Thu, 26 Jun 2014 11:23:57 -0700

    [ 
https://issues.apache.org/jira/browse/SPARK-1740?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14044981#comment-14044981
 ]


Josh Rosen commented on SPARK-1740:
-----------------------------------

The "Python daemon -> multiple workers" architecture was motivated by the high 
cost of forking a JVM with a large heap, so this issue could also lead to 
performance problems if we attempt to re-launch the daemon once a huge amount 
of data has been cached in the Spark worker JVM.

> Pyspark cancellation kills unrelated pyspark workers
> ----------------------------------------------------
>
>                 Key: SPARK-1740
>                 URL: https://issues.apache.org/jira/browse/SPARK-1740
>             Project: Spark
>          Issue Type: Bug
>          Components: PySpark
>    Affects Versions: 1.0.0
>            Reporter: Aaron Davidson
>            Priority: Critical
>
> PySpark cancellation calls SparkEnv#destroyPythonWorker. Since there is one 
> python worker per process, this would seem like a sensible thing to do. 
> Unfortunately, this method actually destroys a python daemon, and all 
> associated workers, which generally means that we can cause failures in 
> unrelated Pyspark jobs.
> The severity of this bug is limited by the fact that the Pyspark daemon is 
> easily recreated, so the tasks will succeed after being restarted.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (SPARK-1740) Pyspark cancellation kills unrelated pyspark workers

Reply via email to