Josh Rosen created SPARK-2764:
---------------------------------

             Summary: Simplify process structure of PySpark daemon / worker 
launching process
                 Key: SPARK-2764
                 URL: https://issues.apache.org/jira/browse/SPARK-2764
             Project: Spark
          Issue Type: Improvement
          Components: PySpark
            Reporter: Josh Rosen


PySpark's daemon-based worker factory has a very complicated process structure 
that I've always found confusing.  The per-java-worker daemon.py process 
launches a numCores-sized pool of subprocesses, and those subprocesses 
launching the actual worker processes that process data.

I think we can simplify this by having daemon.py launch the workers directly 
without this extra layer of indirection.  See my comments on the pull request 
that introduced daemon.py: https://github.com/mesos/spark/pull/563



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to