Re: Large number of pyspark.daemon processes

2015-01-27 Thread Sven Krasser
After slimming down the job quite a bit, it looks like a call to coalesce() on a larger RDD can cause these Python worker spikes (additional details in Jira:

Re: Large number of pyspark.daemon processes

2015-01-24 Thread Sven Krasser
Hey Davies, Sure thing, it's filed here now: https://issues.apache.org/jira/browse/SPARK-5395 As far as a repro goes, what is a normal number of workers I should expect? Even shortly after kicking the job off, I see workers in the double-digits per container. Here's an example using pstree on a

Large number of pyspark.daemon processes

2015-01-23 Thread Sven Krasser
Hey all, I am running into a problem where YARN kills containers for being over their memory allocation (which is about 8G for executors plus 6G for overhead), and I noticed that in those containers there are tons of pyspark.daemon processes hogging memory. Here's a snippet from a container with

Re: Large number of pyspark.daemon processes

2015-01-23 Thread Sandy Ryza
Hi Sven, What version of Spark are you running? Recent versions have a change that allows PySpark to share a pool of processes instead of starting a new one for each task. -Sandy On Fri, Jan 23, 2015 at 9:36 AM, Sven Krasser kras...@gmail.com wrote: Hey all, I am running into a problem

Re: Large number of pyspark.daemon processes

2015-01-23 Thread Adam Diaz
Yarn only has the ability to kill not checkpoint or sig suspend. If you use too much memory it will simply kill tasks based upon the yarn config. https://issues.apache.org/jira/browse/YARN-2172 On Friday, January 23, 2015, Sandy Ryza sandy.r...@cloudera.com wrote: Hi Sven, What version of

Re: Large number of pyspark.daemon processes

2015-01-23 Thread Sven Krasser
Hey Adam, I'm not sure I understand just yet what you have in mind. My takeaway from the logs is that the container actually was above its allotment of about 14G. Since 6G of that are for overhead, I assumed there to be plenty of space for Python workers, but there seem to be more of those than

Re: Large number of pyspark.daemon processes

2015-01-23 Thread Davies Liu
It should be a bug, the Python worker did not exit normally, could you file a JIRA for this? Also, could you show how to reproduce this behavior? On Fri, Jan 23, 2015 at 11:45 PM, Sven Krasser kras...@gmail.com wrote: Hey Adam, I'm not sure I understand just yet what you have in mind. My

Re: Large number of pyspark.daemon processes

2015-01-23 Thread Sven Krasser
Hey Sandy, I'm using Spark 1.2.0. I assume you're referring to worker reuse? In this case I've already set spark.python.worker.reuse to false (but it I also so this behavior when keeping it enabled). Best, -Sven On Fri, Jan 23, 2015 at 4:51 PM, Sandy Ryza sandy.r...@cloudera.com wrote: Hi