Re: Large number of pyspark.daemon processes

2015-01-27 Thread Sven Krasser
After slimming down the job quite a bit, it looks like a call to coalesce()
on a larger RDD can cause these Python worker spikes (additional details in
Jira:
https://issues.apache.org/jira/browse/SPARK-5395?focusedCommentId=14294570page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14294570
).

Any ideas to what coalesce() is doing that triggers the creation of
additional workers?

On Sat, Jan 24, 2015 at 12:27 AM, Sven Krasser kras...@gmail.com wrote:

 Hey Davies,

 Sure thing, it's filed here now:
 https://issues.apache.org/jira/browse/SPARK-5395

 As far as a repro goes, what is a normal number of workers I should
 expect? Even shortly after kicking the job off, I see workers in the
 double-digits per container. Here's an example using pstree on a worker
 node (the worker node runs 16 containers, i.e. the java--bash branches
 below). The initial Python process per container is forked between 7 and 33
 times.

  ├─java─┬─bash───java─┬─python───22*[python]
  │  │ └─89*[{java}]
  │  ├─bash───java─┬─python───13*[python]
  │  │ └─80*[{java}]
  │  ├─bash───java─┬─python───9*[python]
  │  │ └─78*[{java}]
  │  ├─bash───java─┬─python───24*[python]
  │  │ └─91*[{java}]
  │  ├─3*[bash───java─┬─python───19*[python]]
  │  │└─86*[{java}]]
  │  ├─bash───java─┬─python───25*[python]
  │  │ └─90*[{java}]
  │  ├─bash───java─┬─python───33*[python]
  │  │ └─98*[{java}]
  │  ├─bash───java─┬─python───7*[python]
  │  │ └─75*[{java}]
  │  ├─bash───java─┬─python───15*[python]
  │  │ └─80*[{java}]
  │  ├─bash───java─┬─python───18*[python]
  │  │ └─84*[{java}]
  │  ├─bash───java─┬─python───10*[python]
  │  │ └─85*[{java}]
  │  ├─bash───java─┬─python───26*[python]
  │  │ └─90*[{java}]
  │  ├─bash───java─┬─python───17*[python]
  │  │ └─85*[{java}]
  │  ├─bash───java─┬─python───24*[python]
  │  │ └─92*[{java}]
  │  └─265*[{java}]

 Each container has 2 cores in case that makes a difference.

 Thank you!
 -Sven

 On Fri, Jan 23, 2015 at 11:52 PM, Davies Liu dav...@databricks.com
 wrote:

 It should be a bug, the Python worker did not exit normally, could you
 file a JIRA for this?

 Also, could you show how to reproduce this behavior?

 On Fri, Jan 23, 2015 at 11:45 PM, Sven Krasser kras...@gmail.com wrote:
  Hey Adam,
 
  I'm not sure I understand just yet what you have in mind. My takeaway
 from
  the logs is that the container actually was above its allotment of about
  14G. Since 6G of that are for overhead, I assumed there to be plenty of
  space for Python workers, but there seem to be more of those than I'd
  expect.
 
  Does anyone know if that is actually the intended behavior, i.e. in this
  case over 90 Python processes on a 2 core executor?
 
  Best,
  -Sven
 
 
  On Fri, Jan 23, 2015 at 10:04 PM, Adam Diaz adam.h.d...@gmail.com
 wrote:
 
  Yarn only has the ability to kill not checkpoint or sig suspend.  If
 you
  use too much memory it will simply kill tasks based upon the yarn
 config.
  https://issues.apache.org/jira/browse/YARN-2172
 
 
  On Friday, January 23, 2015, Sandy Ryza sandy.r...@cloudera.com
 wrote:
 
  Hi Sven,
 
  What version of Spark are you running?  Recent versions have a change
  that allows PySpark to share a pool of processes instead of starting
 a new
  one for each task.
 
  -Sandy
 
  On Fri, Jan 23, 2015 at 9:36 AM, Sven Krasser kras...@gmail.com
 wrote:
 
  Hey all,
 
  I am running into a problem where YARN kills containers for being
 over
  their memory allocation (which is about 8G for executors plus 6G for
  overhead), and I noticed that in those containers there are tons of
  pyspark.daemon processes hogging memory. Here's a snippet from a
 container
  with 97 pyspark.daemon processes. The total sum of RSS usage across
 all of
  these is 1,764,956 pages (i.e. 6.7GB on the system).
 
  Any ideas what's happening here and how I can get the number of
  pyspark.daemon processes back to a more reasonable count?
 
  2015-01-23 15:36:53,654 INFO  [Reporter] yarn.YarnAllocationHandler
  (Logging.scala:logInfo(59)) - Container marked as failed:
  container_1421692415636_0052_01_30. Exit status: 143.
 Diagnostics:
  Container
 [pid=35211,containerID=container_1421692415636_0052_01_30] is
  running beyond physical memory limits. Current usage: 14.9 GB of
 14.5 GB
  physical memory used; 41.3 GB of 72.5 GB virtual memory used. Killing
  container.
  Dump of the process-tree for container_1421692415636_0052_01_30 :
 |- PID PPID PGRPID SESSID CMD_NAME USER_MODE_TIME(MILLIS)
  SYSTEM_TIME(MILLIS) VMEM_USAGE(BYTES) RSSMEM_USAGE(PAGES)
 FULL_CMD_LINE
 |- 54101 36625 

Re: Large number of pyspark.daemon processes

2015-01-24 Thread Sven Krasser
Hey Davies,

Sure thing, it's filed here now:
https://issues.apache.org/jira/browse/SPARK-5395

As far as a repro goes, what is a normal number of workers I should expect?
Even shortly after kicking the job off, I see workers in the double-digits
per container. Here's an example using pstree on a worker node (the worker
node runs 16 containers, i.e. the java--bash branches below). The initial
Python process per container is forked between 7 and 33 times.

 ├─java─┬─bash───java─┬─python───22*[python]
 │  │ └─89*[{java}]
 │  ├─bash───java─┬─python───13*[python]
 │  │ └─80*[{java}]
 │  ├─bash───java─┬─python───9*[python]
 │  │ └─78*[{java}]
 │  ├─bash───java─┬─python───24*[python]
 │  │ └─91*[{java}]
 │  ├─3*[bash───java─┬─python───19*[python]]
 │  │└─86*[{java}]]
 │  ├─bash───java─┬─python───25*[python]
 │  │ └─90*[{java}]
 │  ├─bash───java─┬─python───33*[python]
 │  │ └─98*[{java}]
 │  ├─bash───java─┬─python───7*[python]
 │  │ └─75*[{java}]
 │  ├─bash───java─┬─python───15*[python]
 │  │ └─80*[{java}]
 │  ├─bash───java─┬─python───18*[python]
 │  │ └─84*[{java}]
 │  ├─bash───java─┬─python───10*[python]
 │  │ └─85*[{java}]
 │  ├─bash───java─┬─python───26*[python]
 │  │ └─90*[{java}]
 │  ├─bash───java─┬─python───17*[python]
 │  │ └─85*[{java}]
 │  ├─bash───java─┬─python───24*[python]
 │  │ └─92*[{java}]
 │  └─265*[{java}]

Each container has 2 cores in case that makes a difference.

Thank you!
-Sven

On Fri, Jan 23, 2015 at 11:52 PM, Davies Liu dav...@databricks.com wrote:

 It should be a bug, the Python worker did not exit normally, could you
 file a JIRA for this?

 Also, could you show how to reproduce this behavior?

 On Fri, Jan 23, 2015 at 11:45 PM, Sven Krasser kras...@gmail.com wrote:
  Hey Adam,
 
  I'm not sure I understand just yet what you have in mind. My takeaway
 from
  the logs is that the container actually was above its allotment of about
  14G. Since 6G of that are for overhead, I assumed there to be plenty of
  space for Python workers, but there seem to be more of those than I'd
  expect.
 
  Does anyone know if that is actually the intended behavior, i.e. in this
  case over 90 Python processes on a 2 core executor?
 
  Best,
  -Sven
 
 
  On Fri, Jan 23, 2015 at 10:04 PM, Adam Diaz adam.h.d...@gmail.com
 wrote:
 
  Yarn only has the ability to kill not checkpoint or sig suspend.  If you
  use too much memory it will simply kill tasks based upon the yarn
 config.
  https://issues.apache.org/jira/browse/YARN-2172
 
 
  On Friday, January 23, 2015, Sandy Ryza sandy.r...@cloudera.com
 wrote:
 
  Hi Sven,
 
  What version of Spark are you running?  Recent versions have a change
  that allows PySpark to share a pool of processes instead of starting a
 new
  one for each task.
 
  -Sandy
 
  On Fri, Jan 23, 2015 at 9:36 AM, Sven Krasser kras...@gmail.com
 wrote:
 
  Hey all,
 
  I am running into a problem where YARN kills containers for being over
  their memory allocation (which is about 8G for executors plus 6G for
  overhead), and I noticed that in those containers there are tons of
  pyspark.daemon processes hogging memory. Here's a snippet from a
 container
  with 97 pyspark.daemon processes. The total sum of RSS usage across
 all of
  these is 1,764,956 pages (i.e. 6.7GB on the system).
 
  Any ideas what's happening here and how I can get the number of
  pyspark.daemon processes back to a more reasonable count?
 
  2015-01-23 15:36:53,654 INFO  [Reporter] yarn.YarnAllocationHandler
  (Logging.scala:logInfo(59)) - Container marked as failed:
  container_1421692415636_0052_01_30. Exit status: 143. Diagnostics:
  Container
 [pid=35211,containerID=container_1421692415636_0052_01_30] is
  running beyond physical memory limits. Current usage: 14.9 GB of 14.5
 GB
  physical memory used; 41.3 GB of 72.5 GB virtual memory used. Killing
  container.
  Dump of the process-tree for container_1421692415636_0052_01_30 :
 |- PID PPID PGRPID SESSID CMD_NAME USER_MODE_TIME(MILLIS)
  SYSTEM_TIME(MILLIS) VMEM_USAGE(BYTES) RSSMEM_USAGE(PAGES)
 FULL_CMD_LINE
 |- 54101 36625 36625 35211 (python) 78 1 332730368 16834 python -m
  pyspark.daemon
 |- 52140 36625 36625 35211 (python) 58 1 332730368 16837 python -m
  pyspark.daemon
 |- 36625 35228 36625 35211 (python) 65 604 331685888 17694 python
 -m
  pyspark.daemon
 
 [...]
 
 
  Full output here:
 https://gist.github.com/skrasser/e3e2ee8dede5ef6b082c
 
  Thank you!
  -Sven
 
  --
  krasser
 
 
 
 
 
  --
  http://sites.google.com/site/krasser/?utm_source=sig




-- 
http://sites.google.com/site/krasser/?utm_source=sig


Large number of pyspark.daemon processes

2015-01-23 Thread Sven Krasser
Hey all,

I am running into a problem where YARN kills containers for being over
their memory allocation (which is about 8G for executors plus 6G for
overhead), and I noticed that in those containers there are tons of
pyspark.daemon processes hogging memory. Here's a snippet from a container
with 97 pyspark.daemon processes. The total sum of RSS usage across all of
these is 1,764,956 pages (i.e. 6.7GB on the system).

Any ideas what's happening here and how I can get the number of
pyspark.daemon processes back to a more reasonable count?

2015-01-23 15:36:53,654 INFO  [Reporter] yarn.YarnAllocationHandler
(Logging.scala:logInfo(59)) - Container marked as failed:
container_1421692415636_0052_01_30. Exit status: 143. Diagnostics:
Container [pid=35211,containerID=container_1421692415636_0052_01_30]
is running beyond physical memory limits. Current usage: 14.9 GB of
14.5 GB physical memory used; 41.3 GB of 72.5 GB virtual memory used.
Killing container.
Dump of the process-tree for container_1421692415636_0052_01_30 :
|- PID PPID PGRPID SESSID CMD_NAME USER_MODE_TIME(MILLIS)
SYSTEM_TIME(MILLIS) VMEM_USAGE(BYTES) RSSMEM_USAGE(PAGES)
FULL_CMD_LINE
|- 54101 36625 36625 35211 (python) 78 1 332730368 16834 python -m
pyspark.daemon
|- 52140 36625 36625 35211 (python) 58 1 332730368 16837 python -m
pyspark.daemon
|- 36625 35228 36625 35211 (python) 65 604 331685888 17694 python -m
pyspark.daemon

[...]


Full output here: https://gist.github.com/skrasser/e3e2ee8dede5ef6b082c

Thank you!
-Sven

-- 
http://sites.google.com/site/krasser/?utm_source=sig


Re: Large number of pyspark.daemon processes

2015-01-23 Thread Sandy Ryza
Hi Sven,

What version of Spark are you running?  Recent versions have a change that
allows PySpark to share a pool of processes instead of starting a new one
for each task.

-Sandy

On Fri, Jan 23, 2015 at 9:36 AM, Sven Krasser kras...@gmail.com wrote:

 Hey all,

 I am running into a problem where YARN kills containers for being over
 their memory allocation (which is about 8G for executors plus 6G for
 overhead), and I noticed that in those containers there are tons of
 pyspark.daemon processes hogging memory. Here's a snippet from a container
 with 97 pyspark.daemon processes. The total sum of RSS usage across all of
 these is 1,764,956 pages (i.e. 6.7GB on the system).

 Any ideas what's happening here and how I can get the number of
 pyspark.daemon processes back to a more reasonable count?

 2015-01-23 15:36:53,654 INFO  [Reporter] yarn.YarnAllocationHandler 
 (Logging.scala:logInfo(59)) - Container marked as failed: 
 container_1421692415636_0052_01_30. Exit status: 143. Diagnostics: 
 Container [pid=35211,containerID=container_1421692415636_0052_01_30] is 
 running beyond physical memory limits. Current usage: 14.9 GB of 14.5 GB 
 physical memory used; 41.3 GB of 72.5 GB virtual memory used. Killing 
 container.
 Dump of the process-tree for container_1421692415636_0052_01_30 :
   |- PID PPID PGRPID SESSID CMD_NAME USER_MODE_TIME(MILLIS) 
 SYSTEM_TIME(MILLIS) VMEM_USAGE(BYTES) RSSMEM_USAGE(PAGES) FULL_CMD_LINE
   |- 54101 36625 36625 35211 (python) 78 1 332730368 16834 python -m 
 pyspark.daemon
   |- 52140 36625 36625 35211 (python) 58 1 332730368 16837 python -m 
 pyspark.daemon
   |- 36625 35228 36625 35211 (python) 65 604 331685888 17694 python -m 
 pyspark.daemon

   [...]


 Full output here: https://gist.github.com/skrasser/e3e2ee8dede5ef6b082c

 Thank you!
 -Sven

 --
 http://sites.google.com/site/krasser/?utm_source=sig



Re: Large number of pyspark.daemon processes

2015-01-23 Thread Adam Diaz
Yarn only has the ability to kill not checkpoint or sig suspend.  If you
use too much memory it will simply kill tasks based upon the yarn config.
https://issues.apache.org/jira/browse/YARN-2172


On Friday, January 23, 2015, Sandy Ryza sandy.r...@cloudera.com wrote:

 Hi Sven,

 What version of Spark are you running?  Recent versions have a change that
 allows PySpark to share a pool of processes instead of starting a new one
 for each task.

 -Sandy

 On Fri, Jan 23, 2015 at 9:36 AM, Sven Krasser kras...@gmail.com
 javascript:_e(%7B%7D,'cvml','kras...@gmail.com'); wrote:

 Hey all,

 I am running into a problem where YARN kills containers for being over
 their memory allocation (which is about 8G for executors plus 6G for
 overhead), and I noticed that in those containers there are tons of
 pyspark.daemon processes hogging memory. Here's a snippet from a container
 with 97 pyspark.daemon processes. The total sum of RSS usage across all of
 these is 1,764,956 pages (i.e. 6.7GB on the system).

 Any ideas what's happening here and how I can get the number of
 pyspark.daemon processes back to a more reasonable count?

 2015-01-23 15:36:53,654 INFO  [Reporter] yarn.YarnAllocationHandler 
 (Logging.scala:logInfo(59)) - Container marked as failed: 
 container_1421692415636_0052_01_30. Exit status: 143. Diagnostics: 
 Container [pid=35211,containerID=container_1421692415636_0052_01_30] is 
 running beyond physical memory limits. Current usage: 14.9 GB of 14.5 GB 
 physical memory used; 41.3 GB of 72.5 GB virtual memory used. Killing 
 container.
 Dump of the process-tree for container_1421692415636_0052_01_30 :
  |- PID PPID PGRPID SESSID CMD_NAME USER_MODE_TIME(MILLIS) 
 SYSTEM_TIME(MILLIS) VMEM_USAGE(BYTES) RSSMEM_USAGE(PAGES) FULL_CMD_LINE
  |- 54101 36625 36625 35211 (python) 78 1 332730368 16834 python -m 
 pyspark.daemon
  |- 52140 36625 36625 35211 (python) 58 1 332730368 16837 python -m 
 pyspark.daemon
  |- 36625 35228 36625 35211 (python) 65 604 331685888 17694 python -m 
 pyspark.daemon

  [...]


 Full output here: https://gist.github.com/skrasser/e3e2ee8dede5ef6b082c

 Thank you!
 -Sven

 --
 krasser http://sites.google.com/site/krasser/?utm_source=sig





Re: Large number of pyspark.daemon processes

2015-01-23 Thread Sven Krasser
Hey Adam,

I'm not sure I understand just yet what you have in mind. My takeaway from
the logs is that the container actually was above its allotment of about
14G. Since 6G of that are for overhead, I assumed there to be plenty of
space for Python workers, but there seem to be more of those than I'd
expect.

Does anyone know if that is actually the intended behavior, i.e. in this
case over 90 Python processes on a 2 core executor?

Best,
-Sven


On Fri, Jan 23, 2015 at 10:04 PM, Adam Diaz adam.h.d...@gmail.com wrote:

 Yarn only has the ability to kill not checkpoint or sig suspend.  If you
 use too much memory it will simply kill tasks based upon the yarn config.
 https://issues.apache.org/jira/browse/YARN-2172


 On Friday, January 23, 2015, Sandy Ryza sandy.r...@cloudera.com wrote:

 Hi Sven,

 What version of Spark are you running?  Recent versions have a change
 that allows PySpark to share a pool of processes instead of starting a new
 one for each task.

 -Sandy

 On Fri, Jan 23, 2015 at 9:36 AM, Sven Krasser kras...@gmail.com wrote:

 Hey all,

 I am running into a problem where YARN kills containers for being over
 their memory allocation (which is about 8G for executors plus 6G for
 overhead), and I noticed that in those containers there are tons of
 pyspark.daemon processes hogging memory. Here's a snippet from a container
 with 97 pyspark.daemon processes. The total sum of RSS usage across all of
 these is 1,764,956 pages (i.e. 6.7GB on the system).

 Any ideas what's happening here and how I can get the number of
 pyspark.daemon processes back to a more reasonable count?

 2015-01-23 15:36:53,654 INFO  [Reporter] yarn.YarnAllocationHandler 
 (Logging.scala:logInfo(59)) - Container marked as failed: 
 container_1421692415636_0052_01_30. Exit status: 143. Diagnostics: 
 Container [pid=35211,containerID=container_1421692415636_0052_01_30] is 
 running beyond physical memory limits. Current usage: 14.9 GB of 14.5 GB 
 physical memory used; 41.3 GB of 72.5 GB virtual memory used. Killing 
 container.
 Dump of the process-tree for container_1421692415636_0052_01_30 :
 |- PID PPID PGRPID SESSID CMD_NAME USER_MODE_TIME(MILLIS) 
 SYSTEM_TIME(MILLIS) VMEM_USAGE(BYTES) RSSMEM_USAGE(PAGES) FULL_CMD_LINE
 |- 54101 36625 36625 35211 (python) 78 1 332730368 16834 python -m 
 pyspark.daemon
 |- 52140 36625 36625 35211 (python) 58 1 332730368 16837 python -m 
 pyspark.daemon
 |- 36625 35228 36625 35211 (python) 65 604 331685888 17694 python -m 
 pyspark.daemon

 [...]


 Full output here: https://gist.github.com/skrasser/e3e2ee8dede5ef6b082c

 Thank you!
 -Sven

 --
 krasser http://sites.google.com/site/krasser/?utm_source=sig





-- 
http://sites.google.com/site/krasser/?utm_source=sig


Re: Large number of pyspark.daemon processes

2015-01-23 Thread Davies Liu
It should be a bug, the Python worker did not exit normally, could you
file a JIRA for this?

Also, could you show how to reproduce this behavior?

On Fri, Jan 23, 2015 at 11:45 PM, Sven Krasser kras...@gmail.com wrote:
 Hey Adam,

 I'm not sure I understand just yet what you have in mind. My takeaway from
 the logs is that the container actually was above its allotment of about
 14G. Since 6G of that are for overhead, I assumed there to be plenty of
 space for Python workers, but there seem to be more of those than I'd
 expect.

 Does anyone know if that is actually the intended behavior, i.e. in this
 case over 90 Python processes on a 2 core executor?

 Best,
 -Sven


 On Fri, Jan 23, 2015 at 10:04 PM, Adam Diaz adam.h.d...@gmail.com wrote:

 Yarn only has the ability to kill not checkpoint or sig suspend.  If you
 use too much memory it will simply kill tasks based upon the yarn config.
 https://issues.apache.org/jira/browse/YARN-2172


 On Friday, January 23, 2015, Sandy Ryza sandy.r...@cloudera.com wrote:

 Hi Sven,

 What version of Spark are you running?  Recent versions have a change
 that allows PySpark to share a pool of processes instead of starting a new
 one for each task.

 -Sandy

 On Fri, Jan 23, 2015 at 9:36 AM, Sven Krasser kras...@gmail.com wrote:

 Hey all,

 I am running into a problem where YARN kills containers for being over
 their memory allocation (which is about 8G for executors plus 6G for
 overhead), and I noticed that in those containers there are tons of
 pyspark.daemon processes hogging memory. Here's a snippet from a container
 with 97 pyspark.daemon processes. The total sum of RSS usage across all of
 these is 1,764,956 pages (i.e. 6.7GB on the system).

 Any ideas what's happening here and how I can get the number of
 pyspark.daemon processes back to a more reasonable count?

 2015-01-23 15:36:53,654 INFO  [Reporter] yarn.YarnAllocationHandler
 (Logging.scala:logInfo(59)) - Container marked as failed:
 container_1421692415636_0052_01_30. Exit status: 143. Diagnostics:
 Container [pid=35211,containerID=container_1421692415636_0052_01_30] is
 running beyond physical memory limits. Current usage: 14.9 GB of 14.5 GB
 physical memory used; 41.3 GB of 72.5 GB virtual memory used. Killing
 container.
 Dump of the process-tree for container_1421692415636_0052_01_30 :
|- PID PPID PGRPID SESSID CMD_NAME USER_MODE_TIME(MILLIS)
 SYSTEM_TIME(MILLIS) VMEM_USAGE(BYTES) RSSMEM_USAGE(PAGES) FULL_CMD_LINE
|- 54101 36625 36625 35211 (python) 78 1 332730368 16834 python -m
 pyspark.daemon
|- 52140 36625 36625 35211 (python) 58 1 332730368 16837 python -m
 pyspark.daemon
|- 36625 35228 36625 35211 (python) 65 604 331685888 17694 python -m
 pyspark.daemon

[...]


 Full output here: https://gist.github.com/skrasser/e3e2ee8dede5ef6b082c

 Thank you!
 -Sven

 --
 krasser





 --
 http://sites.google.com/site/krasser/?utm_source=sig

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org



Re: Large number of pyspark.daemon processes

2015-01-23 Thread Sven Krasser
Hey Sandy,

I'm using Spark 1.2.0. I assume you're referring to worker reuse? In this
case I've already set spark.python.worker.reuse to false (but it I also so
this behavior when keeping it enabled).

Best,
-Sven


On Fri, Jan 23, 2015 at 4:51 PM, Sandy Ryza sandy.r...@cloudera.com wrote:

 Hi Sven,

 What version of Spark are you running?  Recent versions have a change that
 allows PySpark to share a pool of processes instead of starting a new one
 for each task.

 -Sandy

 On Fri, Jan 23, 2015 at 9:36 AM, Sven Krasser kras...@gmail.com wrote:

 Hey all,

 I am running into a problem where YARN kills containers for being over
 their memory allocation (which is about 8G for executors plus 6G for
 overhead), and I noticed that in those containers there are tons of
 pyspark.daemon processes hogging memory. Here's a snippet from a container
 with 97 pyspark.daemon processes. The total sum of RSS usage across all of
 these is 1,764,956 pages (i.e. 6.7GB on the system).

 Any ideas what's happening here and how I can get the number of
 pyspark.daemon processes back to a more reasonable count?

 2015-01-23 15:36:53,654 INFO  [Reporter] yarn.YarnAllocationHandler 
 (Logging.scala:logInfo(59)) - Container marked as failed: 
 container_1421692415636_0052_01_30. Exit status: 143. Diagnostics: 
 Container [pid=35211,containerID=container_1421692415636_0052_01_30] is 
 running beyond physical memory limits. Current usage: 14.9 GB of 14.5 GB 
 physical memory used; 41.3 GB of 72.5 GB virtual memory used. Killing 
 container.
 Dump of the process-tree for container_1421692415636_0052_01_30 :
  |- PID PPID PGRPID SESSID CMD_NAME USER_MODE_TIME(MILLIS) 
 SYSTEM_TIME(MILLIS) VMEM_USAGE(BYTES) RSSMEM_USAGE(PAGES) FULL_CMD_LINE
  |- 54101 36625 36625 35211 (python) 78 1 332730368 16834 python -m 
 pyspark.daemon
  |- 52140 36625 36625 35211 (python) 58 1 332730368 16837 python -m 
 pyspark.daemon
  |- 36625 35228 36625 35211 (python) 65 604 331685888 17694 python -m 
 pyspark.daemon

  [...]


 Full output here: https://gist.github.com/skrasser/e3e2ee8dede5ef6b082c

 Thank you!
 -Sven

 --
 http://sites.google.com/site/krasser/?utm_source=sig





-- 
http://sites.google.com/site/krasser/?utm_source=sig