Re: ALS.trainImplicit running out of mem when using higher rank

Antony Mayi Sat, 10 Jan 2015 01:49:15 -0800

the actual case looks like this:* spark 1.1.0 on yarn (cdh 5.2.1)* ~8-10 
executors, 36GB phys RAM per host* input RDD is roughly 3GB containing 
~150-200M items (and this RDD is made persistent using .cache())* using pyspark
yarn is configured with the limit yarn.nodemanager.resource.memory-mb of 33792 
(33GB), spark is set to 
be:SPARK_EXECUTOR_CORES=6SPARK_EXECUTOR_INSTANCES=9SPARK_EXECUTOR_MEMORY=30G
when using higher rank (above 20) for ALS.trainImplicit the executor runs after 
some time (~hour) of execution out of the yarn limit and gets killed:
2015-01-09 17:51:27,130 WARN 
org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl:
 Container [pid=27125,containerID=container_1420871936411_0002_01_000023] is 
running beyond physical memory limits. Current usage: 31.2 GB of 31 GB physical 
memory used; 34.7 GB of 65.1 GB virtual memory used. Killing container.
thanks for any ideas,Antony.


     On Saturday, 10 January 2015, 10:11, Antony Mayi <[email protected]> 
wrote:
   
 

 the memory requirements seem to be rapidly growing hen using higher rank... I 
am unable to get over 20 without running out of memory. is this 
expected?thanks, Antony.

Re: ALS.trainImplicit running out of mem when using higher rank

Reply via email to