Memory under-utilization
Hi, I'm a Spark newbie. We had installed spark-1.0.2-bin-cdh4 on a 'super machine' with 256gb memory and 48 cores. Tried to allocate a task with 64gb memory but for whatever reason Spark is only using around 9gb max. Submitted spark job with the following command: /bin/spark-submit -class SimpleApp --master local[16] --executor-memory 64G /var/tmp/simple-project_2.10-1.0.jar /data/lucene/ns.gz When I run 'top' command I see only 9gb of memory is used by the spark process PID USER PR NI VIRT RES SHR S %CPU %MEMTIME+ COMMAND 3047005 fran 30 10 8785m 703m 18m S 112.9 0.3 48:19.63 java Any idea why this is happening? I've also tried to set the memory programatically using new SparkConf().set(spark.executor.memory, 64g) but that also didn't do anything. Is there some limitation when running in 'local' mode? Thanks. -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Memory-under-utilization-tp14396.html Sent from the Apache Spark User List mailing list archive at Nabble.com. - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org
Re: Memory under-utilization
Perhaps your job does not use more than 9g. Even though the dashboard shows 64g the process only uses whats needed and grows to 64g max. On Tue, Sep 16, 2014 at 5:40 PM, francisco ftanudj...@nextag.com wrote: Hi, I'm a Spark newbie. We had installed spark-1.0.2-bin-cdh4 on a 'super machine' with 256gb memory and 48 cores. Tried to allocate a task with 64gb memory but for whatever reason Spark is only using around 9gb max. Submitted spark job with the following command: /bin/spark-submit -class SimpleApp --master local[16] --executor-memory 64G /var/tmp/simple-project_2.10-1.0.jar /data/lucene/ns.gz When I run 'top' command I see only 9gb of memory is used by the spark process PID USER PR NI VIRT RES SHR S %CPU %MEMTIME+ COMMAND 3047005 fran 30 10 8785m 703m 18m S 112.9 0.3 48:19.63 java Any idea why this is happening? I've also tried to set the memory programatically using new SparkConf().set(spark.executor.memory, 64g) but that also didn't do anything. Is there some limitation when running in 'local' mode? Thanks. -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Memory-under-utilization-tp14396.html Sent from the Apache Spark User List mailing list archive at Nabble.com. - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org
Re: Memory under-utilization
Thanks for the reply. I doubt that's the case though ... the executor kept having to do a file dump because memory is full. ... 14/09/16 15:00:18 WARN ExternalAppendOnlyMap: Spilling in-memory map of 67 MB to disk (668 times so far) 14/09/16 15:00:21 WARN ExternalAppendOnlyMap: Spilling in-memory map of 66 MB to disk (669 times so far) 14/09/16 15:00:24 WARN ExternalAppendOnlyMap: Spilling in-memory map of 70 MB to disk (670 times so far) 14/09/16 15:00:31 WARN ExternalAppendOnlyMap: Spilling in-memory map of 127 MB to disk (671 times so far) 14/09/16 15:00:43 WARN ExternalAppendOnlyMap: Spilling in-memory map of 67 MB to disk (672 times so far) ... -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Memory-under-utilization-tp14396p14399.html Sent from the Apache Spark User List mailing list archive at Nabble.com. - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org
Re: Memory under-utilization
I see, what does http://localhost:4040/executors/ show for memory usage? I personally find it easier to work with a standalone cluster with a single worker by using the sbin/start-master.sh and then connecting to the master. On Tue, Sep 16, 2014 at 6:04 PM, francisco ftanudj...@nextag.com wrote: Thanks for the reply. I doubt that's the case though ... the executor kept having to do a file dump because memory is full. ... 14/09/16 15:00:18 WARN ExternalAppendOnlyMap: Spilling in-memory map of 67 MB to disk (668 times so far) 14/09/16 15:00:21 WARN ExternalAppendOnlyMap: Spilling in-memory map of 66 MB to disk (669 times so far) 14/09/16 15:00:24 WARN ExternalAppendOnlyMap: Spilling in-memory map of 70 MB to disk (670 times so far) 14/09/16 15:00:31 WARN ExternalAppendOnlyMap: Spilling in-memory map of 127 MB to disk (671 times so far) 14/09/16 15:00:43 WARN ExternalAppendOnlyMap: Spilling in-memory map of 67 MB to disk (672 times so far) ... -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Memory-under-utilization-tp14396p14399.html Sent from the Apache Spark User List mailing list archive at Nabble.com. - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org
Re: Memory under-utilization
Thanks for the tip. http://localhost:4040/executors/ is showing Executors(1) Memory: 0.0 B used (294.9 MB Total) Disk: 0.0 B Used However, running as standalone cluster does resolve the problem. I can see a worker process running w/ the allocated memory. My conclusion (I may be wrong) is for 'local' mode the 'executor-memory' parameter is not honored. Thanks again for the help! -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Memory-under-utilization-tp14396p14409.html Sent from the Apache Spark User List mailing list archive at Nabble.com. - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org