Re: How to specify executor memory in EC2 ?

Aliaksei Litouka Fri, 13 Jun 2014 12:28:59 -0700

Aaron,
spark.executor.memory is set to 2454m in my spark-defaults.conf, which is a
reasonable value for EC2 instances which I use (they are m3.medium
machines). However, it doesn't help and each executor uses only 512 MB of
memory. To figure out why, I examined spark-submit and spark-class scripts
and found that java options and java memory size are computed in the
spark-class script (see OUR_JAVA_OPTS and OUR_JAVA_MEM variables in that
script). Then these values are used to compose the following string:


JAVA_OPTS="$JAVA_OPTS -Xms$OUR_JAVA_MEM -Xmx$OUR_JAVA_MEM"

Note that OUR_JAVA_MEM is appended to the end of the string. For some
reason which I haven't found yet, OUR_JAVA_MEM is set to its default value
- 512 MB. I was able to fix it only by setting the SPARM_MEM variable in
the spark-env.sh file:

export SPARK_MEM=2g

However, this variable is deprecated, so my solution doesn't seem to be
good :)


On Thu, Jun 12, 2014 at 10:16 PM, Aaron Davidson <ilike...@gmail.com> wrote:

> The scripts for Spark 1.0 actually specify this property in
> /root/spark/conf/spark-defaults.conf
>
> I didn't know that this would override the --executor-memory flag, though,
> that's pretty odd.
>
>
> On Thu, Jun 12, 2014 at 6:02 PM, Aliaksei Litouka <
> aliaksei.lito...@gmail.com> wrote:
>
>> Yes, I am launching a cluster with the spark_ec2 script. I checked
>> /root/spark/conf/spark-env.sh on the master node and on slaves and it looks
>> like this:
>>
>> #!/usr/bin/env bash
>>> export SPARK_LOCAL_DIRS="/mnt/spark"
>>> # Standalone cluster options
>>> export SPARK_MASTER_OPTS=""
>>> export SPARK_WORKER_INSTANCES=1
>>> export SPARK_WORKER_CORES=1
>>> export HADOOP_HOME="/root/ephemeral-hdfs"
>>> export SPARK_MASTER_IP=ec2-54-89-95-238.compute-1.amazonaws.com
>>> export MASTER=`cat /root/spark-ec2/cluster-url`
>>> export
>>> SPARK_SUBMIT_LIBRARY_PATH="$SPARK_SUBMIT_LIBRARY_PATH:/root/ephemeral-hdfs/lib/native/"
>>> export
>>> SPARK_SUBMIT_CLASSPATH="$SPARK_CLASSPATH:$SPARK_SUBMIT_CLASSPATH:/root/ephemeral-hdfs/conf"
>>> # Bind Spark's web UIs to this machine's public EC2 hostname:
>>> export SPARK_PUBLIC_DNS=`wget -q -O -
>>> http://169.254.169.254/latest/meta-data/public-hostname`
>>> <http://169.254.169.254/latest/meta-data/public-hostname>
>>> # Set a high ulimit for large shuffles
>>> ulimit -n 1000000
>>
>>
>> None of these variables seem to be related to memory size. Let me know if
>> I am missing something.
>>
>>
>> On Thu, Jun 12, 2014 at 7:17 PM, Matei Zaharia <matei.zaha...@gmail.com>
>> wrote:
>>
>>> Are you launching this using our EC2 scripts? Or have you set up a
>>> cluster by hand?
>>>
>>> Matei
>>>
>>> On Jun 12, 2014, at 2:32 PM, Aliaksei Litouka <
>>> aliaksei.lito...@gmail.com> wrote:
>>>
>>> spark-env.sh doesn't seem to contain any settings related to memory size
>>> :( I will continue searching for a solution and will post it if I find it :)
>>> Thank you, anyway
>>>
>>>
>>> On Wed, Jun 11, 2014 at 12:19 AM, Matei Zaharia <matei.zaha...@gmail.com
>>> > wrote:
>>>
>>>> It might be that conf/spark-env.sh on EC2 is configured to set it to
>>>> 512, and is overriding the application’s settings. Take a look in there and
>>>> delete that line if possible.
>>>>
>>>> Matei
>>>>
>>>> On Jun 10, 2014, at 2:38 PM, Aliaksei Litouka <
>>>> aliaksei.lito...@gmail.com> wrote:
>>>>
>>>> > I am testing my application in EC2 cluster of m3.medium machines. By
>>>> default, only 512 MB of memory on each machine is used. I want to increase
>>>> this amount and I'm trying to do it by passing --executor-memory 2G option
>>>> to the spark-submit script, but it doesn't seem to work - each machine uses
>>>> only 512 MB instead of 2 gigabytes. What am I doing wrong? How do I
>>>> increase the amount of memory?
>>>>
>>>>
>>>
>>>
>>
>

Re: How to specify executor memory in EC2 ?

Reply via email to