I'm using the scala 2.10 branch of Spark in standalone mode, and am finding
that the executor gets started with the default 512M even after setting
spark.executor.memory to 6G. This leads to my job getting an OOM. I've
tried setting spark.executor.memory both programmatically (using
System.setProperty("spark.executor.memory", "6g")) and as an environment
variable (using export SPARK_JAVA_OPTS="-Dspark.executor.memory=6g"). And
in both cases, the executor gets started with the default 512M as displayed
in the UI (*Executor Memory:* 512 M). Interestingly, the startup command
for the executor in its log is

"-cp"  "-Dspark.executor.memory=6g" "-Xms512M" "-Xmx512M"

So it looks like the spark.executor.memory gets ignored and the Xmx value
of 512M is used.

Finally what worked for me was setting SPARK_MEM=6G in spark-env.sh and
copying the file onto each of the slaves. While it solved my OOM, now, even
though the UI seems to indicate (*Executor Memory:* 6 G), the executor's
startup command in the log looks like

"-cp"  "-Dspark.executor.memory=40g" "-Xms6144M" "-Xmx6144M"

Here, I think it got the 40g from my SPARK_WORKER_MEM which was set to 40g.

So I'm a bit confused about how Spark treats executors in standalone mode.
As I understand from the docs, executor is a per-job concept, whereas
workers are across jobs. Is the "-Dspark.executor.memory=40g" really
ignored, as it looks to be in both the above cases, in which case

Also, I'd like to know how to properly set spark.executor.memory in
standalone mode. I'd like to not set SPARK_MEM as I'd like to control the
executor's memory footprint on a per-job level.

Thanks,
Ameet

Reply via email to