RE: --driver-memory parameter doesn't work for spark-submmit on yarn?

Shuai Zheng Tue, 07 Apr 2015 07:49:56 -0700

Sorry for reply late.

I bypass this by set _JAVA_OPTIONS.


And the ps aux | grep spark

hadoop   14442  0.6  0.2 34334552 128560 pts/0 Sl+  14:37   0:01 
/usr/java/latest/bin/java org.apache.spark.deploy.SparkSubmitDriverBootstrapper 
--driver-memory=5G --executor-memory=10G --master yarn-client --class 
com.***.FinancialEngineExecutor 
/home/hadoop/lib/Engine-2.0-jar-with-dependencies.jar 
hadoop   14544  158 13.4 37206420 8472272 pts/0 Sl+ 14:37   4:21 
/usr/java/latest/bin/java -cp 
/home/hadoop/spark/conf:/home/hadoop/conf:/home/hadoop/spark/classpath/emr/*:/home/hadoop/spark/classpath/emrfs/*:/home/hadoop/share/hadoop/common/lib/*:/home/hadoop/share/hadoop/common/lib/hadoop-lzo.jar::/home/hadoop/spark/conf:/home/hadoop/spark/lib/spark-assembly-1.3.0-hadoop2.4.0.jar:/home/hadoop/spark/lib/datanucleus-core-3.2.10.jar:/home/hadoop/spark/lib/datanucleus-rdbms-3.2.9.jar:/home/hadoop/spark/lib/datanucleus-api-jdo-3.2.6.jar:/home/hadoop/conf:/home/hadoop/conf
 -XX:MaxPermSize=128m -Dspark.driver.log.level=INFO -Xms512m -Xmx512m 
org.apache.spark.deploy.SparkSubmit --driver-memory=5G --executor-memory=10G 
--master yarn-client --class com.*executor.FinancialEngineExecutor 
/home/hadoop/lib/MiddlewareEngine-2.0-jar-with-dependencies.jar 

Above already have set _JAVA_OPTIONS="-Xmx30g", but looks like it doesn't show 
in the commandline. I guess SparkSubmit will read _JAVA_OPTIONS, but I just 
think this should be overwritten by the commandline params. Not sure what 
happen here, have no time to dig it out. But if you want me to provide more 
information. I will be happy to do that.

Regards,

Shuai


-----Original Message-----
From: Bozeman, Christopher [mailto:bozem...@amazon.com] 
Sent: Wednesday, April 01, 2015 4:59 PM
To: Shuai Zheng; 'Sean Owen'
Cc: 'Akhil Das'; user@spark.apache.org
Subject: RE: --driver-memory parameter doesn't work for spark-submmit on yarn?

Shuai,

What did " ps aux | grep spark-submit" reveal?

When you compare using _JAVA_OPTIONS and without using it, where do you see the 
difference?

Thanks
Christopher




-----Original Message-----
From: Shuai Zheng [mailto:szheng.c...@gmail.com]
Sent: Wednesday, April 01, 2015 11:12 AM
To: 'Sean Owen'
Cc: 'Akhil Das'; user@spark.apache.org
Subject: RE: --driver-memory parameter doesn't work for spark-submmit on yarn?

Nice.

But when my case shows that even I use Yarn-Client, I have same issue. I do 
verify it several times.

And I am running 1.3.0 on EMR (use the version dispatch by installSpark script 
from AWS).

I agree _JAVA_OPTIONS is not a right solution, but I will use it until 1.4.0 
out :)

Regards,

Shuai

-----Original Message-----
From: Sean Owen [mailto:so...@cloudera.com]
Sent: Wednesday, April 01, 2015 10:51 AM
To: Shuai Zheng
Cc: Akhil Das; user@spark.apache.org
Subject: Re: --driver-memory parameter doesn't work for spark-submmit on yarn?

I feel like I recognize that problem, and it's almost the inverse of
https://issues.apache.org/jira/browse/SPARK-3884 which I was looking at today. 
The spark-class script didn't seem to handle all the ways that driver memory 
can be set.

I think this is also something fixed by the new launcher library in 1.4.0.

_JAVA_OPTIONS is not a good solution since it's global.

On Wed, Apr 1, 2015 at 3:21 PM, Shuai Zheng <szheng.c...@gmail.com> wrote:
> Hi Akhil,
>
>
>
> Thanks a lot!
>
>
>
> After set export _JAVA_OPTIONS="-Xmx5g", the OutOfMemory exception 
> disappeared. But this make me confused, so the driver-memory options 
> doesn’t work for spark-submit to YARN (I haven’t check other clusters), is it 
> a bug?
>
>
>
> Regards,
>
>
>
> Shuai
>
>
>
>
>
> From: Akhil Das [mailto:ak...@sigmoidanalytics.com]
> Sent: Wednesday, April 01, 2015 2:40 AM
> To: Shuai Zheng
> Cc: user@spark.apache.org
> Subject: Re: --driver-memory parameter doesn't work for spark-submmit 
> on yarn?
>
>
>
> Once you submit the job do a ps aux | grep spark-submit and see how 
> much is the heap space allocated to the process (the -Xmx params), if 
> you are seeing a lower value you could try increasing it yourself with:
>
>
>
> export _JAVA_OPTIONS="-Xmx5g"
>
>
> Thanks
>
> Best Regards
>
>
>
> On Wed, Apr 1, 2015 at 1:57 AM, Shuai Zheng <szheng.c...@gmail.com> wrote:
>
> Hi All,
>
>
>
> Below is the my shell script:
>
>
>
> /home/hadoop/spark/bin/spark-submit --driver-memory=5G 
> --executor-memory=40G --master yarn-client --class 
> com.***.FinancialEngineExecutor /home/hadoop/lib/my.jar 
> s3://bucket/vriscBatchConf.properties
>
>
>
> My driver will load some resources and then broadcast to all executors.
>
>
>
> That resources is only 600MB in ser format, but I always has out of 
> memory exception, it looks like the driver doesn’t allocate right 
> memory to my driver.
>
>
>
> Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
>
>         at java.lang.reflect.Array.newArray(Native Method)
>
>         at java.lang.reflect.Array.newInstance(Array.java:70)
>
>         at
> java.io.ObjectInputStream.readArray(ObjectInputStream.java:1670)
>
>         at
> java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1344)
>
>         at
> java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:199
> 0)
>
>         at
> java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)
>
>         at
> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:17
> 98)
>
>         at
> java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
>
>         at
> java.io.ObjectInputStream.readObject(ObjectInputStream.java:370)
>
>         at
> com.***.executor.support.S3FileUtils.loadCache(S3FileUtils.java:68)
>
>
>
> Do I do anything wrong here?
>
>
>
> And no matter how much I set for --driver-memory value (from 512M to 
> 20G), it always give me error on the same line (that line try to load 
> a 600MB java serialization file). So looks like the script doesn’t 
> allocate right memory to driver in my case?
>
>
>
> Regards,
>
>
>
> Shuai
>
>


---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional 
commands, e-mail: user-h...@spark.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

RE: --driver-memory parameter doesn't work for spark-submmit on yarn?

Reply via email to