[ 
https://issues.apache.org/jira/browse/SPARK-9515?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14660286#comment-14660286
 ] 

nirav patel commented on SPARK-9515:
------------------------------------

[~srowen] I just gave a reason why I can't use spark-submit script as you asked 
in first comment. I agree this should be more forum question but I though NPE 
is something you may wanna handle better.

> Creating JavaSparkContext with yarn-cluster mode throws NPE
> -----------------------------------------------------------
>
>                 Key: SPARK-9515
>                 URL: https://issues.apache.org/jira/browse/SPARK-9515
>             Project: Spark
>          Issue Type: Bug
>          Components: Java API
>    Affects Versions: 1.3.1
>            Reporter: nirav patel
>
> I have spark application that runs agains YARN cluster. I run spark 
> application as part of my web application. I can't use spark-submit script. 
> Way I run it is `java -cp myApp.jar com.myapp.Application` which in turn 
> initiate JavaSparkContext. It used to work with spark 1.0.2 and standalone 
> cluster but now with 1.3.1 and yarn its failing.
> Caused by: java.lang.NullPointerException
>       at 
> org.apache.spark.deploy.yarn.ApplicationMaster$.sparkContextInitialized(ApplicationMaster.scala:580)
>       at 
> org.apache.spark.scheduler.cluster.YarnClusterScheduler.postStartHook(YarnClusterScheduler.scala:32)
>       at org.apache.spark.SparkContext.<init>(SparkContext.scala:541)
>       at 
> org.apache.spark.api.java.JavaSparkContext.<init>(JavaSparkContext.scala:61)
> EDIT:
> I got it working with yarn-client mode however I want to test it out with 
> yarn-cluster mode as well.
> Application design is, we create singleton SparkContext object and preload 
> few RDDs in memory when our spring-boot application(tomcat container) starts. 
> That allows us to submit subsequent spark jobs without overhead of creating 
> new sparkContext and RDDs. It performs excellent for our SLA. We are serving 
> real-time GLM in ms with that. I hope this is a reason enough why we can't 
> use spark-submit script to submit a job.
> Code is pretty simple. This is how we create sparkContext
> SparkConf conf = new 
> SparkConf().setAppName(appName.toString()).setMaster("yarn-client");
> conf.set("spark.eventLog.enabled", "true");
> conf.set("spark.executor.extraClassPath", 
> "/opt/mapr/hbase/hbase-0.98.12/lib/*");
> conf.set("spark.cores.max", sparkCoreMax);
> conf.set("spark.executor.memory", sparkExecMem);
> conf.set("spark.executor.extraJavaOptions", executorJavaOPts);
> conf.set("spark.akka.threads", sparkDriverThreads);
> JavaSparkContext sparkContext = new JavaSparkContext(conf);
> This is how we actually run sprig-boot app.
> java 
> -Dloader.path=myspringbootapp.jar,/spark/spark-1.3.1/lib,/opt/mapr/hadoop/hadoop-2.5.1/etc/hadoop,/opt/mapr/hadoop/hadoop-2.5.1/share/hadoop/yarn
>  -XX:PermSize=512m -XX:MaxPermSize=512m -Xms1024m -jar myspringbootapp.jar



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to