[ https://issues.apache.org/jira/browse/SPARK-9515?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14660286#comment-14660286 ]
nirav patel commented on SPARK-9515: ------------------------------------ [~srowen] I just gave a reason why I can't use spark-submit script as you asked in first comment. I agree this should be more forum question but I though NPE is something you may wanna handle better. > Creating JavaSparkContext with yarn-cluster mode throws NPE > ----------------------------------------------------------- > > Key: SPARK-9515 > URL: https://issues.apache.org/jira/browse/SPARK-9515 > Project: Spark > Issue Type: Bug > Components: Java API > Affects Versions: 1.3.1 > Reporter: nirav patel > > I have spark application that runs agains YARN cluster. I run spark > application as part of my web application. I can't use spark-submit script. > Way I run it is `java -cp myApp.jar com.myapp.Application` which in turn > initiate JavaSparkContext. It used to work with spark 1.0.2 and standalone > cluster but now with 1.3.1 and yarn its failing. > Caused by: java.lang.NullPointerException > at > org.apache.spark.deploy.yarn.ApplicationMaster$.sparkContextInitialized(ApplicationMaster.scala:580) > at > org.apache.spark.scheduler.cluster.YarnClusterScheduler.postStartHook(YarnClusterScheduler.scala:32) > at org.apache.spark.SparkContext.<init>(SparkContext.scala:541) > at > org.apache.spark.api.java.JavaSparkContext.<init>(JavaSparkContext.scala:61) > EDIT: > I got it working with yarn-client mode however I want to test it out with > yarn-cluster mode as well. > Application design is, we create singleton SparkContext object and preload > few RDDs in memory when our spring-boot application(tomcat container) starts. > That allows us to submit subsequent spark jobs without overhead of creating > new sparkContext and RDDs. It performs excellent for our SLA. We are serving > real-time GLM in ms with that. I hope this is a reason enough why we can't > use spark-submit script to submit a job. > Code is pretty simple. This is how we create sparkContext > SparkConf conf = new > SparkConf().setAppName(appName.toString()).setMaster("yarn-client"); > conf.set("spark.eventLog.enabled", "true"); > conf.set("spark.executor.extraClassPath", > "/opt/mapr/hbase/hbase-0.98.12/lib/*"); > conf.set("spark.cores.max", sparkCoreMax); > conf.set("spark.executor.memory", sparkExecMem); > conf.set("spark.executor.extraJavaOptions", executorJavaOPts); > conf.set("spark.akka.threads", sparkDriverThreads); > JavaSparkContext sparkContext = new JavaSparkContext(conf); > This is how we actually run sprig-boot app. > java > -Dloader.path=myspringbootapp.jar,/spark/spark-1.3.1/lib,/opt/mapr/hadoop/hadoop-2.5.1/etc/hadoop,/opt/mapr/hadoop/hadoop-2.5.1/share/hadoop/yarn > -XX:PermSize=512m -XX:MaxPermSize=512m -Xms1024m -jar myspringbootapp.jar -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org