spark-submit fails after setting userClassPathFirst to true
After i set spark.driver.userClassPathFirst=true, my spark-submit --master yarn-client fails with below error & it works fine if i remove userClassPathFirst setting. I need to add this setting to avoid class conflicts in some other job so trying to make it this setting work in simple job first & later try with job with class conflicts. >From quick search looks like this error occurs when driver cannot find yarn & hadoop related config, so exported SPARK_CONF_DIR & HADOOP_CONF_DIR and also added config files in --jar option but still get the same error. Any ideas on how to fix this? org.apache.spark.SparkException: Unable to load YARN support at org.apache.spark.deploy.SparkHadoopUtil$.liftedTree1$ 1(SparkHadoopUtil.scala:399) at org.apache.spark.deploy.SparkHadoopUtil$.yarn$ lzycompute(SparkHadoopUtil.scala:394) at org.apache.spark.deploy.SparkHadoopUtil$.yarn( SparkHadoopUtil.scala:394) at org.apache.spark.deploy.SparkHadoopUtil$.get( SparkHadoopUtil.scala:411) at org.apache.spark.util.Utils$.getSparkOrYarnConfig(Utils. scala:2119) at org.apache.spark.storage.BlockManager.( BlockManager.scala:105) at org.apache.spark.SparkEnv$.create(SparkEnv.scala:365) at org.apache.spark.SparkEnv$.createDriverEnv(SparkEnv.scala:193) at org.apache.spark.SparkContext.createSparkEnv(SparkContext. scala:289) at org.apache.spark.SparkContext.(SparkContext.scala:462) at org.apache.spark.api.java.JavaSparkContext.( JavaSparkContext.scala:59) at com.citi.ripcurl.timeseriesbatch.BatchContext.< init>(BatchContext.java:27) at com.citi.ripcurl.timeseriesbatch.example.EqDataQualityExample. runReportQuery(EqDataQualityExample.java:28) at com.citi.ripcurl.timeseriesbatch.example. EqDataQualityExample.main(EqDataQualityExample.java:70) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke( NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke( DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:497) at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$ deploy$SparkSubmit$$runMain(SparkSubmit.scala:731) at org.apache.spark.deploy.SparkSubmit$.doRunMain$1( SparkSubmit.scala:181) at org.apache.spark.deploy.SparkSubmit$.submit( SparkSubmit.scala:206) at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:121) at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) Caused by: java.lang.RuntimeException: java.lang.RuntimeException: class org.apache.hadoop.security.ShellBasedUnixGroupsMapping not org.apache.hadoop.security.GroupMappingServiceProvider
Re: JavaRDD to DataFrame fails with null pointer exception in 1.6.0
Tested with java 7 & 8 , same issue on both versions. On Aug 17, 2016 12:29 PM, "spats"wrote: > Cannot convert JavaRDD to DataFrame in spark 1.6.0, throws null pointer > exception & no more details. Can't really figure out what really happening. > Any pointer to fixes? > > //convert JavaRDD to DataFrame > DataFrame schemaPeople = sqlContext.createDataFrame(people, Person.class); > > // exception with no more details > Exception in thread "main" java.lang.NullPointerException > > > > -- > View this message in context: http://apache-spark-user-list. > 1001560.n3.nabble.com/JavaRDD-to-DataFrame-fails-with-null- > pointer-exception-in-1-6-0-tp27547.html > Sent from the Apache Spark User List mailing list archive at Nabble.com. > > - > To unsubscribe e-mail: user-unsubscr...@spark.apache.org > >
Spark Standalone cluster job to connect Hbase is Stuck
Spark job on Standalone cluster is Stuck, shows no logs after "util.AkkaUtils: Connecting to HeartbeatReceiver" on worker nodes and "storage.BlockmanagerInfo: Added broadcast..." on client driver side. Would be great, if you could clarify any of these ( or better all of these :) 1. Did anyone see similar issue? or any clues on what could be the reason? 2. How do you i increase debug or log level? to see whats actually happening? 3. Any clues or links on how to use kerborised Hbase in spark standalone?
Re: Spark Standalone cluster job to connect Hbase is Stuck
Thanks Ted for quick reply. I am using spark 1.2, exporting Hbase conf directory containing hbase-site.xml in HADOOP_CLASSPATH & SPARK_CLASSPATH. Do i need to do anything else? Issues in connecting to kerberos Hbase through spark yarn cluster is fixed spark 1.4+, so i am trying if it works in spark stand alone mode in spark 1.2, as i cannot upgrade cluster. https://issues.apache.org/jira/browse/SPARK-6918 On Tue, Feb 2, 2016 at 8:31 AM, Ted Yu <yuzhih...@gmail.com> wrote: > Is the hbase-site.xml on the classpath of the worker nodes ? > > Which Spark release are you using ? > > Cheers > > On Mon, Feb 1, 2016 at 4:25 PM, sudhir patil <spatil.sud...@gmail.com> > wrote: > >> Spark job on Standalone cluster is Stuck, shows no logs after >> "util.AkkaUtils: Connecting to HeartbeatReceiver" on worker nodes and >> "storage.BlockmanagerInfo: Added broadcast..." on client driver side. >> >> Would be great, if you could clarify any of these ( or better all of >> these :) >> 1. Did anyone see similar issue? or any clues on what could be the reason? >> 2. How do you i increase debug or log level? to see whats actually >> happening? >> 3. Any clues or links on how to use kerborised Hbase in spark standalone? >> > >