spark-submit fails after setting userClassPathFirst to true

2016-10-29 Thread sudhir patil
After i set spark.driver.userClassPathFirst=true, my spark-submit --master
yarn-client fails with below error & it works fine if i remove
userClassPathFirst setting. I need to add this setting to avoid class
conflicts in some other job so trying to make it this setting work in
simple job first & later try with job with class conflicts.

>From quick search looks like this error occurs when driver cannot find yarn
& hadoop related config, so exported SPARK_CONF_DIR & HADOOP_CONF_DIR and
also added config files in --jar option but still get the same error.

Any ideas on how to fix this?

org.apache.spark.SparkException: Unable to load YARN support

at org.apache.spark.deploy.SparkHadoopUtil$.liftedTree1$
1(SparkHadoopUtil.scala:399)

at org.apache.spark.deploy.SparkHadoopUtil$.yarn$
lzycompute(SparkHadoopUtil.scala:394)

at org.apache.spark.deploy.SparkHadoopUtil$.yarn(
SparkHadoopUtil.scala:394)

at org.apache.spark.deploy.SparkHadoopUtil$.get(
SparkHadoopUtil.scala:411)

at org.apache.spark.util.Utils$.getSparkOrYarnConfig(Utils.
scala:2119)

at org.apache.spark.storage.BlockManager.(
BlockManager.scala:105)

at org.apache.spark.SparkEnv$.create(SparkEnv.scala:365)

at org.apache.spark.SparkEnv$.createDriverEnv(SparkEnv.scala:193)

at org.apache.spark.SparkContext.createSparkEnv(SparkContext.
scala:289)

at org.apache.spark.SparkContext.(SparkContext.scala:462)

at org.apache.spark.api.java.JavaSparkContext.(
JavaSparkContext.scala:59)

at com.citi.ripcurl.timeseriesbatch.BatchContext.<
init>(BatchContext.java:27)

at com.citi.ripcurl.timeseriesbatch.example.EqDataQualityExample.
runReportQuery(EqDataQualityExample.java:28)

at com.citi.ripcurl.timeseriesbatch.example.
EqDataQualityExample.main(EqDataQualityExample.java:70)

at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

at sun.reflect.NativeMethodAccessorImpl.invoke(
NativeMethodAccessorImpl.java:62)

at sun.reflect.DelegatingMethodAccessorImpl.invoke(
DelegatingMethodAccessorImpl.java:43)

at java.lang.reflect.Method.invoke(Method.java:497)

at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$
deploy$SparkSubmit$$runMain(SparkSubmit.scala:731)

at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(
SparkSubmit.scala:181)

at org.apache.spark.deploy.SparkSubmit$.submit(
SparkSubmit.scala:206)

at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:121)

at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)

Caused by: java.lang.RuntimeException: java.lang.RuntimeException: class
org.apache.hadoop.security.ShellBasedUnixGroupsMapping not
org.apache.hadoop.security.GroupMappingServiceProvider


Re: JavaRDD to DataFrame fails with null pointer exception in 1.6.0

2016-08-16 Thread sudhir patil
Tested with java 7 & 8 , same issue on both versions.

On Aug 17, 2016 12:29 PM, "spats"  wrote:

> Cannot convert JavaRDD to DataFrame in spark 1.6.0, throws null pointer
> exception & no more details. Can't really figure out what really happening.
> Any pointer to fixes?
>
> //convert JavaRDD to DataFrame
> DataFrame schemaPeople = sqlContext.createDataFrame(people, Person.class);
>
> // exception with no more details
> Exception in thread "main" java.lang.NullPointerException
>
>
>
> --
> View this message in context: http://apache-spark-user-list.
> 1001560.n3.nabble.com/JavaRDD-to-DataFrame-fails-with-null-
> pointer-exception-in-1-6-0-tp27547.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>
> -
> To unsubscribe e-mail: user-unsubscr...@spark.apache.org
>
>


Spark Standalone cluster job to connect Hbase is Stuck

2016-02-01 Thread sudhir patil
Spark job on Standalone cluster is Stuck, shows no logs after
"util.AkkaUtils: Connecting to HeartbeatReceiver" on worker nodes and
"storage.BlockmanagerInfo: Added broadcast..." on client driver side.

Would be great, if you could clarify any of these ( or better all of these
:)
1. Did anyone see similar issue? or any clues on what could be the reason?
2. How do you i increase debug or log level? to see whats actually
happening?
3. Any clues or links on how to use kerborised Hbase in spark standalone?


Re: Spark Standalone cluster job to connect Hbase is Stuck

2016-02-01 Thread sudhir patil
Thanks Ted for quick reply.

I am using spark 1.2, exporting Hbase conf directory containing
hbase-site.xml in HADOOP_CLASSPATH & SPARK_CLASSPATH. Do i need to do
anything else?

Issues in connecting to kerberos Hbase through spark yarn cluster is fixed
spark 1.4+, so i am trying if it works in spark stand alone mode in spark
1.2, as i cannot upgrade cluster.
https://issues.apache.org/jira/browse/SPARK-6918



On Tue, Feb 2, 2016 at 8:31 AM, Ted Yu <yuzhih...@gmail.com> wrote:

> Is the hbase-site.xml on the classpath of the worker nodes ?
>
> Which Spark release are you using ?
>
> Cheers
>
> On Mon, Feb 1, 2016 at 4:25 PM, sudhir patil <spatil.sud...@gmail.com>
> wrote:
>
>> Spark job on Standalone cluster is Stuck, shows no logs after
>> "util.AkkaUtils: Connecting to HeartbeatReceiver" on worker nodes and
>> "storage.BlockmanagerInfo: Added broadcast..." on client driver side.
>>
>> Would be great, if you could clarify any of these ( or better all of
>> these :)
>> 1. Did anyone see similar issue? or any clues on what could be the reason?
>> 2. How do you i increase debug or log level? to see whats actually
>> happening?
>> 3. Any clues or links on how to use kerborised Hbase in spark standalone?
>>
>
>