Including data nucleus tools

spark.dubovsky.jakub Fri, 05 Dec 2014 12:26:46 -0800

Hi all,

  I have created assembly jar from 1.2 snapshot source by running [1] which 
sets correct version of hadoop for our cluster and uses hive profile. I also
have written relatively simple test program which starts by reading data 
from parquet using hive context. I compile the code against assembly jar 
created and then submited it on a cluster using by [2]. Job fails in its 
early stage on creating HiveContext itself. Important part of stack trace is
[3].


  Could please some of you explain what is wrong and how it should be fixed?
I have found only SPARK-4532
(https://issues.apache.org/jira/browse/SPARK-4532) when looking for 
something related. Fix for the bug is merged in source I have used so this 
is ruled out...

  Thanks for help

  Jakub

[1] ./sbt/sbt -Dhadoop.version=2.3.0-cdh5.1.3 -Pyarn -Phive assembly/
assembly

[2] ./bin/spark-submit --num-executors 200 --master yarn-cluster --conf 
spark.yarn.jar=assembly/target/scala-2.10/spark-assembly-1.2.1-SNAPSHOT-
hadoop2.3.0-cdh5.1.3.jar --class org.apache.spark.mllib.
CreateGuidDomainDictionary root-0.1.jar ...some-args-here

[3]
14/12/05 20:28:15 INFO yarn.ApplicationMaster: Final app status: FAILED, 
exitCode: 15, (reason: User class threw exception: java.lang.
RuntimeException: Unable to instantiate org.apache.hadoop.hive.metastore.
HiveMetaStoreClient)
Exception in thread "Driver" java.lang.RuntimeException: java.lang.
RuntimeException: Unable to instantiate
...
Caused by: java.lang.ClassNotFoundException: org.datanucleus.api.jdo.
JDOPersistenceManagerFactory
    at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
...

Including data nucleus tools

Reply via email to