Thanks for the reply. Actually, I don't think excluding spark-hive from spark-submit --packages is a good idea.
I don't want to recompile spark by assembly for my cluster, every time a new spark release is out. I prefer using binary version of spark and then adding some jars for job execution. e.g. Add spark-hive for HiveContext usage. FYI, spark-hive is just 1.2MB: http://mvnrepository.com/artifact/org.apache.spark/spark-hive_2.10/1.4.0 On Wed, Jul 8, 2015 at 2:03 AM, Burak Yavuz <brk...@gmail.com> wrote: > spark-hive is excluded when using --packages, because it can be included > in the spark-assembly by adding -Phive during mvn package or sbt assembly. > > Best, > Burak > > On Tue, Jul 7, 2015 at 8:06 AM, Hao Ren <inv...@gmail.com> wrote: > >> I want to add spark-hive as a dependence to submit my job, but it seems >> that >> spark-submit can not resolve it. >> >> $ ./bin/spark-submit \ >> → --packages >> >> org.apache.spark:spark-hive_2.10:1.4.0,org.postgresql:postgresql:9.3-1103-jdbc3,joda-time:joda-time:2.8.1 >> \ >> → --class fr.leboncoin.etl.jobs.dwh.AdStateTraceDWHTransform \ >> → --master spark://localhost:7077 \ >> >> Ivy Default Cache set to: /home/invkrh/.ivy2/cache >> The jars for the packages stored in: /home/invkrh/.ivy2/jars >> https://repository.jboss.org/nexus/content/repositories/releases/ added >> as a >> remote repository with the name: repo-1 >> :: loading settings :: url = >> >> jar:file:/home/invkrh/workspace/scala/spark/assembly/target/scala-2.10/spark-assembly-1.4.0-SNAPSHOT-hadoop2.2.0.jar!/org/apache/ivy/core/settings/ivysettings.xml >> org.apache.spark#spark-hive_2.10 added as a dependency >> org.postgresql#postgresql added as a dependency >> joda-time#joda-time added as a dependency >> :: resolving dependencies :: org.apache.spark#spark-submit-parent;1.0 >> confs: [default] >> found org.postgresql#postgresql;9.3-1103-jdbc3 in local-m2-cache >> found joda-time#joda-time;2.8.1 in central >> :: resolution report :: resolve 139ms :: artifacts dl 3ms >> :: modules in use: >> joda-time#joda-time;2.8.1 from central in [default] >> org.postgresql#postgresql;9.3-1103-jdbc3 from local-m2-cache in >> [default] >> >> --------------------------------------------------------------------- >> | | modules || >> artifacts | >> | conf | number| search|dwnlded|evicted|| >> number|dwnlded| >> >> --------------------------------------------------------------------- >> | default | 2 | 0 | 0 | 0 || 2 | >> 0 | >> >> --------------------------------------------------------------------- >> :: retrieving :: org.apache.spark#spark-submit-parent >> confs: [default] >> 0 artifacts copied, 2 already retrieved (0kB/6ms) >> Exception in thread "main" java.lang.NoClassDefFoundError: >> org/apache/spark/sql/hive/HiveContext >> at java.lang.Class.forName0(Native Method) >> at java.lang.Class.forName(Class.java:348) >> at >> >> org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:633) >> at >> org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:169) >> at >> org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:192) >> at >> org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:111) >> at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) >> Caused by: java.lang.ClassNotFoundException: >> org.apache.spark.sql.hive.HiveContext >> at java.net.URLClassLoader.findClass(URLClassLoader.java:381) >> at java.lang.ClassLoader.loadClass(ClassLoader.java:424) >> at java.lang.ClassLoader.loadClass(ClassLoader.java:357) >> ... 7 more >> Using Spark's default log4j profile: >> org/apache/spark/log4j-defaults.properties >> 15/07/07 16:57:59 INFO Utils: Shutdown hook called >> >> Any help is appreciated. Thank you. >> >> >> >> -- >> View this message in context: >> http://apache-spark-user-list.1001560.n3.nabble.com/spark-submit-can-not-resolve-spark-hive-2-10-tp23695.html >> Sent from the Apache Spark User List mailing list archive at Nabble.com. >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org >> For additional commands, e-mail: user-h...@spark.apache.org >> >> > -- Hao Ren Data Engineer @ leboncoin Paris, France