Re: spark-submit can not resolve spark-hive_2.10

2015-07-15 Thread Hao Ren
Thanks for the reply.

Actually, I don't think excluding spark-hive from spark-submit --packages
is a good idea.

I don't want to recompile spark by assembly for my cluster, every time a
new spark release is out.

I prefer using binary version of spark and then adding some jars for job
execution. e.g. Add spark-hive for HiveContext usage.

FYI, spark-hive is just 1.2MB:
http://mvnrepository.com/artifact/org.apache.spark/spark-hive_2.10/1.4.0

On Wed, Jul 8, 2015 at 2:03 AM, Burak Yavuz brk...@gmail.com wrote:

 spark-hive is excluded when using --packages, because it can be included
 in the spark-assembly by adding -Phive during mvn package or sbt assembly.

 Best,
 Burak

 On Tue, Jul 7, 2015 at 8:06 AM, Hao Ren inv...@gmail.com wrote:

 I want to add spark-hive as a dependence to submit my job, but it seems
 that
 spark-submit can not resolve it.

 $ ./bin/spark-submit \
 → --packages

 org.apache.spark:spark-hive_2.10:1.4.0,org.postgresql:postgresql:9.3-1103-jdbc3,joda-time:joda-time:2.8.1
 \
 → --class fr.leboncoin.etl.jobs.dwh.AdStateTraceDWHTransform \
 → --master spark://localhost:7077 \

 Ivy Default Cache set to: /home/invkrh/.ivy2/cache
 The jars for the packages stored in: /home/invkrh/.ivy2/jars
 https://repository.jboss.org/nexus/content/repositories/releases/ added
 as a
 remote repository with the name: repo-1
 :: loading settings :: url =

 jar:file:/home/invkrh/workspace/scala/spark/assembly/target/scala-2.10/spark-assembly-1.4.0-SNAPSHOT-hadoop2.2.0.jar!/org/apache/ivy/core/settings/ivysettings.xml
 org.apache.spark#spark-hive_2.10 added as a dependency
 org.postgresql#postgresql added as a dependency
 joda-time#joda-time added as a dependency
 :: resolving dependencies :: org.apache.spark#spark-submit-parent;1.0
 confs: [default]
 found org.postgresql#postgresql;9.3-1103-jdbc3 in local-m2-cache
 found joda-time#joda-time;2.8.1 in central
 :: resolution report :: resolve 139ms :: artifacts dl 3ms
 :: modules in use:
 joda-time#joda-time;2.8.1 from central in [default]
 org.postgresql#postgresql;9.3-1103-jdbc3 from local-m2-cache in
 [default]

 -
 |  |modules||
  artifacts   |
 |   conf   | number| search|dwnlded|evicted||
 number|dwnlded|

 -
 |  default |   2   |   0   |   0   |   0   ||   2   |
  0   |

 -
 :: retrieving :: org.apache.spark#spark-submit-parent
 confs: [default]
 0 artifacts copied, 2 already retrieved (0kB/6ms)
 Exception in thread main java.lang.NoClassDefFoundError:
 org/apache/spark/sql/hive/HiveContext
 at java.lang.Class.forName0(Native Method)
 at java.lang.Class.forName(Class.java:348)
 at

 org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:633)
 at
 org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:169)
 at
 org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:192)
 at
 org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:111)
 at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
 Caused by: java.lang.ClassNotFoundException:
 org.apache.spark.sql.hive.HiveContext
 at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
 at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
 at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
 ... 7 more
 Using Spark's default log4j profile:
 org/apache/spark/log4j-defaults.properties
 15/07/07 16:57:59 INFO Utils: Shutdown hook called

 Any help is appreciated. Thank you.



 --
 View this message in context:
 http://apache-spark-user-list.1001560.n3.nabble.com/spark-submit-can-not-resolve-spark-hive-2-10-tp23695.html
 Sent from the Apache Spark User List mailing list archive at Nabble.com.

 -
 To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
 For additional commands, e-mail: user-h...@spark.apache.org





-- 
Hao Ren

Data Engineer @ leboncoin

Paris, France


spark-submit can not resolve spark-hive_2.10

2015-07-07 Thread Hao Ren
I want to add spark-hive as a dependence to submit my job, but it seems that
spark-submit can not resolve it.

$ ./bin/spark-submit \
→ --packages
org.apache.spark:spark-hive_2.10:1.4.0,org.postgresql:postgresql:9.3-1103-jdbc3,joda-time:joda-time:2.8.1
\
→ --class fr.leboncoin.etl.jobs.dwh.AdStateTraceDWHTransform \
→ --master spark://localhost:7077 \

Ivy Default Cache set to: /home/invkrh/.ivy2/cache
The jars for the packages stored in: /home/invkrh/.ivy2/jars
https://repository.jboss.org/nexus/content/repositories/releases/ added as a
remote repository with the name: repo-1
:: loading settings :: url =
jar:file:/home/invkrh/workspace/scala/spark/assembly/target/scala-2.10/spark-assembly-1.4.0-SNAPSHOT-hadoop2.2.0.jar!/org/apache/ivy/core/settings/ivysettings.xml
org.apache.spark#spark-hive_2.10 added as a dependency
org.postgresql#postgresql added as a dependency
joda-time#joda-time added as a dependency
:: resolving dependencies :: org.apache.spark#spark-submit-parent;1.0
confs: [default]
found org.postgresql#postgresql;9.3-1103-jdbc3 in local-m2-cache
found joda-time#joda-time;2.8.1 in central
:: resolution report :: resolve 139ms :: artifacts dl 3ms
:: modules in use:
joda-time#joda-time;2.8.1 from central in [default]
org.postgresql#postgresql;9.3-1103-jdbc3 from local-m2-cache in 
[default]
-
|  |modules||   artifacts   |
|   conf   | number| search|dwnlded|evicted|| number|dwnlded|
-
|  default |   2   |   0   |   0   |   0   ||   2   |   0   |
-
:: retrieving :: org.apache.spark#spark-submit-parent
confs: [default]
0 artifacts copied, 2 already retrieved (0kB/6ms)
Exception in thread main java.lang.NoClassDefFoundError:
org/apache/spark/sql/hive/HiveContext
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:348)
at
org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:633)
at 
org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:169)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:192)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:111)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Caused by: java.lang.ClassNotFoundException:
org.apache.spark.sql.hive.HiveContext
at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
... 7 more
Using Spark's default log4j profile:
org/apache/spark/log4j-defaults.properties
15/07/07 16:57:59 INFO Utils: Shutdown hook called

Any help is appreciated. Thank you.



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/spark-submit-can-not-resolve-spark-hive-2-10-tp23695.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org



Re: spark-submit can not resolve spark-hive_2.10

2015-07-07 Thread Burak Yavuz
spark-hive is excluded when using --packages, because it can be included in
the spark-assembly by adding -Phive during mvn package or sbt assembly.

Best,
Burak

On Tue, Jul 7, 2015 at 8:06 AM, Hao Ren inv...@gmail.com wrote:

 I want to add spark-hive as a dependence to submit my job, but it seems
 that
 spark-submit can not resolve it.

 $ ./bin/spark-submit \
 → --packages

 org.apache.spark:spark-hive_2.10:1.4.0,org.postgresql:postgresql:9.3-1103-jdbc3,joda-time:joda-time:2.8.1
 \
 → --class fr.leboncoin.etl.jobs.dwh.AdStateTraceDWHTransform \
 → --master spark://localhost:7077 \

 Ivy Default Cache set to: /home/invkrh/.ivy2/cache
 The jars for the packages stored in: /home/invkrh/.ivy2/jars
 https://repository.jboss.org/nexus/content/repositories/releases/ added
 as a
 remote repository with the name: repo-1
 :: loading settings :: url =

 jar:file:/home/invkrh/workspace/scala/spark/assembly/target/scala-2.10/spark-assembly-1.4.0-SNAPSHOT-hadoop2.2.0.jar!/org/apache/ivy/core/settings/ivysettings.xml
 org.apache.spark#spark-hive_2.10 added as a dependency
 org.postgresql#postgresql added as a dependency
 joda-time#joda-time added as a dependency
 :: resolving dependencies :: org.apache.spark#spark-submit-parent;1.0
 confs: [default]
 found org.postgresql#postgresql;9.3-1103-jdbc3 in local-m2-cache
 found joda-time#joda-time;2.8.1 in central
 :: resolution report :: resolve 139ms :: artifacts dl 3ms
 :: modules in use:
 joda-time#joda-time;2.8.1 from central in [default]
 org.postgresql#postgresql;9.3-1103-jdbc3 from local-m2-cache in
 [default]

 -
 |  |modules||   artifacts
  |
 |   conf   | number| search|dwnlded|evicted||
 number|dwnlded|

 -
 |  default |   2   |   0   |   0   |   0   ||   2   |   0
  |

 -
 :: retrieving :: org.apache.spark#spark-submit-parent
 confs: [default]
 0 artifacts copied, 2 already retrieved (0kB/6ms)
 Exception in thread main java.lang.NoClassDefFoundError:
 org/apache/spark/sql/hive/HiveContext
 at java.lang.Class.forName0(Native Method)
 at java.lang.Class.forName(Class.java:348)
 at

 org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:633)
 at
 org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:169)
 at
 org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:192)
 at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:111)
 at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
 Caused by: java.lang.ClassNotFoundException:
 org.apache.spark.sql.hive.HiveContext
 at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
 at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
 at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
 ... 7 more
 Using Spark's default log4j profile:
 org/apache/spark/log4j-defaults.properties
 15/07/07 16:57:59 INFO Utils: Shutdown hook called

 Any help is appreciated. Thank you.



 --
 View this message in context:
 http://apache-spark-user-list.1001560.n3.nabble.com/spark-submit-can-not-resolve-spark-hive-2-10-tp23695.html
 Sent from the Apache Spark User List mailing list archive at Nabble.com.

 -
 To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
 For additional commands, e-mail: user-h...@spark.apache.org