[ 
https://issues.apache.org/jira/browse/SPARK-3217?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Cheng Lian updated SPARK-3217:
------------------------------

    Description: 
PR [#1813|https://github.com/apache/spark/pull/1813] shaded Guava jar file and 
moved Guava classes to package {{org.spark-project.guava}} when Spark is built 
by Maven. But if developers set the environment variable 
{{SPARK_PREPEND_CLASSES}} to {{true}}, commands like {{bin/spark-shell}} throws 
{{ClassNotFoundException}}:
{code}
# Set the env var
$ export SPARK_PREPEND_CLASSES=true

# Build Spark with Maven
$ mvn clean package -Phive,hadoop-2.3 -Dhadoop.version=2.3.0 -DskipTests
...

# Then spark-shell complains
$ ./bin/spark-shell
Spark assembly has been built with Hive, including Datanucleus jars on classpath
Exception in thread "main" java.lang.NoClassDefFoundError: 
com/google/common/util/concurrent/ThreadFactoryBuilder
        at org.apache.spark.util.Utils$.<init>(Utils.scala:636)
        at org.apache.spark.util.Utils$.<clinit>(Utils.scala)
        at org.apache.spark.repl.SparkILoop.<init>(SparkILoop.scala:134)
        at org.apache.spark.repl.SparkILoop.<init>(SparkILoop.scala:65)
        at org.apache.spark.repl.Main$.main(Main.scala:30)
        at org.apache.spark.repl.Main.main(Main.scala)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
        at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:606)
        at org.apache.spark.deploy.SparkSubmit$.launch(SparkSubmit.scala:317)
        at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:73)
        at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Caused by: java.lang.ClassNotFoundException: 
com.google.common.util.concurrent.ThreadFactoryBuilder
        at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
        at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
        at java.security.AccessController.doPrivileged(Native Method)
        at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
        at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
        ... 13 more

# Check the assembly jar file
$ jar tf 
assembly/target/scala-2.10/spark-assembly-1.1.0-SNAPSHOT-hadoop2.3.0.jar | grep 
-i ThreadFactoryBuilder
org/spark-project/guava/common/util/concurrent/ThreadFactoryBuilder$1.class
org/spark-project/guava/common/util/concurrent/ThreadFactoryBuilder.class
{code}
SBT build is fine since we don't shade Guava with SBT right now.

  was:
PR [#1813|https://github.com/apache/spark/pull/1813] shaded Guava jar file and 
moved Guava classes to package {{org.spark-project.guava}} when Spark is built 
by Maven. But code in {{org.apache.spark.util.Utils}} still refers to classes 
(e.g. {{ThreadFactoryBuilder}}) in package {{com.google.common}}.

The result is that, when Spark is built with Maven (or 
{{make-distribution.sh}}), commands like {{bin/spark-shell}} throws 
{{ClassNotFoundException}}:
{code}
# Build Spark with Maven
$ mvn clean package -Phive,hadoop-2.3 -Dhadoop.version=2.3.0 -DskipTests
...

# Then spark-shell complains
$ ./bin/spark-shell
Spark assembly has been built with Hive, including Datanucleus jars on classpath
Exception in thread "main" java.lang.NoClassDefFoundError: 
com/google/common/util/concurrent/ThreadFactoryBuilder
        at org.apache.spark.util.Utils$.<init>(Utils.scala:636)
        at org.apache.spark.util.Utils$.<clinit>(Utils.scala)
        at org.apache.spark.repl.SparkILoop.<init>(SparkILoop.scala:134)
        at org.apache.spark.repl.SparkILoop.<init>(SparkILoop.scala:65)
        at org.apache.spark.repl.Main$.main(Main.scala:30)
        at org.apache.spark.repl.Main.main(Main.scala)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
        at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:606)
        at org.apache.spark.deploy.SparkSubmit$.launch(SparkSubmit.scala:317)
        at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:73)
        at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Caused by: java.lang.ClassNotFoundException: 
com.google.common.util.concurrent.ThreadFactoryBuilder
        at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
        at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
        at java.security.AccessController.doPrivileged(Native Method)
        at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
        at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
        ... 13 more

# Check the assembly jar file
$ jar tf 
assembly/target/scala-2.10/spark-assembly-1.1.0-SNAPSHOT-hadoop2.3.0.jar | grep 
-i ThreadFactoryBuilder
org/spark-project/guava/common/util/concurrent/ThreadFactoryBuilder$1.class
org/spark-project/guava/common/util/concurrent/ThreadFactoryBuilder.class
{code}
SBT build is fine since we don't shade Guava with SBT right now (and that's why 
Jenkins didn't complain about this).

Possible solutions can be:
# revert PR #1813 for safe, or
# also shade Guava in SBT build and only use {{org.spark-project.guava}} in 
Spark


> Shaded Guava jar doesn't play well with Maven build when 
> SPARK_PREPEND_CLASSES is set
> -------------------------------------------------------------------------------------
>
>                 Key: SPARK-3217
>                 URL: https://issues.apache.org/jira/browse/SPARK-3217
>             Project: Spark
>          Issue Type: Bug
>          Components: Build
>    Affects Versions: 1.2.0
>            Reporter: Cheng Lian
>            Assignee: Marcelo Vanzin
>            Priority: Blocker
>
> PR [#1813|https://github.com/apache/spark/pull/1813] shaded Guava jar file 
> and moved Guava classes to package {{org.spark-project.guava}} when Spark is 
> built by Maven. But if developers set the environment variable 
> {{SPARK_PREPEND_CLASSES}} to {{true}}, commands like {{bin/spark-shell}} 
> throws {{ClassNotFoundException}}:
> {code}
> # Set the env var
> $ export SPARK_PREPEND_CLASSES=true
> # Build Spark with Maven
> $ mvn clean package -Phive,hadoop-2.3 -Dhadoop.version=2.3.0 -DskipTests
> ...
> # Then spark-shell complains
> $ ./bin/spark-shell
> Spark assembly has been built with Hive, including Datanucleus jars on 
> classpath
> Exception in thread "main" java.lang.NoClassDefFoundError: 
> com/google/common/util/concurrent/ThreadFactoryBuilder
>         at org.apache.spark.util.Utils$.<init>(Utils.scala:636)
>         at org.apache.spark.util.Utils$.<clinit>(Utils.scala)
>         at org.apache.spark.repl.SparkILoop.<init>(SparkILoop.scala:134)
>         at org.apache.spark.repl.SparkILoop.<init>(SparkILoop.scala:65)
>         at org.apache.spark.repl.Main$.main(Main.scala:30)
>         at org.apache.spark.repl.Main.main(Main.scala)
>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>         at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>         at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>         at java.lang.reflect.Method.invoke(Method.java:606)
>         at org.apache.spark.deploy.SparkSubmit$.launch(SparkSubmit.scala:317)
>         at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:73)
>         at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
> Caused by: java.lang.ClassNotFoundException: 
> com.google.common.util.concurrent.ThreadFactoryBuilder
>         at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
>         at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
>         at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
>         at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
>         at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
>         ... 13 more
> # Check the assembly jar file
> $ jar tf 
> assembly/target/scala-2.10/spark-assembly-1.1.0-SNAPSHOT-hadoop2.3.0.jar | 
> grep -i ThreadFactoryBuilder
> org/spark-project/guava/common/util/concurrent/ThreadFactoryBuilder$1.class
> org/spark-project/guava/common/util/concurrent/ThreadFactoryBuilder.class
> {code}
> SBT build is fine since we don't shade Guava with SBT right now.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to