[jira] [Commented] (SPARK-1520) Assembly Jar with more than 65536 files won't work when compiled on JDK7 and run on JDK6

2014-06-04 Thread Qiuzhuang Lian (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-1520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14017753#comment-14017753
 ] 

Qiuzhuang Lian commented on SPARK-1520:
---

I can run the assembly jar via bin\spark-shell.cmd but couldn't run the example 
LocalKMeans in Intellj IDEA which throws out Exception in thread main 
java.lang.NoClassDefFoundError: breeze/linalg/Vector. Can somebody suggest a 
fix since I prefer try coding in Intellj IDEA. Thanks.

 Assembly Jar with more than 65536 files won't work when compiled on  JDK7 and 
 run on JDK6
 -

 Key: SPARK-1520
 URL: https://issues.apache.org/jira/browse/SPARK-1520
 Project: Spark
  Issue Type: Bug
  Components: MLlib, Spark Core
Reporter: Patrick Wendell
Assignee: Xiangrui Meng
Priority: Blocker
 Fix For: 1.0.0


 This is a real doozie - when compiling a Spark assembly with JDK7, the 
 produced jar does not work well with JRE6. I confirmed the byte code being 
 produced is JDK 6 compatible (major version 50). What happens is that, 
 silently, the JRE will not load any class files from the assembled jar.
 {code}
 $ sbt/sbt assembly/assembly
 $ /usr/lib/jvm/java-1.7.0-openjdk-amd64/bin/java -cp 
 /home/patrick/Documents/spark/assembly/target/scala-2.10/spark-assembly-1.0.0-SNAPSHOT-hadoop1.0.4.jar
  org.apache.spark.ui.UIWorkloadGenerator
 usage: ./bin/spark-class org.apache.spark.ui.UIWorkloadGenerator [master] 
 [FIFO|FAIR]
 $ /usr/lib/jvm/java-1.6.0-openjdk-amd64/bin/java -cp 
 /home/patrick/Documents/spark/assembly/target/scala-2.10/spark-assembly-1.0.0-SNAPSHOT-hadoop1.0.4.jar
  org.apache.spark.ui.UIWorkloadGenerator
 Exception in thread main java.lang.NoClassDefFoundError: 
 org/apache/spark/ui/UIWorkloadGenerator
 Caused by: java.lang.ClassNotFoundException: 
 org.apache.spark.ui.UIWorkloadGenerator
   at java.net.URLClassLoader$1.run(URLClassLoader.java:217)
   at java.security.AccessController.doPrivileged(Native Method)
   at java.net.URLClassLoader.findClass(URLClassLoader.java:205)
   at java.lang.ClassLoader.loadClass(ClassLoader.java:323)
   at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:294)
   at java.lang.ClassLoader.loadClass(ClassLoader.java:268)
 Could not find the main class: org.apache.spark.ui.UIWorkloadGenerator. 
 Program will exit.
 {code}
 I also noticed that if the jar is unzipped, and the classpath set to the 
 currently directory, it just works. Finally, if the assembly jar is 
 compiled with JDK6, it also works. The error is seen with any class, not just 
 the UIWorkloadGenerator. Also, this error doesn't exist in branch 0.9, only 
 in master.
 h1. Isolation and Cause
 The package-time behavior of Java 6 and 7 differ with respect to the format 
 used for jar files:
 ||Number of entries||JDK 6||JDK 7||
 |= 65536|zip|zip|
 | 65536|zip*|zip64|
 zip* is a workaround for the original zip format that [described in 
 JDK-6828461|https://bugs.openjdk.java.net/browse/JDK-4828461] that allows 
 some versions of Java 6 to support larger assembly jars.
 The Scala libraries we depend on have added a large number of classes which 
 bumped us over the limit. This causes the Java 7 packaging to not work with 
 Java 6. We can probably go back under the limit by clearing out some 
 accidental inclusion of FastUtil, but eventually we'll go over again.
 The real answer is to force people to build with JDK 6 if they want to run 
 Spark on JRE 6.
 -I've found that if I just unpack and re-pack the jar (using `jar`) it always 
 works:-
 {code}
 $ cd assembly/target/scala-2.10/
 $ /usr/lib/jvm/java-1.6.0-openjdk-amd64/bin/java -cp 
 ./spark-assembly-1.0.0-SNAPSHOT-hadoop1.0.4.jar 
 org.apache.spark.ui.UIWorkloadGenerator # fails
 $ jar xvf spark-assembly-1.0.0-SNAPSHOT-hadoop1.0.4.jar
 $ jar cvf spark-assembly-1.0.0-SNAPSHOT-hadoop1.0.4.jar *
 $ /usr/lib/jvm/java-1.6.0-openjdk-amd64/bin/java -cp 
 ./spark-assembly-1.0.0-SNAPSHOT-hadoop1.0.4.jar 
 org.apache.spark.ui.UIWorkloadGenerator # succeeds
 {code}
 -I also noticed something of note. The Breeze package contains single 
 directories that have huge numbers of files in them (e.g. 2000+ class files 
 in one directory). It's possible we are hitting some weird bugs/corner cases 
 with compatibility of the internal storage format of the jar itself.-
 -I narrowed this down specifically to the inclusion of the breeze library. 
 Just adding breeze to an older (unaffected) build triggered the issue.-
 -I ran a git bisection and this appeared after the MLLib sparse vector patch 
 was merged:-
 https://github.com/apache/spark/commit/80c29689ae3b589254a571da3ddb5f9c866ae534
 SPARK-1212



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (SPARK-1520) Assembly Jar with more than 65536 files won't work when compiled on JDK7 and run on JDK6

2014-05-05 Thread Koert Kuipers (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-1520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13989430#comment-13989430
 ] 

Koert Kuipers commented on SPARK-1520:
--

I will try latest master. thanks



 Assembly Jar with more than 65536 files won't work when compiled on  JDK7 and 
 run on JDK6
 -

 Key: SPARK-1520
 URL: https://issues.apache.org/jira/browse/SPARK-1520
 Project: Spark
  Issue Type: Bug
  Components: MLlib, Spark Core
Reporter: Patrick Wendell
Assignee: Xiangrui Meng
Priority: Blocker
 Fix For: 1.0.0


 This is a real doozie - when compiling a Spark assembly with JDK7, the 
 produced jar does not work well with JRE6. I confirmed the byte code being 
 produced is JDK 6 compatible (major version 50). What happens is that, 
 silently, the JRE will not load any class files from the assembled jar.
 {code}
 $ sbt/sbt assembly/assembly
 $ /usr/lib/jvm/java-1.7.0-openjdk-amd64/bin/java -cp 
 /home/patrick/Documents/spark/assembly/target/scala-2.10/spark-assembly-1.0.0-SNAPSHOT-hadoop1.0.4.jar
  org.apache.spark.ui.UIWorkloadGenerator
 usage: ./bin/spark-class org.apache.spark.ui.UIWorkloadGenerator [master] 
 [FIFO|FAIR]
 $ /usr/lib/jvm/java-1.6.0-openjdk-amd64/bin/java -cp 
 /home/patrick/Documents/spark/assembly/target/scala-2.10/spark-assembly-1.0.0-SNAPSHOT-hadoop1.0.4.jar
  org.apache.spark.ui.UIWorkloadGenerator
 Exception in thread main java.lang.NoClassDefFoundError: 
 org/apache/spark/ui/UIWorkloadGenerator
 Caused by: java.lang.ClassNotFoundException: 
 org.apache.spark.ui.UIWorkloadGenerator
   at java.net.URLClassLoader$1.run(URLClassLoader.java:217)
   at java.security.AccessController.doPrivileged(Native Method)
   at java.net.URLClassLoader.findClass(URLClassLoader.java:205)
   at java.lang.ClassLoader.loadClass(ClassLoader.java:323)
   at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:294)
   at java.lang.ClassLoader.loadClass(ClassLoader.java:268)
 Could not find the main class: org.apache.spark.ui.UIWorkloadGenerator. 
 Program will exit.
 {code}
 I also noticed that if the jar is unzipped, and the classpath set to the 
 currently directory, it just works. Finally, if the assembly jar is 
 compiled with JDK6, it also works. The error is seen with any class, not just 
 the UIWorkloadGenerator. Also, this error doesn't exist in branch 0.9, only 
 in master.
 h1. Isolation and Cause
 The package-time behavior of Java 6 and 7 differ with respect to the format 
 used for jar files:
 ||Number of entries||JDK 6||JDK 7||
 |= 65536|zip|zip|
 | 65536|zip*|zip64|
 zip* is a workaround for the original zip format that [described in 
 JDK-6828461|https://bugs.openjdk.java.net/browse/JDK-4828461] that allows 
 some versions of Java 6 to support larger assembly jars.
 The Scala libraries we depend on have added a large number of classes which 
 bumped us over the limit. This causes the Java 7 packaging to not work with 
 Java 6. We can probably go back under the limit by clearing out some 
 accidental inclusion of FastUtil, but eventually we'll go over again.
 The real answer is to force people to build with JDK 6 if they want to run 
 Spark on JRE 6.
 -I've found that if I just unpack and re-pack the jar (using `jar`) it always 
 works:-
 {code}
 $ cd assembly/target/scala-2.10/
 $ /usr/lib/jvm/java-1.6.0-openjdk-amd64/bin/java -cp 
 ./spark-assembly-1.0.0-SNAPSHOT-hadoop1.0.4.jar 
 org.apache.spark.ui.UIWorkloadGenerator # fails
 $ jar xvf spark-assembly-1.0.0-SNAPSHOT-hadoop1.0.4.jar
 $ jar cvf spark-assembly-1.0.0-SNAPSHOT-hadoop1.0.4.jar *
 $ /usr/lib/jvm/java-1.6.0-openjdk-amd64/bin/java -cp 
 ./spark-assembly-1.0.0-SNAPSHOT-hadoop1.0.4.jar 
 org.apache.spark.ui.UIWorkloadGenerator # succeeds
 {code}
 -I also noticed something of note. The Breeze package contains single 
 directories that have huge numbers of files in them (e.g. 2000+ class files 
 in one directory). It's possible we are hitting some weird bugs/corner cases 
 with compatibility of the internal storage format of the jar itself.-
 -I narrowed this down specifically to the inclusion of the breeze library. 
 Just adding breeze to an older (unaffected) build triggered the issue.-
 -I ran a git bisection and this appeared after the MLLib sparse vector patch 
 was merged:-
 https://github.com/apache/spark/commit/80c29689ae3b589254a571da3ddb5f9c866ae534
 SPARK-1212



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (SPARK-1520) Assembly Jar with more than 65536 files won't work when compiled on JDK7 and run on JDK6

2014-05-04 Thread Sean Owen (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-1520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13989098#comment-13989098
 ] 

Sean Owen commented on SPARK-1520:
--

[~pwendell] On this note, I wonder if it's also best to make Jenkins build with 
Java 6? I'm not quite sure if it catches things like this, but catches things 
with similar roots. I had a request open at 
https://issues.apache.org/jira/browse/SPARK-1437 but it's a Jenkins change 
rather than a code change.

 Assembly Jar with more than 65536 files won't work when compiled on  JDK7 and 
 run on JDK6
 -

 Key: SPARK-1520
 URL: https://issues.apache.org/jira/browse/SPARK-1520
 Project: Spark
  Issue Type: Bug
  Components: MLlib, Spark Core
Reporter: Patrick Wendell
Assignee: Xiangrui Meng
Priority: Blocker
 Fix For: 1.0.0


 This is a real doozie - when compiling a Spark assembly with JDK7, the 
 produced jar does not work well with JRE6. I confirmed the byte code being 
 produced is JDK 6 compatible (major version 50). What happens is that, 
 silently, the JRE will not load any class files from the assembled jar.
 {code}
 $ sbt/sbt assembly/assembly
 $ /usr/lib/jvm/java-1.7.0-openjdk-amd64/bin/java -cp 
 /home/patrick/Documents/spark/assembly/target/scala-2.10/spark-assembly-1.0.0-SNAPSHOT-hadoop1.0.4.jar
  org.apache.spark.ui.UIWorkloadGenerator
 usage: ./bin/spark-class org.apache.spark.ui.UIWorkloadGenerator [master] 
 [FIFO|FAIR]
 $ /usr/lib/jvm/java-1.6.0-openjdk-amd64/bin/java -cp 
 /home/patrick/Documents/spark/assembly/target/scala-2.10/spark-assembly-1.0.0-SNAPSHOT-hadoop1.0.4.jar
  org.apache.spark.ui.UIWorkloadGenerator
 Exception in thread main java.lang.NoClassDefFoundError: 
 org/apache/spark/ui/UIWorkloadGenerator
 Caused by: java.lang.ClassNotFoundException: 
 org.apache.spark.ui.UIWorkloadGenerator
   at java.net.URLClassLoader$1.run(URLClassLoader.java:217)
   at java.security.AccessController.doPrivileged(Native Method)
   at java.net.URLClassLoader.findClass(URLClassLoader.java:205)
   at java.lang.ClassLoader.loadClass(ClassLoader.java:323)
   at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:294)
   at java.lang.ClassLoader.loadClass(ClassLoader.java:268)
 Could not find the main class: org.apache.spark.ui.UIWorkloadGenerator. 
 Program will exit.
 {code}
 I also noticed that if the jar is unzipped, and the classpath set to the 
 currently directory, it just works. Finally, if the assembly jar is 
 compiled with JDK6, it also works. The error is seen with any class, not just 
 the UIWorkloadGenerator. Also, this error doesn't exist in branch 0.9, only 
 in master.
 h1. Isolation and Cause
 The package-time behavior of Java 6 and 7 differ with respect to the format 
 used for jar files:
 ||Number of entries||JDK 6||JDK 7||
 |= 65536|zip|zip|
 | 65536|zip*|zip64|
 zip* is a workaround for the original zip format that [described in 
 JDK-6828461|https://bugs.openjdk.java.net/browse/JDK-4828461] that allows 
 some versions of Java 6 to support larger assembly jars.
 The Scala libraries we depend on have added a large number of classes which 
 bumped us over the limit. This causes the Java 7 packaging to not work with 
 Java 6. We can probably go back under the limit by clearing out some 
 accidental inclusion of FastUtil, but eventually we'll go over again.
 The real answer is to force people to build with JDK 6 if they want to run 
 Spark on JRE 6.
 -I've found that if I just unpack and re-pack the jar (using `jar`) it always 
 works:-
 {code}
 $ cd assembly/target/scala-2.10/
 $ /usr/lib/jvm/java-1.6.0-openjdk-amd64/bin/java -cp 
 ./spark-assembly-1.0.0-SNAPSHOT-hadoop1.0.4.jar 
 org.apache.spark.ui.UIWorkloadGenerator # fails
 $ jar xvf spark-assembly-1.0.0-SNAPSHOT-hadoop1.0.4.jar
 $ jar cvf spark-assembly-1.0.0-SNAPSHOT-hadoop1.0.4.jar *
 $ /usr/lib/jvm/java-1.6.0-openjdk-amd64/bin/java -cp 
 ./spark-assembly-1.0.0-SNAPSHOT-hadoop1.0.4.jar 
 org.apache.spark.ui.UIWorkloadGenerator # succeeds
 {code}
 -I also noticed something of note. The Breeze package contains single 
 directories that have huge numbers of files in them (e.g. 2000+ class files 
 in one directory). It's possible we are hitting some weird bugs/corner cases 
 with compatibility of the internal storage format of the jar itself.-
 -I narrowed this down specifically to the inclusion of the breeze library. 
 Just adding breeze to an older (unaffected) build triggered the issue.-
 -I ran a git bisection and this appeared after the MLLib sparse vector patch 
 was merged:-
 https://github.com/apache/spark/commit/80c29689ae3b589254a571da3ddb5f9c866ae534
 SPARK-1212



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (SPARK-1520) Assembly Jar with more than 65536 files won't work when compiled on JDK7 and run on JDK6

2014-05-04 Thread Xiangrui Meng (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-1520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13989206#comment-13989206
 ] 

Xiangrui Meng commented on SPARK-1520:
--

Koert, which JDK6 did you use? This problem was fixed in a later version of 
openjdk-6. If you are using openjdk-6, you can try upgrading it to the latest 
version and see whether the problem still exists.

 Assembly Jar with more than 65536 files won't work when compiled on  JDK7 and 
 run on JDK6
 -

 Key: SPARK-1520
 URL: https://issues.apache.org/jira/browse/SPARK-1520
 Project: Spark
  Issue Type: Bug
  Components: MLlib, Spark Core
Reporter: Patrick Wendell
Assignee: Xiangrui Meng
Priority: Blocker
 Fix For: 1.0.0


 This is a real doozie - when compiling a Spark assembly with JDK7, the 
 produced jar does not work well with JRE6. I confirmed the byte code being 
 produced is JDK 6 compatible (major version 50). What happens is that, 
 silently, the JRE will not load any class files from the assembled jar.
 {code}
 $ sbt/sbt assembly/assembly
 $ /usr/lib/jvm/java-1.7.0-openjdk-amd64/bin/java -cp 
 /home/patrick/Documents/spark/assembly/target/scala-2.10/spark-assembly-1.0.0-SNAPSHOT-hadoop1.0.4.jar
  org.apache.spark.ui.UIWorkloadGenerator
 usage: ./bin/spark-class org.apache.spark.ui.UIWorkloadGenerator [master] 
 [FIFO|FAIR]
 $ /usr/lib/jvm/java-1.6.0-openjdk-amd64/bin/java -cp 
 /home/patrick/Documents/spark/assembly/target/scala-2.10/spark-assembly-1.0.0-SNAPSHOT-hadoop1.0.4.jar
  org.apache.spark.ui.UIWorkloadGenerator
 Exception in thread main java.lang.NoClassDefFoundError: 
 org/apache/spark/ui/UIWorkloadGenerator
 Caused by: java.lang.ClassNotFoundException: 
 org.apache.spark.ui.UIWorkloadGenerator
   at java.net.URLClassLoader$1.run(URLClassLoader.java:217)
   at java.security.AccessController.doPrivileged(Native Method)
   at java.net.URLClassLoader.findClass(URLClassLoader.java:205)
   at java.lang.ClassLoader.loadClass(ClassLoader.java:323)
   at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:294)
   at java.lang.ClassLoader.loadClass(ClassLoader.java:268)
 Could not find the main class: org.apache.spark.ui.UIWorkloadGenerator. 
 Program will exit.
 {code}
 I also noticed that if the jar is unzipped, and the classpath set to the 
 currently directory, it just works. Finally, if the assembly jar is 
 compiled with JDK6, it also works. The error is seen with any class, not just 
 the UIWorkloadGenerator. Also, this error doesn't exist in branch 0.9, only 
 in master.
 h1. Isolation and Cause
 The package-time behavior of Java 6 and 7 differ with respect to the format 
 used for jar files:
 ||Number of entries||JDK 6||JDK 7||
 |= 65536|zip|zip|
 | 65536|zip*|zip64|
 zip* is a workaround for the original zip format that [described in 
 JDK-6828461|https://bugs.openjdk.java.net/browse/JDK-4828461] that allows 
 some versions of Java 6 to support larger assembly jars.
 The Scala libraries we depend on have added a large number of classes which 
 bumped us over the limit. This causes the Java 7 packaging to not work with 
 Java 6. We can probably go back under the limit by clearing out some 
 accidental inclusion of FastUtil, but eventually we'll go over again.
 The real answer is to force people to build with JDK 6 if they want to run 
 Spark on JRE 6.
 -I've found that if I just unpack and re-pack the jar (using `jar`) it always 
 works:-
 {code}
 $ cd assembly/target/scala-2.10/
 $ /usr/lib/jvm/java-1.6.0-openjdk-amd64/bin/java -cp 
 ./spark-assembly-1.0.0-SNAPSHOT-hadoop1.0.4.jar 
 org.apache.spark.ui.UIWorkloadGenerator # fails
 $ jar xvf spark-assembly-1.0.0-SNAPSHOT-hadoop1.0.4.jar
 $ jar cvf spark-assembly-1.0.0-SNAPSHOT-hadoop1.0.4.jar *
 $ /usr/lib/jvm/java-1.6.0-openjdk-amd64/bin/java -cp 
 ./spark-assembly-1.0.0-SNAPSHOT-hadoop1.0.4.jar 
 org.apache.spark.ui.UIWorkloadGenerator # succeeds
 {code}
 -I also noticed something of note. The Breeze package contains single 
 directories that have huge numbers of files in them (e.g. 2000+ class files 
 in one directory). It's possible we are hitting some weird bugs/corner cases 
 with compatibility of the internal storage format of the jar itself.-
 -I narrowed this down specifically to the inclusion of the breeze library. 
 Just adding breeze to an older (unaffected) build triggered the issue.-
 -I ran a git bisection and this appeared after the MLLib sparse vector patch 
 was merged:-
 https://github.com/apache/spark/commit/80c29689ae3b589254a571da3ddb5f9c866ae534
 SPARK-1212



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (SPARK-1520) Assembly Jar with more than 65536 files won't work when compiled on JDK7 and run on JDK6

2014-05-04 Thread Koert Kuipers (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-1520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13989240#comment-13989240
 ] 

Koert Kuipers commented on SPARK-1520:
--

Xiangrui,
I have to stick to sun java. Currently on java version 1.6.0_43








 Assembly Jar with more than 65536 files won't work when compiled on  JDK7 and 
 run on JDK6
 -

 Key: SPARK-1520
 URL: https://issues.apache.org/jira/browse/SPARK-1520
 Project: Spark
  Issue Type: Bug
  Components: MLlib, Spark Core
Reporter: Patrick Wendell
Assignee: Xiangrui Meng
Priority: Blocker
 Fix For: 1.0.0


 This is a real doozie - when compiling a Spark assembly with JDK7, the 
 produced jar does not work well with JRE6. I confirmed the byte code being 
 produced is JDK 6 compatible (major version 50). What happens is that, 
 silently, the JRE will not load any class files from the assembled jar.
 {code}
 $ sbt/sbt assembly/assembly
 $ /usr/lib/jvm/java-1.7.0-openjdk-amd64/bin/java -cp 
 /home/patrick/Documents/spark/assembly/target/scala-2.10/spark-assembly-1.0.0-SNAPSHOT-hadoop1.0.4.jar
  org.apache.spark.ui.UIWorkloadGenerator
 usage: ./bin/spark-class org.apache.spark.ui.UIWorkloadGenerator [master] 
 [FIFO|FAIR]
 $ /usr/lib/jvm/java-1.6.0-openjdk-amd64/bin/java -cp 
 /home/patrick/Documents/spark/assembly/target/scala-2.10/spark-assembly-1.0.0-SNAPSHOT-hadoop1.0.4.jar
  org.apache.spark.ui.UIWorkloadGenerator
 Exception in thread main java.lang.NoClassDefFoundError: 
 org/apache/spark/ui/UIWorkloadGenerator
 Caused by: java.lang.ClassNotFoundException: 
 org.apache.spark.ui.UIWorkloadGenerator
   at java.net.URLClassLoader$1.run(URLClassLoader.java:217)
   at java.security.AccessController.doPrivileged(Native Method)
   at java.net.URLClassLoader.findClass(URLClassLoader.java:205)
   at java.lang.ClassLoader.loadClass(ClassLoader.java:323)
   at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:294)
   at java.lang.ClassLoader.loadClass(ClassLoader.java:268)
 Could not find the main class: org.apache.spark.ui.UIWorkloadGenerator. 
 Program will exit.
 {code}
 I also noticed that if the jar is unzipped, and the classpath set to the 
 currently directory, it just works. Finally, if the assembly jar is 
 compiled with JDK6, it also works. The error is seen with any class, not just 
 the UIWorkloadGenerator. Also, this error doesn't exist in branch 0.9, only 
 in master.
 h1. Isolation and Cause
 The package-time behavior of Java 6 and 7 differ with respect to the format 
 used for jar files:
 ||Number of entries||JDK 6||JDK 7||
 |= 65536|zip|zip|
 | 65536|zip*|zip64|
 zip* is a workaround for the original zip format that [described in 
 JDK-6828461|https://bugs.openjdk.java.net/browse/JDK-4828461] that allows 
 some versions of Java 6 to support larger assembly jars.
 The Scala libraries we depend on have added a large number of classes which 
 bumped us over the limit. This causes the Java 7 packaging to not work with 
 Java 6. We can probably go back under the limit by clearing out some 
 accidental inclusion of FastUtil, but eventually we'll go over again.
 The real answer is to force people to build with JDK 6 if they want to run 
 Spark on JRE 6.
 -I've found that if I just unpack and re-pack the jar (using `jar`) it always 
 works:-
 {code}
 $ cd assembly/target/scala-2.10/
 $ /usr/lib/jvm/java-1.6.0-openjdk-amd64/bin/java -cp 
 ./spark-assembly-1.0.0-SNAPSHOT-hadoop1.0.4.jar 
 org.apache.spark.ui.UIWorkloadGenerator # fails
 $ jar xvf spark-assembly-1.0.0-SNAPSHOT-hadoop1.0.4.jar
 $ jar cvf spark-assembly-1.0.0-SNAPSHOT-hadoop1.0.4.jar *
 $ /usr/lib/jvm/java-1.6.0-openjdk-amd64/bin/java -cp 
 ./spark-assembly-1.0.0-SNAPSHOT-hadoop1.0.4.jar 
 org.apache.spark.ui.UIWorkloadGenerator # succeeds
 {code}
 -I also noticed something of note. The Breeze package contains single 
 directories that have huge numbers of files in them (e.g. 2000+ class files 
 in one directory). It's possible we are hitting some weird bugs/corner cases 
 with compatibility of the internal storage format of the jar itself.-
 -I narrowed this down specifically to the inclusion of the breeze library. 
 Just adding breeze to an older (unaffected) build triggered the issue.-
 -I ran a git bisection and this appeared after the MLLib sparse vector patch 
 was merged:-
 https://github.com/apache/spark/commit/80c29689ae3b589254a571da3ddb5f9c866ae534
 SPARK-1212



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (SPARK-1520) Assembly Jar with more than 65536 files won't work when compiled on JDK7 and run on JDK6

2014-04-17 Thread Xiangrui Meng (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-1520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13973463#comment-13973463
 ] 

Xiangrui Meng commented on SPARK-1520:
--

It seems HyperLogLog doesn't need fastutil, so we can exclude fastutil 
directly. Will send a patch.

 Assembly Jar with more than 65536 files won't work when compiled on  JDK7 and 
 run on JDK6
 -

 Key: SPARK-1520
 URL: https://issues.apache.org/jira/browse/SPARK-1520
 Project: Spark
  Issue Type: Bug
  Components: MLlib, Spark Core
Reporter: Patrick Wendell
Priority: Blocker
 Fix For: 1.0.0


 This is a real doozie - when compiling a Spark assembly with JDK7, the 
 produced jar does not work well with JRE6. I confirmed the byte code being 
 produced is JDK 6 compatible (major version 50). What happens is that, 
 silently, the JRE will not load any class files from the assembled jar.
 {code}
 $ sbt/sbt assembly/assembly
 $ /usr/lib/jvm/java-1.7.0-openjdk-amd64/bin/java -cp 
 /home/patrick/Documents/spark/assembly/target/scala-2.10/spark-assembly-1.0.0-SNAPSHOT-hadoop1.0.4.jar
  org.apache.spark.ui.UIWorkloadGenerator
 usage: ./bin/spark-class org.apache.spark.ui.UIWorkloadGenerator [master] 
 [FIFO|FAIR]
 $ /usr/lib/jvm/java-1.6.0-openjdk-amd64/bin/java -cp 
 /home/patrick/Documents/spark/assembly/target/scala-2.10/spark-assembly-1.0.0-SNAPSHOT-hadoop1.0.4.jar
  org.apache.spark.ui.UIWorkloadGenerator
 Exception in thread main java.lang.NoClassDefFoundError: 
 org/apache/spark/ui/UIWorkloadGenerator
 Caused by: java.lang.ClassNotFoundException: 
 org.apache.spark.ui.UIWorkloadGenerator
   at java.net.URLClassLoader$1.run(URLClassLoader.java:217)
   at java.security.AccessController.doPrivileged(Native Method)
   at java.net.URLClassLoader.findClass(URLClassLoader.java:205)
   at java.lang.ClassLoader.loadClass(ClassLoader.java:323)
   at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:294)
   at java.lang.ClassLoader.loadClass(ClassLoader.java:268)
 Could not find the main class: org.apache.spark.ui.UIWorkloadGenerator. 
 Program will exit.
 {code}
 I also noticed that if the jar is unzipped, and the classpath set to the 
 currently directory, it just works. Finally, if the assembly jar is 
 compiled with JDK6, it also works. The error is seen with any class, not just 
 the UIWorkloadGenerator. Also, this error doesn't exist in branch 0.9, only 
 in master.
 h1. Isolation and Cause
 The package-time behavior of Java 6 and 7 differ with respect to the format 
 used for jar files:
 ||Number of entries||JDK 6||JDK 7||
 |= 65536|zip|zip|
 | 65536|zip*|zip64|
 zip* is a workaround for the original zip format that [described in 
 JDK-6828461|https://bugs.openjdk.java.net/browse/JDK-4828461] that allows 
 some versions of Java 6 to support larger assembly jars.
 The Scala libraries we depend on have added a large number of classes which 
 bumped us over the limit. This causes the Java 7 packaging to not work with 
 Java 6. We can probably go back under the limit by clearing out some 
 accidental inclusion of FastUtil, but eventually we'll go over again.
 The real answer is to force people to build with JDK 6 if they want to run 
 Spark on JRE 6.
 -I've found that if I just unpack and re-pack the jar (using `jar`) it always 
 works:-
 {code}
 $ cd assembly/target/scala-2.10/
 $ /usr/lib/jvm/java-1.6.0-openjdk-amd64/bin/java -cp 
 ./spark-assembly-1.0.0-SNAPSHOT-hadoop1.0.4.jar 
 org.apache.spark.ui.UIWorkloadGenerator # fails
 $ jar xvf spark-assembly-1.0.0-SNAPSHOT-hadoop1.0.4.jar
 $ jar cvf spark-assembly-1.0.0-SNAPSHOT-hadoop1.0.4.jar *
 $ /usr/lib/jvm/java-1.6.0-openjdk-amd64/bin/java -cp 
 ./spark-assembly-1.0.0-SNAPSHOT-hadoop1.0.4.jar 
 org.apache.spark.ui.UIWorkloadGenerator # succeeds
 {code}
 -I also noticed something of note. The Breeze package contains single 
 directories that have huge numbers of files in them (e.g. 2000+ class files 
 in one directory). It's possible we are hitting some weird bugs/corner cases 
 with compatibility of the internal storage format of the jar itself.-
 -I narrowed this down specifically to the inclusion of the breeze library. 
 Just adding breeze to an older (unaffected) build triggered the issue.-
 -I ran a git bisection and this appeared after the MLLib sparse vector patch 
 was merged:-
 https://github.com/apache/spark/commit/80c29689ae3b589254a571da3ddb5f9c866ae534
 SPARK-1212



--
This message was sent by Atlassian JIRA
(v6.2#6252)