I can run it now with the suggested method. However, I have encountered a new 
problem that I have not faced before (sent another email with that one but here 
it goes again ...)

I ran SparkKMeans with a big file (~ 7 GB of data) for one iteration with 
spark-0.8.0 with this line in bash.rc " export _JAVA_OPTIONS="-Xmx15g -Xms15g 
-verbose:gc -XX:+PrintGCTimeStamps -XX:+PrintGCDetails" ". It finished in a 
decent time, ~50 seconds, and I had only a few "Full GC...." messages from 
Java. (a max of 4-5)

Now, using the same export in bash.rc but with spark-1.0.0  (and running it 
with spark-submit) the first loop never finishes and  I get a lot of:
"18.537: [GC (Allocation Failure) --[PSYoungGen: 
11796992K->11796992K(13762560K)] 11797442K->11797450K(13763072K), 2.8420311 
secs] [Times: user=5.81 sys=2.12, real=2.85 secs]
"
or 

 "31.867: [Full GC (Ergonomics) [PSYoungGen: 11796992K->3177967K(13762560K)] 
[ParOldGen: 505K->505K(512K)] 11797497K->3178473K(13763072K), [Metaspace: 
37646K->37646K(1081344K)], 2.3053283 secs] [Times: user=37.74 sys=0.11, 
real=2.31 secs]"
 
I tried passing different parameters for the JVM through spark-submit, but the 
results are the same
This happens with java 1.7 and also with java 1.8.
I do not know what the "Ergonomics" stands for ...

How can I get a decent performance from spark-1.0.0 considering that 
spark-0.8.0 did not need any fine tuning on the gargage collection method (the 
default worked well) ?

Thank you


On Wednesday, July 2, 2014 4:45 PM, Yana Kadiyska <yana.kadiy...@gmail.com> 
wrote:
 


The scripts that Xiangrui mentions set up the classpath...Can you run
./run-example for the provided example sucessfully?

What you can try is set SPARK_PRINT_LAUNCH_COMMAND=1 and then call
run-example -- that will show you the exact java command used to run
the example at the start of execution. Assuming you can run examples
succesfully, you should be able to just copy that and add your jar to
the front of the classpath. If that works you can start removing extra
jars (run-examples put all the example jars in the cp, which you won't
need)

As you said the error you see is indicative of the class not being
available/seen at runtime but it's hard to tell why.


On Wed, Jul 2, 2014 at 2:13 AM, Wanda Hawk <wanda_haw...@yahoo.com> wrote:
> I want to make some minor modifications in the SparkMeans.scala so running
> the basic example won't do.
> I have also packed my code under a "jar" file with sbt. It completes
> successfully but when I try to run it : "java -jar myjar.jar" I get the same
> error:
> "Exception in thread "main" java.lang.NoClassDefFoundError:
> breeze/linalg/Vector
>         at java.lang.Class.getDeclaredMethods0(Native Method)
>         at java.lang.Class.privateGetDeclaredMethods(Class.java:2531)
>         at java.lang.Class.getMethod0(Class.java:2774)
>         at java.lang.Class.getMethod(Class.java:1663)
>         at
> sun.launcher.LauncherHelper.getMainMethod(LauncherHelper.java:494)
>         at
> sun.launcher.LauncherHelper.checkAndLoadMain(LauncherHelper.java:486)
> "
>
> If "scalac -d classes/ SparkKMeans.scala" can't see my classpath, why does
> it succeeds in compiling and does not give the same error ?
> The error itself "NoClassDefFoundError" means that the files are available
> at compile time, but for some reason I cannot figure out they are not
> available at run time. Does anyone know why ?
>
> Thank you
>
>
> On Tuesday, July 1, 2014 7:03 PM, Xiangrui Meng <men...@gmail.com> wrote:
>
>
> You can use either bin/run-example or bin/spark-summit to run example
> code. "scalac -d classes/ SparkKMeans.scala" doesn't recognize Spark
> classpath. There are examples in the official doc:
> http://spark.apache.org/docs/latest/quick-start.html#where-to-go-from-here
> -Xiangrui
>
> On Tue, Jul 1, 2014 at 4:39 AM, Wanda Hawk <wanda_haw...@yahoo.com> wrote:
>> Hello,
>>
>> I have installed spark-1.0.0 with scala2.10.3. I have built spark with
>> "sbt/sbt assembly" and added
>>
>> "/home/wanda/spark-1.0.0/assembly/target/scala-2.10/spark-assembly-1.0.0-hadoop1.0.4.jar"
>> to my CLASSPATH variable.
>> Then I went here
>> "../spark-1.0.0/examples/src/main/scala/org/apache/spark/examples" created
>> a
>> new directory "classes" and compiled SparkKMeans.scala with "scalac -d
>> classes/ SparkKMeans.scala"
>> Then I navigated to "classes" (I commented this line in the scala file :
>> package org.apache.spark.examples ) and tried to run it with "java -cp .
>> SparkKMeans" and I get the following error:
>> "Exception in thread "main" java.lang.NoClassDefFoundError:
>> breeze/linalg/Vector
>>        at java.lang.Class.getDeclaredMethods0(Native Method)
>>        at java.lang.Class.privateGetDeclaredMethods(Class.java:2531)
>>        at java.lang.Class.getMethod0(Class.java:2774)
>>        at java.lang.Class.getMethod(Class.java:1663)
>>        at
>> sun.launcher.LauncherHelper.getMainMethod(LauncherHelper.java:494)
>>        at
>> sun.launcher.LauncherHelper.checkAndLoadMain(LauncherHelper.java:486)
>> Caused by: java.lang.ClassNotFoundException: breeze.linalg.Vector
>>        at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
>>        at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
>>        at java.security.AccessController.doPrivileged(Native Method)
>>        at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
>>        at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
>>        at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
>>        at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
>>        ... 6 more
>> "
>> The jar under
>>
>> "/home/wanda/spark-1.0.0/assembly/target/scala-2.10/spark-assembly-1.0.0-hadoop1.0.4.jar"
>> contains the breeze/linalg/Vector* path, I even tried to unpack it and put
>> it in CLASSPATH to it does not seem to pick it up
>>
>>
>> I am currently running java 1.8
>> "java version "1.8.0_05"
>> Java(TM) SE Runtime Environment (build 1.8.0_05-b13)
>> Java HotSpot(TM) 64-Bit Server VM (build 25.5-b02, mixed mode)"
>>
>> What I am doing wrong ?
>>
>
>

Reply via email to