Try looking at the running processes with “ps” to see their full command line and see whether any options are different. It seems like in both cases, your young generation is quite large (11 GB), which doesn’t make lot of sense with a heap of 15 GB. But maybe I’m misreading something.
Matei On Jul 2, 2014, at 4:50 AM, Wanda Hawk <wanda_haw...@yahoo.com> wrote: > I ran SparkKMeans with a big file (~ 7 GB of data) for one iteration with > spark-0.8.0 with this line in bash.rc " export _JAVA_OPTIONS="-Xmx15g -Xms15g > -verbose:gc -XX:+PrintGCTimeStamps -XX:+PrintGCDetails" ". It finished in a > decent time, ~50 seconds, and I had only a few "Full GC...." messages from > Java. (a max of 4-5) > > Now, using the same export in bash.rc but with spark-1.0.0 (and running it > with spark-submit) the first loop never finishes and I get a lot of: > "18.537: [GC (Allocation Failure) --[PSYoungGen: > 11796992K->11796992K(13762560K)] 11797442K->11797450K(13763072K), 2.8420311 > secs] [Times: user=5.81 sys=2.12, real=2.85 secs] > " > or > > "31.867: [Full GC (Ergonomics) [PSYoungGen: 11796992K->3177967K(13762560K)] > [ParOldGen: 505K->505K(512K)] 11797497K->3178473K(13763072K), [Metaspace: > 37646K->37646K(1081344K)], 2.3053283 secs] [Times: user=37.74 sys=0.11, > real=2.31 secs]" > > I tried passing different parameters for the JVM through spark-submit, but > the results are the same > This happens with java 1.7 and also with java 1.8. > I do not know what the "Ergonomics" stands for ... > > How can I get a decent performance from spark-1.0.0 considering that > spark-0.8.0 did not need any fine tuning on the gargage collection method > (the default worked well) ? > > Thank you