Looks like, it spend more time writing/transferring the 40GB of shuffle when you used kryo. And surpirsingly, JavaSerializer has 700MB of shuffle?
Thanks Best Regards On Sun, Jul 5, 2015 at 12:01 PM, Gavin Liu <ilovesonsofanar...@gmail.com> wrote: > Hi, > > I am using TeraSort benchmark from ehiggs's branch > https://github.com/ehiggs/spark-terasort > <https://github.com/ehiggs/spark-terasort> . Then I noticed that in > TeraSort.scala, it is using Kryo Serializer. So I made a small change from > "org.apache.spark.serializer.KryoSerializer" to > "org.apache.spark.serializer.JavaSerializer" to see the time difference. > > Curiously, using Java Serializer is much quicker than using Kryo and there > is no error reported when I run the program. Here is the record from > history > server, first one is kryo. second one is java default. > > 1. > <http://apache-spark-user-list.1001560.n3.nabble.com/file/n23621/kryo.png> > > 2. > <http://apache-spark-user-list.1001560.n3.nabble.com/file/n23621/java.png> > > I am wondering if I did something wrong or there is any other reason behind > this result. > > Thanks for any help, > Gavin > > > > -- > View this message in context: > http://apache-spark-user-list.1001560.n3.nabble.com/Why-Kryo-Serializer-is-slower-than-Java-Serializer-in-TeraSort-tp23621.html > Sent from the Apache Spark User List mailing list archive at Nabble.com. > > --------------------------------------------------------------------- > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org > For additional commands, e-mail: user-h...@spark.apache.org > >