That code doesn't appear to be registering classes with Kryo, which means the 
fully-qualified classname is stored with every Kryo record. The Spark 
documentation has more on this: 
https://spark.apache.org/docs/latest/tuning.html#data-serialization

Regards,
Will

On July 5, 2015, at 2:31 AM, Gavin Liu <ilovesonsofanar...@gmail.com> wrote:

Hi,

I am using TeraSort benchmark from ehiggs's branch 
https://github.com/ehiggs/spark-terasort
<https://github.com/ehiggs/spark-terasort>  . Then I noticed that in
TeraSort.scala, it is using Kryo Serializer. So I made a small change from
"org.apache.spark.serializer.KryoSerializer" to
"org.apache.spark.serializer.JavaSerializer" to see the time difference.

Curiously, using Java Serializer is much quicker than using Kryo and there
is no error reported when I run the program. Here is the record from history
server, first one is kryo. second one is java default. 

1.
<http://apache-spark-user-list.1001560.n3.nabble.com/file/n23621/kryo.png> 

2.
<http://apache-spark-user-list.1001560.n3.nabble.com/file/n23621/java.png> 

I am wondering if I did something wrong or there is any other reason behind
this result.

Thanks for any help,
Gavin



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/Why-Kryo-Serializer-is-slower-than-Java-Serializer-in-TeraSort-tp23621.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

Reply via email to