Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/19586#discussion_r147565429 --- Diff: core/src/main/scala/org/apache/spark/serializer/KryoSerializer.scala --- @@ -205,11 +205,45 @@ class KryoSerializationStream( private[this] var kryo: Kryo = serInstance.borrowKryo() + // This is only used when we write object and class separately. + var classWrote = false + override def writeObject[T: ClassTag](t: T): SerializationStream = { kryo.writeClassAndObject(output, t) --- End diff -- I was expecting kryo to buffer the distinct classes and only store an identifier/pointer for duplicated classes. Even if we write object and class every time, the overhead should be small. This is not true?
--- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org