[GitHub] spark pull request #19586: [SPARK-22367][CORE] Separate the serialization of...

cloud-fan Sat, 28 Oct 2017 14:55:42 -0700

Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/19586#discussion_r147565429
  
    --- Diff: 
core/src/main/scala/org/apache/spark/serializer/KryoSerializer.scala ---
    @@ -205,11 +205,45 @@ class KryoSerializationStream(
     
       private[this] var kryo: Kryo = serInstance.borrowKryo()
     
    +  // This is only used when we write object and class separately.
    +  var classWrote = false
    +
       override def writeObject[T: ClassTag](t: T): SerializationStream = {
         kryo.writeClassAndObject(output, t)
    --- End diff --
    
    I was expecting kryo to buffer the distinct classes and only store an 
identifier/pointer for duplicated classes. Even if we write object and class 
every time, the overhead should be small. This is not true?



---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #19586: [SPARK-22367][CORE] Separate the serialization of...

Reply via email to