Merge pull request #43 from mateiz/kryo-fix Don't allocate Kryo buffers unless needed
I noticed that the Kryo serializer could be slower than the Java one by 2-3x on small shuffles because it spend a lot of time initializing Kryo Input and Output objects. This is because our default buffer size for them is very large. Since the serializer is often used on streams, I made the initialization lazy for that, and used a smaller buffer (auto-managed by Kryo) for input. Project: http://git-wip-us.apache.org/repos/asf/incubator-spark/repo Commit: http://git-wip-us.apache.org/repos/asf/incubator-spark/commit/e67d5b96 Tree: http://git-wip-us.apache.org/repos/asf/incubator-spark/tree/e67d5b96 Diff: http://git-wip-us.apache.org/repos/asf/incubator-spark/diff/e67d5b96 Branch: refs/heads/master Commit: e67d5b962a2adddc073cfc9c99be9012fbb69838 Parents: ea34c52 a8725bf Author: Reynold Xin <r...@apache.org> Authored: Tue Oct 8 22:57:38 2013 -0700 Committer: Reynold Xin <r...@apache.org> Committed: Tue Oct 8 22:57:38 2013 -0700 ---------------------------------------------------------------------- .../scala/org/apache/spark/serializer/KryoSerializer.scala | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) ----------------------------------------------------------------------