[
https://issues.apache.org/jira/browse/SPARK-22450?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16387630#comment-16387630
]
Richard Wilkinson commented on SPARK-22450:
-------------------------------------------
Just as an FYI, the change to
org.apache.spark.serializer.KryoSerializer#newKryo from (i think this ticket)
this is a performance hit over the in 2.2.1. I am calling
org.apache.spark.serializer.KryoSerializer#newInstance alot, which is probably
an issue in itself (hence not rasing a bug report), but im not aware of how
much this is called internal to spark. I do not have the ML jars on my
classpath.
> Safely register class for mllib
> -------------------------------
>
> Key: SPARK-22450
> URL: https://issues.apache.org/jira/browse/SPARK-22450
> Project: Spark
> Issue Type: Improvement
> Components: Spark Core
> Affects Versions: 2.2.0
> Reporter: Xianyang Liu
> Assignee: Xianyang Liu
> Priority: Major
> Fix For: 2.3.0
>
>
> There are still some algorithms based on mllib, such as KMeans. For now,
> many mllib common class (such as: Vector, DenseVector, SparseVector, Matrix,
> DenseMatrix, SparseMatrix) are not registered in Kryo. So there are some
> performance issues for those object serialization or deserialization.
> Previously dicussed: https://github.com/apache/spark/pull/19586
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]