[
https://issues.apache.org/jira/browse/SPARK-2361?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14063040#comment-14063040
]
Xiangrui Meng commented on SPARK-2361:
--------------------------------------
PR that uses broadcast for both training and prediction:
https://github.com/apache/spark/pull/1427
> Decide whether to broadcast or serialize the weights directly in MLlib
> algorithms
> ---------------------------------------------------------------------------------
>
> Key: SPARK-2361
> URL: https://issues.apache.org/jira/browse/SPARK-2361
> Project: Spark
> Issue Type: Improvement
> Components: MLlib
> Reporter: Xiangrui Meng
>
> In the current implementation, MLlib serializes weights directly into
> closure. This is okay for small feature dimension, but not efficient for
> feature dimensions beyond 1M. Especially the default akka.frameSize is 10m.
> We should use broadcast when the size of the serialized task is going to be
> large.
--
This message was sent by Atlassian JIRA
(v6.2#6252)