[jira] [Commented] (SPARK-10014) ML model broadcasts should be stored in private vars

Joseph K. Bradley (JIRA) Fri, 11 Sep 2015 10:25:24 -0700

    [ 
https://issues.apache.org/jira/browse/SPARK-10014?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14741185#comment-14741185
 ]


Joseph K. Bradley commented on SPARK-10014:
-------------------------------------------

An optimization like a hashcode sounds cool, but it might be overkill.  Let's 
just rebroadcast on each predict/transform.  If a user really needs to call 
transform a bunch of times, they can handle broadcasting themselves; it 
probably is a special use case.  I can look at your PRs again once they are 
updated.  Thanks!

> ML model broadcasts should be stored in private vars
> ----------------------------------------------------
>
>                 Key: SPARK-10014
>                 URL: https://issues.apache.org/jira/browse/SPARK-10014
>             Project: Spark
>          Issue Type: Umbrella
>          Components: ML, MLlib
>            Reporter: Joseph K. Bradley
>            Priority: Minor
>
> Multiple places in MLlib, we broadcast a model before prediction.  Since 
> prediction may be called many times, we should store the broadcast variable 
> in a private var so that we broadcast at most once.
> I'll link subtasks for each problem case I find.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (SPARK-10014) ML model broadcasts should be stored in private vars

Reply via email to