[jira] [Assigned] (SPARK-22707) Optimize Crossvalidator fitting memory occupation by models

Apache Spark (JIRA) Tue, 05 Dec 2017 19:19:33 -0800

     [ 
https://issues.apache.org/jira/browse/SPARK-22707?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Apache Spark reassigned SPARK-22707:
------------------------------------

    Assignee: Apache Spark

> Optimize Crossvalidator fitting memory occupation by models
> -----------------------------------------------------------
>
>                 Key: SPARK-22707
>                 URL: https://issues.apache.org/jira/browse/SPARK-22707
>             Project: Spark
>          Issue Type: Bug
>          Components: ML
>    Affects Versions: 2.2.0
>            Reporter: Weichen Xu
>            Assignee: Apache Spark
>
> Via some test I found CrossValidator still exists memory issue, it will still 
> occupy `O(n*sizeof(model))` for holding models when fitting, if well 
> optimized, it should be `O(parallelism*sizeof(model))`
> This is because modelFutures will hold the reference to model object after 
> future is complete (we can use `future.value.get.get` to fetch it), and the 
> `Future.sequence` and the `modelFutures` array holds references to each model 
> future. So all model object are keep referenced until `fit` return. So it 
> will still occupy `O(n*sizeof(model))` memory.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Assigned] (SPARK-22707) Optimize Crossvalidator fitting memory occupation by models

Reply via email to