[ 
https://issues.apache.org/jira/browse/SPARK-17090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15426876#comment-15426876
 ] 

DB Tsai commented on SPARK-17090:
---------------------------------

Since having a formula of determining the aggregation depth is pretty tricky, 
and this will depend on the memory setting of driver, the dimension of 
problems, and the number of partition, etc. This will take longer to discuss 
and have a proper implementation. Let's have the api done in this PR, and set 
the default value as 2.0. In a follow-up PR, we can work on the formula part. 

> Make tree aggregation level in linear/logistic regression configurable
> ----------------------------------------------------------------------
>
>                 Key: SPARK-17090
>                 URL: https://issues.apache.org/jira/browse/SPARK-17090
>             Project: Spark
>          Issue Type: Improvement
>          Components: ML
>            Reporter: Seth Hendrickson
>            Priority: Minor
>
> Linear/logistic regression use treeAggregate with default aggregation depth 
> for collecting coefficient gradient updates to the driver. For high 
> dimensional problems, this can case OOM error on the driver. We should make 
> it configurable, perhaps via an expert param, so that users can avoid this 
> problem if their data has many features.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to