GitHub user hqzizania opened a pull request:

    https://github.com/apache/spark/pull/14717

    [SPARK-17090][ML]Make tree aggregation level in linear/logistic regression 
configurable

    ## What changes were proposed in this pull request?
    
    Linear/logistic regression use treeAggregate with default depth (always = 
2) for collecting coefficient gradient updates to the driver. For high 
dimensional problems, this can cause OOM error on the driver. This patch makes 
it configurable to avoid this problem if users' input data has many features. 
It adds a HasTreeDepth API in `sharedParams.scala`, and extends it to both 
Linear regression and logistic regression in .ml
    
    ## How was this patch tested?
    
    existing unit tests
    
    


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/hqzizania/spark SPARK-17090

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/14717.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #14717
    
----
commit 396ba1d07f894ca2b7dbf42868037bb4d70db5ba
Author: hqzizania <[email protected]>
Date:   2016-08-19T13:10:38Z

    Make tree aggregation level in linear/logistic regression configurable

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to