GitHub user mengxr opened a pull request:

    https://github.com/apache/spark/pull/3374

    [SPARK-4486][MLLIB] Improve GradientBoosting APIs and doc

    There are some inconsistencies in the gradient boosting APIs. The target is 
a general boosting meta-algorithm, but the implementation is attached to trees. 
This was partially due to the delay of SPARK-1856. But for the 1.2 release, we 
should make the APIs consistent.
    
    1. WeightedEnsembleModel -> private[tree] TreeEnsembleModel
    1. GradientBoosting -> GradientBoostedTrees
    1. Add RandomForestModel and GradientBoostedTreesModel and hide 
CombiningStrategy
    1. Slightly refactored TreeEnsembleModel
    1. Remove `trainClassifier` and `trainRegressor` from 
`GradientBoostedTrees` because they are the same as `train`
    1. Rename class `train` method to `run` because it hides the static methods 
with the same name in Java. Deprecated `DecisionTree.run` class method.
    1. Simplify BoostingStrategy and make sure the input strategy is not 
modified. Users should put algo and numClasses in treeStrategy. We create 
ensembleStrategy inside boosting.
    1. Fix a bug in GradientBoostedTreesSuite with AbsoluteError
    1. doc updates
    
    @manishamde @jkbradley

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/mengxr/spark SPARK-4486

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/3374.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #3374
    
----
commit 19030a5edf8acc90010d2430fcf5c46d4389d86a
Author: Xiangrui Meng <[email protected]>
Date:   2014-11-19T21:17:01Z

    update boosting public APIs

commit 751da4e16a1fea86398abdb37ecb33b2b8f723a8
Author: Xiangrui Meng <[email protected]>
Date:   2014-11-19T22:09:11Z

    rename class method train -> run

commit ea4c467474ff488d4f4367edeb008cf2c042fc64
Author: Xiangrui Meng <[email protected]>
Date:   2014-11-19T22:25:51Z

    fix unit tests

commit 4aae3b761c5e98d19d6a5bf6b8a425f4bb4d2ebc
Author: Xiangrui Meng <[email protected]>
Date:   2014-11-19T23:25:16Z

    add RandomForestModel and GradientBoostedTreesModel, hide CombiningStrategy

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to