GitHub user MechCoder opened a pull request:
https://github.com/apache/spark/pull/4677
[SPARK-5436] [MLlib] Validate GradientBoostedTrees during train
One can early stop if the decrease in error rate is lesser than a certain
tol, or if the error increases if the training data is overfit.
This introduces a new method which takes in a pair of RDD's , one for the
training data and the other for the validation.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/MechCoder/spark spark-5436
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/spark/pull/4677.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #4677
----
commit 07c8f12bc72b11ae780095a73662b5e049dc6e22
Author: MechCoder <[email protected]>
Date: 2015-02-18T21:23:33Z
[SPARK-5436] Validate GradientBoostedTrees during train
----
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]