GitHub user smurching reopened a pull request:
https://github.com/apache/spark/pull/14872
[SPARK-3162][MLlib][WIP] Add local tree training for decision tree
regressors
## What changes were proposed in this pull request?
Based on [Yggdrasil](https://github.com/fabuzaid21/yggdrasil), added local
training of decision tree regressors.
Some classes/objects largely correspond to Yggdrasil classes/objects.
Specifically:
* class LocalDecisionTreeRegressor --> class YggdrasilRegressor
* object LocalDecisionTree --> object YggdrasilRegression
* object LocalDecisionTreeUtils --> object Yggdrasil
## How was this patch tested?
Added unit tests in (ml/tree/impl/LocalTreeTrainingSuite.scala) verifying
that local & distributed training of a decision tree regressor produces the
same tree.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/smurching/spark local-trees-pr
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/spark/pull/14872.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #14872
----
commit acf5b3e29a346a0cb86f621269855a6a98a9a74e
Author: Siddharth Murching <[email protected]>
Date: 2016-08-29T23:51:33Z
Add local tree training for decision tree regressors
commit aa4fcc8d401385f38fe0cdfdb9fe39062c3a9f96
Author: Siddharth Murching <[email protected]>
Date: 2016-08-30T01:19:07Z
Fix setting of impurity values for leaf nodes to match values produced by
distributed Random Forest algorithm
commit f273fc6a4b5048ae577d03676def354dce5c87a7
Author: Siddharth Murching <[email protected]>
Date: 2016-08-31T07:01:26Z
WIP refactoring single-machine tree code
commit 5e61e3b29c236d27e0d655d15a48f2fe3e13d26a
Author: Siddharth Murching <[email protected]>
Date: 2016-09-01T01:22:48Z
Remove unused imports, remove array of single-node impurity aggregators
commit d2060fc460a97228a36bf81956cf8dd24c83106e
Author: Siddharth Murching <[email protected]>
Date: 2016-09-01T23:11:17Z
WIP
commit 634a3223374608d68018daac5500a429034bbc20
Author: Siddharth Murching <[email protected]>
Date: 2016-09-02T00:21:10Z
More work, tests still pass
commit eb7fde00e0db5aa5d04951f8f4a9cd62204f1609
Author: Siddharth Murching <[email protected]>
Date: 2016-09-02T17:02:06Z
WIP: Added tests for classes upon which local tree training is dependent.
Some integration tests fail
commit b748f05e3eaa7d58b1ad86d269e0dda5f35ee885
Author: Siddharth Murching <[email protected]>
Date: 2016-09-02T17:37:31Z
WIP debugging
commit 297052242727e6693ccbacf89f44b3ff6db584f7
Author: Siddharth Murching <[email protected]>
Date: 2016-09-02T21:34:13Z
Consolidate checking for valid splits
commit 8d443ce38f958e7b83b502e614e01c824cb63c4b
Author: Siddharth Murching <[email protected]>
Date: 2016-09-02T21:52:47Z
Delete empty test suite
commit ee56ffe98756ed78cefbc3f782a471f04e80b256
Author: Siddharth Murching <[email protected]>
Date: 2016-09-02T22:49:38Z
Fix some style errors
----
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]