GitHub user smurching opened a pull request:
https://github.com/apache/spark/pull/19758
[SPARK-3162][MLlib] Local Tree Training Pt 1: Refactor RandomForest.scala
into utility classes
## What changes were proposed in this pull request?
Breaks up #19433 to help unblock #19666; after this PR is merged, #19666
can be merged.
This PR contains the changes made to migrate functionality from
RandomForest.scala into the following utility classes:
* AggUpdateUtils
* ImpurityUtils
* SplitUtils
The PR also adds tests for split selection logic in TreeSplitUtilsSuite.
A follow-up PR will include the other changes from #19433:
* Local decision tree data structures & tests
* Local tree training logic & tests
## How was this patch tested?
Adds unit tests for split selection logic in TreeSplitUtilsSuite
(Please explain how this patch was tested. E.g. unit tests, integration
tests, manual tests)
(If this patch involves UI changes, please attach a screenshot; otherwise,
remove this)
Please review http://spark.apache.org/contributing.html before opening a
pull request.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/smurching/spark refactor-random-forest
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/spark/pull/19758.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #19758
----
commit f2e3fbd40eea2919d249710eae5b5789d97543b7
Author: Sid Murching <[email protected]>
Date: 2017-11-15T17:52:01Z
Local tree training part 1 (refactor RandomForest.scala into utility
classes)
commit a2357c95672e94a148051d00e26b89245eb8e204
Author: Sid Murching <[email protected]>
Date: 2017-11-15T17:57:55Z
WIP adding TreeSplitUtilsSuite
commit 320c32ee8d0ac9bde457b0286d064470648c73af
Author: Sid Murching <[email protected]>
Date: 2017-11-15T19:37:56Z
WIP
commit b93f9f3da9cca0887c0264162f5b032f14fa87d7
Author: Sid Murching <[email protected]>
Date: 2017-11-15T19:57:25Z
Add TreeSplitUtilsSuite, refactor it to not depend on any local tree
training code
----
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]