GitHub user facaiy opened a pull request:
https://github.com/apache/spark/pull/17503
[SPARK-3159][MLlib] Check for reducible DecisionTree
add canMergeChildren param: find the pairs of leave of the same parent
which output the same prediction, and merge them.
## How was this patch tested?
1. [x] add unit test: verify whether implementation is correct.
2. [ ] add unit test: verity whether setCanMergeChildren works.
3. [ ] perhaps we need create a sample which can produce a reducible tree,
and test it.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/facaiy/spark
CLN/check_for_reducible_decision_tree
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/spark/pull/17503.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #17503
----
commit fab2a0e5a3c4db8beeaa78d98253d11e408f3b56
Author: é¢åæï¼Yan Facaiï¼ <[email protected]>
Date: 2017-03-31T02:33:38Z
TST: create new test suite
commit f5d52cce500290165ac7d8bad5aa38041ed21c54
Author: é¢åæï¼Yan Facaiï¼ <[email protected]>
Date: 2017-03-31T02:35:26Z
TST: helper method for construcing binary tree
commit b9248b7ae2e1048d93e44e2d3687c2f9fd286ce8
Author: é¢åæï¼Yan Facaiï¼ <[email protected]>
Date: 2017-03-31T07:09:51Z
TST: helper method, show tree node info
commit be12f4f23a5fd53870bc97b6cbdb8fa0a094f2c1
Author: é¢åæï¼Yan Facaiï¼ <[email protected]>
Date: 2017-03-31T07:21:41Z
TST: helper method, check if pairs of leave with same prediction exists
commit b52420201576610613c782146dc9d6c2dc6ebb0c
Author: é¢åæï¼Yan Facaiï¼ <[email protected]>
Date: 2017-03-31T07:28:28Z
TST: helper method for modifying nodes
commit 98a73f952d1a199cf581cde2636d6dc831ae4ee3
Author: é¢åæï¼Yan Facaiï¼ <[email protected]>
Date: 2017-03-31T07:41:46Z
ENH: merge the pairs of leave with same prediction of same parent
commit 632325d0e0d45d7fe9325686f90dbdc64b149960
Author: é¢åæï¼Yan Facaiï¼ <[email protected]>
Date: 2017-04-01T01:10:07Z
ENH: add mergeLeave param in Strategy
commit 12052958d30d015be537fbd1169da4406869fb3d
Author: é¢åæï¼Yan Facaiï¼ <[email protected]>
Date: 2017-04-01T01:18:50Z
ENH: support mergeChild when training
commit 434c762de76be2f1b4ec939ccba9c2ecb45c1c04
Author: é¢åæï¼Yan Facaiï¼ <[email protected]>
Date: 2017-04-01T02:48:37Z
ENH: add canMergeChildren param in DecisionTreeParams
commit 5162552a8db92283a514e94adeef439f6fb8f80e
Author: é¢åæï¼Yan Facaiï¼ <[email protected]>
Date: 2017-04-01T02:54:57Z
ENH: add set method in tree classifier
commit 21b1a851c89cd0a060720503bdc4a9441155236b
Author: é¢åæï¼Yan Facaiï¼ <[email protected]>
Date: 2017-04-01T04:19:07Z
ENH: stat: merge counts of each tree
commit 25b712a37bbc31cc9b8ff2b6330d79fd437cb17c
Author: é¢åæï¼Yan Facaiï¼ <[email protected]>
Date: 2017-04-01T06:16:01Z
BUG: depth=0 tree has none of children
----
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]