[GitHub] spark issue #20632: [SPARK-3159][ML] Add decision tree pruning

2018-03-03 Thread asolimando
Github user asolimando commented on the issue: https://github.com/apache/spark/pull/20632 Thanks to you @srowen and @sethah for all your feedbacks which sensibly improved the PR! --- - To unsubscribe, e-mail

[GitHub] spark pull request #20632: [SPARK-3159][ML] Add decision tree pruning

2018-03-02 Thread asolimando
Github user asolimando commented on a diff in the pull request: https://github.com/apache/spark/pull/20632#discussion_r171983753 --- Diff: mllib/src/test/scala/org/apache/spark/ml/tree/impl/RandomForestSuite.scala --- @@ -703,4 +707,16 @@ private object RandomForestSuite

[GitHub] spark issue #20632: [SPARK-3159][ML] Add decision tree pruning

2018-03-02 Thread asolimando
Github user asolimando commented on the issue: https://github.com/apache/spark/pull/20632 Thanks for the suggestion @sethah, I have updated the PR with the extra check (both tests). --- - To unsubscribe, e-mail

[GitHub] spark issue #20632: [SPARK-3159][ML] Add decision tree pruning

2018-02-28 Thread asolimando
Github user asolimando commented on the issue: https://github.com/apache/spark/pull/20632 @sethah I have shortened the name as suggested and squashed the commits into a single one, let me know it that's ok

[GitHub] spark pull request #20632: [SPARK-3159] added subtree pruning in the transla...

2018-02-27 Thread asolimando
Github user asolimando commented on a diff in the pull request: https://github.com/apache/spark/pull/20632#discussion_r171045897 --- Diff: mllib/src/test/scala/org/apache/spark/ml/tree/impl/RandomForestSuite.scala --- @@ -631,10 +634,99 @@ class RandomForestSuite extends

[GitHub] spark pull request #20632: [SPARK-3159] added subtree pruning in the transla...

2018-02-26 Thread asolimando
Github user asolimando commented on a diff in the pull request: https://github.com/apache/spark/pull/20632#discussion_r170734558 --- Diff: mllib/src/test/scala/org/apache/spark/ml/tree/impl/RandomForestSuite.scala --- @@ -631,6 +651,160 @@ class RandomForestSuite extends

[GitHub] spark pull request #20632: [SPARK-3159] added subtree pruning in the transla...

2018-02-26 Thread asolimando
Github user asolimando commented on a diff in the pull request: https://github.com/apache/spark/pull/20632#discussion_r170733980 --- Diff: mllib/src/test/scala/org/apache/spark/mllib/tree/DecisionTreeSuite.scala --- @@ -541,7 +541,7 @@ object DecisionTreeSuite extends

[GitHub] spark pull request #20632: [SPARK-3159] added subtree pruning in the transla...

2018-02-22 Thread asolimando
Github user asolimando commented on a diff in the pull request: https://github.com/apache/spark/pull/20632#discussion_r170140738 --- Diff: mllib/src/test/scala/org/apache/spark/ml/tree/impl/RandomForestSuite.scala --- @@ -362,10 +365,10 @@ class RandomForestSuite extends

[GitHub] spark pull request #20632: [SPARK-3159] added subtree pruning in the transla...

2018-02-22 Thread asolimando
Github user asolimando commented on a diff in the pull request: https://github.com/apache/spark/pull/20632#discussion_r170140280 --- Diff: mllib/src/test/scala/org/apache/spark/mllib/tree/DecisionTreeSuite.scala --- @@ -359,29 +339,6 @@ class DecisionTreeSuite extends

[GitHub] spark pull request #20632: [SPARK-3159] added subtree pruning in the transla...

2018-02-22 Thread asolimando
Github user asolimando commented on a diff in the pull request: https://github.com/apache/spark/pull/20632#discussion_r170139957 --- Diff: mllib/src/test/scala/org/apache/spark/ml/tree/impl/RandomForestSuite.scala --- @@ -402,20 +405,40 @@ class RandomForestSuite extends

[GitHub] spark pull request #20632: [SPARK-3159] added subtree pruning in the transla...

2018-02-21 Thread asolimando
Github user asolimando commented on a diff in the pull request: https://github.com/apache/spark/pull/20632#discussion_r169742383 --- Diff: mllib/src/main/scala/org/apache/spark/ml/tree/Node.scala --- @@ -287,6 +291,34 @@ private[tree] class LearningNode

[GitHub] spark issue #20632: [SPARK-3159] added subtree pruning in the translation fr...

2018-02-20 Thread asolimando
Github user asolimando commented on the issue: https://github.com/apache/spark/pull/20632 Given that we are converging I have squashed the commits into a single one. My local `mvn scalastyle:check` was passing (as well as the check done via the Scala plugin for IntelliiJ

[GitHub] spark pull request #20632: [SPARK-3159] added subtree pruning in the transla...

2018-02-19 Thread asolimando
Github user asolimando commented on a diff in the pull request: https://github.com/apache/spark/pull/20632#discussion_r169172406 --- Diff: mllib/src/test/scala/org/apache/spark/ml/tree/impl/RandomForestSuite.scala --- @@ -402,20 +406,40 @@ class RandomForestSuite extends

[GitHub] spark pull request #20632: [SPARK-3159] added subtree pruning in the transla...

2018-02-19 Thread asolimando
Github user asolimando commented on a diff in the pull request: https://github.com/apache/spark/pull/20632#discussion_r169172053 --- Diff: mllib/src/main/scala/org/apache/spark/ml/tree/Node.scala --- @@ -287,6 +291,34 @@ private[tree] class LearningNode

[GitHub] spark pull request #20632: [SPARK-3159] added subtree pruning in the transla...

2018-02-18 Thread asolimando
Github user asolimando commented on a diff in the pull request: https://github.com/apache/spark/pull/20632#discussion_r168961174 --- Diff: mllib/src/test/scala/org/apache/spark/ml/tree/impl/RandomForestSuite.scala --- @@ -640,4 +740,55 @@ private object RandomForestSuite

[GitHub] spark pull request #20632: [SPARK-3159] added subtree pruning in the transla...

2018-02-18 Thread asolimando
Github user asolimando commented on a diff in the pull request: https://github.com/apache/spark/pull/20632#discussion_r168961183 --- Diff: mllib/src/test/scala/org/apache/spark/ml/tree/impl/RandomForestSuite.scala --- @@ -640,4 +740,55 @@ private object RandomForestSuite

[GitHub] spark pull request #20632: [SPARK-3159] added subtree pruning in the transla...

2018-02-18 Thread asolimando
Github user asolimando commented on a diff in the pull request: https://github.com/apache/spark/pull/20632#discussion_r168961088 --- Diff: mllib/src/test/scala/org/apache/spark/mllib/tree/DecisionTreeSuite.scala --- @@ -303,26 +303,6 @@ class DecisionTreeSuite extends

[GitHub] spark pull request #20632: [SPARK-3159] added subtree pruning in the transla...

2018-02-18 Thread asolimando
Github user asolimando commented on a diff in the pull request: https://github.com/apache/spark/pull/20632#discussion_r168946503 --- Diff: mllib/src/test/scala/org/apache/spark/ml/tree/impl/RandomForestSuite.scala --- @@ -640,4 +689,96 @@ private object RandomForestSuite

[GitHub] spark pull request #20632: [SPARK-3159] added subtree pruning in the transla...

2018-02-18 Thread asolimando
Github user asolimando commented on a diff in the pull request: https://github.com/apache/spark/pull/20632#discussion_r168946478 --- Diff: mllib/src/main/scala/org/apache/spark/ml/tree/Node.scala --- @@ -287,6 +292,41 @@ private[tree] class LearningNode

[GitHub] spark pull request #20632: [SPARK-3159] added subtree pruning in the transla...

2018-02-18 Thread asolimando
Github user asolimando commented on a diff in the pull request: https://github.com/apache/spark/pull/20632#discussion_r168946358 --- Diff: mllib/src/main/scala/org/apache/spark/ml/tree/Node.scala --- @@ -287,6 +292,41 @@ private[tree] class LearningNode

[GitHub] spark pull request #20632: [SPARK-3159] added subtree pruning in the transla...

2018-02-17 Thread asolimando
Github user asolimando commented on a diff in the pull request: https://github.com/apache/spark/pull/20632#discussion_r168925267 --- Diff: mllib/src/test/scala/org/apache/spark/ml/tree/impl/RandomForestSuite.scala --- @@ -640,4 +689,96 @@ private object RandomForestSuite

[GitHub] spark pull request #20632: [SPARK-3159] added subtree pruning in the transla...

2018-02-17 Thread asolimando
Github user asolimando commented on a diff in the pull request: https://github.com/apache/spark/pull/20632#discussion_r168925279 --- Diff: mllib/src/test/scala/org/apache/spark/ml/tree/impl/RandomForestSuite.scala --- @@ -640,4 +689,96 @@ private object RandomForestSuite

[GitHub] spark pull request #20632: [SPARK-3159] added subtree pruning in the transla...

2018-02-17 Thread asolimando
Github user asolimando commented on a diff in the pull request: https://github.com/apache/spark/pull/20632#discussion_r168925264 --- Diff: mllib/src/test/scala/org/apache/spark/ml/tree/impl/RandomForestSuite.scala --- @@ -640,4 +689,96 @@ private object RandomForestSuite

[GitHub] spark pull request #20632: [SPARK-3159] added subtree pruning in the transla...

2018-02-17 Thread asolimando
Github user asolimando commented on a diff in the pull request: https://github.com/apache/spark/pull/20632#discussion_r168925247 --- Diff: mllib/src/main/scala/org/apache/spark/ml/tree/Node.scala --- @@ -287,6 +292,41 @@ private[tree] class LearningNode

[GitHub] spark pull request #20632: [SPARK-3159] added subtree pruning in the transla...

2018-02-17 Thread asolimando
Github user asolimando commented on a diff in the pull request: https://github.com/apache/spark/pull/20632#discussion_r168925240 --- Diff: mllib/src/test/scala/org/apache/spark/ml/tree/impl/RandomForestSuite.scala --- @@ -631,6 +654,32 @@ class RandomForestSuite extends

[GitHub] spark pull request #20632: [SPARK-3159] added subtree pruning in the transla...

2018-02-17 Thread asolimando
Github user asolimando commented on a diff in the pull request: https://github.com/apache/spark/pull/20632#discussion_r168925232 --- Diff: mllib/src/main/scala/org/apache/spark/ml/tree/Node.scala --- @@ -287,6 +292,41 @@ private[tree] class LearningNode

[GitHub] spark pull request #20632: [SPARK-3159] added subtree pruning in the transla...

2018-02-17 Thread asolimando
Github user asolimando commented on a diff in the pull request: https://github.com/apache/spark/pull/20632#discussion_r168925224 --- Diff: mllib/src/main/scala/org/apache/spark/ml/tree/Node.scala --- @@ -287,6 +292,41 @@ private[tree] class LearningNode

[GitHub] spark issue #20632: [SPARK-3159] added subtree pruning in the translation fr...

2018-02-17 Thread asolimando
Github user asolimando commented on the issue: https://github.com/apache/spark/pull/20632 Hello Sean, here is my understanding of the problem and the main intuition of the proposed solution: We want to have a tree such that it does not contain any redundant subtree

[GitHub] spark pull request #20632: [SPARK-3159] added subtree pruning in the transla...

2018-02-16 Thread asolimando
GitHub user asolimando opened a pull request: https://github.com/apache/spark/pull/20632 [SPARK-3159] added subtree pruning in the translation from LearningNode to Node, added unit tests for tree redundancy and adapted existing ones that were affected ## What changes were proposed