Github user sethah commented on the pull request:
https://github.com/apache/spark/pull/9008#issuecomment-210002149
@holdenk Thanks for the feedback. Upon some further thought, I think that
a.) We need to compute the statistics needed for both `minInstancesPerNode` and
Github user holdenk commented on the pull request:
https://github.com/apache/spark/pull/9008#issuecomment-209136374
@sethah So to avoid adding any overhead from computing stats for both these
params one option would be to selectively compute only the stats that are
required (e.g. if
Github user sethah commented on the pull request:
https://github.com/apache/spark/pull/9008#issuecomment-198428763
cc @MLnick thoughts on the above comments?
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your
Github user sethah commented on the pull request:
https://github.com/apache/spark/pull/9008#issuecomment-193978528
@rotationsymmetry: Will you have time to work on this? I am more than happy
to send a PR to your PR if you do not have time.
@jkbradley @dbtsai Would you mind
Github user sethah commented on the pull request:
https://github.com/apache/spark/pull/9008#issuecomment-187920267
Another issue is that the information gain for candidate splits is not
computed correctly with fractional samples. This is because the information
gain calculation
Github user sethah commented on the pull request:
https://github.com/apache/spark/pull/9008#issuecomment-187524117
I noticed a problem with the current implementation regarding the
`minInstancesPerNode` parameter. The number of _instances_ in each node is now
a weighted count where
Github user rotationsymmetry commented on the pull request:
https://github.com/apache/spark/pull/9008#issuecomment-185364988
@sethah Thank you very much for your review. I will incorporate the changes
in the next few days. Regarding the TODO in BaggedPoint.scala, I want to look
into
Github user sethah commented on the pull request:
https://github.com/apache/spark/pull/9008#issuecomment-184920013
@rotationsymmetry I made a pass on this, mostly minor comments. Thanks for
working on this, it would be great to get it merged in!
---
If your project is set up for
Github user sethah commented on a diff in the pull request:
https://github.com/apache/spark/pull/9008#discussion_r53097743
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/tree/impl/RandomForest.scala ---
@@ -1171,4 +1173,28 @@ private[ml] object RandomForest extends Logging {
Github user sethah commented on a diff in the pull request:
https://github.com/apache/spark/pull/9008#discussion_r53093008
--- Diff:
mllib/src/test/scala/org/apache/spark/ml/regression/DecisionTreeRegressorSuite.scala
---
@@ -73,6 +76,56 @@ class DecisionTreeRegressorSuite
Github user sethah commented on a diff in the pull request:
https://github.com/apache/spark/pull/9008#discussion_r53092853
--- Diff:
mllib/src/test/scala/org/apache/spark/ml/regression/RandomForestRegressorSuite.scala
---
@@ -101,6 +104,43 @@ class RandomForestRegressorSuite
Github user sethah commented on a diff in the pull request:
https://github.com/apache/spark/pull/9008#discussion_r53092821
--- Diff:
mllib/src/test/scala/org/apache/spark/ml/classification/DecisionTreeClassifierSuite.scala
---
@@ -275,6 +278,63 @@ class
Github user sethah commented on a diff in the pull request:
https://github.com/apache/spark/pull/9008#discussion_r53092866
--- Diff:
mllib/src/test/scala/org/apache/spark/ml/regression/DecisionTreeRegressorSuite.scala
---
@@ -73,6 +76,56 @@ class DecisionTreeRegressorSuite
Github user sethah commented on a diff in the pull request:
https://github.com/apache/spark/pull/9008#discussion_r53091878
--- Diff:
mllib/src/test/scala/org/apache/spark/ml/classification/RandomForestClassifierSuite.scala
---
@@ -182,6 +184,53 @@ class
Github user fabboe commented on the pull request:
https://github.com/apache/spark/pull/9008#issuecomment-184901410
Thanks for working on this!
Minor: PR title says `class weights` but actually it's `sample weights`
what is implemented.
---
If your project is set up for it,
Github user sethah commented on a diff in the pull request:
https://github.com/apache/spark/pull/9008#discussion_r5309
--- Diff:
mllib/src/test/scala/org/apache/spark/ml/classification/DecisionTreeClassifierSuite.scala
---
@@ -275,6 +278,63 @@ class
Github user sethah commented on a diff in the pull request:
https://github.com/apache/spark/pull/9008#discussion_r53089894
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/tree/impl/RandomForest.scala ---
@@ -1171,4 +1173,28 @@ private[ml] object RandomForest extends Logging {
Github user sethah commented on a diff in the pull request:
https://github.com/apache/spark/pull/9008#discussion_r53089674
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/regression/DecisionTreeRegressor.scala
---
@@ -17,18 +17,20 @@
package
Github user sethah commented on a diff in the pull request:
https://github.com/apache/spark/pull/9008#discussion_r53089711
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/regression/RandomForestRegressor.scala
---
@@ -41,7 +41,7 @@ import org.apache.spark.sql.functions._
Github user sethah commented on a diff in the pull request:
https://github.com/apache/spark/pull/9008#discussion_r53089695
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/regression/DecisionTreeRegressor.scala
---
@@ -40,7 +42,7 @@ import org.apache.spark.sql.DataFrame
Github user sethah commented on a diff in the pull request:
https://github.com/apache/spark/pull/9008#discussion_r53089599
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/classification/RandomForestClassifier.scala
---
@@ -41,7 +41,7 @@ import org.apache.spark.sql.functions._
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/9008#issuecomment-154858995
**[Test build #45315 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/45315/consoleFull)**
for PR 9008 at commit
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/9008#issuecomment-154858824
Merged build triggered.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/9008#issuecomment-154858834
Merged build started.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/9008#issuecomment-154875214
Merged build finished. Test PASSed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/9008#issuecomment-154875163
**[Test build #45315 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/45315/consoleFull)**
for PR 9008 at commit
Github user rotationsymmetry commented on the pull request:
https://github.com/apache/spark/pull/9008#issuecomment-154606846
retest this please
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/9008#issuecomment-154381876
[Test build #45206 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/45206/console)
for PR 9008 at commit
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/9008#issuecomment-154381940
Build finished. Test FAILed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/9008#issuecomment-154381942
Test FAILed.
Refer to this link for build results (access rights to CI server needed):
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/9008#issuecomment-147461393
Merged build triggered.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not
Github user rotationsymmetry commented on the pull request:
https://github.com/apache/spark/pull/9008#issuecomment-147461337
Jenkins failed tests unrelated to this patch. Let's try again.
---
If your project is set up for it, you can reply to this email and have your
reply appear on
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/9008#issuecomment-147461420
Merged build started.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/9008#issuecomment-147463491
[Test build #43572 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/43572/consoleFull)
for PR 9008 at commit
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/9008#issuecomment-147493385
Test FAILed.
Refer to this link for build results (access rights to CI server needed):
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/9008#issuecomment-147493291
[Test build #43572 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/43572/console)
for PR 9008 at commit
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/9008#issuecomment-147493384
Merged build finished. Test FAILed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/9008#issuecomment-147530643
Merged build started.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/9008#issuecomment-147530617
Merged build triggered.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not
Github user rotationsymmetry commented on the pull request:
https://github.com/apache/spark/pull/9008#issuecomment-147574091
@sethah I have incorporated your comments in the latest patch. Thank you!
@jkbradley Do you have any comments or suggestions? Much appreciated.
---
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/9008#issuecomment-147531959
[Test build #43588 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/43588/consoleFull)
for PR 9008 at commit
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/9008#issuecomment-147553172
[Test build #43588 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/43588/console)
for PR 9008 at commit
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/9008#issuecomment-147553264
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/9008#issuecomment-147553262
Merged build finished. Test PASSed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/9008#issuecomment-147272653
Merged build started.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/9008#issuecomment-147272648
Merged build triggered.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/9008#issuecomment-147273096
[Test build #43553 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/43553/consoleFull)
for PR 9008 at commit
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/9008#issuecomment-147282536
Test FAILed.
Refer to this link for build results (access rights to CI server needed):
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/9008#issuecomment-147282474
[Test build #43553 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/43553/console)
for PR 9008 at commit
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/9008#issuecomment-147282535
Merged build finished. Test FAILed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/9008#issuecomment-147102028
Merged build triggered.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/9008#issuecomment-147102033
Merged build started.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/9008#issuecomment-147102431
[Test build #43531 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/43531/consoleFull)
for PR 9008 at commit
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/9008#issuecomment-147122387
**[Test build #43531 timed
out](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/43531/console)**
for PR 9008 at commit
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/9008#issuecomment-147122395
Merged build finished. Test FAILed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/9008#issuecomment-147122396
Test FAILed.
Refer to this link for build results (access rights to CI server needed):
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/9008#issuecomment-147021256
Merged build finished. Test FAILed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/9008#issuecomment-147021257
Test FAILed.
Refer to this link for build results (access rights to CI server needed):
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/9008#issuecomment-147021241
[Test build #43503 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/43503/console)
for PR 9008 at commit
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/9008#issuecomment-147013614
Merged build started.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/9008#issuecomment-147013601
Merged build triggered.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/9008#issuecomment-147014205
[Test build #43503 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/43503/consoleFull)
for PR 9008 at commit
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/9008#issuecomment-146775989
Test FAILed.
Refer to this link for build results (access rights to CI server needed):
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/9008#issuecomment-146775984
Merged build finished. Test FAILed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/9008#issuecomment-146775911
[Test build #43460 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/43460/console)
for PR 9008 at commit
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/9008#issuecomment-146773059
Merged build started.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/9008#issuecomment-146773792
[Test build #43460 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/43460/consoleFull)
for PR 9008 at commit
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/9008#issuecomment-146773042
Merged build triggered.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/9008#issuecomment-146995069
Merged build finished. Test FAILed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/9008#issuecomment-146995030
**[Test build #43482 timed
out](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/43482/console)**
for PR 9008 at commit
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/9008#issuecomment-146995071
Test FAILed.
Refer to this link for build results (access rights to CI server needed):
Github user sethah commented on a diff in the pull request:
https://github.com/apache/spark/pull/9008#discussion_r41648955
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/tree/impl/RandomForest.scala ---
@@ -1211,4 +1213,28 @@ private[ml] object RandomForest extends Logging {
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/9008#issuecomment-146943400
[Test build #43482 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/43482/consoleFull)
for PR 9008 at commit
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/9008#issuecomment-146942918
Merged build started.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/9008#issuecomment-146942850
Merged build triggered.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not
Github user sethah commented on a diff in the pull request:
https://github.com/apache/spark/pull/9008#discussion_r41573997
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/tree/impl/RandomForest.scala ---
@@ -1211,4 +1212,34 @@ private[ml] object RandomForest extends Logging {
Github user rotationsymmetry commented on a diff in the pull request:
https://github.com/apache/spark/pull/9008#discussion_r41591990
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/tree/impl/RandomForest.scala ---
@@ -1211,4 +1212,34 @@ private[ml] object RandomForest extends
Github user sethah commented on a diff in the pull request:
https://github.com/apache/spark/pull/9008#discussion_r41597118
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/tree/impl/RandomForest.scala ---
@@ -87,8 +86,10 @@ private[ml] object RandomForest extends Logging {
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/9008#issuecomment-146098411
[Test build #43318 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/43318/console)
for PR 9008 at commit
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/9008#issuecomment-146098500
Merged build finished. Test PASSed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/9008#issuecomment-146098501
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
GitHub user rotationsymmetry opened a pull request:
https://github.com/apache/spark/pull/9008
[SPARK-9478] [ml] Add class weights to Random Forest
This PR adds weight support to
DecisionTreeClassifier
DecisionTreeRegressor
RandomForestClassifier
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/9008#issuecomment-146073233
Merged build triggered.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/9008#issuecomment-146073241
Merged build started.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/9008#issuecomment-146073764
[Test build #43318 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/43318/consoleFull)
for PR 9008 at commit
85 matches
Mail list logo