[GitHub] spark issue #15214: [SPARK-17017][Follow-up][ML] Refactor of ChiSqSelector a...

2016-09-26 Thread srowen
Github user srowen commented on the issue: https://github.com/apache/spark/pull/15214 Merged to master --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or

[GitHub] spark issue #15214: [SPARK-17017][Follow-up][ML] Refactor of ChiSqSelector a...

2016-09-25 Thread mpjlu
Github user mpjlu commented on the issue: https://github.com/apache/spark/pull/15214 Hi @srowen , sorry for forgetting update the doc and python/ml/feature.py in last PR. This pr has added ml/feature.py. It looks good to me. Thanks --- If your project is set up for

[GitHub] spark issue #15214: [SPARK-17017][Follow-up][ML] Refactor of ChiSqSelector a...

2016-09-25 Thread mpjlu
Github user mpjlu commented on the issue: https://github.com/apache/spark/pull/15214 Thanks, this looks good to me. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and

[GitHub] spark issue #15214: [SPARK-17017][Follow-up][ML] Refactor of ChiSqSelector a...

2016-09-25 Thread yanboliang
Github user yanboliang commented on the issue: https://github.com/apache/spark/pull/15214 And you can also refer all other Estimator in ML, even you swap the arguments setting order, you still get the same model. Thanks. --- If your project is set up for it, you can reply to this

[GitHub] spark issue #15214: [SPARK-17017][Follow-up][ML] Refactor of ChiSqSelector a...

2016-09-25 Thread yanboliang
Github user yanboliang commented on the issue: https://github.com/apache/spark/pull/15214 @srowen @mpjlu Another important reason for this change: it's error prone for Python ML API. ``` def __init__(self, numTopFeatures=50, featuresCol="features", outputCol=None,

[GitHub] spark issue #15214: [SPARK-17017][Follow-up][ML] Refactor of ChiSqSelector a...

2016-09-25 Thread srowen
Github user srowen commented on the issue: https://github.com/apache/spark/pull/15214 OK, I could also support either behavior. After all, for any component, `.setFoo(x).setFoo(y)` also creates a different model if the order is swapped, so I am not so clear that's a 'problem'. ---

[GitHub] spark issue #15214: [SPARK-17017][Follow-up][ML] Refactor of ChiSqSelector a...

2016-09-25 Thread mpjlu
Github user mpjlu commented on the issue: https://github.com/apache/spark/pull/15214 hi @srowen . My understand of yanbo's comments here is, if user use chSqSelector like this: model1 = new ChiSqSelector().setFPR(0.05).setKBest(100).fit(data) model2 = new

[GitHub] spark issue #15214: [SPARK-17017][Follow-up][ML] Refactor of ChiSqSelector a...

2016-09-25 Thread srowen
Github user srowen commented on the issue: https://github.com/apache/spark/pull/15214 I'm OK with it. @mpjlu sounds like you approve? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this

[GitHub] spark issue #15214: [SPARK-17017][Follow-up][ML] Refactor of ChiSqSelector a...

2016-09-24 Thread mpjlu
Github user mpjlu commented on the issue: https://github.com/apache/spark/pull/15214 Hi @yanboliang , got it. Thanks. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark issue #15214: [SPARK-17017][Follow-up][ML] Refactor of ChiSqSelector a...

2016-09-24 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15214 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #15214: [SPARK-17017][Follow-up][ML] Refactor of ChiSqSelector a...

2016-09-24 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15214 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/65864/ Test PASSed. ---

[GitHub] spark issue #15214: [SPARK-17017][Follow-up][ML] Refactor of ChiSqSelector a...

2016-09-24 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15214 **[Test build #65864 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65864/consoleFull)** for PR 15214 at commit

[GitHub] spark issue #15214: [SPARK-17017][Follow-up][ML] Refactor of ChiSqSelector a...

2016-09-24 Thread yanboliang
Github user yanboliang commented on the issue: https://github.com/apache/spark/pull/15214 @mpjlu The most important cause of this change is that the fit/train model should not dependent on the order of users setting params. In other words, users should get the same model whether set

[GitHub] spark issue #15214: [SPARK-17017][Follow-up][ML] Refactor of ChiSqSelector a...

2016-09-24 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15214 **[Test build #65864 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65864/consoleFull)** for PR 15214 at commit

[GitHub] spark issue #15214: [SPARK-17017][Follow-up][ML] Refactor of ChiSqSelector a...

2016-09-23 Thread mpjlu
Github user mpjlu commented on the issue: https://github.com/apache/spark/pull/15214 Hi @srowen and @yanboliang ; Thanks for your following up PR. I partly agree with your comments on 17017. **1. "if users both set numTopFeatures and percentile, it will train kbest or

[GitHub] spark issue #15214: [SPARK-17017][Follow-up][ML] Refactor of ChiSqSelector a...

2016-09-23 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15214 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/65824/ Test PASSed. ---

[GitHub] spark issue #15214: [SPARK-17017][Follow-up][ML] Refactor of ChiSqSelector a...

2016-09-23 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15214 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #15214: [SPARK-17017][Follow-up][ML] Refactor of ChiSqSelector a...

2016-09-23 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15214 **[Test build #65824 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65824/consoleFull)** for PR 15214 at commit

[GitHub] spark issue #15214: [SPARK-17017][Follow-up][ML] Refactor of ChiSqSelector a...

2016-09-23 Thread srowen
Github user srowen commented on the issue: https://github.com/apache/spark/pull/15214 Oh I see. I trust your judgment on this, just wish we could have gotten your review on the original PR. @mpjlu what do you think? --- If your project is set up for it, you can reply to this email

[GitHub] spark issue #15214: [SPARK-17017][Follow-up][ML] Refactor of ChiSqSelector a...

2016-09-23 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15214 **[Test build #65824 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65824/consoleFull)** for PR 15214 at commit