[GitHub] spark issue #11119: [SPARK-10780][ML] Add an initial model to kmeans

2017-05-03 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/9 Do you guys mind if I propose to close this PR? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark issue #11119: [SPARK-10780][ML] Add an initial model to kmeans

2017-03-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/9 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/75004/ Test PASSed. ---

[GitHub] spark issue #11119: [SPARK-10780][ML] Add an initial model to kmeans

2017-03-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/9 Build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark issue #11119: [SPARK-10780][ML] Add an initial model to kmeans

2017-03-21 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/9 **[Test build #75004 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/75004/consoleFull)** for PR 9 at commit

[GitHub] spark issue #11119: [SPARK-10780][ML] Add an initial model to kmeans

2017-03-21 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/9 **[Test build #75004 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/75004/consoleFull)** for PR 9 at commit

[GitHub] spark issue #11119: [SPARK-10780][ML] Add an initial model to kmeans

2017-01-31 Thread sethah
Github user sethah commented on the issue: https://github.com/apache/spark/pull/9 ping! I could take this over if needed :) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #11119: [SPARK-10780][ML] Add an initial model to kmeans

2017-01-09 Thread sethah
Github user sethah commented on the issue: https://github.com/apache/spark/pull/9 @yinxusen Do you think you'll have time to work on this? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark issue #11119: [SPARK-10780][ML] Add an initial model to kmeans

2016-12-06 Thread sethah
Github user sethah commented on the issue: https://github.com/apache/spark/pull/9 ping? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the

[GitHub] spark issue #11119: [SPARK-10780][ML] Add an initial model to kmeans

2016-11-18 Thread sethah
Github user sethah commented on the issue: https://github.com/apache/spark/pull/9 @yinxusen I took a look at the updates. Will you be able to create the design doc that Joseph mentioned? --- If your project is set up for it, you can reply to this email and have your reply

[GitHub] spark issue #11119: [SPARK-10780][ML] Add an initial model to kmeans

2016-11-08 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/9 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #11119: [SPARK-10780][ML] Add an initial model to kmeans

2016-11-08 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/9 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/68368/ Test PASSed. ---

[GitHub] spark issue #11119: [SPARK-10780][ML] Add an initial model to kmeans

2016-11-08 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/9 **[Test build #68368 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/68368/consoleFull)** for PR 9 at commit

[GitHub] spark issue #11119: [SPARK-10780][ML] Add an initial model to kmeans

2016-11-08 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/9 **[Test build #68368 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/68368/consoleFull)** for PR 9 at commit

[GitHub] spark issue #11119: [SPARK-10780][ML] Add an initial model to kmeans

2016-11-08 Thread sethah
Github user sethah commented on the issue: https://github.com/apache/spark/pull/9 This is probably going to miss 2.1 since we are officially in QA now, just as an fyi. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well.

[GitHub] spark issue #11119: [SPARK-10780][ML] Add an initial model to kmeans

2016-11-07 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/9 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #11119: [SPARK-10780][ML] Add an initial model to kmeans

2016-11-07 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/9 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/68325/ Test FAILed. ---

[GitHub] spark issue #11119: [SPARK-10780][ML] Add an initial model to kmeans

2016-11-07 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/9 **[Test build #68325 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/68325/consoleFull)** for PR 9 at commit

[GitHub] spark issue #11119: [SPARK-10780][ML] Add an initial model to kmeans

2016-11-07 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/9 **[Test build #68325 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/68325/consoleFull)** for PR 9 at commit

[GitHub] spark issue #11119: [SPARK-10780][ML] Add an initial model to kmeans

2016-11-07 Thread yinxusen
Github user yinxusen commented on the issue: https://github.com/apache/spark/pull/9 @sethah Sorry, I got stuck in other things. I'll update this PR tonight. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark issue #11119: [SPARK-10780][ML] Add an initial model to kmeans

2016-11-03 Thread sethah
Github user sethah commented on the issue: https://github.com/apache/spark/pull/9 @yinxusen Status update? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes

[GitHub] spark issue #11119: [SPARK-10780][ML] Add an initial model to kmeans

2016-10-24 Thread jkbradley
Github user jkbradley commented on the issue: https://github.com/apache/spark/pull/9 > setInitMode throws an error when called with setInitMode("initialModel") and instructs user to use setInitialModel instead On second thought, for this one, it could be good to have it

[GitHub] spark issue #11119: [SPARK-10780][ML] Add an initial model to kmeans

2016-10-24 Thread sethah
Github user sethah commented on the issue: https://github.com/apache/spark/pull/9 Ok, unless anyone has strong objections, it seems our plan moving forward with this PR should be: * Change the `setInitialModel` method to also set `initMode` to "initialModel" * Change

[GitHub] spark issue #11119: [SPARK-10780][ML] Add an initial model to kmeans

2016-10-24 Thread jkbradley
Github user jkbradley commented on the issue: https://github.com/apache/spark/pull/9 > Would you mind pointing me to an example of an algorithm which only copies some, but not all, of the estimator params? ALS is a good example:

[GitHub] spark issue #11119: [SPARK-10780][ML] Add an initial model to kmeans

2016-10-24 Thread sethah
Github user sethah commented on the issue: https://github.com/apache/spark/pull/9 @jkbradley Thanks for your thoughts. I agree it's a good idea to change the KMeans prediction function to not use the entire model in its closure, but that we need a more thorough solution when we

[GitHub] spark issue #11119: [SPARK-10780][ML] Add an initial model to kmeans

2016-10-23 Thread yinxusen
Github user yinxusen commented on the issue: https://github.com/apache/spark/pull/9 How about the following: 1. Since the new generated model is derived from an estimator, the model should have the same params as its parent estimator. That's why there is no need to

[GitHub] spark issue #11119: [SPARK-10780][ML] Add an initial model to kmeans

2016-10-20 Thread sethah
Github user sethah commented on the issue: https://github.com/apache/spark/pull/9 Related thought: if the model holds a pointer to its initialModel, then it will be serialized and shipped along with the model at prediction time. This will be inefficient for large models and even

[GitHub] spark issue #11119: [SPARK-10780][ML] Add an initial model to kmeans

2016-10-19 Thread yinxusen
Github user yinxusen commented on the issue: https://github.com/apache/spark/pull/9 @MLnick @dbtsai @sethah Any thoughts on the new version? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not

[GitHub] spark issue #11119: [SPARK-10780][ML] Add an initial model to kmeans

2016-10-18 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/9 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #11119: [SPARK-10780][ML] Add an initial model to kmeans

2016-10-18 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/9 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/67156/ Test PASSed. ---

[GitHub] spark issue #11119: [SPARK-10780][ML] Add an initial model to kmeans

2016-10-18 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/9 **[Test build #67156 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/67156/consoleFull)** for PR 9 at commit

[GitHub] spark issue #11119: [SPARK-10780][ML] Add an initial model to kmeans

2016-10-18 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/9 **[Test build #67156 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/67156/consoleFull)** for PR 9 at commit

[GitHub] spark issue #11119: [SPARK-10780][ML] Add an initial model to kmeans

2016-10-18 Thread MLnick
Github user MLnick commented on the issue: https://github.com/apache/spark/pull/9 @yinxusen for cutting lineage do you intend to check whether the `initialModel` itself has an `initialModel`, and if so clear it? I think that can be a reasonable solution if we have a clean

[GitHub] spark issue #11119: [SPARK-10780][ML] Add an initial model to kmeans

2016-10-18 Thread MLnick
Github user MLnick commented on the issue: https://github.com/apache/spark/pull/9 @sethah agree with that yes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and

[GitHub] spark issue #11119: [SPARK-10780][ML] Add an initial model to kmeans

2016-10-18 Thread sethah
Github user sethah commented on the issue: https://github.com/apache/spark/pull/9 We should change the `initMode` doc to indicate that the param is ignored when `initialModel` is set. --- If your project is set up for it, you can reply to this email and have your reply appear on

[GitHub] spark issue #11119: [SPARK-10780][ML] Add an initial model to kmeans

2016-10-17 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/9 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/67100/ Test PASSed. ---

[GitHub] spark issue #11119: [SPARK-10780][ML] Add an initial model to kmeans

2016-10-17 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/9 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #11119: [SPARK-10780][ML] Add an initial model to kmeans

2016-10-17 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/9 **[Test build #67100 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/67100/consoleFull)** for PR 9 at commit

[GitHub] spark issue #11119: [SPARK-10780][ML] Add an initial model to kmeans

2016-10-17 Thread yinxusen
Github user yinxusen commented on the issue: https://github.com/apache/spark/pull/9 I agree that too long lineage would hurt performance and also unnecessary. How about cutting a lineage that is longer than 2? Namely, we only keep direct parent model when saving. Keeping direct

[GitHub] spark issue #11119: [SPARK-10780][ML] Add an initial model to kmeans

2016-10-17 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/9 **[Test build #67100 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/67100/consoleFull)** for PR 9 at commit

[GitHub] spark issue #11119: [SPARK-10780][ML] Add an initial model to kmeans

2016-10-17 Thread MLnick
Github user MLnick commented on the issue: https://github.com/apache/spark/pull/9 Good points. Perhaps a solution - while slightly "verbose", is to introduce another param `initialModelWriteMode` which governs what is saved - `full`, `params` or `none`. Full is obviously the

[GitHub] spark issue #11119: [SPARK-10780][ML] Add an initial model to kmeans

2016-10-17 Thread sethah
Github user sethah commented on the issue: https://github.com/apache/spark/pull/9 I agree that saving the initialModel may not be practical - since it can be large. However, not saving that param at all also seems a bit contrary to me. When we produce a model from an estimator,

[GitHub] spark issue #11119: [SPARK-10780][ML] Add an initial model to kmeans

2016-10-17 Thread sethah
Github user sethah commented on the issue: https://github.com/apache/spark/pull/9 @MLnick I tend to agree with your opinion about saving the initialModel with the model. We can remove saving the initial model entirely from the model. Though, right now users could avoid

[GitHub] spark issue #11119: [SPARK-10780][ML] Add an initial model to kmeans

2016-10-17 Thread dbtsai
Github user dbtsai commented on the issue: https://github.com/apache/spark/pull/9 I was also thinking that most of people will use this for daily retraining by passing in the previous model which will cause the model larger and larger due to the model chain which is unnecessary

[GitHub] spark issue #11119: [SPARK-10780][ML] Add an initial model to kmeans

2016-10-17 Thread MLnick
Github user MLnick commented on the issue: https://github.com/apache/spark/pull/9 For the `KMeansModel` saving the `initialModel` - it still seems weird to me. Another factor to consider: the main use case in my view is continually re-training a model with the result from the

[GitHub] spark issue #11119: [SPARK-10780][ML] Add an initial model to kmeans

2016-10-17 Thread MLnick
Github user MLnick commented on the issue: https://github.com/apache/spark/pull/9 For the `k` vs other params - I'm ok with just setting `k`. I do agree that in most cases the other params may be different (in fact likely different from both the defaults and initial model for

[GitHub] spark issue #11119: [SPARK-10780][ML] Add an initial model to kmeans

2016-10-17 Thread sethah
Github user sethah commented on the issue: https://github.com/apache/spark/pull/9 One minor comment, otherwise LGTM. Thanks a lot @yinxusen and reviewers. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark issue #11119: [SPARK-10780][ML] Add an initial model to kmeans

2016-10-17 Thread dbtsai
Github user dbtsai commented on the issue: https://github.com/apache/spark/pull/9 Please remove `WIP` in the description. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #11119: [SPARK-10780][ML] Add an initial model to kmeans

2016-10-17 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/9 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/67051/ Test PASSed. ---

[GitHub] spark issue #11119: [SPARK-10780][ML] Add an initial model to kmeans

2016-10-17 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/9 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #11119: [SPARK-10780][ML] Add an initial model to kmeans

2016-10-17 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/9 **[Test build #67051 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/67051/consoleFull)** for PR 9 at commit

[GitHub] spark issue #11119: [SPARK-10780][ML] Add an initial model to kmeans

2016-10-16 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/9 **[Test build #67051 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/67051/consoleFull)** for PR 9 at commit

[GitHub] spark issue #11119: [SPARK-10780][ML] Add an initial model to kmeans

2016-10-16 Thread yinxusen
Github user yinxusen commented on the issue: https://github.com/apache/spark/pull/9 @sethah Thanks, I change these problems. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #11119: [SPARK-10780][ML] Add an initial model to kmeans

2016-10-16 Thread sethah
Github user sethah commented on the issue: https://github.com/apache/spark/pull/9 @yinxusen Thanks for the updates. Regarding save/load compatibility, I tested it locally and had no issues. There are still some issues to address, but I think we can sort them out into

[GitHub] spark issue #11119: [SPARK-10780][ML] Add an initial model to kmeans

2016-10-16 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/9 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #11119: [SPARK-10780][ML] Add an initial model to kmeans

2016-10-16 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/9 **[Test build #67050 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/67050/consoleFull)** for PR 9 at commit

[GitHub] spark issue #11119: [SPARK-10780][ML] Add an initial model to kmeans

2016-10-16 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/9 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/67050/ Test PASSed. ---

[GitHub] spark issue #11119: [SPARK-10780][ML] Add an initial model to kmeans

2016-10-16 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/9 **[Test build #67050 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/67050/consoleFull)** for PR 9 at commit

[GitHub] spark issue #11119: [SPARK-10780][ML] Add an initial model to kmeans

2016-10-16 Thread yinxusen
Github user yinxusen commented on the issue: https://github.com/apache/spark/pull/9 @sethah New behavior of `setK` is adapted. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this

[GitHub] spark issue #11119: [SPARK-10780][ML] Add an initial model to kmeans

2016-10-14 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/9 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #11119: [SPARK-10780][ML] Add an initial model to kmeans

2016-10-14 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/9 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/66994/ Test PASSed. ---

[GitHub] spark issue #11119: [SPARK-10780][ML] Add an initial model to kmeans

2016-10-14 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/9 **[Test build #66994 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/66994/consoleFull)** for PR 9 at commit

[GitHub] spark issue #11119: [SPARK-10780][ML] Add an initial model to kmeans

2016-10-14 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/9 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #11119: [SPARK-10780][ML] Add an initial model to kmeans

2016-10-14 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/9 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/66991/ Test PASSed. ---

[GitHub] spark issue #11119: [SPARK-10780][ML] Add an initial model to kmeans

2016-10-14 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/9 **[Test build #66991 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/66991/consoleFull)** for PR 9 at commit

[GitHub] spark issue #11119: [SPARK-10780][ML] Add an initial model to kmeans

2016-10-14 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/9 **[Test build #66994 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/66994/consoleFull)** for PR 9 at commit

[GitHub] spark issue #11119: [SPARK-10780][ML] Add an initial model to kmeans

2016-10-14 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/9 **[Test build #66991 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/66991/consoleFull)** for PR 9 at commit

[GitHub] spark issue #11119: [SPARK-10780][ML] Add an initial model to kmeans

2016-10-11 Thread sethah
Github user sethah commented on the issue: https://github.com/apache/spark/pull/9 @MLnick Good questions. For the setting of params - I'm sure it varies in some cases, but for example, why would you want to use the same `maxIter` when training with an initial model vs without? I

[GitHub] spark issue #11119: [SPARK-10780][ML] Add an initial model to kmeans

2016-10-11 Thread MLnick
Github user MLnick commented on the issue: https://github.com/apache/spark/pull/9 We should also just check save / load is backward compatible with older versions. It should be, but subtle things can sneak in so let's be careful about that. --- If your project is set up for it,

[GitHub] spark issue #11119: [SPARK-10780][ML] Add an initial model to kmeans

2016-10-11 Thread MLnick
Github user MLnick commented on the issue: https://github.com/apache/spark/pull/9 I have a few high level questions on this: Params Why are we only setting` k` based on the `initialModel`? I had thought from previous discussion above (it was a while ago now)

[GitHub] spark issue #11119: [SPARK-10780][ML] Add an initial model to kmeans

2016-10-11 Thread MLnick
Github user MLnick commented on the issue: https://github.com/apache/spark/pull/9 I misread DB's meaning in my previous comment. I agree that the parameter settings of `initialModel`, if set, should take precedence. If it conflicts with an existing `k` then log a warning.

[GitHub] spark issue #11119: [SPARK-10780][ML] Add an initial model to kmeans

2016-10-11 Thread dbtsai
Github user dbtsai commented on the issue: https://github.com/apache/spark/pull/9 +1 on what @sethah proposed. We can log with warn when k is modified by setting the initial model. Thanks. --- If your project is set up for it, you can reply to this email and have your reply

[GitHub] spark issue #11119: [SPARK-10780][ML] Add an initial model to kmeans

2016-10-08 Thread sethah
Github user sethah commented on the issue: https://github.com/apache/spark/pull/9 @MLnick I'm not sure I understand what you're saying. Where are we discarding cluster centers? Maybe we should say that the `initialModel` always takes precedence over `k`. So we can just

[GitHub] spark issue #11119: [SPARK-10780][ML] Add an initial model to kmeans

2016-10-08 Thread MLnick
Github user MLnick commented on the issue: https://github.com/apache/spark/pull/9 Isn't the point of an initial model mainly the cluster centers? If you override k what happens to the cluster centers? Discard them? Why not then just start again rather than have an initial

[GitHub] spark issue #11119: [SPARK-10780][ML] Add an initial model to kmeans

2016-10-07 Thread yinxusen
Github user yinxusen commented on the issue: https://github.com/apache/spark/pull/9 @dbtsai @sethah I updated the code. Now we check the equivalence of K when setting initialModel if K is set previously. We also check the equivalence when fitting a model. --- If your project

[GitHub] spark issue #11119: [SPARK-10780][ML] Add an initial model to kmeans

2016-10-07 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/9 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #11119: [SPARK-10780][ML] Add an initial model to kmeans

2016-10-07 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/9 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/66524/ Test PASSed. ---

[GitHub] spark issue #11119: [SPARK-10780][ML] Add an initial model to kmeans

2016-10-07 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/9 **[Test build #66524 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/66524/consoleFull)** for PR 9 at commit

[GitHub] spark issue #11119: [SPARK-10780][ML] Add an initial model to kmeans

2016-10-07 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/9 **[Test build #66524 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/66524/consoleFull)** for PR 9 at commit

[GitHub] spark issue #11119: [SPARK-10780][ML] Add an initial model to kmeans

2016-10-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/9 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/66446/ Test PASSed. ---

[GitHub] spark issue #11119: [SPARK-10780][ML] Add an initial model to kmeans

2016-10-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/9 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #11119: [SPARK-10780][ML] Add an initial model to kmeans

2016-10-06 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/9 **[Test build #66446 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/66446/consoleFull)** for PR 9 at commit

[GitHub] spark issue #11119: [SPARK-10780][ML] Add an initial model to kmeans

2016-10-06 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/9 **[Test build #66446 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/66446/consoleFull)** for PR 9 at commit

[GitHub] spark issue #11119: [SPARK-10780][ML] Add an initial model to kmeans

2016-10-05 Thread sethah
Github user sethah commented on the issue: https://github.com/apache/spark/pull/9 ping @dbtsai --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if

[GitHub] spark issue #11119: [SPARK-10780][ML] Add an initial model to kmeans

2016-10-05 Thread sethah
Github user sethah commented on the issue: https://github.com/apache/spark/pull/9 One minor comment, otherwise LGTM. We should still consider overriding equals method as a follow up item. Also, we should discuss changing the behavior so that if initial model is set, then

[GitHub] spark issue #11119: [SPARK-10780][ML] Add an initial model to kmeans

2016-10-05 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/9 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #11119: [SPARK-10780][ML] Add an initial model to kmeans

2016-10-05 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/9 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/66380/ Test PASSed. ---

[GitHub] spark issue #11119: [SPARK-10780][ML] Add an initial model to kmeans

2016-10-05 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/9 **[Test build #66380 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/66380/consoleFull)** for PR 9 at commit

[GitHub] spark issue #11119: [SPARK-10780][ML] Add an initial model to kmeans

2016-10-05 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/9 **[Test build #66380 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/66380/consoleFull)** for PR 9 at commit

[GitHub] spark issue #11119: [SPARK-10780][ML] Add an initial model to kmeans

2016-09-27 Thread sethah
Github user sethah commented on the issue: https://github.com/apache/spark/pull/9 @yinxusen Looking good. I left a few small comments, and we should take care of the checking of initial model params in the read/write test now. After that, I think it will be ready to merge.

[GitHub] spark issue #11119: [SPARK-10780][ML] Add an initial model to kmeans

2016-09-27 Thread yinxusen
Github user yinxusen commented on the issue: https://github.com/apache/spark/pull/9 Ping @dbtsai @sethah --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes

[GitHub] spark issue #11119: [SPARK-10780][ML] Add an initial model to kmeans

2016-09-27 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/9 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/65953/ Test PASSed. ---

[GitHub] spark issue #11119: [SPARK-10780][ML] Add an initial model to kmeans

2016-09-27 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/9 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #11119: [SPARK-10780][ML] Add an initial model to kmeans

2016-09-27 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/9 **[Test build #65953 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65953/consoleFull)** for PR 9 at commit

[GitHub] spark issue #11119: [SPARK-10780][ML] Add an initial model to kmeans

2016-09-27 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/9 **[Test build #65953 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65953/consoleFull)** for PR 9 at commit

[GitHub] spark issue #11119: [SPARK-10780][ML] Add an initial model to kmeans

2016-09-26 Thread dbtsai
Github user dbtsai commented on the issue: https://github.com/apache/spark/pull/9 Ping @yinxusen on update. Would like to have it merged soon so we can work on LiR and LoR parts. Thanks. --- If your project is set up for it, you can reply to this email and have your reply appear

[GitHub] spark issue #11119: [SPARK-10780][ML] Add an initial model to kmeans

2016-09-16 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/9 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/65507/ Test PASSed. ---

[GitHub] spark issue #11119: [SPARK-10780][ML] Add an initial model to kmeans

2016-09-16 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/9 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #11119: [SPARK-10780][ML] Add an initial model to kmeans

2016-09-16 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/9 **[Test build #65507 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65507/consoleFull)** for PR 9 at commit

[GitHub] spark issue #11119: [SPARK-10780][ML] Add an initial model to kmeans

2016-09-16 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/9 **[Test build #65507 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65507/consoleFull)** for PR 9 at commit

[GitHub] spark issue #11119: [SPARK-10780][ML] Add an initial model to kmeans

2016-09-16 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/9 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

  1   2   >