Github user HyukjinKwon commented on the issue:
https://github.com/apache/spark/pull/9
Do you guys mind if I propose to close this PR?
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/9
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/75004/
Test PASSed.
---
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/9
Build finished. Test PASSed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/9
**[Test build #75004 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/75004/consoleFull)**
for PR 9 at commit
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/9
**[Test build #75004 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/75004/consoleFull)**
for PR 9 at commit
Github user sethah commented on the issue:
https://github.com/apache/spark/pull/9
ping! I could take this over if needed :)
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
Github user sethah commented on the issue:
https://github.com/apache/spark/pull/9
@yinxusen Do you think you'll have time to work on this?
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have
Github user sethah commented on the issue:
https://github.com/apache/spark/pull/9
ping?
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the
Github user sethah commented on the issue:
https://github.com/apache/spark/pull/9
@yinxusen I took a look at the updates. Will you be able to create the
design doc that Joseph mentioned?
---
If your project is set up for it, you can reply to this email and have your
reply
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/9
Merged build finished. Test PASSed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/9
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/68368/
Test PASSed.
---
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/9
**[Test build #68368 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/68368/consoleFull)**
for PR 9 at commit
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/9
**[Test build #68368 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/68368/consoleFull)**
for PR 9 at commit
Github user sethah commented on the issue:
https://github.com/apache/spark/pull/9
This is probably going to miss 2.1 since we are officially in QA now, just
as an fyi.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/9
Merged build finished. Test FAILed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/9
Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/68325/
Test FAILed.
---
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/9
**[Test build #68325 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/68325/consoleFull)**
for PR 9 at commit
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/9
**[Test build #68325 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/68325/consoleFull)**
for PR 9 at commit
Github user yinxusen commented on the issue:
https://github.com/apache/spark/pull/9
@sethah Sorry, I got stuck in other things. I'll update this PR tonight.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your
Github user sethah commented on the issue:
https://github.com/apache/spark/pull/9
@yinxusen Status update?
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes
Github user jkbradley commented on the issue:
https://github.com/apache/spark/pull/9
> setInitMode throws an error when called with setInitMode("initialModel")
and instructs user to use setInitialModel instead
On second thought, for this one, it could be good to have it
Github user sethah commented on the issue:
https://github.com/apache/spark/pull/9
Ok, unless anyone has strong objections, it seems our plan moving forward
with this PR should be:
* Change the `setInitialModel` method to also set `initMode` to
"initialModel"
* Change
Github user jkbradley commented on the issue:
https://github.com/apache/spark/pull/9
> Would you mind pointing me to an example of an algorithm which only
copies some, but not all, of the estimator params?
ALS is a good example:
Github user sethah commented on the issue:
https://github.com/apache/spark/pull/9
@jkbradley Thanks for your thoughts. I agree it's a good idea to change the
KMeans prediction function to not use the entire model in its closure, but that
we need a more thorough solution when we
Github user yinxusen commented on the issue:
https://github.com/apache/spark/pull/9
How about the following:
1. Since the new generated model is derived from an estimator, the model
should have the same params as its parent estimator. That's why there is no
need to
Github user sethah commented on the issue:
https://github.com/apache/spark/pull/9
Related thought: if the model holds a pointer to its initialModel, then it
will be serialized and shipped along with the model at prediction time. This
will be inefficient for large models and even
Github user yinxusen commented on the issue:
https://github.com/apache/spark/pull/9
@MLnick @dbtsai @sethah Any thoughts on the new version?
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/9
Merged build finished. Test PASSed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/9
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/67156/
Test PASSed.
---
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/9
**[Test build #67156 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/67156/consoleFull)**
for PR 9 at commit
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/9
**[Test build #67156 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/67156/consoleFull)**
for PR 9 at commit
Github user MLnick commented on the issue:
https://github.com/apache/spark/pull/9
@yinxusen for cutting lineage do you intend to check whether the
`initialModel` itself has an `initialModel`, and if so clear it?
I think that can be a reasonable solution if we have a clean
Github user MLnick commented on the issue:
https://github.com/apache/spark/pull/9
@sethah agree with that yes.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and
Github user sethah commented on the issue:
https://github.com/apache/spark/pull/9
We should change the `initMode` doc to indicate that the param is ignored
when `initialModel` is set.
---
If your project is set up for it, you can reply to this email and have your
reply appear on
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/9
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/67100/
Test PASSed.
---
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/9
Merged build finished. Test PASSed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/9
**[Test build #67100 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/67100/consoleFull)**
for PR 9 at commit
Github user yinxusen commented on the issue:
https://github.com/apache/spark/pull/9
I agree that too long lineage would hurt performance and also unnecessary.
How about cutting a lineage that is longer than 2? Namely, we only keep direct
parent model when saving. Keeping direct
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/9
**[Test build #67100 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/67100/consoleFull)**
for PR 9 at commit
Github user MLnick commented on the issue:
https://github.com/apache/spark/pull/9
Good points. Perhaps a solution - while slightly "verbose", is to introduce
another param `initialModelWriteMode` which governs what is saved - `full`,
`params` or `none`. Full is obviously the
Github user sethah commented on the issue:
https://github.com/apache/spark/pull/9
I agree that saving the initialModel may not be practical - since it can be
large. However, not saving that param at all also seems a bit contrary to me.
When we produce a model from an estimator,
Github user sethah commented on the issue:
https://github.com/apache/spark/pull/9
@MLnick I tend to agree with your opinion about saving the initialModel
with the model.
We can remove saving the initial model entirely from the model. Though,
right now users could avoid
Github user dbtsai commented on the issue:
https://github.com/apache/spark/pull/9
I was also thinking that most of people will use this for daily retraining
by passing in the previous model which will cause the model larger and larger
due to the model chain which is unnecessary
Github user MLnick commented on the issue:
https://github.com/apache/spark/pull/9
For the `KMeansModel` saving the `initialModel` - it still seems weird to
me. Another factor to consider: the main use case in my view is continually
re-training a model with the result from the
Github user MLnick commented on the issue:
https://github.com/apache/spark/pull/9
For the `k` vs other params - I'm ok with just setting `k`. I do agree that
in most cases the other params may be different (in fact likely different from
both the defaults and initial model for
Github user sethah commented on the issue:
https://github.com/apache/spark/pull/9
One minor comment, otherwise LGTM. Thanks a lot @yinxusen and reviewers.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your
Github user dbtsai commented on the issue:
https://github.com/apache/spark/pull/9
Please remove `WIP` in the description.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/9
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/67051/
Test PASSed.
---
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/9
Merged build finished. Test PASSed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/9
**[Test build #67051 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/67051/consoleFull)**
for PR 9 at commit
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/9
**[Test build #67051 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/67051/consoleFull)**
for PR 9 at commit
Github user yinxusen commented on the issue:
https://github.com/apache/spark/pull/9
@sethah Thanks, I change these problems.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
Github user sethah commented on the issue:
https://github.com/apache/spark/pull/9
@yinxusen Thanks for the updates. Regarding save/load compatibility, I
tested it locally and had no issues.
There are still some issues to address, but I think we can sort them out
into
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/9
Merged build finished. Test PASSed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/9
**[Test build #67050 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/67050/consoleFull)**
for PR 9 at commit
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/9
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/67050/
Test PASSed.
---
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/9
**[Test build #67050 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/67050/consoleFull)**
for PR 9 at commit
Github user yinxusen commented on the issue:
https://github.com/apache/spark/pull/9
@sethah New behavior of `setK` is adapted.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/9
Merged build finished. Test PASSed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/9
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/66994/
Test PASSed.
---
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/9
**[Test build #66994 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/66994/consoleFull)**
for PR 9 at commit
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/9
Merged build finished. Test PASSed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/9
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/66991/
Test PASSed.
---
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/9
**[Test build #66991 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/66991/consoleFull)**
for PR 9 at commit
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/9
**[Test build #66994 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/66994/consoleFull)**
for PR 9 at commit
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/9
**[Test build #66991 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/66991/consoleFull)**
for PR 9 at commit
Github user sethah commented on the issue:
https://github.com/apache/spark/pull/9
@MLnick Good questions. For the setting of params - I'm sure it varies in
some cases, but for example, why would you want to use the same `maxIter` when
training with an initial model vs without? I
Github user MLnick commented on the issue:
https://github.com/apache/spark/pull/9
We should also just check save / load is backward compatible with older
versions. It should be, but subtle things can sneak in so let's be careful
about that.
---
If your project is set up for it,
Github user MLnick commented on the issue:
https://github.com/apache/spark/pull/9
I have a few high level questions on this:
Params
Why are we only setting` k` based on the `initialModel`? I had thought from
previous discussion above (it was a while ago now)
Github user MLnick commented on the issue:
https://github.com/apache/spark/pull/9
I misread DB's meaning in my previous comment.
I agree that the parameter settings of `initialModel`, if set, should take
precedence. If it conflicts with an existing `k` then log a warning.
Github user dbtsai commented on the issue:
https://github.com/apache/spark/pull/9
+1 on what @sethah proposed. We can log with warn when k is modified by
setting the initial model. Thanks.
---
If your project is set up for it, you can reply to this email and have your
reply
Github user sethah commented on the issue:
https://github.com/apache/spark/pull/9
@MLnick I'm not sure I understand what you're saying. Where are we
discarding cluster centers?
Maybe we should say that the `initialModel` always takes precedence over
`k`. So we can just
Github user MLnick commented on the issue:
https://github.com/apache/spark/pull/9
Isn't the point of an initial model mainly the cluster centers? If you
override k what happens to the cluster centers? Discard them? Why not then
just start again rather than have an initial
Github user yinxusen commented on the issue:
https://github.com/apache/spark/pull/9
@dbtsai @sethah I updated the code. Now we check the equivalence of K when
setting initialModel if K is set previously. We also check the equivalence when
fitting a model.
---
If your project
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/9
Merged build finished. Test PASSed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/9
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/66524/
Test PASSed.
---
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/9
**[Test build #66524 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/66524/consoleFull)**
for PR 9 at commit
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/9
**[Test build #66524 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/66524/consoleFull)**
for PR 9 at commit
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/9
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/66446/
Test PASSed.
---
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/9
Merged build finished. Test PASSed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/9
**[Test build #66446 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/66446/consoleFull)**
for PR 9 at commit
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/9
**[Test build #66446 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/66446/consoleFull)**
for PR 9 at commit
Github user sethah commented on the issue:
https://github.com/apache/spark/pull/9
ping @dbtsai
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if
Github user sethah commented on the issue:
https://github.com/apache/spark/pull/9
One minor comment, otherwise LGTM.
We should still consider overriding equals method as a follow up item.
Also, we should discuss changing the behavior so that if initial model is set,
then
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/9
Merged build finished. Test PASSed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/9
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/66380/
Test PASSed.
---
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/9
**[Test build #66380 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/66380/consoleFull)**
for PR 9 at commit
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/9
**[Test build #66380 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/66380/consoleFull)**
for PR 9 at commit
Github user sethah commented on the issue:
https://github.com/apache/spark/pull/9
@yinxusen Looking good. I left a few small comments, and we should take
care of the checking of initial model params in the read/write test now. After
that, I think it will be ready to merge.
Github user yinxusen commented on the issue:
https://github.com/apache/spark/pull/9
Ping @dbtsai @sethah
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/9
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/65953/
Test PASSed.
---
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/9
Merged build finished. Test PASSed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/9
**[Test build #65953 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65953/consoleFull)**
for PR 9 at commit
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/9
**[Test build #65953 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65953/consoleFull)**
for PR 9 at commit
Github user dbtsai commented on the issue:
https://github.com/apache/spark/pull/9
Ping @yinxusen on update. Would like to have it merged soon so we can work
on LiR and LoR parts. Thanks.
---
If your project is set up for it, you can reply to this email and have your
reply appear
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/9
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/65507/
Test PASSed.
---
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/9
Merged build finished. Test PASSed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/9
**[Test build #65507 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65507/consoleFull)**
for PR 9 at commit
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/9
**[Test build #65507 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65507/consoleFull)**
for PR 9 at commit
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/9
Merged build finished. Test FAILed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
1 - 100 of 118 matches
Mail list logo