Github user sethah commented on the issue: https://github.com/apache/spark/pull/11119 @MLnick I'm not sure I understand what you're saying. Where are we discarding cluster centers? Maybe we should say that the `initialModel` always takes precedence over `k`. So we can just ignore `k` when initialModel is set, and log a warning at train time that we are ignoring it. There are going to be tradeoffs either way, and I think that is reasonable behavior. I vote to ignore `k` when `initialModel` is set. That also alleviate DB's concern about the following situation (which would fail given the current logic): ````scala val km = new KMeans().setInitialModel(kEquals5Model) val model1 = km.fit(df) val model2 = km.setInitialModel(kEquals6Model).fit(df) ```` Again, I think we can all agree there are tradeoffs. Let's see if we can agree on something for now and go with it. If someone feels really strongly, then maybe we can discuss it in another JIRA.
--- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org