Github user MLnick commented on the issue:

    https://github.com/apache/spark/pull/11119
  
    I have a few high level questions on this:
    
    #### Params
    
    Why are we only setting` k` based on the `initialModel`? I had thought from 
previous discussion above (it was a while ago now) that it would set all 
parameters? Of course some will be ignored like the init settings, but I think 
the default expectation of using `setInitialModel` would be that all params are 
set.
    
    For example, let's say I train a model with various `maxIter` and `tol` 
params. Then I want to use that model later for warm-starting. I want the same 
settings, just start from the existing centroids. I have to remember to do `new 
KMeans().setInitialModel(model).setMaxIter(model.getMaxIter())...`.
    
    If there is a good argument against this I'm happy to hear it, but then we 
must document this behavior clearly.
    
    #### Saving initial model on Model
    What is the reasoning behind saving the `initialModel` on the `Model`? It 
makes sense for the `Estimator` - I may want to save my estimator, and when 
loading it of course I'd need the initial model to be loaded, if it was set, so 
that I can correctly fit my estimator.
    
    But once I've fit a model, why would I save two models?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to