Github user sethah commented on the issue:

    https://github.com/apache/spark/pull/11119
  
    I agree that saving the initialModel may not be practical - since it can be 
large. However, not saving that param at all also seems a bit contrary to me. 
When we produce a model from an estimator, we copy over the params that were 
used to create the model. These params give an indication to how the model was 
created. If we completely disregard the initialModel when we save the model, 
then it will appear as though the model was not created with an initialModel. 
In fact for kmeans, it would look like the model was created using the 
`k-means||` initialization strategy since that is the default. This is 
misleading.
    
    It would be nice to have a way to avoid saving the model with the initial 
model data, but still preserve the information about how the model was 
initialized. You can argue even, that the initialModel should not be a param at 
all because of the edge cases it seems to introduce. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to