[ https://issues.apache.org/jira/browse/SPARK-11136?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15193886#comment-15193886 ]
Xusen Yin commented on SPARK-11136: ----------------------------------- This is a good point. Actually in our settings now, the new KMeans only uses the model itself (i.e. the array of cluster centers) without its parameters. E.g. {code} if (isSet(initialModel)) { require($(initialModel).parentModel.clusterCenters.length == $(k), "mismatched cluster count") require(rdd.first().size == $(initialModel).clusterCenters.head.size, "mismatched dimension") algo.setInitialModel($(initialModel).parentModel) } {code} But I think you're right. We should also extend the parameters in some scenarios. IMHO, the parameter overriding order should be (initialModel parameter < default parameter < user-set parameter). What do you think about it? > Warm-start support for ML estimator > ----------------------------------- > > Key: SPARK-11136 > URL: https://issues.apache.org/jira/browse/SPARK-11136 > Project: Spark > Issue Type: Sub-task > Components: ML > Reporter: Xusen Yin > Priority: Minor > > The current implementation of Estimator does not support warm-start fitting, > i.e. estimator.fit(data, params, partialModel). But first we need to add > warm-start for all ML estimators. This is an umbrella JIRA to add support for > the warm-start estimator. > Treat model as a special parameter, passing it through ParamMap. e.g. val > partialModel: Param[Option[M]] = new Param(...). In the case of model > existing, we use it to warm-start, else we start the training process from > the beginning. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org