Github user yinxusen commented on a diff in the pull request:

    https://github.com/apache/spark/pull/11119#discussion_r55281476
  
    --- Diff: mllib/src/main/scala/org/apache/spark/ml/clustering/KMeans.scala 
---
    @@ -169,12 +182,29 @@ object KMeansModel extends MLReadable[KMeansModel] {
     
       /** [[MLWriter]] instance for [[KMeansModel]] */
       private[KMeansModel] class KMeansModelWriter(instance: KMeansModel) 
extends MLWriter {
    +    import org.json4s.JsonDSL._
     
         private case class Data(clusterCenters: Array[Vector])
     
         override protected def saveImpl(path: String): Unit = {
    -      // Save metadata and Params
    -      DefaultParamsWriter.saveMetadata(instance, path, sc)
    +      if (instance.isSet(instance.initialModel)) {
    +        val initialModelPath = new Path(path, "initial-model").toString
    +        val initialModel = instance.getInitialModel
    +        initialModel.save(initialModelPath)
    +
    +        // Remove the initialModel temporarily
    +        instance.clear(instance.initialModel)
    --- End diff --
    
    Thanks Fred, I'll fix it soon.
    
    On Monday, March 7, 2016, Fred Reiss <[email protected]> wrote:
    
    > In mllib/src/main/scala/org/apache/spark/ml/clustering/KMeans.scala
    > <https://github.com/apache/spark/pull/11119#discussion_r55276968>:
    >
    > >
    > >      private case class Data(clusterCenters: Array[Vector])
    > >
    > >      override protected def saveImpl(path: String): Unit = {
    > > -      // Save metadata and Params
    > > -      DefaultParamsWriter.saveMetadata(instance, path, sc)
    > > +      if (instance.isSet(instance.initialModel)) {
    > > +        val initialModelPath = new Path(path, "initial-model").toString
    > > +        val initialModel = instance.getInitialModel
    > > +        initialModel.save(initialModelPath)
    > > +
    > > +        // Remove the initialModel temporarily
    > > +        instance.clear(instance.initialModel)
    >
    > It's probably not a good idea for this serialization method to modify the
    > model. Two potential problem scenarios come to mind: (a) The call to
    > saveMetadata() below fails, leaving the entire KMeansModel object in an
    > inconsistent state; or (b) another thread could be accessing the
    > initialModel field while the current thread calls saveImpl()
    >
    > —
    > Reply to this email directly or view it on GitHub
    > <https://github.com/apache/spark/pull/11119/files#r55276968>.
    >
    
    
    -- 
    Cheers,
    Xusen Yin



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to