Github user jkbradley commented on a diff in the pull request:

    https://github.com/apache/spark/pull/9678#discussion_r44731681
  
    --- Diff: mllib/src/main/scala/org/apache/spark/ml/clustering/LDA.scala ---
    @@ -314,31 +314,31 @@ private[clustering] trait LDAParams extends Params 
with HasFeaturesCol with HasM
      * Model fitted by [[LDA]].
      *
      * @param vocabSize  Vocabulary size (number of terms or terms in the 
vocabulary)
    - * @param oldLocalModel  Underlying spark.mllib model.
    - *                       If this model was produced by Online LDA, then 
this is the
    - *                       only model representation.
    - *                       If this model was produced by EM, then this local
    - *                       representation may be built lazily.
      * @param sqlContext  Used to construct local DataFrames for returning 
query results
      */
     @Since("1.6.0")
     @Experimental
    -class LDAModel private[ml] (
    +sealed abstract class LDAModel private[ml] (
         @Since("1.6.0") override val uid: String,
         @Since("1.6.0") val vocabSize: Int,
    -    @Since("1.6.0") protected var oldLocalModel: Option[OldLocalLDAModel],
         @Since("1.6.0") @transient protected val sqlContext: SQLContext)
       extends Model[LDAModel] with LDAParams with Logging {
     
    -  /** Returns underlying spark.mllib model */
    +  // NOTE to developers:
    +  //  This abstraction should contain all important functionality for 
basic LDA usage.
    +  //  Specializations of this class can contain expert-only functionality.
    +
    +  /**
    +   * Underlying spark.mllib model.
    +   * If this model was produced by Online LDA, then this is the only model 
representation.
    +   * If this model was produced by EM, then this local representation may 
be built lazily.
    +   */
       @Since("1.6.0")
    -  protected def getModel: OldLDAModel = oldLocalModel match {
    -    case Some(m) => m
    -    case None =>
    -      // Should never happen.
    -      throw new RuntimeException("LDAModel required local model format," +
    -        " but the underlying model is missing.")
    -  }
    +  protected def oldLocalModel: OldLocalLDAModel
    +
    +  /** Returns underlying spark.mllib model, which may be local or 
distributed */
    +  @Since("1.6.0")
    +  protected def getModel: OldLDAModel
    --- End diff --
    
    This does not trigger a collect.  It's oldLocalModel which can trigger a 
collect.
    
    I'll add some more warnings in places where it can happen in the public API.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to