Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/9678#discussion_r44731681
--- Diff: mllib/src/main/scala/org/apache/spark/ml/clustering/LDA.scala ---
@@ -314,31 +314,31 @@ private[clustering] trait LDAParams extends Params
with HasFeaturesCol with HasM
* Model fitted by [[LDA]].
*
* @param vocabSize Vocabulary size (number of terms or terms in the
vocabulary)
- * @param oldLocalModel Underlying spark.mllib model.
- * If this model was produced by Online LDA, then
this is the
- * only model representation.
- * If this model was produced by EM, then this local
- * representation may be built lazily.
* @param sqlContext Used to construct local DataFrames for returning
query results
*/
@Since("1.6.0")
@Experimental
-class LDAModel private[ml] (
+sealed abstract class LDAModel private[ml] (
@Since("1.6.0") override val uid: String,
@Since("1.6.0") val vocabSize: Int,
- @Since("1.6.0") protected var oldLocalModel: Option[OldLocalLDAModel],
@Since("1.6.0") @transient protected val sqlContext: SQLContext)
extends Model[LDAModel] with LDAParams with Logging {
- /** Returns underlying spark.mllib model */
+ // NOTE to developers:
+ // This abstraction should contain all important functionality for
basic LDA usage.
+ // Specializations of this class can contain expert-only functionality.
+
+ /**
+ * Underlying spark.mllib model.
+ * If this model was produced by Online LDA, then this is the only model
representation.
+ * If this model was produced by EM, then this local representation may
be built lazily.
+ */
@Since("1.6.0")
- protected def getModel: OldLDAModel = oldLocalModel match {
- case Some(m) => m
- case None =>
- // Should never happen.
- throw new RuntimeException("LDAModel required local model format," +
- " but the underlying model is missing.")
- }
+ protected def oldLocalModel: OldLocalLDAModel
+
+ /** Returns underlying spark.mllib model, which may be local or
distributed */
+ @Since("1.6.0")
+ protected def getModel: OldLDAModel
--- End diff --
This does not trigger a collect. It's oldLocalModel which can trigger a
collect.
I'll add some more warnings in places where it can happen in the public API.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]