Github user feynmanliang commented on a diff in the pull request:

    https://github.com/apache/spark/pull/7916#discussion_r36149524
  
    --- Diff: 
mllib/src/main/scala/org/apache/spark/mllib/clustering/LDAModel.scala ---
    @@ -214,29 +214,61 @@ class LocalLDAModel private[clustering] (
           gammaShape)
       }
     
    +  /**
    +   * Calculates a lower bound on the log likelihood of the entire corpus 
and inferred topics.
    +   * Note that this bound sums 2 parts:
    +   *  - a bound on the log likelihood of the corpus, which scales with 
corpus size.
    +   *    See [[logLikelihood()]].
    +   *  - a bound on the log likelihood of the estimated topics (topic-term 
distributions),
    +   *    which does not scale with corpus size.
    +   *    See [[topicsLogLikelihood()]].
    +   *
    +   * See Equation (16) in original Online LDA paper.
    +   *
    +   * @param documents test corpus to use for calculating log likelihood
    +   * @return variational lower bound on the log likelihood of the entire 
corpus and inferred topics
    +   */
    +  def fullLogLikelihood(documents: RDD[(Long, Vector)]): Double = {
    --- End diff --
    
    Actually, this isn't the joint sorry (though when I saw "full" that's what 
I was lead to believe until reading Eq (3) 
https://www.cs.princeton.edu/~blei/papers/HoffmanBleiBach2010b.pdf)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to