[
https://issues.apache.org/jira/browse/MAHOUT-815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13109845#comment-13109845
]
Jake Mannix commented on MAHOUT-815:
------------------------------------
Yes, Yahoo primarily does Gibbs sampling (or "collapsed gibbs sampling" to be
more precise), which is just a stochastic version of exactly the same update
equations in collapsed variational bayes.
> LDA Inference Corrections, Alpha (Dirichlet) Estimation
> -------------------------------------------------------
>
> Key: MAHOUT-815
> URL: https://issues.apache.org/jira/browse/MAHOUT-815
> Project: Mahout
> Issue Type: Improvement
> Components: Clustering
> Affects Versions: 0.6
> Reporter: Christoph Boden
> Assignee: Sebastian Schelter
>
> Hi, I am a PhD Student at TU Berlin DIMA. I am currently working on Mahouts
> LDA Implementation together with Sebastian Schelter. We identified a couple
> of points that can be fixed or improved in the current version.
> We propose to fix the inference in the expectation step of EM in accordance
> with [1], implement maximum likelihood estimation of the dirichlet
> distribution (alpha) as presented in [1] and some refacoring.
> [1]Blei, David M.; Ng, Andrew Y.; Jordan, Michael I (January 2003). Lafferty,
> John. ed. "Latent Dirichlet allocation". Journal of Machine Learning Research
> 3 (4-5): pp. 993-1022. doi:10.1162/jmlr.2003.3.4-5.993
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira