[
https://issues.apache.org/jira/browse/SPARK-5556?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14303737#comment-14303737
]
Joseph K. Bradley commented on SPARK-5556:
------------------------------------------
I believe [~mengxr] and [~witgo] have confirmed that's the plan. It would be
great to get this as another algorithm option; I suspect it will perform better
than EM.
One thought: There are several possibilities for Gibbs sampling algorithms:
* Collapsed Gibbs sampling (most common, but distributed implementations are
all non-ergodic) (This is what [~witgo]'s PR uses, I believe.)
* Non-collapsed Gibbs sampling (not common, but easy to make an ergodic
distributed implementation)
* Modified versions of the LDA model designed for distributed computation, such
as HD-LDA in Newman et al. “Distributed Algorithms for Topic Models.” JMLR,
2009.
> Latent Dirichlet Allocation (LDA) using Gibbs sampler
> ------------------------------------------------------
>
> Key: SPARK-5556
> URL: https://issues.apache.org/jira/browse/SPARK-5556
> Project: Spark
> Issue Type: New Feature
> Components: MLlib
> Reporter: Guoqiang Li
>
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]