[ 
https://issues.apache.org/jira/browse/SPARK-5556?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14303737#comment-14303737
 ] 

Joseph K. Bradley commented on SPARK-5556:
------------------------------------------

I believe [~mengxr] and [~witgo] have confirmed that's the plan.  It would be 
great to get this as another algorithm option; I suspect it will perform better 
than EM.

One thought: There are several possibilities for Gibbs sampling algorithms:
* Collapsed Gibbs sampling (most common, but distributed implementations are 
all non-ergodic)  (This is what [~witgo]'s PR uses, I believe.)
* Non-collapsed Gibbs sampling (not common, but easy to make an ergodic 
distributed implementation)
* Modified versions of the LDA model designed for distributed computation, such 
as HD-LDA in Newman et al. “Distributed Algorithms for Topic Models.” JMLR, 
2009.


> Latent Dirichlet Allocation (LDA) using Gibbs sampler 
> ------------------------------------------------------
>
>                 Key: SPARK-5556
>                 URL: https://issues.apache.org/jira/browse/SPARK-5556
>             Project: Spark
>          Issue Type: New Feature
>          Components: MLlib
>            Reporter: Guoqiang Li
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to