Joseph K. Bradley created SPARK-5563:
----------------------------------------

             Summary: LDA with online variational inference
                 Key: SPARK-5563
                 URL: https://issues.apache.org/jira/browse/SPARK-5563
             Project: Spark
          Issue Type: Test
          Components: MLlib
    Affects Versions: 1.3.0
            Reporter: Joseph K. Bradley


Latent Dirichlet Allocation (LDA) parameters can be inferred using online 
variational inference, as in Hoffman, Blei and Bach. “Online Learning for 
Latent Dirichlet Allocation.”  NIPS, 2010.  This algorithm should be very 
efficient and should be able to handle much larger datasets than batch 
algorithms for LDA.

This algorithm will also be important for supporting Streaming versions of LDA.

The implementation will ideally use the same API as the existing LDA but use a 
different underlying optimizer.

This will require hooking in to the existing mllib.optimization frameworks.

This will require some discussion about whether batch versions of online 
variational inference should be supported, as well as what variational 
approximation should be used now or in the future.




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to