[
https://issues.apache.org/jira/browse/SPARK-8555?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15076488#comment-15076488
]
Sean Owen commented on SPARK-8555:
----------------------------------
Have a look at
https://cwiki.apache.org/confluence/display/SPARK/Contributing+to+Spark#ContributingtoSpark-MLlib-specificContributionGuidelines
; generally speaking there are so many algorithms to implement and most aren't
that useful or widely used, and so few really belong in MLlib itself. I'm not
commenting on HDP here, though I don't think it's that commonly used. The idea
is that it should prove itself out externally.
> Online Variational Inference for the Hierarchical Dirichlet Process
> -------------------------------------------------------------------
>
> Key: SPARK-8555
> URL: https://issues.apache.org/jira/browse/SPARK-8555
> Project: Spark
> Issue Type: New Feature
> Components: MLlib
> Reporter: yuhao yang
> Priority: Minor
>
> The task is created for exploration on the online HDP algorithm described in
> http://jmlr.csail.mit.edu/proceedings/papers/v15/wang11a/wang11a.pdf.
> Major advantage for the algorithm: one pass on corpus, streaming friendly,
> automatic K (topic number).
> Currently the scope is to support online HDP for topic modeling, i.e.
> probably an optimizer for LDA.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]