[
https://issues.apache.org/jira/browse/SPARK-5556?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14615616#comment-14615616
]
Pedro Rodriguez commented on SPARK-5556:
----------------------------------------
I am still interested, but was unsure of the status of other implementations.
Given not much new, perhaps I should go ahead with it?
Last week I was also considering the possibility of making a Spark package for
LDA. The aims would be threefold: have more algorithms (I have been contact by
a couple researchers basing their work on the Gibbs LDA I worked on, plus I
will likely be using it in my own PhD starting this fall), a good place for
relatively new/unproven variants, and then pull the best into spark. I have
been pretty busy so haven't gotten around to that, but it has been on my mind.
When I get time this week, I will take a look at the current source and what I
have to see how much work it would take to get to something that could make a
pull request. Although FastLDA/LightLDA might be algorithmically better, I
think that what I have would be a good starting place at least.
> Latent Dirichlet Allocation (LDA) using Gibbs sampler
> ------------------------------------------------------
>
> Key: SPARK-5556
> URL: https://issues.apache.org/jira/browse/SPARK-5556
> Project: Spark
> Issue Type: New Feature
> Components: MLlib
> Reporter: Guoqiang Li
> Assignee: Pedro Rodriguez
> Attachments: LDA_test.xlsx, spark-summit.pptx
>
>
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]