[ 
https://issues.apache.org/jira/browse/SPARK-5556?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14615616#comment-14615616
 ] 

Pedro Rodriguez commented on SPARK-5556:
----------------------------------------

I am still interested, but was unsure of the status of other implementations. 
Given not much new, perhaps I should go ahead with it?

Last week I was also considering the possibility of making a Spark package for 
LDA. The aims would be threefold: have more algorithms (I have been contact by 
a couple researchers basing their work on the Gibbs LDA I worked on, plus I 
will likely be using it in my own PhD starting this fall), a good place for 
relatively new/unproven variants, and then pull the best into spark. I have 
been pretty busy so haven't gotten around to that, but it has been on my mind.

When I get time this week, I will take a look at the current source and what I 
have to see how much work it would take to get to something that could make a 
pull request. Although FastLDA/LightLDA might be algorithmically better, I 
think that what I have would be a good starting place at least.

> Latent Dirichlet Allocation (LDA) using Gibbs sampler 
> ------------------------------------------------------
>
>                 Key: SPARK-5556
>                 URL: https://issues.apache.org/jira/browse/SPARK-5556
>             Project: Spark
>          Issue Type: New Feature
>          Components: MLlib
>            Reporter: Guoqiang Li
>            Assignee: Pedro Rodriguez
>         Attachments: LDA_test.xlsx, spark-summit.pptx
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to