Hi DB Tsai, Not for now. My primary reference is http://jmlr.csail.mit.edu/proceedings/papers/v15/wang11a/wang11a.pdf .
And I'm seeking a way to maximum code reuse. Any suggestion will be welcome. Thanks. Regards, yuhao -----Original Message----- From: DB Tsai [mailto:dbt...@dbtsai.com] Sent: Thursday, June 4, 2015 1:01 PM To: Yang, Yuhao Cc: Joseph Bradley; Lorenz Fischer; dev@spark.apache.org Subject: Re: MLlib: Anybody working on hierarchical topic models like HLDA? Is your HDP implementation based on distributed gibbs sampling? Thanks. Sincerely, DB Tsai ------------------------------------------------------- Blog: https://www.dbtsai.com On Wed, Jun 3, 2015 at 8:13 PM, Yang, Yuhao <yuhao.y...@intel.com> wrote: > Hi Lorenz, > > > > I’m trying to build a prototype of HDP for a customer based on the > current LDA implementations. An initial version will probably be ready > within the next one or two weeks. I’ll share it and hopefully we can join > forces. > > > > One concern is that I’m not sure how widely it will be used in the > industry or community. Hope it’s popular enough to be accepted by > Spark MLlib. > > > > http://www.cs.berkeley.edu/~jordan/papers/hierarchical-dp.pdf > > http://jmlr.csail.mit.edu/proceedings/papers/v15/wang11a/wang11a.pdf > > > > Regards, > > Yuhao > > > > From: Joseph Bradley [mailto:jos...@databricks.com] > Sent: Thursday, June 4, 2015 7:17 AM > To: Lorenz Fischer > Cc: dev@spark.apache.org > Subject: Re: MLlib: Anybody working on hierarchical topic models like HLDA? > > > > Hi Lorenz, > > > > I'm not aware of people working on hierarchical topic models for > MLlib, but that would be cool to see. Hopefully other devs know more! > > > > Glad that the current LDA is helpful! > > > > Joseph > > > > On Wed, Jun 3, 2015 at 6:43 AM, Lorenz Fischer > <lorenz.fisc...@gmail.com> > wrote: > > Hi All > > > > I'm working on a project in which I use the current LDA implementation > that has been contributed by Databricks' Joseph Bradley et al. for the > recent > 1.3.0 release (thanks guys!). While this is great, my project requires > several levels of topics, as I would like to offer users to drill down > into subtopics. > > > > As I understand it, Hierarchical Latent Dirichlet Allocation (HLDA) > would offer such a hierarchy. Looking at the papers and talks by Blei > [1,2] and Jordan [3], I think I should be able to implement HLDA in > Spark using the Nested Chinese Restaurant Process (NCRP). However, as > I have some time constraints, I'm not sure if I will have the time to do it > 'the proper way'. > > > > In any case, I wanted to quickly ask around if anybody is already > working on this or on some other form of a hierarchical topic model. > Maybe I could contribute to these efforts instead of starting from scratch. > > > > Best, > > Lorenz > > > > [1] > http://www.cs.princeton.edu/~blei/papers/BleiGriffithsJordan2009.pdf > > [2] > http://papers.nips.cc/paper/2466-hierarchical-topic-models-and-the-nes > ted-chinese-restaurant-process.pdf > > [3] https://www.youtube.com/watch?v=PxgW3lOrj60 > >