LDA and evaluating topic number

Cody Buntain Thu, 28 Sep 2017 10:51:09 -0700

Hi, all!

        Is there an example somewhere on using LDA’s 
logPerplexity()/logLikelihood() functions to evaluate topic counts? The 
existing MLLib LDA examples show calling them, but I can’t find any 
documentation about how to interpret the outputs. Graphing the outputs for logs 
of perplexity and likelihood aren’t consistent with what I expected (perplexity 
increases and likelihood decreases as topics increase, which seem odd to me).


        An example of what I’m doing is here: 
http://www.cs.umd.edu/~cbuntain/FindTopicK-pyspark-regex.html 
<http://www.cs.umd.edu/~cbuntain/FindTopicK-pyspark-regex.html>

        Thanks very much in advance! If I can figure this out, I can post 
example code online, so others can see how this process is done.

-Best regards,
Cody
_________________
Cody Buntain, PhD
Postdoc, @UMD_CS
Intelligence Community Postdoctoral Fellow
[email protected] <mailto:[email protected]>
www.cs.umd.edu/~cbuntain <http://www.cs.umd.edu/~cbuntain>

signature.asc
Description: Message signed with OpenPGP

LDA and evaluating topic number

Reply via email to