The clustering is unsupervised. It doesn't tell you what a topic stands for, its up to you to assign what the topics are labeled based on the highest scoring words.
On Thu, Jun 30, 2011 at 11:08 AM, wine lover <[email protected]> wrote: > Hello Everyone, > > I have two questions on the LDA analysis. > > After running the command of lda, under the generated directory of > "testdata-lda", there have several folders: docTopics state-0 state-1 > .... > > It seems to me that those folders of "state-x" will be transferred into > readable format after running "ldatopics". But what does the folder of > "docTopics" stand for? How can I view it? > > Running the command of ldatopics generates 20 files, (topic_0, topic_1, > etc), in total. For instance, in the file of topic_0, I get information > such > as follows: > model [p(model|topic_0) = 0.010358664102351409 > tissues [p(tissues|topic_0) = 0.008870984984037485 > > How can I tell what does topic_0 stand for? Where to find this kind of > information? Moreover, is there any other procedures existed to generate > the clustering result based on these topic_x files. > > > Thank you very much for the help. > > Wenyia > -- Yee Yang Li Hector http://hectorgon.blogspot.com/ (tech + travel) http://hectorgon.com (book reviews)
