I think this requires a separate program which does not exist. On Thu, Jun 30, 2011 at 12:02 PM, wine lover <[email protected]> wrote: > Thanks, Hector, you are right, the exact meaning of topic_i is not necessary > for unsupervised clustering. > > However, in order to cluster a set of documents, I still need to know the > probabilistic relationship between topic and each document. I am not very > clear how to get this kind of information from the generated result. > > For instance, model [p(model|topic_0) = 0.010358664102351409 Here, model is > a word, but the result does not tell me anything between this word and a > given document? Thanks. > > > On Thu, Jun 30, 2011 at 2:08 PM, wine lover <[email protected]> wrote: > >> Hello Everyone, >> >> I have two questions on the LDA analysis. >> >> After running the command of lda, under the generated directory of >> "testdata-lda", there have several folders: docTopics state-0 state-1 >> .... >> >> It seems to me that those folders of "state-x" will be transferred into >> readable format after running "ldatopics". But what does the folder of >> "docTopics" stand for? How can I view it? >> >> Running the command of ldatopics generates 20 files, (topic_0, topic_1, >> etc), in total. For instance, in the file of topic_0, I get information such >> as follows: >> model [p(model|topic_0) = 0.010358664102351409 >> tissues [p(tissues|topic_0) = 0.008870984984037485 >> >> How can I tell what does topic_0 stand for? Where to find this kind of >> information? Moreover, is there any other procedures existed to generate >> the clustering result based on these topic_x files. >> >> >> Thank you very much for the help. >> >> Wenyia >> >
-- Lance Norskog [email protected]
