Im using mahout 0.6. I had runned the "mahout lda..." tool for command line for apply lda method in a corpus. But now, i want to code it in my java program and Im having a lot of problems because it crashes. Can someone give me an example java code running correctly?
Looking at the output of LDA, I have 2 folders: - docTopics: wich contains a Text key (the document ID) and a vector Value (that is the membership of this document to each topic). -state-n: I assume that the intPairWritable is (topicID, wordID) so it have as wordID as all the corpus for each topic. And the DoubleWritable Value I dont know what is. I think its the membership between the topic and the word, but i dont know what type of meassure method is used. For example, here is an split that I have printed: ... (4, 17847) -28.424714110200803 (4, 17848) -32.54168874531223 (4, 17849) -51.954687480087074 (4, 17850) -1.8811618929248652E-12 (4, 17851) -7.102634146221668 (4, 17852) 3.440324743165531 (4, 17853) 1.118778127312393 (4, 17854) 2.2973859313207385 (4, 17855) 2.1602327860824015 (4, 17856) -2.5362957334351677E-6 (4, 17857) -32.80559170476965 (4, 17858) -1.9791269423308222E-7 ... Can somebody help me explaining me this?
