Hello,

We are currently trying to evaluate mahout-0.6's LDA implementation for a couple of our use-cases. One of those is assignment of topic probabilities to new documents - this is, not contained in the training-corpus. After a little bit of research we found that the LDADriver.computeDocumentTopicProbabilities method might be a good starting point. This method is private though. Another problem would be creation of a vector from a new document using the same dictionary. It seems that SparseVectorsFromSequenceFiles
only supports "collective" creation of vectors.

Is there maybe already something implemented, I might have overseen, to accomplish one or both steps? I would be thankful for any suggestions and hints before I start implementing something myself.

Thanks,
    Dimitri

--
Neofonie GmbH
Robert-Koch-Platz 4
10115 Berlin
T +49.30 24627-241
F +49.30 24627 120
[email protected]
http://www.neofonie.de

Handelsregister
Berlin-Charlottenburg: HRB 67460

Geschäftsführung
Thomas Kitlitschko

Reply via email to