I already have a (term,weight) data using which I wanted to do an LDA analysis to find the topics distribution.
How should I create the Mahout vectors from this? Documentation says, I can use VectorWriter, but I'm not sure how to go with this. *Converting existing vectors to Mahout's format* If you are in the happy position to already own a document (as in: texts, images or whatever item you wish to treat) processing pipeline, the question arises of how to convert the vectors into the Mahout vector format. Probably the easiest way to go would be to implement your own Iterable<Vector> (called VectorIterable in the example below) and then reuse the existing VectorWriter classes: VectorWriter vectorWriter = SequenceFile.createWriter(filesystem, configuration, outfile, LongWritable.class, SparseVector.class); long numDocs = vectorWriter.write(new VectorIterable(), Long.MAX_VALUE); -- View this message in context: http://lucene.472066.n3.nabble.com/Creating-Mahout-vector-from-existing-vector-tp4001436.html Sent from the Mahout User List mailing list archive at Nabble.com.
