Creating Mahout vector from existing vector

sampath Wed, 15 Aug 2012 09:30:15 -0700

I already have a (term,weight) data using which I wanted to do an LDA
analysis to find the topics distribution.


How should I create the Mahout vectors from this? 
Documentation says, I can use VectorWriter, but I'm not sure how to go with
this.

*Converting existing vectors to Mahout's format*

If you are in the happy position to already own a document (as in: texts,
images or whatever item you wish to treat) processing pipeline, the question
arises of how to convert the vectors into the Mahout vector format. Probably
the easiest way to go would be to implement your own Iterable<Vector>
(called VectorIterable in the example below) and then reuse the existing
VectorWriter classes:

VectorWriter vectorWriter = SequenceFile.createWriter(filesystem,
configuration, outfile, LongWritable.class, SparseVector.class);
long numDocs = vectorWriter.write(new VectorIterable(), Long.MAX_VALUE);





--
View this message in context: 
http://lucene.472066.n3.nabble.com/Creating-Mahout-vector-from-existing-vector-tp4001436.html
Sent from the Mahout User List mailing list archive at Nabble.com.

Creating Mahout vector from existing vector

Reply via email to