It's just a sequence file so you can just create a sequence file writer and output your own weights
Sent from my iPad On Jun 13, 2011, at 9:52 PM, sharath jagannath <[email protected]> wrote: > Hey All, > > I intend to build a KMeans clustering for my documents but does not want to > use the tf/ tf-idf based vectors. > I have a map of <term,weight> associated with the document. > I want to use these weights in place of the term counts for my computation. > Was having a look at DocumentProcessor class, it is primarily driven by the > term count. > wondering whether I need to do something on my own or is there an inbuilt > support for this. > > -- > Thanks, > Sharath
