10x, that was what I needed :) On Wed, Jan 6, 2010 at 4:58 AM, Drew Farris <[email protected]> wrote:
> Take a look a o.a.m.clustering.ClusterDumper in mahout-utils. The > points file is a SequenceFile<Text,Text> where the key is the vector > id and the value is a cluster id. > > On Tue, Jan 5, 2010 at 9:51 PM, Bogdan Vatkov <[email protected]> > wrote: > > I customized the lucene index-to-vector dumper already quite a lot (e.g. > > applied stop-words (from file), stop-regex) but I am wondering how the > input > > vectors are later reachable if I start from cluster vectors, you say > points > > are somehow doing that, where can I read more or can you tell me more, or > is > > there a piece of code which would best guide me through the points > format? > -- Best regards, Bogdan
