You need to group by user before converting to vector to get sensible clustering.
On Wed, Jun 12, 2013 at 1:06 PM, Grant Ingersoll <[email protected]>wrote: > The CSVVectorIterator in the Integration package will take in a CSV file > and produce vectors. It assumes that each row is the equivalent of a > DenseVector (does MovieLens fit that?) If you need otherwise, I'd suggest > starting with the code and modifying to fit your needs. > > > -Grant > > On Jun 12, 2013, at 6:11 AM, Neetha <[email protected]> wrote: > > > Hi, > > > > > > I am using 1m movielens. > > > > I need to run the K-means clustering using mahout and hadoop. Actually, > > 1st step in the clustering is to convert into a sequence file, then into > > vector format and then apply the clustering algorithm. My doubt is, Is > > there any need to convert the movielens rating.csv file into a sequence > > file. If needed what are the commands for applying clustering technique > > using mahout and the hadoop. > > > > Thanking you, > > Neetha Suan Thampi > > -------------------------------------------- > Grant Ingersoll | @gsingers > http://www.lucidworks.com > > > > > >
