The CSVVectorIterator in the Integration package will take in a CSV file and produce vectors. It assumes that each row is the equivalent of a DenseVector (does MovieLens fit that?) If you need otherwise, I'd suggest starting with the code and modifying to fit your needs.
-Grant On Jun 12, 2013, at 6:11 AM, Neetha <[email protected]> wrote: > Hi, > > > I am using 1m movielens. > > I need to run the K-means clustering using mahout and hadoop. Actually, > 1st step in the clustering is to convert into a sequence file, then into > vector format and then apply the clustering algorithm. My doubt is, Is > there any need to convert the movielens rating.csv file into a sequence > file. If needed what are the commands for applying clustering technique > using mahout and the hadoop. > > Thanking you, > Neetha Suan Thampi -------------------------------------------- Grant Ingersoll | @gsingers http://www.lucidworks.com
