The CSVVectorIterator in the Integration package will take in a CSV file and 
produce vectors.  It assumes that each row is the equivalent of a DenseVector 
(does MovieLens fit that?)  If you need otherwise, I'd suggest starting with 
the code and modifying to fit your needs.  


-Grant

On Jun 12, 2013, at 6:11 AM, Neetha <[email protected]> wrote:

> Hi,
> 
> 
> I am using 1m movielens.
> 
> I need to run the K-means clustering using mahout and hadoop. Actually,
> 1st step in the clustering is to convert into a sequence file, then into
> vector format and then apply the clustering algorithm. My doubt is, Is
> there any need to convert the movielens rating.csv file into a sequence
> file. If needed what are the commands for applying clustering technique
> using mahout and the hadoop.
> 
> Thanking you,
> Neetha Suan Thampi

--------------------------------------------
Grant Ingersoll | @gsingers
http://www.lucidworks.com





Reply via email to