Hi: I want to vectorize the movielens 100K dataset as a RandomAccessSparseVector and use it to run Mahout k-means clustering. Has anyone done this before? If not, any ideas on a how this can be done? (BTW, movielens dataset contains ~100K records/lines with this format: userid, itemid, rating, unix time.)
Thanks .. Carlos
