spark-itemsimilarity takes tuples user-id,item-id
You are looking at the collected input as a matrix. it would be collected from something of the form: u1,item1 u1,item10 u1,item500 u2,item2 u2,item500 ... On Mar 11, 2015, at 8:24 PM, Jeff Isenhart <jeffi...@yahoo.com.INVALID> wrote: I am trying to run the example found here: http://mahout.apache.org/users/recommender/intro-cooccurrence-spark.html The data (demoItems.csv added to hdfs) is just copied from the example: u1,purchase,iphoneu1,purchase,ipadu2,purchase,nexus...... But when I run mahout spark-itemsimilarity -i demoItems.csv -o output2 -fc 1 -ic 2 I get empty _SUCCESS and part-00000 files output2/indicator-matrix Any ideas?