Hi, I'm trying to test mahout 542 (ALS Matrix Factorization) on the kddcup track2 data set and would like some feedback.
I am using the latest mahout 0.5 snapshot. I converted the trainIdx2.txt data using org.apache.mahout.cf.taste.example.kddcup.ToCSV When training on this I get errors which seemed to be because the ratings are in the range 0-100 and it wasn't liking the zero values. So I hacked ratings of zero to be 1. I trained using --numFeatures 20 --numIterations 10 --lambda 0.065 The training seemed to succeed and as a simple way to get a result set for track2 I simply used predictFromFactorization to predict ratings for testIdx2.txt and chose the top 3 ratings as '1' values in the result and the other 3 as '0'. However, the error for this was 49.9% which seems equivalent to a random result. Has anyone else tried mahout 542 on this data set and can provide feedback? Thanks Clive
