If you are using the distributed org.apache.mahout.cf.taste.hadoop.item.RecommenderJob you should never use "0" . If you do that, when you multiply the co-occurence matrix times the user's rating vector you remove elements in the matrix, which is like if the user never interacted with the item.
For the same reason, "-1" should work, because actually subtract score from any book which similar to the one with negative rating. For CosineSimilarity, 0 has to be avoided for obvious reasons (no cosine defined at the origin of the axis), and 1 and 2 are possibly the values I'd go for. Tanimoto and LogLikelihood are True/False, but False means "not interacted". Having "dislike = False" would be extremely misleading. For all the other algorithms, I'd say one should make similar considerations. Cheers Mario On Mon, Jul 14, 2014 at 4:21 PM, Floris Devriendt <[email protected] > wrote: > Hey all, > > When using a discrete rating scale (e.g. likes / dislikes), what are the > things that I should consider when using Mahout for Collaborative > Filtering? > > If I'm not mistaking I've read a mail a week or two ago from this mailing > list stating that one should avoid using 0 (dislike) and 1 (like) as > scores, because Mahout would not be able to take into account the dislikes > properly. > If this is true, what scores should I give to my like/dislike scale? (e.g. > is -1/1 better than 0/1, or should I use 1/2 with 1 = dislike and 2 = > like?) > > Best regards, > Floris Devriendt >
