Long answer: Preferred tags is an example of an action that would not lead to recommendations in any other type of recommender. A user takes many actions in your app, not all of them have “purchase” intent behind them. What the cross-cooccurrence stuff does is find actions that correlate with the action you want to recommend. Don’t get too hung up in that before you understand the basics—it is a way to make better use of your data.
The cooccurrence recommender does not use ratings. In fact any Mahout recommender that uses LLR ignores ratings. Ratings are very hard to use in practice since no two people rate on the same scale and the same person is often inconsistent about ratings. It is more important to find an indicator or preference and focus on _ranking_ better. Ask yourself if you want to predict a rating or show the user the things you think they will like in the right order (you can only recommend a fixed number of things after all). Not even Netflix, who led us into thinking ratings were important, use ratings predictions to make recommendations anymore and they have stated this publicly. Short Answer: Feed MovieLens in and you will get ranked ratings out of the system (it requires a search engine to query—don’t forget). If you want to toss the very low ratings the answers might be a little better but the fact that a user cared enough to watch the movie is the important thing. On Feb 26, 2015, at 12:08 AM, Ferran Muñoz <ferran.mu...@gmail.com> wrote: Hello, I have read the "Intro to Cooccurrence Recommenders with Spark" of the Mahout documentation and I have a question regarding the unified recommender query. What does "user's-tags-associated-with-purchases" exactly mean? Does it mean that I have to put tags or itemids? I understand that the "tags" field of each item document contains the tags of this particular item. Then, what query do it have to write in order to get recommended items using the content-based indicator? On the other hand, how can I use ratings when computing the spark-itemsimilarity? For example, how can I use spark-itemsimilarity to get recommendations in MovieLens dataset (it has ratings, not boolean)? Thank you in advance. Ferran