Hello Mahout Users, I'm relatively new to recommendations but have some experience with other ML techniques such as clustering.
I'm trying to generate recommendations with data that isn't the conventional user, item, rating format. Here's some details on the problem I'm trying to solve; hoping someone can suggest the best Mahout algorithm to accomplish this. 1) User's purchase items, potentially the same ones multiple times, but do not give specific ratings to those items. 2) There is rich meta-data for the items (names, categories, descriptions, etc) 3) The data is very sparse. There may be 100,000 items and on average a user may only ever purchase 1-10 of those items. Some of the approaches I've considered after reading the various Mahout documentation / discussion are: A) Use an item-based recommender, with the rating being the number of times they bought the item (perhaps normalize the data between 1-10). B) Use the meta-data to generate similarities between the items, then simply recommend to a user the top N items that are similar to one that they've previously purchased. This could be implemented in Mahout by overriding the ItemSimilarity (as described in this post: http://lucene.472066.n3.nabble.com/Content-based-Recommender-Implementation-td913981.html). Obviously the challenging part here is figuring out how to generate a similarity score for the two items using the meta-data. C) Use frequent item-sets to figure out other items that are usually bought with that one, and recommend those. Any suggestions on this matter would be greatly appreciated. Cheers, Cam
