Re: Newbie question
+ Mahout user Sent from my iPhone On Mar 8, 2014, at 10:42 AM, Mahmood Naderan nt_mahm...@yahoo.commailto:nt_mahm...@yahoo.com wrote: Hi Maybe this is a newbie question but I want to know does Hadoop/Mahout use pthread models? Regards, Mahmood
Re: Newbie question on modeling a Recommender using Mahout when the matrix is sparse
Well there are only 7 products in the universe! If you ask for 10 recommendations, you will always get all unrated items back in the recommendations. That's always true unless the algorithm can't actually establish a value for some items. What result were you expecting, less than 10 recs? less than 7? On Thu, Sep 13, 2012 at 6:55 AM, Gokul Pillai gokoolt...@gmail.com wrote: I am trying out Mahout to come up with product recommendations for users based on data that show what products they use today. The data is not web-scale, just about 300,000 users and 7 products. Few comments about the data here: 1. Since users either have or not have a particular product, the value in the matrix is either 1 or 0 for all the columns (rows being the userids) 2. All the users have one basic product, so I discounted this from the data-model passed to the Mahout recommender since I assume that if everyone has the same product, its effect on the recommendations are trivial. 3. The matrix itself is sparse, the total counts of users having each product is : A=31847, 54754,1897 |23154 |2201 |2766 |33585 Steps followed: 1. Created a data-source from the user-product table in the database File ratingsFile = new File(datasets/products.csv); DataModel model = new FileDataModel(ratingsFile); 2. Created a recommender on this data CachingRecommender recommender = new CachingRecommender(new SlopeOneRecommender(model)); 3. Loop through all users and get the top ten recommendations: ListRecommendedItem recommendations = recommender.recommend(userId, 10); Issue faced: The problem I am facing is that the recommendations that come out are way too simple - meaning that all that it seems like what is being recommended is if a user does not have product A, then recommend it, if they dont have product B, then recommend it and so on. Basically a simple inverse of their ownership status. Obviously, I am not doing something right here. How can I do the modeling better to get the right recommendations. Or is it that my dataset (30 users times 7 products) is too small for Mahout to work with? Look forward to your comments. Thanks.
Re: Newbie question on modeling a Recommender using Mahout when the matrix is sparse
Very true, good catch. I think I was interpreting the results the wrong way. I expect only the top 5, so I changed the parameter to 5 instead of 10 and the results are as expected now. Thanks. On Wed, Sep 12, 2012 at 11:36 PM, Sean Owen sro...@gmail.com wrote: Well there are only 7 products in the universe! If you ask for 10 recommendations, you will always get all unrated items back in the recommendations. That's always true unless the algorithm can't actually establish a value for some items. What result were you expecting, less than 10 recs? less than 7? On Thu, Sep 13, 2012 at 6:55 AM, Gokul Pillai gokoolt...@gmail.com wrote: I am trying out Mahout to come up with product recommendations for users based on data that show what products they use today. The data is not web-scale, just about 300,000 users and 7 products. Few comments about the data here: 1. Since users either have or not have a particular product, the value in the matrix is either 1 or 0 for all the columns (rows being the userids) 2. All the users have one basic product, so I discounted this from the data-model passed to the Mahout recommender since I assume that if everyone has the same product, its effect on the recommendations are trivial. 3. The matrix itself is sparse, the total counts of users having each product is : A=31847, 54754,1897 |23154 |2201 |2766 |33585 Steps followed: 1. Created a data-source from the user-product table in the database File ratingsFile = new File(datasets/products.csv); DataModel model = new FileDataModel(ratingsFile); 2. Created a recommender on this data CachingRecommender recommender = new CachingRecommender(new SlopeOneRecommender(model)); 3. Loop through all users and get the top ten recommendations: ListRecommendedItem recommendations = recommender.recommend(userId, 10); Issue faced: The problem I am facing is that the recommendations that come out are way too simple - meaning that all that it seems like what is being recommended is if a user does not have product A, then recommend it, if they dont have product B, then recommend it and so on. Basically a simple inverse of their ownership status. Obviously, I am not doing something right here. How can I do the modeling better to get the right recommendations. Or is it that my dataset (30 users times 7 products) is too small for Mahout to work with? Look forward to your comments. Thanks.