Hello Sebastian thanks for the reply.
After adding the GenericBooleanPrefItemBasedRecommender instead of the GenericItemBasedRecommender I obtain the following results: FIRST RecommendedItem[item:4140, value:2.7275915] RecommendedItem[item:3982, value:2.7191503] RecommendedItem[item:1377, value:2.7180452] RecommendedItem[item:2706, value:2.7041116] RecommendedItem[item:4010, value:2.702695] ----------------- SECOND RecommendedItem[item:4140, value:4.4948235] RecommendedItem[item:2108, value:4.3325663] RecommendedItem[item:1968, value:4.330123] RecommendedItem[item:2835, value:4.3260937] RecommendedItem[item:2902, value:4.3107653] Could the difference be due to the pruning you're talking about? If so which of the two implementation do you think could be considered better? Thanks again Davide 2012/9/20 Sebastian Schelter <[email protected]> > You should also be aware that ItemSimilarityJob applies some pruning by > default, that can also be a reason for different results. > > Best, > Sebastian > > On 20.09.2012 15:19, Sean Owen wrote: > > The problem is that you have boolean data with no ratings, so all the > > ratings are 1. But you are using GenericItemBasedRecommender, which > > expects ratings. Since it ranks on estimated ratings, but, all ratings > > are 1, the result is essentially random. > > > > Use GenericBooleanPrefItemBasedRecommender. > > > > On Thu, Sep 20, 2012 at 2:04 PM, Davide Pozza <[email protected]> > wrote: > >> Hello > >> > >> I'm trying to understand how to develop a item-based recommendation > module > >> for an ecommerce website. > >> > >> Here's my input data.csv file format: > >> > >> USER_ID,ITEM_ID > >> > >> (data coming from the orders history, so I haven't any rating to use) > >> > >> If I correctly understand the documentation, the following > implementations > >> should be equivalent (the first one just uses the precomputed data), but > >> they return different results. > >> Could anyone help me to understand the reason? > >> > >> FIRST IMPLEMENTATION > >> ==================== > >> DataModel dataModel = new FileDataModel(new File("data.csv"));//FORMAT > >> user_id,item_id > >> > >> //precomputed data generated by ItemSimilarityJob with > >> SIMILARITY_LOGLIKELIHOOD > >> ItemSimilarity similarity = new FileItemSimilarity(new > >> File("precomputed_data")); > >> > >> GenericItemBasedRecommender recommender = > >> new GenericItemBasedRecommender(dataModel, similarity); > >> > >> long userId = 8500003; > >> List<RecommendedItem> recommendations = > >> recommender.recommend(userId , 5); > >> for (RecommendedItem recommendation : recommendations){ > >> System.out.println(recommendation); > >> } > >> > >> ==RESULT== > >> RecommendedItem[item:1653, value:1.0] > >> RecommendedItem[item:14, value:1.0] > >> RecommendedItem[item:1592, value:1.0] > >> RecommendedItem[item:25, value:1.0] > >> RecommendedItem[item:43, value:1.0] > >> > >> SECOND IMPLEMENTATION > >> ====================== > >> DataModel dataModel = new FileDataModel(new File("data.csv"));//FORMAT > >> user_id,item_id > >> > >> ItemSimilarity similarity = new LogLikelihoodSimilarity(dataModel); > >> > >> GenericItemBasedRecommender recommender = > >> new GenericItemBasedRecommender(dataModel, similarity); > >> > >> long userId = 8500003; > >> List<RecommendedItem> recommendations = > >> recommender.recommend(userId , 5); > >> for (RecommendedItem recommendation : recommendations){ > >> System.out.println(recommendation); > >> } > >> > >> ==RESULT== > >> RecommendedItem[item:28, value:1.0] > >> RecommendedItem[item:14, value:1.0] > >> RecommendedItem[item:20, value:1.0] > >> RecommendedItem[item:21, value:1.0] > >> RecommendedItem[item:25, value:1.0] > >> > >> -- > >> Davide Pozza > > -- Davide Pozza
