I had a sudden idea: would it be better to use an item based recommender? (with TanimotoCoefficientSimilarity instead of BooleanTanimotoCoefficientSimilarity that doesn't implement ItemSimilarity) In this way I could overcome the scalability problem of having too many "users" (queries)
Another question: do you know of any example query log data I could use to experiment how the algorithm performs for large data sets? Thanks again Claudia -----Messaggio originale----- Da: Sean Owen [mailto:[email protected]] Inviato: lunedì 20 luglio 2009 13.19 A: [email protected] Oggetto: Re: Implement the related search feature with mahout Actually I kind of misspoke. The fastest way to do this is in fact to use a GenericUserBasedRecommender -- not because you need recommendations, but because it exposes a nice mostSimlarUsers() method. DataModel model = new MySQLBooleanPrefJDBCDataModel(...); // or whatever you are using UserSimilarity similarity = BooleanTanimotoCoefficientSimilarity(model); UserNeighborhood similarity = new NearestNUserNeighborhood(10, similarity, model); BooleanUserGenericUserBasedRecommender recommender = new BooleanUserGenericUserBasedRecommender(model, neighborhood, similarity); Rescorer<Pair<User,User>> rescorer = new Rescorer<Pair<User,User>>() { // implement your rescoring logic to affect how similar 'users' are -- boost the returned value for popular queries }; Object userID = ...; // current query ID List<User> similarUsers = recommender.mostSimilarUsers(userID, 10, rescorer);
