Yeah, I'm a little perplexed. By low-rank items I mean items that have a low number of preferences not a low average preference. Basically if we don't have some level of confidence in our ItemSimilarity based on the fact that not many people have given a preference good or bad, don't recommend them. To your point though LogLikelihood may already account for that making these results even more surprising.
The ItemSimilarity I'm using is indeed LogLikelihoodSimilarity. *It is repeatable. Using aaEvaluator.evaluate(new xxxRecommenderBuilder(), null, model, 0.9, 1.0);* 12/01/04 06:33:47 INFO eval.AbstractDifferenceRecommenderEvaluator: Beginning evaluation of 1226 users 12/01/04 06:33:48 INFO eval.AbstractDifferenceRecommenderEvaluator: Starting timing of 1226 tasks in 4 threads 12/01/04 06:33:48 INFO eval.StatsCallable: Average time per recommendation: 10ms 12/01/04 06:33:48 INFO eval.StatsCallable: Approximate memory used: 7MB / 126MB 12/01/04 06:33:48 INFO eval.StatsCallable: Unable to recommend in 131 cases 12/01/04 06:33:48 INFO eval.StatsCallable: Average time per recommendation: 3ms 12/01/04 06:33:48 INFO eval.StatsCallable: Approximate memory used: 12MB / 159MB 12/01/04 06:33:48 INFO eval.StatsCallable: Unable to recommend in 1264 cases 12/01/04 06:33:49 INFO eval.AbstractDifferenceRecommenderEvaluator: Evaluation result: 0.7553994047273064 12/01/04 06:35:32 INFO eval.AbstractDifferenceRecommenderEvaluator: Beginning evaluation of 1264 users 12/01/04 06:35:32 INFO eval.AbstractDifferenceRecommenderEvaluator: Starting timing of 1264 tasks in 4 threads 12/01/04 06:35:32 INFO eval.StatsCallable: Average time per recommendation: 10ms 12/01/04 06:35:32 INFO eval.StatsCallable: Approximate memory used: 29MB / 126MB 12/01/04 06:35:32 INFO eval.StatsCallable: Unable to recommend in 5 cases 12/01/04 06:35:32 INFO eval.StatsCallable: Average time per recommendation: 2ms 12/01/04 06:35:32 INFO eval.StatsCallable: Approximate memory used: 29MB / 126MB 12/01/04 06:35:32 INFO eval.StatsCallable: Unable to recommend in 1037 cases 12/01/04 06:35:32 INFO eval.AbstractDifferenceRecommenderEvaluator: Evaluation result: 0.7304665753551213 12/01/04 06:36:12 INFO eval.AbstractDifferenceRecommenderEvaluator: Beginning evaluation of 1223 users 12/01/04 06:36:12 INFO eval.AbstractDifferenceRecommenderEvaluator: Starting timing of 1223 tasks in 4 threads 12/01/04 06:36:12 INFO eval.StatsCallable: Average time per recommendation: 37ms 12/01/04 06:36:12 INFO eval.StatsCallable: Approximate memory used: 29MB / 126MB 12/01/04 06:36:12 INFO eval.StatsCallable: Unable to recommend in 0 cases 12/01/04 06:36:13 INFO eval.StatsCallable: Average time per recommendation: 2ms 12/01/04 06:36:13 INFO eval.StatsCallable: Approximate memory used: 29MB / 126MB 12/01/04 06:36:13 INFO eval.StatsCallable: Unable to recommend in 1028 cases 12/01/04 06:36:13 INFO eval.AbstractDifferenceRecommenderEvaluator: Evaluation result: 0.7783535436208079 *And then with aaEvaluator.evaluate(new xxxRecommenderBuilder(), null, model, 0.1, 1.0);* 12/01/04 06:37:27 INFO eval.AbstractDifferenceRecommenderEvaluator: Beginning evaluation of 1234 users 12/01/04 06:37:27 INFO eval.AbstractDifferenceRecommenderEvaluator: Starting timing of 1234 tasks in 4 threads 12/01/04 06:37:27 INFO eval.StatsCallable: Average time per recommendation: 10ms 12/01/04 06:37:27 INFO eval.StatsCallable: Approximate memory used: 26MB / 126MB 12/01/04 06:37:27 INFO eval.StatsCallable: Unable to recommend in 37 cases 12/01/04 06:37:28 INFO eval.StatsCallable: Average time per recommendation: 3ms 12/01/04 06:37:28 INFO eval.StatsCallable: Approximate memory used: 124MB / 227MB 12/01/04 06:37:28 INFO eval.StatsCallable: Unable to recommend in 7907 cases 12/01/04 06:37:28 INFO eval.AbstractDifferenceRecommenderEvaluator: Evaluation result: 0.2785222650491917 12/01/04 06:38:33 INFO eval.AbstractDifferenceRecommenderEvaluator: Beginning evaluation of 1223 users 12/01/04 06:38:33 INFO eval.AbstractDifferenceRecommenderEvaluator: Starting timing of 1223 tasks in 4 threads 12/01/04 06:38:33 INFO eval.StatsCallable: Average time per recommendation: 20ms 12/01/04 06:38:33 INFO eval.StatsCallable: Approximate memory used: 26MB / 126MB 12/01/04 06:38:33 INFO eval.StatsCallable: Unable to recommend in 21 cases 12/01/04 06:38:34 INFO eval.StatsCallable: Average time per recommendation: 2ms 12/01/04 06:38:34 INFO eval.StatsCallable: Approximate memory used: 124MB / 227MB 12/01/04 06:38:34 INFO eval.StatsCallable: Unable to recommend in 7887 cases 12/01/04 06:38:34 INFO eval.AbstractDifferenceRecommenderEvaluator: Evaluation result: 0.28219563687543997 12/01/04 06:39:03 INFO eval.AbstractDifferenceRecommenderEvaluator: Beginning evaluation of 1263 users 12/01/04 06:39:03 INFO eval.AbstractDifferenceRecommenderEvaluator: Starting timing of 1263 tasks in 4 threads 12/01/04 06:39:03 INFO eval.StatsCallable: Average time per recommendation: 24ms 12/01/04 06:39:03 INFO eval.StatsCallable: Approximate memory used: 26MB / 126MB 12/01/04 06:39:03 INFO eval.StatsCallable: Unable to recommend in 9 cases 12/01/04 06:39:04 INFO eval.StatsCallable: Average time per recommendation: 2ms 12/01/04 06:39:04 INFO eval.StatsCallable: Approximate memory used: 115MB / 227MB 12/01/04 06:39:04 INFO eval.StatsCallable: Unable to recommend in 7651 cases 12/01/04 06:39:04 INFO eval.AbstractDifferenceRecommenderEvaluator: Evaluation result: 0.24877444317466355 Thanks for looking. Nick On Tue, Jan 3, 2012 at 11:56 PM, Sean Owen <[email protected]> wrote: > That is the opposite of what you'd expect, and I think that's a possible > explanation you've identified, but still seems unlikely to me. Something > else may be wrong. Is this repeatable, and not just a fluke of the random > number generator? What are the exact args you're using, just to make sure > you're really setting the percentages and such as you think? > > If you have more data available, indeed I'd use more data, especially if > that more accurately reflects your real environment. You can try to exclude > these low-rank items, though this makes the test less representative of > reality, since those kinds of item do exist and are an issue. What > ItemSimilarity? because some are by nature already accounting for these > issues, like log-likelihood. > > But you can use IDRescorer if you like to exclude such items, if you do > want to go that way, yes. > > On Wed, Jan 4, 2012 at 1:51 AM, Nick Jordan <[email protected]> wrote: > > > Hi All, > > > > I'm currently running an item based recommendation > > using KnnItemBasedRecommender. My data set isn't very large at > > approximately 30k preferences over 10k items. When running > > a AverageAbsoluteDifferenceRecommenderEvaluator evaluation on a 0.9 > > training set the result is ~0.80 (on a preference scale of 1-5). When > > tuning that training set down to only 0.1 the mean difference is closer > to > > 0.2. > > > > I assume that this number is actually lower because there are less > > recommendations that can actually be made. Meaning that with the smaller > > training set there isn't enough similarity to make recommendations, and > so > > those that it does make are more accurate. So the question for me > becomes, > > what does the evaluation look like when only providing recommendations > for > > items with more than x declared preferences? I'm wondering what the best > > way to determine this. Should I create a new recommender that only will > > return items with x or more preferences (maybe using IDRescorer?) or > should > > I create a new evaulator to do something similar? Is there a native > method > > to accomplish this that I've missed? Is my hypothesis just likely wrong? > > > > Appreciate the feedback. > > > > Nick > > >
