Re: Item Based Recommendation Evaluation based on Number of Preferences

Nick Jordan Wed, 04 Jan 2012 03:42:44 -0800

Yeah, I'm a little perplexed.  By low-rank items I mean items that have a
low number of preferences not a low average preference.  Basically if we
don't have some level of confidence in our ItemSimilarity based on the fact
that not many people have given a preference good or bad, don't recommend
them.  To your point though LogLikelihood may already account for that
making these results even more surprising.


The ItemSimilarity I'm using is indeed LogLikelihoodSimilarity.

*It is repeatable.  Using aaEvaluator.evaluate(new xxxRecommenderBuilder(),
null, model, 0.9, 1.0);*

12/01/04 06:33:47 INFO eval.AbstractDifferenceRecommenderEvaluator:
Beginning evaluation of 1226 users
12/01/04 06:33:48 INFO eval.AbstractDifferenceRecommenderEvaluator:
Starting timing of 1226 tasks in 4 threads
12/01/04 06:33:48 INFO eval.StatsCallable: Average time per recommendation:
10ms
12/01/04 06:33:48 INFO eval.StatsCallable: Approximate memory used: 7MB /
126MB
12/01/04 06:33:48 INFO eval.StatsCallable: Unable to recommend in 131 cases
12/01/04 06:33:48 INFO eval.StatsCallable: Average time per recommendation:
3ms
12/01/04 06:33:48 INFO eval.StatsCallable: Approximate memory used: 12MB /
159MB
12/01/04 06:33:48 INFO eval.StatsCallable: Unable to recommend in 1264 cases
12/01/04 06:33:49 INFO eval.AbstractDifferenceRecommenderEvaluator:
Evaluation result: 0.7553994047273064

12/01/04 06:35:32 INFO eval.AbstractDifferenceRecommenderEvaluator:
Beginning evaluation of 1264 users
12/01/04 06:35:32 INFO eval.AbstractDifferenceRecommenderEvaluator:
Starting timing of 1264 tasks in 4 threads
12/01/04 06:35:32 INFO eval.StatsCallable: Average time per recommendation:
10ms
12/01/04 06:35:32 INFO eval.StatsCallable: Approximate memory used: 29MB /
126MB
12/01/04 06:35:32 INFO eval.StatsCallable: Unable to recommend in 5 cases
12/01/04 06:35:32 INFO eval.StatsCallable: Average time per recommendation:
2ms
12/01/04 06:35:32 INFO eval.StatsCallable: Approximate memory used: 29MB /
126MB
12/01/04 06:35:32 INFO eval.StatsCallable: Unable to recommend in 1037 cases
12/01/04 06:35:32 INFO eval.AbstractDifferenceRecommenderEvaluator:
Evaluation result: 0.7304665753551213

12/01/04 06:36:12 INFO eval.AbstractDifferenceRecommenderEvaluator:
Beginning evaluation of 1223 users
12/01/04 06:36:12 INFO eval.AbstractDifferenceRecommenderEvaluator:
Starting timing of 1223 tasks in 4 threads
12/01/04 06:36:12 INFO eval.StatsCallable: Average time per recommendation:
37ms
12/01/04 06:36:12 INFO eval.StatsCallable: Approximate memory used: 29MB /
126MB
12/01/04 06:36:12 INFO eval.StatsCallable: Unable to recommend in 0 cases
12/01/04 06:36:13 INFO eval.StatsCallable: Average time per recommendation:
2ms
12/01/04 06:36:13 INFO eval.StatsCallable: Approximate memory used: 29MB /
126MB
12/01/04 06:36:13 INFO eval.StatsCallable: Unable to recommend in 1028 cases
12/01/04 06:36:13 INFO eval.AbstractDifferenceRecommenderEvaluator:
Evaluation result: 0.7783535436208079

*And then with aaEvaluator.evaluate(new xxxRecommenderBuilder(), null,
model, 0.1, 1.0);*

12/01/04 06:37:27 INFO eval.AbstractDifferenceRecommenderEvaluator:
Beginning evaluation of 1234 users
12/01/04 06:37:27 INFO eval.AbstractDifferenceRecommenderEvaluator:
Starting timing of 1234 tasks in 4 threads
12/01/04 06:37:27 INFO eval.StatsCallable: Average time per recommendation:
10ms
12/01/04 06:37:27 INFO eval.StatsCallable: Approximate memory used: 26MB /
126MB
12/01/04 06:37:27 INFO eval.StatsCallable: Unable to recommend in 37 cases
12/01/04 06:37:28 INFO eval.StatsCallable: Average time per recommendation:
3ms
12/01/04 06:37:28 INFO eval.StatsCallable: Approximate memory used: 124MB /
227MB
12/01/04 06:37:28 INFO eval.StatsCallable: Unable to recommend in 7907 cases
12/01/04 06:37:28 INFO eval.AbstractDifferenceRecommenderEvaluator:
Evaluation result: 0.2785222650491917

12/01/04 06:38:33 INFO eval.AbstractDifferenceRecommenderEvaluator:
Beginning evaluation of 1223 users
12/01/04 06:38:33 INFO eval.AbstractDifferenceRecommenderEvaluator:
Starting timing of 1223 tasks in 4 threads
12/01/04 06:38:33 INFO eval.StatsCallable: Average time per recommendation:
20ms
12/01/04 06:38:33 INFO eval.StatsCallable: Approximate memory used: 26MB /
126MB
12/01/04 06:38:33 INFO eval.StatsCallable: Unable to recommend in 21 cases
12/01/04 06:38:34 INFO eval.StatsCallable: Average time per recommendation:
2ms
12/01/04 06:38:34 INFO eval.StatsCallable: Approximate memory used: 124MB /
227MB
12/01/04 06:38:34 INFO eval.StatsCallable: Unable to recommend in 7887 cases
12/01/04 06:38:34 INFO eval.AbstractDifferenceRecommenderEvaluator:
Evaluation result: 0.28219563687543997

12/01/04 06:39:03 INFO eval.AbstractDifferenceRecommenderEvaluator:
Beginning evaluation of 1263 users
12/01/04 06:39:03 INFO eval.AbstractDifferenceRecommenderEvaluator:
Starting timing of 1263 tasks in 4 threads
12/01/04 06:39:03 INFO eval.StatsCallable: Average time per recommendation:
24ms
12/01/04 06:39:03 INFO eval.StatsCallable: Approximate memory used: 26MB /
126MB
12/01/04 06:39:03 INFO eval.StatsCallable: Unable to recommend in 9 cases
12/01/04 06:39:04 INFO eval.StatsCallable: Average time per recommendation:
2ms
12/01/04 06:39:04 INFO eval.StatsCallable: Approximate memory used: 115MB /
227MB
12/01/04 06:39:04 INFO eval.StatsCallable: Unable to recommend in 7651 cases
12/01/04 06:39:04 INFO eval.AbstractDifferenceRecommenderEvaluator:
Evaluation result: 0.24877444317466355

Thanks for looking.

Nick

On Tue, Jan 3, 2012 at 11:56 PM, Sean Owen <[email protected]> wrote:

> That is the opposite of what you'd expect, and I think that's a possible
> explanation you've identified, but still seems unlikely to me. Something
> else may be wrong. Is this repeatable, and not just a fluke of the random
> number generator? What are the exact args you're using, just to make sure
> you're really setting the percentages and such as you think?
>
> If you have more data available, indeed I'd use more data, especially if
> that more accurately reflects your real environment. You can try to exclude
> these low-rank items, though this makes the test less representative of
> reality, since those kinds of item do exist and are an issue. What
> ItemSimilarity? because some are by nature already accounting for these
> issues, like log-likelihood.
>
> But you can use IDRescorer if you like to exclude such items, if you do
> want to go that way, yes.
>
> On Wed, Jan 4, 2012 at 1:51 AM, Nick Jordan <[email protected]> wrote:
>
> > Hi All,
> >
> > I'm currently running an item based recommendation
> > using KnnItemBasedRecommender.  My data set isn't very large at
> > approximately 30k preferences over 10k items.  When running
> > a AverageAbsoluteDifferenceRecommenderEvaluator evaluation on a 0.9
> > training set the result is ~0.80 (on a preference scale of 1-5).  When
> > tuning that training set down to only 0.1 the mean difference is closer
> to
> > 0.2.
> >
> > I assume that this number is actually lower because there are less
> > recommendations that can actually be made.  Meaning that with the smaller
> > training set there isn't enough similarity to make recommendations, and
> so
> > those that it does make are more accurate.  So the question for me
> becomes,
> > what does the evaluation look like when only providing recommendations
> for
> > items with more than x declared preferences?  I'm wondering what the best
> > way to determine this.  Should I create a new recommender that only will
> > return items with x or more preferences (maybe using IDRescorer?) or
> should
> > I create a new evaulator to do something similar?  Is there a native
> method
> > to accomplish this that I've missed?  Is my hypothesis just likely wrong?
> >
> > Appreciate the feedback.
> >
> > Nick
> >
>

Re: Item Based Recommendation Evaluation based on Number of Preferences

Reply via email to