Yep, weird that it is ratings only. Better still to have both ratings and playcounts.
From: Ted Dunning Sent: Tuesday, February 15, 2011 7:41 AM To: [email protected] Subject: Re: Two learning competitions that might be of interest for Mahout For music, there is an even bigger difference between ratings and what people want to listen to. It is, indeed, a pity that the data is ratings instead of listening histories. On Tue, Feb 15, 2011 at 3:57 AM, Sean Owen <[email protected]> wrote: > If I may guess at the answer -- > > Yes in theory it would be better to score output on the quality of its > top recommendations, rather than on accuracy of predicted ratings, > which are just one means to that goal. There are of course contexts > where you have no ratings, so the winning technique here may not > translate to those scenarios. > > Perhaps output would be scored on what proportion of the top k match > the real top k preferred items. And so the test would actually > withhold the top k rated items and ask recommenders to predict them. > This has two problems I can see, however. > > The small problem is that chopping off the top ratings makes the test > data systematically different than real data. There's a lot of > "information" in those top ratings versus any arbitrary k. > > The bigger problem is that the user's top k ratings are not > necessarily the same as the best k recommendations! Let's say I've > never seen the movie Breathless, but, if I do, I'll find it's actually > my favorite movie ever. A recommender would be right in making this a > top recommendation. But a recommender evaluation framework such as > this contest might use can't know that, so would count that "wrong". > > Evaluating rating accuracy is at least unambiguous in comparison and > so can form the basis of a competition. > And to be fair, most people making production recommender systems > would expect it to be able to estimate a rating, in addition to making > recommendations. > > > > > On Tue, Feb 15, 2011 at 11:19 AM, Chen_1st <[email protected]> wrote: > > Hi, Markus, > > > > I am curious why the competition still tries to predict the rating > > values, now that top k recommendation is more practical in real life > > applications, and it's illustrated by many papers that rating value > > prediction is not so useful for discovery of top k items. > > > > Best Regards. > > > > Chen >
