Hmm, not sure I understand. No, it's not true that real-life data regularly omits the user's top ratings. Why would that be?
How would you score the recommendations by holding out a random subset? That subset is definitely *not* representative of good recommendations -- you might be picking out things the user hates. Precision / recall don't really make sense unless you think you're holding out "good" recommendations and those would have to be top rated items. Sean On Tue, Feb 15, 2011 at 5:36 PM, Chen_1st <[email protected]> wrote: > Hi, Sean, > > I cannot agree with you. > > The small problem you mentioned might incur difficulties in prediction > indeed, but such problem also occurs in real life applications, right? > > As to the big problem you mentioned, of course, we don't have the complete > set of true result, but if the available subset of true result is randomly > selected from the complete set, I think the evaluation criteria like > recall@k, precision@k, or ndcg are still meaningful. > >
