I agree with both of you (Ted and Sean) regarding the generation of real-world recommendations, and I also think that it makes sense to use a static neighbourhood when generating recommendations, because it will be too slow to estimate the target user's rating of all the items in the system.
I still think that it's worthwhile to compare rating errors of different algorithms on all items, because in general it is likely that the more accurate algorithm will generate better recommendations. Comparing the actual generated recommendations is not as simple as comparing rating errors, though it is very important. As Ted said, you need to worry about the ordering of the top few items, and you typically don't want all the recommended items to be too similar to each other or too obvious. However, capturing all these qualities in a single metric is quite hard, if not impossible. In any case, I think that the current behaviour should be documented, either in the javadocs or in the wiki or both. I'm happy to update the documentation if it's okay with you. On Fri, Aug 13, 2010 at 06:09, Ted Dunning <ted.dunn...@gmail.com> wrote: > Focussing on rating error is also problematic in that it causes us to worry > about being correct about the estimated ratings for items that will *never* > be shown to a user. > > In my mind, the only thing that matters in a practical system is the > ordering of the top few items and the rough composition of the top few > hundred items. Everything else makes no difference at all for real-world > recommendations. > > On Thu, Aug 12, 2010 at 12:53 PM, Sean Owen <sro...@gmail.com> wrote: > > > I agree with your reading of what the Herlocker paper is saying. The > > paper is focused on producing one estimated rating, not > > recommendations. While those tasks are related -- recommendations are > > those with the highest estimated ratings -- translating what's in > > Herlocker directly to a recommendation algorithm is a significant > > jump. > > >