On Thu, Jun 21, 2012 at 9:01 PM, Nimrod Priell <[email protected]> wrote: > On a completely different subject: I wrote a simple RelevantItemsDataSplitter > and RecommenderIRStatsEvaluator which take a list of item IDs, and run CF > evaluation by hiding items only out of that list, and asking to recommend > only out of that list of items (precision and recall are then also calculated > only with that list of items as the universe).
Sure, if you know what the 'right answers' are more specifically in your use case, you can and should use that in the test. That's what the splitter class is for and that's what you did, yes. The more important thing of course is to implement this in your actual recommender! you can use a Rescorer to penalize popular items, if that's what you believe improves the result quality. > I realize an alternative to the example I proposed with the popularity is > looking at the top-n recommendation for large n because only relatively few > items are very popular so the precision-recall stats based on popularity > become less skewed; But I still think it's a useful constraint for evaluation. You mean you want to use as a large "at" value in your test? This tends to increase recall but decrease precision. I don't know if it (necessarily) fixes something in this regard.
