Yes, there should exist an evaluation that allows you to pass which items are relevant. On the other hand, generally speaking, I am also trying to evaluate with having relevant items all chosen randomly. Maybe both implementations should exist.
On 21 July 2011 15:59, Sean Owen <[email protected]> wrote: > You mean, have the user specify all items that are considered relevant? yes > that could be useful. Do you have a patch in mind? > > Your analysis is correct, and I would not call it a bug. It's a symptom of > how little information the evaluation has to work with here without > ratings. > It has to pick random items as "relevant", for starters. It's another > reason > your idea is good, to let the user specify those relevant items. > > On Thu, Jul 21, 2011 at 1:49 PM, Marko Ciric <[email protected]> > wrote: > > > Hi guys, > > > > I wonder if Mahout should have a "precision and recall" evaluator that > > calculates the relevant items data set without looking to the relevance > > threshold. This would be suitable for data sets with boolean preference > > nature. In addition, the relevant items can be removed from the training > > data set by random (removing first couple of preferred items every time > > wouldn't be a great idea). > > > > On the other hand, having relevance threshold > > with RecommenderIRStatsEvaluator set to 1.0 removes exactly "at" number > of > > items. As the recommender returns that number of items, the precision and > > recall would have the same value. Is this Ok or is it a bug, given that > > precision = intersection / num_recommended_items (where > > num_recommended_items is almost always "at") > > recall = intersection / num_relevant_items (also "at" as the previously > > mentioned why relevanceThreshold is 1.0)? > > > > > > -- > > Marko Ćirić > > [email protected] > > > -- -- Marko Ćirić [email protected]
