Re: Evaluating boolean preference data sets

Sean Owen Thu, 21 Jul 2011 06:59:48 -0700

You mean, have the user specify all items that are considered relevant? yes
that could be useful. Do you have a patch in mind?


Your analysis is correct, and I would not call it a bug. It's a symptom of
how little information the evaluation has to work with here without ratings.
It has to pick random items as "relevant", for starters. It's another reason
your idea is good, to let the user specify those relevant items.

On Thu, Jul 21, 2011 at 1:49 PM, Marko Ciric <[email protected]> wrote:

> Hi guys,
>
> I wonder if Mahout should have a "precision and recall" evaluator that
> calculates the relevant items data set without looking to the relevance
> threshold. This would be suitable for data sets with boolean preference
> nature. In addition, the relevant items can be removed from the training
> data set by random (removing first couple of preferred items every time
> wouldn't be a great idea).
>
> On the other hand, having relevance threshold
> with RecommenderIRStatsEvaluator set to 1.0 removes exactly "at" number of
> items. As the recommender returns that number of items, the precision and
> recall would have the same value. Is this Ok or is it a bug, given that
>  precision = intersection / num_recommended_items (where
> num_recommended_items is almost always "at")
>  recall = intersection / num_relevant_items (also "at" as the previously
> mentioned why relevanceThreshold is 1.0)?
>
>
> --
> Marko Ćirić
> [email protected]
>

Re: Evaluating boolean preference data sets

Reply via email to