The recommendation process ends with steps:

1. Estimate a pref for each candidate item
2. (Optionally, rescore or filter those pref values)
3. Sort by estimated pref and return top items by pref

The evaluator is not evaluating the result at step #3, but at step #1 -- as
a proxy for evaluating the quality of the ultimate recommendations. It's
not necessarily any less valid to see how well it estimates the pref for an
item that happens to be expired. So yes I'd say the current behavior is
intended.

I take your point though. You could fairly easily
modify AbstractDifferenceRecommenderEvaluator to construct whatever test
and training data set you like. For example, you would probably put all
expired items in your training set and not in the test set.

If you're OK just modifying the code, go for that.
If you'd like to think of a clean way to incorporate a hook that lets you
replace the random test/training selection with custom logic, that's cool
too. I think it would be some work, if not a great deal, to cleanly
refactor out the random sampling.

On Tue, Nov 29, 2011 at 4:09 PM, Anatoliy Kats <[email protected]> wrote:

> Hi,
>
> I brought up this question in dev a few weeks ago.  I have a
> recommendation algorithm that learns the similarity matrix relying on both
> current items, and expired ones that should not be recommended.  However,
> AverageAbsoluteDifferenceRecom**menderEvaluator compares the predicted
> and actual ratings for all items, expired or not.  I believe the evaluation
> would be more realistic if it did not -- it corresponds more closely to how
> the algorithm is normally deployed in production.  For example, the newer
> items generally have fewer clicks, so this kind of an evaluation emphasizes
> the cold start problem we would experience in production.
>
> The evaluation uses expired items even if if I write a recommender class
> that forces all recommendations to use an IDRescorer that sets their scores
> to NaN.  The reason is that the ...Evaluator calls the 
> Recommender::**doEstimatePreference
> function to calculate the predicted rating, bypassing the recommend
> function.  I checked for the presence of expired items by running my
> recommender in the debugger, and checking the item IDs when
> doEstimatePreference is called.
>
> Do I understand the evaluator's behavior correctly?  Do you think this is
> considered a bug?
>
> Thanks,
>
> Anatoliy
>

Reply via email to