Re: Using IDF in CF recommender

Paulo Villegas Wed, 06 Feb 2013 12:54:21 -0800


> The affect of downweighting the popular items is very similar to
> removing them from recommendations so I still suspect precision will
> go down using IDF. Obviously this can pretty easily be tested, I just
> wondered if anyone had already done it.
>
> This brings up a problem with holdout based precision. It measures
> the value of a model trained on a training set in predicting
> something that is in the holdout set. This may or may not correlate
> with affecting user behavior.

Indeed. The problem with holdout sets is that they only indicate whatusers did with certain items. There's no way to know what they wouldhave done with items they were not exposed to.


>
> To use purchases as preference indicators, a precision metric would
> measure how well purchases in the trianing set predicted purchases in
> the test set. If IDF lowers precision, it may also affect user
> behavior strongly by recommending non-obvious (non-inevitable)
> items.

It's also a strategic decision: whether you want to use recommendationsto reinforce the "long tail" of your catalog or go with the sure thing.


This affect on user behavior AFAIK can't be measured from holdout
tests. I worry that precision related measures may point us in the
wrong direction. Are A/B tests our only reliable metric for questions
like this?



I'm afraid I agree, A/B testing is the only true valid proof that one
recommender config is better than another.

And even A/B testing may point us in the wrong direction. Say we achieveone configuration with which we can measure better sales with enoughsignificance level, then that configuration is the best one from anexperimental A/B test, i.e. the Holy Grail of measures. But what if ourultimate goal is customer retention? Maybe those short-termrecommendation of, say, very popular items (because we're not using theIDF weights) are achieving sales we would have had anyway but are nothelping client loyalty because there's no added value perceived. So inthe long term we'll increase churn because our recommendations do notdifferentiate ourselves.


Life and business are complicated :-)

As for offline metrics, I consider them as a hint that can help inpruning the space of possible recommender configurations. But discardingone system in favour of another based only on precision is risky, thedifference would need to be more than significant.

Re: Using IDF in CF recommender

Reply via email to