Re: "Direction" of co-occurence and log-likelihood ratio

Nimrod Priell Thu, 21 Jun 2012 14:42:18 -0700

On Jun 21, 2012, at 4:48 PM, Sean Owen wrote:

> On Thu, Jun 21, 2012 at 9:01 PM, Nimrod Priell <[email protected]> 
> wrote:
>> On a completely different subject: I wrote a simple 
>> RelevantItemsDataSplitter and RecommenderIRStatsEvaluator which take a list 
>> of item IDs, and run CF evaluation by hiding items only out of that list, 
>> and asking to recommend only out of that list of items (precision and recall 
>> are then also calculated only with that list of items as the universe).
> 
> Sure, if you know what the 'right answers' are more specifically in
> your use case, you can and should use that in the test. That's what
> the splitter class is for and that's what you did, yes.
> 
> The more important thing of course is to implement this in your actual
> recommender! you can use a Rescorer to penalize popular items, if
> that's what you believe improves the result quality.
>


In my real-time context, the recommender will never be asked to recommend for 
items outside of the specified set. Hence I want to evaluate only on it. The 
reference to popularity is a little confusing, actually; it just so happens 
that I realized I should do the scoring differently because the items in my set 
are less popular, and I didn't see any improvement from UBCF until I compared 
the results on these less popular items; rather than hit items at random for 
which the popularity recommender does best.

Cool. I'll try to look into submitting a patch this weekend and maybe others 
could gain from this.

> 
>> I realize an alternative to the example I proposed with the popularity is 
>> looking at the top-n recommendation for large n because only relatively few 
>> items are very popular so the precision-recall stats based on popularity 
>> become less skewed; But I still think it's a useful constraint for 
>> evaluation.
> 
> You mean you want to use as a large "at" value in your test? This
> tends to increase recall but decrease precision. I don't know if it
> (necessarily) fixes something in this regard.


What I do is I scatter-plot top-n for every n (say, top-1, top-5, top-10, ...) 
on a precision-recall space (or just compare F1 across several values of n). 
Then user-based CF is comparable to and even worse than "most popular" 
recommendation (in my test). However, as n becomes larger, "popular 
recommendation" has no "personalization" and starts missing, compared to UBCF 
so it has a "lower profile" than UBCF the farther out I go in recommendations. 

For an example (where I took this idea from), see the plot on top of page 22 of 
http://cran.r-project.org/web/packages/recommenderlab/vignettes/recommenderlab.pdf
 , a guide to the R recommenderlab package.

My rationalization for this was that because 80% of the users have the most 
popular item, it is picked somewhat more often to be the hidden item, and it's 
very easy for the popular recommender to guess it right. When I ask the 
recommender for many items, the effect of the popular ones dampens. Does that 
make sense?

Best,
Nimrod

Re: "Direction" of co-occurence and log-likelihood ratio

Reply via email to