[jira] Commented: (MAHOUT-305) Combine both cooccurrence-based CF M/R jobs

Ted Dunning (JIRA) Tue, 23 Feb 2010 10:44:49 -0800

    [ 
https://issues.apache.org/jira/browse/MAHOUT-305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12837377#action_12837377
 ]


Ted Dunning commented on MAHOUT-305:
------------------------------------


My own experience is that all that counts in recommendations is the probability 
of click (interest) on a set of recommendations.  As such, the best analog is 
probably precision at 10 or 20.  I don't think that recall at 10 or 20 makes 
any sense at all (with a depth limited situation like this, you have given up 
on recall and are only looking at precision).

Ankur's suggestion about keeping the most recent 4's and 5's as test data seems 
right to me.  My only beefs are that you don't need rec...@10 and what to do 
with the unrated items.  Presumably a new style algorithm could surface items 
that the user hadn't thought of, but really likes.  In practice, I think that 
counting unrated items in the results as misses isn't a big deal in the Netflix 
data.  In the real world where test data is more scarce, I would count unrated 
items as misses in off-line evaluation, but try to run as many alternatives as 
possible against live users.

 

> Combine both cooccurrence-based CF M/R jobs
> -------------------------------------------
>
>                 Key: MAHOUT-305
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-305
>             Project: Mahout
>          Issue Type: Improvement
>          Components: Collaborative Filtering
>    Affects Versions: 0.2
>            Reporter: Sean Owen
>            Assignee: Ankur
>            Priority: Minor
>
> We have two different but essentially identical MapReduce jobs to make 
> recommendations based on item co-occurrence: 
> org.apache.mahout.cf.taste.hadoop.{item,cooccurrence}. They ought to be 
> merged. Not sure exactly how to approach that but noting this in JIRA, per 
> Ankur.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (MAHOUT-305) Combine both cooccurrence-based CF M/R jobs

Reply via email to