Re: Problems with Mahout's RecommenderIRStatsEvaluator

Ted Dunning Sat, 16 Feb 2013 15:15:54 -0800

Sean

I think it is still a supervised learning problem in that there is a labelled 
training data set and an unlabeled test data set.


Learning a ranking doesn't change the basic dichotomy between supervised and 
unsupervised.  It just changes the desired figure of merit. 

Sent from my iPhone

On Feb 16, 2013, at 1:32 PM, Sean Owen <[email protected]> wrote:

> Sure, if you were predicting ratings for one movie given a set of ratings
> for that movie and the ratings for many other movies. That isn't what the
> recommender problem is. Here, the problem is to list N movies most likely
> to be top-rated. The precision-recall test is, in turn, a test of top N
> results, not a test over prediction accuracy. We aren't talking about RMSE
> here or even any particular means of generating top N recommendations. You
> don't even have to predict ratings to make a top N list.
> 
> 
> On Sat, Feb 16, 2013 at 9:28 PM, Tevfik Aytekin 
> <[email protected]>wrote:
> 
>> No, rating prediction is clearly a supervised ML problem
>> 
>> On Sat, Feb 16, 2013 at 10:15 PM, Sean Owen <[email protected]> wrote:
>>> This is a good answer for evaluation of supervised ML, but, this is
>>> unsupervised. Choosing randomly is choosing the 'right answers' randomly,
>>> and that's plainly problematic.
>>> 
>>> 
>>> On Sat, Feb 16, 2013 at 8:53 PM, Tevfik Aytekin <
>> [email protected]>wrote:
>>> 
>>>> I think, it is better to choose ratings of the test user in a random
>>>> fashion.
>>>> 
>>>> On Sat, Feb 16, 2013 at 9:37 PM, Sean Owen <[email protected]> wrote:
>>>>> Yes. But: the test sample is small. Using 40% of your data to test is
>>>>> probably quite too much.
>>>>> 
>>>>> My point is that it may be the least-bad thing to do. What test are
>> you
>>>>> proposing instead, and why is it coherent with what you're testing?
>>>>> 
>>>> 
>>

Re: Problems with Mahout's RecommenderIRStatsEvaluator

Reply via email to