It may be true that the results are best with a neighborhood size of
2. Why is that surprising? Very similar people, by nature, rate
similar things, which makes the things you held out of a user's test
set likely to be found in the recommendations.

The mapping you suggest is not that sensible, yes, since almost
everything maps to 1. Not surprisingly, most of your predictions are
near 1. That's "better" in an absolute sense, but RMSE is worse
relative to the variance of the data set. This is not a good mapping
-- or else, RMSE is not a very good metric, yes. So, don't do one of
those two things.

Try mean average precision for a metric that is not directly related
to the prediction values.

On Wed, May 8, 2013 at 2:45 PM, Zhongduo Lin <[email protected]> wrote:
> Thank you for your reply.
>
> I think the evaluation process involves randomly choosing the evaluation
> proportion. The problem is that I always get the best result when I set
> neighbors to 2, which seems unreasonable to me. Since there should be many
> test case that the recommender system couldn't predict at all. So why did I
> still get a valid result? How does Mahout handle this case?
>
> Sorry I didn't make myself clear for the second question. Here is the
> problem: I have a set of inferred preference ranging from 0 to 1000. But I
> want to map it to 1 - 5. So there can be many ways for mapping. Let's take a
> simple example, if the mapping rule is like the following:
>         if (inferred_preference < 995) preference = 1;
>         else preference = inferred_preference - 995.
>
> You can see that this is a really bad mapping algorithms, but if we run the
> generated preference to Mahout, it is going to give me a really nice result
> because most of the preference is 1. So is there any other metric to
> evaluate this?
>
>
> Any help will be highly appreciated.
>
> Best Regards,
> Jimmy
>
>
> Zhongduo Lin (Jimmy)
> MASc candidate in ECE department
> University of Toronto
>
>
> On 2013-05-08 4:44 AM, Sean Owen wrote:
>>
>> It is true that a process based on user-user similarity only won't be
>> able to recommend item 4 in this example. This is a drawback of the
>> algorithm and not something that can be worked around. You could try
>> not to choose this item in the test set, but then that does not quite
>> reflect reality in the test.
>>
>> If you just mean that compressing the range of pref values improves
>> RMSE in absolute terms, yes it does of course. But not in relative
>> terms. There is nothing inherently better or worse about a small range
>> in this example.
>>
>> RMSE is a fine eval metric, but you can also considered mean average
>> precision.
>>
>> Sean

Reply via email to