Could you elaborate on your example? It's quite possible for an item
with an average rating of 2.5 to be a better recommendation for a user
than one whose average is 3.5 -- depends on the user of course.

Try the evaluation framework in org.apache.mahout.cf.taste.impl.eval
in order to figure out whether one or the other implementation is more
accurate, in the sense of correctly predicting ratings. It could be
that the results "look" wrong but

In what sense does the weighted average bias to the mean -- how does
the choice of weights have this effect? You can't use similarity
scores as weights directly since they're possibly negative, so
something must be done, and I don't know of a standard answer to this.
I'm open to a different way of patching this problem, but first want
to understand where the bias is.

For any given input, some algorithms will work well and some won't.
SVD is a great approach, and wouldn't be surprised if it's working
better for whatever input you have.

On Mon, Feb 22, 2010 at 1:14 PM, Tamas Jambor <[email protected]> wrote:
> I'm doing some studies on the bias of recommender system, and using this
> approach combined with
> person correlation gives some very weird results. For example if I take
> items that have a mean of less than 2.5, it
> is more likely that those items are ranked higher than items which have a
> very high mean (ie higher than 3.5). it took me a while to
> figure out why, and the reason is the approach you take to calculate
> prediction always biases the score towards the mean. So that I end
> up with a very low variance for the predicted items compared to for example
> SVD.
>
> Tamas
>
> On 22/02/2010 13:03, Sean Owen wrote:
>>
>> It's a good question. The bigger question here is, how do you create a
>> weighted average when weights can be negative? That leads to wacky
>> results like predicting ratings of -5 when ratings range from 1 to 5.
>>
>> My fix was to make all weights nonnegative in this way. If you ignore
>> items with similarity 0, what would you do with items with negative
>> similarity?
>>
>> You could ignore them I suppose; it loses some key information, but
>> might be OK. It also presupposes that similarity 0 means no
>> resemblance at all; that's not necessarily what 0 means for similarity
>> -- at least in the context of this framework. While it means no
>> resemblance in the case of similarities built on things like the
>> Pearson correlation, it doesn't for other metrics.
>>
>> Sean
>>
>>
>> On Mon, Feb 22, 2010 at 12:54 PM, Tamas Jambor<[email protected]>
>>  wrote:
>>
>>>
>>> hi,
>>>
>>> Just wondering how you justify that you add +1 to the correlation, when
>>> you
>>> calculate the score for the recommendation.
>>> so that items which are not correlated constitute to the score. I think
>>> this
>>> biases the recommender towards the mean of the ratings of the target
>>> users
>>> (for item based),
>>>
>>> Tamas
>>>
>>>
>>>
>>>
>

Reply via email to