On Mon, Feb 22, 2010 at 4:18 PM, Tamas Jambor <[email protected]> wrote:
>
>> What if the weights are 1,1,-1,-1? The estimate is -2 then. This is
>> why I say this won't work.
>>
>
> you could trim the result so you would get 1, which is the same as what you
> get with your approach.

What about the case where you have just 1-2 similar items, all
similarity 0. The result is undefined. That could be patched too, but
it's why it just feels like defining it this way is problematic. You
can't meaningfully take a weighted average with negative weights.


> if you take cosine similarity, 0 mean that vectors are independent, that is
> also an implication that
> there is no mutual information. although cosine cannot get negative in this
> case.

The cosine similarity can be negative, but yes 0 here also means no
relation. This isn't true of, say, a Euclidean distance-based measure.


> evaluate. The first one is your approach, and the second one is the one I
> mentioned in the previous email.

How are you dealing with negative/undefined predictions though? I am
also not sure what defining it this does to the accuracy of estimated
preference values, which is what the evaluator would test. This tends
to push estimates towards negative values.

I could believe it works a little better for your data set -- so you
should use your variation, especially if you only care about precision
and recall. I don't know that this is going to be better in general --
honestly don't know, I haven't studied this. Failing that, I am just
having a hard time implementing this as a ill-defined weighted
average, no matter what it seems to do to one data set.

is there no third way? maybe someone can think of a standard solution to this.

Reply via email to