Multinomial likelihood ratios can handle any size contingency table. I haven't used them for this, though.
Of course, it is commonly true that ratings break down as 80+% very positive, ~10% very negative and ~10% intermediate values. To my mind, this is just as well summarized as negative, positive or no strong value. Furthermore, there is very little loss in forgetting the negative ratings because it is so hard to interpret them well (a negative rating often means "this is *exactly* what I wanted except for some tiny nit that drives me completely non-linear"). There is a long tradition going back to Shardanad of using multiple levels of scoring in collaborative filtering, but there is little evidence that it is useful. Even more of a problem, though, is the fact that only a few percent of the users ever rate anything. That makes implicit observations much more useful for most recommendation tasks. On Tue, Jun 23, 2009 at 4:20 PM, Sean Owen <[email protected]> wrote: > Do any of the approaches you cite take into the account the value of > the rating itself? I agree, seems like there should be some > alternative to Pearson / cosine-measure to offer, but right now it's > the only similarity metric that cares about the rating. >
