I will have to read up on multinomial likelihood. I don't see how mutual information is applied to this problem? trying to figure out what the random variables are...
I think I have reached the same conclusion that rating data is typically noisy enough to make it hard to use. Agree about implicit observations too. On Tue, Jun 23, 2009 at 8:12 PM, Ted Dunning<[email protected]> wrote: > Multinomial likelihood ratios can handle any size contingency table. I > haven't used them for this, though. > > Of course, it is commonly true that ratings break down as 80+% very > positive, ~10% very negative and ~10% intermediate values. To my mind, this > is just as well summarized as negative, positive or no strong value. > Furthermore, there is very little loss in forgetting the negative ratings > because it is so hard to interpret them well (a negative rating often means > "this is *exactly* what I wanted except for some tiny nit that drives me > completely non-linear"). There is a long tradition going back to Shardanad > of using multiple levels of scoring in collaborative filtering, but there is > little evidence that it is useful. > > Even more of a problem, though, is the fact that only a few percent of the > users ever rate anything. That makes implicit observations much more useful > for most recommendation tasks.
