P Kishor wrote: > Good advice on storing often used values to speed up the computation, > but, as I noted above, I *am* looking for a Bayesian (weighted?) > ranking system, not the one I ended up inadvertently implying above.
So, Bayesian rating generally works like this: W = (R*v + C*m) / (v + m) where W is the desired weighted rating, R is the item's true rating (total number of stars / total number of votes for that item), v is the number of votes for the item, C is the average rating of all items (sum of all true ratings / number of items), and m is a magic number. The idea is that, with v << m, W ~ C (the null hypothesis in the absence of evidence is that all items are about average); with v >> m, W ~ R (when the number of votes is significant, the evidence is "believable", so we accept the hypothesis that the true rating is the correct one); with v=m, W = (R + C) / 2 (so m is the number of votes that is "half-believable", so to speak). The choice of m is more art than science. Two popular approaches are 1) just making m some constant, based on what you think is a sufficient number of votes to be trusted; 2) making m be some fraction of the average number of votes per item (let's call it A, calculated as total number of votes overall / number of items): m = p*A. http://www.thebroth.com/blog/118/bayesian-rating suggests m = A (p = 1), though a smaller p might add more dynamism to the ratings (when you have lots of votes for existing items, new items may take a long time accumulating enough votes to break away from the average rating, no matter how good or bad they might be). Note that, to calculate W, you don't need individual votes - just the total number of stars (S) and the number of votes (v) for each item (or, alternatively, R and v, but R is a bit harder to update when a new vote is recorded). So, suppose you have a table votes(itemId, S, v). You can calculate weighted ratings like this: select itemId, (S + C * m) / (v + m) as W from votes, (select :p * avg(v) as m, avg(S/v) as C from votes); Now, you seem to want to precalculate C and m (or C and A), perhaps for performance reasons. You would need to keep track of some global totals. Create a separate table totals(number_of_items, number_of_votes, sum_of_ratings), insert a single row with all zeros. Suppose you want to register a new vote (:id, :stars). That would go something like this: insert or replace into votes(itemId, S, v) values (:id, coalesce((select S from votes where itemId = :id), 0) + :stars, coalesce((select v from votes where itemId = :id), 0) + 1); update totals set number_of_items = number_of_items + (select v = 1 from votes where itemId=:id), number_of_votes = number_of_votes + 1, sum_of_ratings = sum_of_ratings + (select S / v - (case when v = 1 then 0 else (S - :stars) / (v - 1) end) from votes where itemId = :id); With this data, weighted ratings can be calculated as follows: select itemId, (S + C * m) / (v + m) as W from votes, (select :p * number_of_votes/number_of_items as m, sum_of_ratings/number_of_items as C from totals); None of the above is tested, everything is off the top of my head. Use with caution. Igor Tandetnik _______________________________________________ sqlite-users mailing list [email protected] http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

