On 1/17/09, Erik Jones <ejo...@engineyard.com> wrote: > On Jan 15, 2009, at 8:06 PM, Tom Lane wrote: >> "Jamie Tufnell" <die...@googlemail.com> writes: >>> item_count int -- this is derived from (select count(*) from items >>> where group_id = id) >>> ... >> >>> item_count would be updated by insert/update/delete triggers on the >>> items table, hopefully that would ensure it is always correct? >> >> Concurrent updates to the items table make this much harder than >> it might first appear. If you're willing to serialize all your >> updating >> transactions then you can make it work, but ... > > That was exactly the caveat I was about to point out. That being > said, keeping COUNT() values and other computed statistics based on > other data in the database *is* a fairly common "tactic". On method > that I've used to great success to avoid the serialization problem is > to have your triggers actually insert the necessary information for > the update into a separate "update queue" table. You then have > separate process that routinely sweeps that update queue, aggregates > the updates and then updates your count values in the groups table > with the total update values for each groups entry with updates.
Fortunately our items table rarely sees concurrent writes. It's over 99% reads and is typically updated by just one user. We are already caching these aggregates and other data in a separate layer and my goal is to see if I can get rid of that layer. In light of your advice though, I will think things through a bit more first. Thanks for your help! Jamie -- Sent via pgsql-sql mailing list (pgsql-sql@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-sql