I've been working on the same project and figured I would chip in. A compromise to avoid needing synchronous replication would be to determine which functions in our code need to use "live" or recently modified data, and ensure that queries pertaining to those function calls get sent to the master database (where all INSERT and UPDATE operations are performed). For other functions where a few seconds of delay doesn't matter, the queries would be directed to a replicated slave database.
It isn't clear how to achieve this. We have pgpool2 working in master/ slave mode, but it doesn't have a very fine level of control over how queries get directed. The only way I could see to do it at the application level would be to step in and out of the "read committed" isolation level. According to the Postgres documentation this makes no difference in behavior because "read committed" is actually the minimum level of transaction isolation. However, pgpool2 isn't aware of this; it directs all queries to the master at this isolation level and uses load balancing when the isolation level is "read uncommitted." I was able to direct queries from individual functions to the master database by wrapping them in a decorator that sets the connection's isolation level to 1 and then back to 0. However, this seems way too sketchy for us to be comfortable with it. I wonder if there is a better way. Michael On Nov 3, 10:19 pm, Adam Seering <[email protected]> wrote: > Hi, > We're running a website that usually runs just fine on our server; but > every now and then we get a big load burst (thousands of simultaneous > users in an interactive Web 1.5-ish app), and our database server > (PostgreSQL) just gets completely swamped. > > We'd like to set up some form of load-balancing. The workload is very > SELECT-heavy, so this seems plausible. It looks like Slony is the > recommended package for doing this. However, if we set up a Slony > cluster and use pgpool to divide up queries among the nodes, the default > isolation level requested by psycopg forces all the queries to go to the > master database, which defeats the purpose of the cluster. If we force > the system to a lower isolation level, all kinds of things start > breaking, because data doesn't appear quickly enough in the slave > databases, and various chunks of Django code (and our code) seem to rely > on writing data and immediately reading it back. > > Does anyone else do this type of load-balancing? Any tips? In > general, what (if anything) do folks here do for load-balancing? > > Thanks, > Adam --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Django users" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [email protected] For more options, visit this group at http://groups.google.com/group/django-users?hl=en -~----------~----~----~----~------~----~------~--~---

