Stuart Bishop has proposed merging lp:~stub/launchpad/replication into lp:launchpad.
Requested reviews: Launchpad code reviewers (launchpad-reviewers) Related bugs: Bug #307407 in Launchpad itself: "slave database should never be used when lag is too great" https://bugs.launchpad.net/launchpad/+bug/307407 Bug #345835 in Launchpad itself: "Database load balancing should use slave lag, not cluster lag" https://bugs.launchpad.net/launchpad/+bug/345835 Bug #447453 in Launchpad itself: "Changes made through the API (via javascript) aren't blacklisting the Slave DBs" https://bugs.launchpad.net/launchpad/+bug/447453 Bug #461800 in Launchpad itself: "new-slave.py no longer works" https://bugs.launchpad.net/launchpad/+bug/461800 Bug #504696 in Launchpad itself: "Replication lag checks can block" https://bugs.launchpad.net/launchpad/+bug/504696 Bug #504751 in Launchpad itself: "Standalone slave not subscribed to the authdb replication set" https://bugs.launchpad.net/launchpad/+bug/504751 Bug #504807 in Launchpad itself: "authdb replication set sequence values not being restored on staging" https://bugs.launchpad.net/launchpad/+bug/504807 Bug #514267 in Launchpad itself: "InternalError on clusters under busy load" https://bugs.launchpad.net/launchpad/+bug/514267 Bug #1014661 in Launchpad itself: "Replication lag checks do not understand PG 9.1 streaming replication" https://bugs.launchpad.net/launchpad/+bug/1014661 For more details, see: https://code.launchpad.net/~stub/launchpad/replication/+merge/121410 = Summary = We want systems that only need a hot standby database to be always available, even during database updates. To support this, we plan to have the fast downtime deployment scripts stagger how db changes get applied (master first while hot standby is available, then hot standby when the master is available). == Proposed fix == If a client requests a hot standby Store, and the hot standby is down, return the master Store instead. I'm doing this in the BaseDatabasePolicy, so this logic affects everything. This means not only do we get the behavior we are after with the hot-standby only clients (that don't exist yet), other systems will become available sooner during a FDT update because they will only be down until the database updates have been applied on the master and not until those changes have propagated to the hot standbys. == Pre-implementation notes == == LOC Rationale == == Implementation details == == Tests == == Demo and Q/A == = Launchpad lint = Checking for conflicts and issues in changed files. Linting changed files: lib/lp/services/webapp/dbpolicy.py -- https://code.launchpad.net/~stub/launchpad/replication/+merge/121410 Your team Launchpad code reviewers is requested to review the proposed merge of lp:~stub/launchpad/replication into lp:launchpad. _______________________________________________ Mailing list: https://launchpad.net/~launchpad-reviewers Post to : [email protected] Unsubscribe : https://launchpad.net/~launchpad-reviewers More help : https://help.launchpad.net/ListHelp

