On Tue, 2010-01-05 at 18:54 -0200, Guilherme Salgado wrote: > (CCing launchpad-dev as others might have ideas/suggestions) > > On Tue, 2010-01-05 at 08:13 +0000, Tom Haddon wrote: > > On Mon, 2010-01-04 at 18:16 -0200, Guilherme Salgado wrote: > > > On Mon, 2009-12-21 at 09:21 +0000, Tom Haddon wrote: > > > > On Fri, 2009-12-18 at 10:37 -0500, Gary Poster wrote: > > > > > I like the suggestions I've read. Thanks to all three of you. I'll > > > > > summarize the proposals so far. > > > > > > > > > > - We will switch logrotation to use SIGHUP. > > > > > > > > > > - We will use SIGUSR2 as a flag for checking for the presence of a > > > > > "read-only.txt" at the top of the tree. > > > > > > > > > > - At application start, or when SIGUSR2 fires, if "read-only.txt" is > > > > > found at the top of the tree, the application will switch to (or stay > > > > > in) read-only mode. If it is not found, the application will switch > > > > > to (or stay in) normal read-write mode. > > > > > > > > > > - We will provide a key-value page to verify the read-only status of > > > > > (each) application. > > > > > > > > > > Here are my thoughts: > > > > > > > > > > - I think the key-value page would be very valuable for LOSA peace of > > > > > mind, so I like the idea. However, it is only pertinent for a given > > > > > application instance. Going to this page through the load-balancer > > > > > would not be valuable. LOSAs, would you immediately use this page > > > > > if we offered it, going to each instance in the cluster? > > > > > > > > It'd be nice, but I don't want to block on it. > > > > > > > > > If not, I'd like to push it out of the scope of this effort, until we > > > > > can think about offering an aggregated view of information like this > > > > > in a dashboard like the one Maris will hopefully be working on this > > > > > cycle. > > > > > > > > > > - I think we should definitely log mode switches. Then LOSAs can at > > > > > least trail the logs for a given instance to verify that the app > > > > > noticed the signal and the presence or absence of the file. > > > > > > > > +1 > > > > > > > > > > - If the LOSAs don't want to rock the boat with changing logrotation > > > > > to SIGHUP, we do have a swath of signals from SIGRTMIN to SIGRTMAX > > > > > that we could use. I'm in favor of the SIGHUP switch if the LOSAs > > > > > don't mind, though. > > > > > > > > > This switch is okay. > > > > > > > > > > Today I started working on this, and following is my initial plan: > > > > > > Currently, the way we switch to read-only is by changing the > > > read_only config to True *and* changing the main_master and > > > main_slave configs to point to standalone databases. What we > > > want is to get rid of the read_only config and collapse the > > > extra config files we have for read-only mode (lpnet1-db-update) > > > into the lpnet1 config. > > > > > > In order to do this we will use the presence of a file > > > (read-only.txt) on the root of the tree to identify (upon > > > startup or SIGUSR2) whether or not we're in read-only mode, and > > > set the main_master and main_slave configs appropriately. As > > > we'll be overwriting these config variables, we'll need to store > > > all different values we might use for them in new variables > > > (e.g. rw_main_master, rw_main_slave, ro_main_master and > > > ro_main_slave). (we might even get rid of the main_master and > > > main_slave config variables as they will be computed values, > > > which can be moved somewhere else. although I'm not sure this > > > is a good idea because all other db names live in config > > > variables). > > > > > > The plan: > > > > > > • Change all places that use config.launchpad.read_only to use > > > another helper, which tells whether or not we're in read-only > > > mode by looking for a read-only.txt file. > > > • switch logrotation to use SIGHUP. > > > • Rename main_master and main_slave to rw_main_master and > > > rw_main_slave, adding new (and empty) main_master and main_slave > > > config variables, which get set upon startup/SIGUSR2 (with the > > > values of rw_*). > > > • log read-only/read-write switches > > > > > > However, after I started implementing it I realized that having two > > > switches (the read-only.txt file and the SIGUSR2) to turn on read-only > > > doesn't sound like a very good idea (as we may accidentally leave an app > > > server in an inconsistent state), so we may want to use SIGUSR2 to > > > create a read-only.txt file *and* trigger the code that sets the configs > > > with the appropriate values. > > > > You don't need to worry about creating/deleting the read-only.txt file - > > we'll manage that through external means (initscripts or other helper > > scripts). I'd envisage you only need one signal which means "check again > > whether we're in read-only or read-write mode". > > > > As we discussed on IRC, my concern was that having a read-only.txt file > did not mean we were in read-only mode -- the SIGUSR2 is needed, and if > forgotten the server would be in an inconsistent state. In that state, > the python code thinks we're running in read only (because it relies on > read-only.txt for that) but we're still connecting to the rw db (because > we rely on SIGUSR2 to change to the ro dbs). > > Anyway, that didn't seem to be a big deal as this is going to be handled > by scripts, so I went ahead and tried to implement that. As usual, I've > encountered some problems, and they seem to boil down to the way our > config works -- the config variables are immutable so to make changes we > need to push/pop overlays on top of the existing config. > > Since config.pop(name) removes the overlay with the given name and any > others that were on top of it, we can't rely on config.push/pop to > update the config values because we might end up inadvertently reverting > others' changes and others might do the same to ours. I think this > push/pop mechanism was meant only for testing purposes. > > After realizing that I came up with another approach, which relies only > on the presence/absence of the read-only.txt file to figure out the mode > we're on. On this approach, config.database.main_master/slave are gone > and we use dbconfig.main_master/slave instead, which are properties in > DatabaseConfig that return the appropriate value according to the mode > we're on.
Does this mean we're checking for the presence of this text file before every database operation? That sounds quite IO intensive. > Although that simplifies things for us and for LOSAs, it also means we > can't easily log mode switches (because we don't have the signal > anymore). Surely the server knows a current state, and then if that changes you could log it? > We could easily workaround that by pushing config changes, > but I'd be very uncomfortable doing that, for the reasons I explained > above. > > So, I'd like to know if this would be an acceptable solution, and > whether or not we can live without logs of the mode switches? > > > That make sense? > > > > > Similarly, when starting up we'd check for > > > the presence of read-only.txt and set the config variables with the > > > appropriate values. That means we can't use SIGUSR2 to switch back to > > > read-write mode, though. > > > > > > An alternative that would not have any of the problems described above > > > would be to keep the existing code using config.launchpad.read_only and > > > have the helper function (which looks for read-only.txt) just update > > > that config variable upon startup/SIGUSR. That way it'd be much harder > > > to have an appserver in read-only mode using the wrong DB, and we'd be > > > able to use SIGURS2 to switch back to read-write mode. > > > > > > Any preferences/suggestions? > > > > > > > > > _______________________________________________ Mailing list: https://launchpad.net/~launchpad-dev Post to : [email protected] Unsubscribe : https://launchpad.net/~launchpad-dev More help : https://help.launchpad.net/ListHelp

