On Tue, 2010-01-05 at 21:12 +0000, Tom Haddon wrote: > On Tue, 2010-01-05 at 18:54 -0200, Guilherme Salgado wrote: > > (CCing launchpad-dev as others might have ideas/suggestions) > > > > On Tue, 2010-01-05 at 08:13 +0000, Tom Haddon wrote: > > > On Mon, 2010-01-04 at 18:16 -0200, Guilherme Salgado wrote: > > > > On Mon, 2009-12-21 at 09:21 +0000, Tom Haddon wrote: > > > > > On Fri, 2009-12-18 at 10:37 -0500, Gary Poster wrote: > > > > > > I like the suggestions I've read. Thanks to all three of you. I'll > > > > > > summarize the proposals so far. > > > > > > > > > > > > - We will switch logrotation to use SIGHUP. > > > > > > > > > > > > - We will use SIGUSR2 as a flag for checking for the presence of a > > > > > > "read-only.txt" at the top of the tree. > > > > > > > > > > > > - At application start, or when SIGUSR2 fires, if "read-only.txt" is > > > > > > found at the top of the tree, the application will switch to (or > > > > > > stay > > > > > > in) read-only mode. If it is not found, the application will switch > > > > > > to (or stay in) normal read-write mode. > > > > > > > > > > > > - We will provide a key-value page to verify the read-only status of > > > > > > (each) application. > > > > > > > > > > > > Here are my thoughts: > > > > > > > > > > > > - I think the key-value page would be very valuable for LOSA peace > > > > > > of > > > > > > mind, so I like the idea. However, it is only pertinent for a given > > > > > > application instance. Going to this page through the load-balancer > > > > > > would not be valuable. LOSAs, would you immediately use this page > > > > > > if we offered it, going to each instance in the cluster? > > > > > > > > > > It'd be nice, but I don't want to block on it. > > > > > > > > > > > If not, I'd like to push it out of the scope of this effort, until > > > > > > we > > > > > > can think about offering an aggregated view of information like this > > > > > > in a dashboard like the one Maris will hopefully be working on this > > > > > > cycle. > > > > > > > > > > > > - I think we should definitely log mode switches. Then LOSAs can at > > > > > > least trail the logs for a given instance to verify that the app > > > > > > noticed the signal and the presence or absence of the file. > > > > > > > > > > +1 > > > > > > > > > > > > - If the LOSAs don't want to rock the boat with changing logrotation > > > > > > to SIGHUP, we do have a swath of signals from SIGRTMIN to SIGRTMAX > > > > > > that we could use. I'm in favor of the SIGHUP switch if the LOSAs > > > > > > don't mind, though. > > > > > > > > > > > This switch is okay. > > > > > > > > > > > > > Today I started working on this, and following is my initial plan: > > > > > > > > Currently, the way we switch to read-only is by changing the > > > > read_only config to True *and* changing the main_master and > > > > main_slave configs to point to standalone databases. What we > > > > want is to get rid of the read_only config and collapse the > > > > extra config files we have for read-only mode (lpnet1-db-update) > > > > into the lpnet1 config. > > > > > > > > In order to do this we will use the presence of a file > > > > (read-only.txt) on the root of the tree to identify (upon > > > > startup or SIGUSR2) whether or not we're in read-only mode, and > > > > set the main_master and main_slave configs appropriately. As > > > > we'll be overwriting these config variables, we'll need to store > > > > all different values we might use for them in new variables > > > > (e.g. rw_main_master, rw_main_slave, ro_main_master and > > > > ro_main_slave). (we might even get rid of the main_master and > > > > main_slave config variables as they will be computed values, > > > > which can be moved somewhere else. although I'm not sure this > > > > is a good idea because all other db names live in config > > > > variables). > > > > > > > > The plan: > > > > > > > > • Change all places that use config.launchpad.read_only to use > > > > another helper, which tells whether or not we're in read-only > > > > mode by looking for a read-only.txt file. > > > > • switch logrotation to use SIGHUP. > > > > • Rename main_master and main_slave to rw_main_master and > > > > rw_main_slave, adding new (and empty) main_master and > > > > main_slave > > > > config variables, which get set upon startup/SIGUSR2 (with the > > > > values of rw_*). > > > > • log read-only/read-write switches > > > > > > > > However, after I started implementing it I realized that having two > > > > switches (the read-only.txt file and the SIGUSR2) to turn on read-only > > > > doesn't sound like a very good idea (as we may accidentally leave an app > > > > server in an inconsistent state), so we may want to use SIGUSR2 to > > > > create a read-only.txt file *and* trigger the code that sets the configs > > > > with the appropriate values. > > > > > > You don't need to worry about creating/deleting the read-only.txt file - > > > we'll manage that through external means (initscripts or other helper > > > scripts). I'd envisage you only need one signal which means "check again > > > whether we're in read-only or read-write mode". > > > > > > > As we discussed on IRC, my concern was that having a read-only.txt file > > did not mean we were in read-only mode -- the SIGUSR2 is needed, and if > > forgotten the server would be in an inconsistent state. In that state, > > the python code thinks we're running in read only (because it relies on > > read-only.txt for that) but we're still connecting to the rw db (because > > we rely on SIGUSR2 to change to the ro dbs). > > > > Anyway, that didn't seem to be a big deal as this is going to be handled > > by scripts, so I went ahead and tried to implement that. As usual, I've > > encountered some problems, and they seem to boil down to the way our > > config works -- the config variables are immutable so to make changes we > > need to push/pop overlays on top of the existing config. > > > > Since config.pop(name) removes the overlay with the given name and any > > others that were on top of it, we can't rely on config.push/pop to > > update the config values because we might end up inadvertently reverting > > others' changes and others might do the same to ours. I think this > > push/pop mechanism was meant only for testing purposes. > > > > After realizing that I came up with another approach, which relies only > > on the presence/absence of the read-only.txt file to figure out the mode > > we're on. On this approach, config.database.main_master/slave are gone > > and we use dbconfig.main_master/slave instead, which are properties in > > DatabaseConfig that return the appropriate value according to the mode > > we're on. > > Does this mean we're checking for the presence of this text file before > every database operation? That sounds quite IO intensive.
ISTM that the presence of the file would be checked only a couple times (once for each of the properties in DatabaseConfig that look for that file) for each handler thread, as a consequence of storm creating the DB connections when they're first used. If that's correct, then we'll have to find a way to reset the stores in all threads when we switch modes -- something I didn't realize before. > > > Although that simplifies things for us and for LOSAs, it also means we > > can't easily log mode switches (because we don't have the signal > > anymore). > > Surely the server knows a current state, and then if that changes you > could log it? Not in the current implementation, as it relies on a @property which checks the presence of read-only.txt, but it's easy to change that. Not sure what I had in mind when I wrote the above. > > > We could easily workaround that by pushing config changes, > > but I'd be very uncomfortable doing that, for the reasons I explained > > above. > > > > So, I'd like to know if this would be an acceptable solution, and > > whether or not we can live without logs of the mode switches? > > > > > That make sense? > > > > > > > Similarly, when starting up we'd check for > > > > the presence of read-only.txt and set the config variables with the > > > > appropriate values. That means we can't use SIGUSR2 to switch back to > > > > read-write mode, though. > > > > > > > > An alternative that would not have any of the problems described above > > > > would be to keep the existing code using config.launchpad.read_only and > > > > have the helper function (which looks for read-only.txt) just update > > > > that config variable upon startup/SIGUSR. That way it'd be much harder > > > > to have an appserver in read-only mode using the wrong DB, and we'd be > > > > able to use SIGURS2 to switch back to read-write mode. > > > > > > > > Any preferences/suggestions? > > > > > > > > > > > > > > > > -- Guilherme Salgado <[email protected]>
signature.asc
Description: This is a digitally signed message part
_______________________________________________ Mailing list: https://launchpad.net/~launchpad-dev Post to : [email protected] Unsubscribe : https://launchpad.net/~launchpad-dev More help : https://help.launchpad.net/ListHelp

