On Mon, 4 Oct 2010 12:22:57 +1300, Robert Collins <[email protected]> wrote: > On Mon, Oct 4, 2010 at 11:40 AM, Michael Hudson > <[email protected]> wrote: > > On Mon, 27 Sep 2010 10:41:51 +1200, Robert Collins > > <[email protected]> wrote: > >> I've hit my tolerance level for stale librarian processes and am > >> looking to address this in the test environment. I want to make sure I > >> preserve existing use cases - please tell me if I have missed any. > > > > I freely admit that for whatever reason I'm not thinking very clearly > > today, so I may be being dense and not understanding things. Â But I hope > > I'm making some kind of sense at least. Â I also started to reply before > > reading the whole mail, sorry about that... > > Thanks for digging in in detail.
NP. > >> 1) Run an external process for the life of *a* test > > > > I think that there are two subcases here, one for a server-type process > > where you start the process, get it to do some things and then shut it > > down (the librarian would be a prototypical example, I guess) and the > > case where you want to invoke a script to do something (e.g. the script > > that performs a code import). > > > > I also think there are two cases when it comes to intent: one where the > > external process is part of the test fixture and one where it's the > > system under test. > > > > For all that, I don't think the implementation needs to take this into > > account very much -- I think the problems each case faces are about the > > same. > > > >> * create a working config for it > >> * start it > > > > It's a trivial point, but one that AIUI requires some engineering time: > > we need to start it, *using the previously created config*.  I guess we > > can do this using the LPCONFIG variable -- maybe we establish a > > convention that if LPCONFIG starts with a / it is interpreted as an > > absolute path to a directory containing config files.  We also need to > > make sure that lazr.config can serialize itself out to a config file > > that it can read in again. > > We don't need lazr.config serialization, as long as we can hand-write > (per helper) the fragment as a string template. True. > what we need is the ability to combine multiple > fragments. e.g. inheriting [for a config on disk] / push() (for an in > memory fragment). Luckily this is easy. > We can in the simplest fashion create a config called by the name of > the working dir we create for the helper, if absolute paths don't Just > Work - that will occasionally leave grot around on disk in the tree, > but easily cleaned, and bugs can be filed to improve. I guess, although I suspect making absolute paths Just Work will be little effort. > > These may both be already implemented for all I know.  But I don't think > > so, and I'm pretty sure that they're not routinely utilized. > > > > Using a hidden parameter like an environment variable is somewhat gross > > I guess, but I can't think of a better way.  We *could* be super regular > > about having each of our scripts take a --config option and similarly > > use a super regular API in all our tests to invoke each subprocess with > > the appropriate value for this option, but that doesn't seem likely to > > come to pass somehow. > > We need the main appserver and *everything* to take this option, and > to apply it it to any child processes it takes. So I'm pretty sure an > option isn't anywhere as easy as an environment variable. Yeah. > >> * run test > >> * kill it > >> * clean up any overhead left in the work environment [that we care about] > >> > >> 2) For many tests > >> * create working config > >> * start it > >> -run tests- > >>  * check its still ok and do per-test isolation > >>  * run a test > >> * kill it > >> * clean up > > > > In some sense, this doesn't seem interestingly different to the above > > problem, you just start and stop the process in different places. > > The main difference is needing a per-test step, which in layers is > 'setUpTest' and 'tearDownTest', I'm not convinced that having two > separate calls is needed, I think one call (reset) is sufficient. Right. > > Again, there's the issue of propagating 'which config to use' around. > > > >> 3) For 'make run' > >> * Use a small commandline tool that > >>  * creates working config > >>  * starts it > >>  * waits for SIGINT > >>  * stops it > >>  * cleans up > >> > >> 4) for concurrent testing > >> * As for many tests, but the creation of a working config needs to be > >> safe in the presence of concurrent activity. > >> * The created config needs to be sharable with *other* external > >> processes (e.g. the buildmaster may want to talk to the librarian) > > > > Putting this bullet here doesn't really make sense to me, although I > > agree it's necessary.  We already have many tests that invoke scripts > > that need to talk to the librarian, for example. > > We don't have any helpers that depend on a tree of dynamically created > config fragments at the moment, and we have to make sure that we can > glue them together. I hope we only need a stack, not a tree... > For instance, without this point we might do: - start unique helpers > by running an external process which returns a config - helpers don't > get given a special config, we just apply the config they spit out > in-process. > > So yes, its necessary, and we could easily implement in a way that > fails this if if we don't require it. Ah yes. Well I think I get the point you're making anyway :) > >> 5) For low-overhead iteration > >> * Find an existing external process > >> * Must 'know' its config a-priori > >> -run tests- > >>  * check the process is running, do per-test isolation > >>  * run a test > > > > Hm.  This feels a bit tricky, but I guess you don't need this at the > > same time as concurrent testing... > > Ideally starting things like appservers and the librarian wouldn't > take 6-7 seconds. removing zcml will probably fix that (and then we > can remove the low-overhead iteration story completely). Um, I'm not sure what you mean by removing the ZCML? It was my impression (and hey, I might be wrong) that it was the work the ZCML implied not the ZCML itself that takes up most of the time of setting up the component architecture. Although I agree -- it would be nice if services started really fast. > >> If the above set is complete, then I am proposing to combine things in > >> the following way. > >> Firstly, because its a good building block, the make run use case. > >> Note that the current code is duplicative/overlapping with the test > >> helper code - I'm proposing to consolidate this. This shouldn't look > >> too different to our runlaunchpad Service today, except that we'll > >> have more entry points (to do cleanup etc). > >>  - the small helper will do the following for a given service: > >>   start up the instance > >>   optionally print a bash snippet with variables (like ssh-agent > >> does), including the helpers pid > >>    - this is useful for running up isolated copies > >>  - main process runs > > > > I think I agree that the code we use in the 'make run' case is nicer and > > has a nicer API than the code we use in the tests, but I'm not sure that > > the make run use case is very much like the concurrent testing case -- > > partly because of state that's embedded in things like the apache config > > and partly because the ways you interact with things in the 'make run' > > case doesn't have a convenient parent process/child process relationship > > to know which running instance we should be talking to. > > I don't quite follow you here. I'm saying that there is a DAG (nothing > has mutual dependencies - this is true today, and if it stops being > true we could separate 'allocate and execute' to get a DAG); given > that we have a parent->child structure we can create (but it won't be > process based, rather we'd bring up helpers, and inject their returned > config into the next as per the dag). > > >> This lets us capture useful things from starting it off without > >> needing a well known location a-priori. > > > > To state my above point again: in this case, sure you can start it off > > without needing a well-known location, but how do you find it again? > > It returns it as shell variables - 'bash snippets'. I'm slightly unconvinced by this interface -- but only slightly, and need to think a bit more. It'll work, I guess. > >> We can even 'run' postgresql in this fashion, and have it return the > >> DB name to use. > > > > Oh, I see, maybe you're thinking of this more at the level of how the > > 'make run' code builds up the various interacting processes it needs to > > start? > > Yes. This means that some of my waffle above is a bit misdirected :-) > >> Now, the cheap test iteration case can be addressed: > >>  - devs run eval `bin/run -i test --daemonise` > >>   - this outputs all the variables for all the servers started. > >>  - test code looks for a *per-service* information about pid files etc. > >>   e.g. LP_LIBRARIAN_PIDFILE and LP_LIBRARIAN_CONFIGFILE rather than > >> LP_PERSISTENT_TEST_SERVICES > >>  - to kill, eval `bin/test-servers --stop` > >>   (Which will kill the daemonised wrapper, and unset the environment > >> variables). > >>  - If LP_PERSISTENT_TEST_SERVICES is set and a service isn't running, > >> I propose to error, because I think it usefully indicates a bug in > >> that external process, and this is easier than detecting both 'not > >> started yet' and 'started but crashed' - especially given the test > >> runners tendancy to fork sub runners. > >> > >> > >> Concurrent testing then is easy: as long as all the fixtures are > >> meeting this contract, if the *default* behaviour is to  bring up a > >> unique instance, everything will come up fine. > > > > Well, "easy": I think we'll find we have a lot of absolute paths in our > > testrunner config currently, and these will need to be shaken out before > > we can reliably run tests in parallel on the same machine.  I think that > > if we can solve the external process problem, the other problems will be > > shallow even if they might be numerous, but maybe you are focusing a bit > > too much on the process case here. > > > > To try to be super clear, here's an example/use case: the acceptance > > tests for the scanner currently create branches in a known location > > (/var/tmp/bazaar.launchpad.net/mirrors I think), invoke the scanner > > script and check that the database has been updated.  In the new world, > > I guess they would create some branches in a temporary directory and > > then use some api to set the supermirror.hosted_branches_by_id (iirc) > > config key to point at this temporary directory and invoke the scanner > > script and check the database as before. > > yes. Specifically I would probably layer it like this: > branches = BranchHostingFixture() > scanner = ScannerFixture(branches) > branches.setUp() > scanner.setUp() > # make branches in branches.mirror_dir > scanner.trigger_scan() I think that makes sense. If BranchHostingFixture() is the thing that sets LPCONFIG I'm not sure it needs to be passed to ScannerFixture -- if they're communicating via a hidden channel (an env var) it's perhaps best to not make it look like they're communicating over a visible one? Minor detail though. > > This mythical API would have to work in a way that crosses process > > boundaries, possibly by creating yet another temporary directory, > > creating a file in it called launchpad.conf that looks like this: > > > > extends: $previous-value-of-LPCONFIG/launchpad.conf > > > > [supermirror] > > hosted_branches_by_id: file:///path/to/tmp/dir > > created by branches.setUp(); found by scanner.setUp() Right. > > and update the LPCONFIG environment variable to point at this directory > > (as well as updating the in-process in-memory config).  It would > > presuambly be organized as a fixture to make undoing all this easy. > > > > Does this make sense? > > yes. Good. > >> Note that in this model there is never a need to do more than 'kill > >> helper-pid' to shut something down: that helper pid will encapsulate > >> all the cleanup logic, kill-9ing, dropdb of temporary dbs etc, and the > >> helper code should be simple and robust. This will help give us a > >> simple, robust interface. > > > > Right, that bit sounds nice.  We should of course work on fixing it, but > > I don't think we can rely on our processes shutting down nicely when > > asked.  Having a wrapper that does that sounds good. > > [...] > >> What do you think? > > > > I think it by and large makes sense.  The only part that I think might > > have to change is that you talk about environment variables like > > LP_LIBRARIAN_CONFIGFILE, but in fact in Launchpad all services are > > configured from one config file.  So I think we'll end up managing a > > stack of config files, and pushing and popping from the stack will be a > > matter of updating where LPCONFIG points. > > > > Apologies for the slightly rambling reply, I hope it's useful. > > It is indeed useful. Glad to hear it. Cheers, mwh _______________________________________________ Mailing list: https://launchpad.net/~launchpad-dev Post to : [email protected] Unsubscribe : https://launchpad.net/~launchpad-dev More help : https://help.launchpad.net/ListHelp

