On Fri, 1 Jun 2007, Edmund Smith wrote:
There are problems with behavioral tests as well of course. Not only
are many of the things that you want to test inaccessible from shell script
probes as well, but shell tests inevitably occur after deployment, by which
time it may be too late to undo the damage you caused by doing the change in
the first place, and the question of what to do after deployment fails is a
hard one.

the 'what to do after deployment fails' question is pretty easily solved by keeping your configs in a revision control system and some roll-back scripts. (we have a fairly nice rollback system here; other places i've worked just said "make a backup copy before you do anything" But either way, rollback on config files is a solved problem; even at most smaller/backwards shops. rollback on binaries is less solved, but it is much closer to being solved than the configuration testing problem.) Of course, you need good monitoring so you know right away when things are broken, and often the time required to deploy + detect problem + rollback deploy is Not Good Enough.

The point of the pre-production tests, behavioral or higher level, is to prevent the need for a rollback and the resulting downtime. To some extent, in some environments, the pre-production test is somewhat exchangeable with better monitoring and quicker rollback tests.

Now, I've worked all my life in the command-line world. (seriously. my parents computers switched from dos to windows in '94 or so, and I got the hand-me-down 286; by '95, I had gotten my first job, and a 486DLC and was running linux full-time... ever since then I've been working in internet-land where even the GUIs are text-based under the hood.) but I can't think of anything I've ever managed that I couldn't test with a shell script. But on the off chance that you have something that is difficult to test via shell, in theory, you should be able to plug your client into the behavioral test network.

The problem with behavioral tests (and why I think virtualization might help) is the sheer scale. the best behavioral test is an exact duplicate of production, which is often quite impractical. At this place, we have several duplicate datacenters; in theory, I could take one out of production, roll out to it, test, and then bring it back online; but even then, I imagine the bandwidth spikes could get expensive pretty quick. (I'm not privy to what we pay for bandwidth, and I imagine a lot of it is settlement-free peering, but it would not take very many 95th percentile links getting activated to eat a couple years of my salary.)

So, it's pretty easy for me to fit 100 standard Linux servers on one 4G box using xen; by standard, I mean running apache/MySQL/bind and/or a few other things. The "hard problem", of course, is the data. But let's pretend it's easy to get subsets of that as well. This, of course, will make the system even more useless for detecting performance issues, but it should, (assuming we sufficiently emulate the network setup) detect config problems, network problems, and even non-performance bug interactions before they hit stuff that customers see. we should be able to use the same monitoring system to detect problems in the behavioral test cluster as we use to detect problems in production, and failing a reasonable monitoring system, you can hook clients up to the test cluster; you will have less data and it will be slower, but it should work the same as production other than that.

a 100:1 ratio of servers isn't free, but it does give you a fairly massive test system within a reasonable budget, and I think I can get that ratio up higher if I use lighter-weight virtualization like vserver, OpenVZ or FreeBSD jails. (Marko Zec has some awesome patches for FreeBSD 4.x that give each jail it's own network stack that he wrote for the IMUNES network emulation project... it also makes IPC work correctly; back before Xen, I used that in production because normal FreeBSD jails did not play nicely with IPC and thus PostgreSQL)
_______________________________________________
lssconf-discuss mailing list
lssconf-discuss@inf.ed.ac.uk
http://lists.inf.ed.ac.uk/mailman/listinfo/lssconf-discuss

Reply via email to