Re: [lssconf-discuss] testing large systems using virtualization

Luke S. Crawford Fri, 01 Jun 2007 14:30:41 -0700

On Fri, 1 Jun 2007, Edmund Smith wrote:

There are problems with behavioral tests as well of course. Not only
are many of the things that you want to test inaccessible from shell script
probes as well, but shell tests inevitably occur after deployment, by which
time it may be too late to undo the damage you caused by doing the change in
the first place, and the question of what to do after deployment fails is a
hard one.

the 'what to do after deployment fails' question is pretty easily solvedby keeping your configs in a revision control system and some roll-backscripts. (we have a fairly nice rollback system here; other places i'veworked just said "make a backup copy before you do anything" Buteither way, rollback on config files is a solved problem; even at mostsmaller/backwards shops. rollback on binaries is less solved, but it ismuch closer to being solved than the configuration testing problem.) Ofcourse, you need good monitoring so you know right away when things arebroken, and often the time required to deploy + detect problem + rollbackdeploy is Not Good Enough.

The point of the pre-production tests, behavioral or higher level, isto prevent the need for a rollback and the resulting downtime. To someextent, in some environments, the pre-production test is somewhatexchangeable with better monitoring and quicker rollback tests.

Now, I've worked all my life in the command-line world. (seriously. myparents computers switched from dos to windows in '94 or so, and I got thehand-me-down 286; by '95, I had gotten my first job, and a 486DLC and wasrunning linux full-time... ever since then I've been working ininternet-land where even the GUIs are text-based under the hood.) but Ican't think of anything I've ever managed that I couldn't test with ashell script. But on the off chance that you have something that isdifficult to test via shell, in theory, you should be able to plug yourclient into the behavioral test network.

The problem with behavioral tests (and why I think virtualization mighthelp) is the sheer scale. the best behavioral test is an exact duplicateof production, which is often quite impractical. At this place, we haveseveral duplicate datacenters; in theory, I could take one out ofproduction, roll out to it, test, and then bring it back online; but eventhen, I imagine the bandwidth spikes could get expensive pretty quick.(I'm not privy to what we pay for bandwidth, and I imagine a lotof it is settlement-free peering, but it would not take very many 95thpercentile links getting activated to eat a couple years of mysalary.)

So, it's pretty easy for me to fit 100 standard Linux servers on one 4Gbox using xen; by standard, I mean running apache/MySQL/bind and/or a fewother things. The "hard problem", of course, is the data. But let'spretend it's easy to get subsets of that as well. This, of course,will make the system even more useless for detecting performance issues,but it should, (assuming we sufficiently emulate the network setup) detectconfig problems, network problems, and even non-performance buginteractions before they hit stuff that customers see. we should be ableto use the same monitoring system to detect problems in the behavioraltest cluster as we use to detect problems in production, and failing areasonable monitoring system, you can hook clients up to the test cluster;you will have less data and it will be slower, but it should work the sameas production other than that.

a 100:1 ratio of servers isn't free, but it does give you a fairly massivetest system within a reasonable budget, and I think I can get thatratio up higher if I use lighter-weight virtualization like vserver,OpenVZ or FreeBSD jails. (Marko Zec has some awesome patches for FreeBSD4.x that give each jail it's own network stack that he wrote for theIMUNES network emulation project... it also makes IPC work correctly;back before Xen, I used that in production because normal FreeBSD jailsdid not play nicely with IPC and thus PostgreSQL)

_______________________________________________
lssconf-discuss mailing list
lssconf-discuss@inf.ed.ac.uk
http://lists.inf.ed.ac.uk/mailman/listinfo/lssconf-discuss

Re: [lssconf-discuss] testing large systems using virtualization

Reply via email to