Edward Ned Harvey observed: > In the case of backups, I noticed two things. Failure modes. > (1) ... some backups that couldn't > maintain 5 seconds of power, and still didn't alert me to bad batteries. >...In other words, the backups caused more power outages than > they prevented for me.
Alas that's what I've concluded over a long career in data center management. If you want high-quality battery backup, you have to go for high-end units (usually of the rack-mount or central hardwired variety costing $3000 to $300,000) and keep the batteries maintained far more often than most of us ever bother with. Just a few weeks ago, I had a similar episode: an APC-branded unit abruptly died with no advance warning, just a fault light, high-pitched alarm and loss of power. These days I'm leaning more toward the Google/Facebook route: go cheap on the hardware with consumer-grade stuff, design two-of-everything (or 3 or 4 or more) a la "Hal 9000" style with diverse cable routing and geographic separation so you can go into any data center, start yanking cables, and have utterly no impact on operations. Expect lots of failures behind the scenes each year, but it costs a heckuva lot less and provides equivalent overall reliability. My last dev/QA lab design included enough rack-mount UPS to operate only about 30% of the servers, letting the others die during power outages and forcing users/administrators to decide which machines are actually mission-critical. Disk drives (even consumer-grade ones) are still sold with standard 5-year warranties so if you make the machines double-redundant (i.e. RAID1/RAID10 on each machine, plus disk clustering across 2 or more machines in separate locations, all of which can be done with open-source software and obsolete hardware if you have near-zero budget or with high-end new gear if you have a big budget) then you just keep shipping cartons handy to RMA failed drives as they crap out--swapping out the failed units very little labor effort, and if you standardize your drive capacities and keep some spares, it's even easier. Works whether you have 10 machines or 10,000. My larger point is that hardware redundancy and battery-backup serve two very different needs. If you need to maintain all your machines through power outages then you need standby generation and high-end UPS. If you have a home or desktop computer then you can get by with a low-end UPS but you should probably at the very least install software RAID1 on it. If you're looking for the most cost-effective way to keep a roomful of computers in good repair, the cost of UPS (whether high-end or low-end) outweighs the cost of spares, if you set up an efficient fault-tolerant design and keep track of equipment warranties. -rich _______________________________________________ bblisa mailing list [email protected] http://www.bblisa.org/mailman/listinfo/bblisa
