On Jul 13, 2006, at 3:03 PM, mos wrote:

At 03:45 PM 7/12/2006, Jon Frisby wrote:
This REALLY should be an academic concern.  Either you have a system
that can tolerate the failure of a drive, or you do not.  The
frequency of failure rates is pretty much irrelevant:  You can train
incredibly non-technical (inexpensive) people to respond to a pager
and hot-swap a bad drive.
If you are in the position where the typical failure-rate of a class
of drive is of concern to you then either: A) You have a different
problem causing all your drives to fail ultra-fast (heat, electrical
noise, etc) or B) You haven't adequately designed your storage
subsystem.


It all depends how valuable your uptime is. If you double or triple the time between hard disk failures, most people would pay extra for that so they buy SCSI drive. You wouldn't take your family car and race in the Indy 500, would you? After a few laps at 150 mph (if you can get it going that fast), it will seize up, so you go into the pit stop and what? Get another family car and drive that? And keep doing that until you finish the race? Down time is extremely expensive and embarrassing. Just talk to the guys at FastMail who has had 2 outages even with hardware raid in place. Recovery doesn't always work as smoothly as you think it should.

Again: Either your disk sub-system can TOLERATE (read: CONTINUE OPERATING IN THE FACE OF) a drive failure, or it cannot. If you can't hot-stop a dead drive, your system can't tolerate the failure of a drive.

Your analogy is flawed. The fact that companies like Google are running with incredibly good uptimes while using cheap, commodity hardware (including IDE drives!) demonstrates it.

SCSI drives WILL NOT improve your uptime by a factor of 2x or 3x. Using a hot-swappable disk subsystem, and having hot-spares WILL. Designing your systems without needless single points of failure WILL.


Software RAID? Are you serious? No way!

You make a compelling case for your position, but I'm afraid I still disagree with you. *cough*

If you're using RAID10, or other forms of RAID that don't involve computing a checksum (and the "write hole" that accompanies it), there's little need for hardware support. It won't make things dramatically faster unless you spend a ton of money on cache -- in which case you should seriously consider a SAN for the myriad other benefits it provides. The "reliability" introduced by hardware RAID with battery backups is pretty negligible if you're doing your I/O right (I.E. you've made sure your drives aren't lying when they say a write has completed AND you're using fsync -- which MySQL does).

-JF



--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:    http://lists.mysql.com/[EMAIL PROTECTED]

Reply via email to