Carl Wilhelm Soderstrom wrote:
> On 07/23 12:50 , Les Mikesell wrote:
>> I've never seen anything go wrong with Linux RAID1 over 6 or so years 
>> and dozens of machines and I've abused it with things like hot swapping 
>> SCA drives and rebuilding the replacement, cloning machines by splitting 
>> the pair and letting both rebuild a new mirror, and using a 3-member set 
>> with one normally missing for my backuppc archive.  What kind of 
>> problems have you seen?
> 
> - When one drive dies, the whole machine kernel panicking and locking up.
>   The data was still good on the second drive; but it required a site visit
>   to get things working again; and then another site visit to replace the
>   drive.

That has to do with putting two IDE drives on the same controller.

> - One disk in an array sort-of kind-of dead, but still throwing bus errors
>   and slowing things down. Then causing the whole machine to panic when the
>   drive was poked with hdparm (new techician said "oh, it's running slow,
>   I'll just hdparm -X69 it" and did so before I could stop him).

Same here. I like to keep at least the boot and root partitions on scsi.

> - First disk in an array dying, and the second one not being bootable.
>   (You have to write boot records to the disks individually, and either
>   someone typoed the command in this case, or GRUB failed to write what it
>   should have, both of which are concievable).

IDE drives can fail in ways that keep the machine from booting at all - 
even from a drive on the other controller.  But I'd expect people 
involved with doing backups and recovery to know how to open a case, 
swap cables and jumpers, etc. The issue with grub installers not 
understanding raid is valid but recovery is the same as for other 
failure modes that you may need to fix.

> - When building/rebuilding arrays it's *very* easy to transpose partition
>   numbers or device names and wipe out all your data. (Remember that when
>   this stuff goes wrong it's never at a good time; so you're stressed and
>   sometimes tired). (yes, this is a user issue not a technology issue).

mdadm is pretty good about knowing which partitions are already in use. 
In the case where you are adding the replacement mirror I don't think it 
will let you do it wrong.

> In our experience, 3ware controllers make the drive replacement process so
> much easier 90% of the time, that they more than pay for themselves in
> reduced labor costs when things go wrong. Software RAID is false economy in
> a corporate environment where labor costs 50-100s of $/hr. 

Just be sure to include the cost of a spare offsite controller to be 
sure you will be able to access the disks if that's all you have left 
after a disaster, and the time to learn the details of the 
vendor-specific utilities.

-- 
   Les Mikesell
    [EMAIL PROTECTED]

-------------------------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >>  http://get.splunk.com/
_______________________________________________
BackupPC-users mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/backuppc-users
http://backuppc.sourceforge.net/

Reply via email to