I have done both SW and HW raid across with OSTs and MDTs.

As part of your choice, look into what happens when you have to replace a 
failed disk in a sw configuration.  My negatives for sw raid are all management 
at this point.

When you pull a bad disk out of a linux box (/dev/sde for example) and insert a 
new disk, the new disk will not always come back as sde, it will come back at 
first available device letter.  When you reconfigure your partitions and add it 
back into your array, you will have to remember that you need to tweak the 
partitions on the new drive letter.  When you reboot, your device letters will 
sort themselves back out and that new disk will again go back to sde, if that 
is the placement on the controller. If your machine has been up for a long time 
with a few failed disks, you may have multiple holes in your dev lettering.  
Not a big deal for one or two, but when you have hundred of machines, you will 
probably have an ops team that does the work, not you.  

When you reboot a machine that has a failed disk in the array (degraded), the 
array will not start by default in a degraded state.  If you have LVMs on top 
of your raid arrays, they will also not start.  You will need to log into the 
machine, manually force start the array in a degraded state and then manually 
start the LVM on top of the SW raid array.

By default, grub does not install on multiple disks.  Assuming you also raid 
your boot disks, you will need to manually put your boot loader on the front of 
each bootable disk.

Some controllers have a memory of which disks are inserted into which slots. 
They will not present disks beyond a certain number to the BIOS for booting.  
If you boot replace the boot disks too many times, they will no longer present 
a bootable disk to the BIOS.  The only way to correct this for the controller I 
have worked with is to pull all but one non-bootable disk, then boot into the 
controller firmware and clear the device memory, then reconnect all of the 
disks. (We only discovered this issue in the lab, and haven't seen it yet in 
production.)

In my experience, maintenance for linux sw raid is significantly more difficult 
than hw raid.



-----Original Message-----
From: [email protected] 
[mailto:[email protected]] On Behalf Of Brian O'Connor
Sent: Wednesday, March 23, 2011 8:55 PM
To: [email protected]
Subject: [Lustre-discuss] software raid


This has probably been asked and answered.

Is software raid(md) still considered bad practice?

I would like to use ssd drives for an mdt, but using fast ssd drives
behind a raid controller seems to defeat the purpose.

There was some thought that the decision not to support
software raid was mostly about Sun/Oracle trying to sell hardware
raid.

thoughts?

-- 
Brian O'Connor
-----------------------------------------------------------------------
SGI Consulting
Email: [email protected], Mobile +61 417 746 452
Phone: +61 3 9963 1900, Fax:  +61 3 9963 1902
357 Camberwell Road, Camberwell, Victoria, 3124
AUSTRALIA
http://www.sgi.com/support/services
-----------------------------------------------------------------------

_______________________________________________
Lustre-discuss mailing list
[email protected]
http://lists.lustre.org/mailman/listinfo/lustre-discuss
_______________________________________________
Lustre-discuss mailing list
[email protected]
http://lists.lustre.org/mailman/listinfo/lustre-discuss

Reply via email to