Re: [gentoo-user] A drive in my RAID6 has failed
On Fri, Sep 6, 2013 at 12:46 AM, Paul Hartman paul.hartman+gen...@gmail.com wrote: So, I simply inserted and partitioned the new drive, added it to the array and away we go! md0 : active raid6 sde1[6] sdd1[5] sdg1[4] sdh1[2] sdf1[1] sdi1[0] 11720009728 blocks super 1.2 level 6, 512k chunk, algorithm 2 [6/5] [UUU_UU] [] recovery = 2.3% (69513216/2930002432) finish=428.7min speed=111206K/sec When I wake up in the morning, I hope there won't be any errors. Success! It took 10 hours to rebuild the drive (speeds near the start of the disk are significantly faster than those near the end of the disk, so early estimates quoted by /proc/mdstat above were overly optimistic): [3720270.120695] md: bindsde1 [3720270.162933] RAID conf printout: [3720270.162942] --- level:6 rd:6 wd:5 [3720270.162949] disk 0, o:1, dev:sdi1 [3720270.162954] disk 1, o:1, dev:sdf1 [3720270.162958] disk 2, o:1, dev:sdh1 [3720270.162962] disk 3, o:1, dev:sde1 [3720270.162965] disk 4, o:1, dev:sdg1 [3720270.162969] disk 5, o:1, dev:sdd1 [3720270.163060] md: recovery of RAID array md0 [3720270.163067] md: minimum _guaranteed_ speed: 1000 KB/sec/disk. [3720270.163071] md: using maximum available idle IO bandwidth (but not more than 20 KB/sec) for recovery. [3720270.163085] md: using 128k window, over a total of 2930002432k. [3756293.459324] md: md0: recovery done. [3756294.797961] RAID conf printout: [3756294.797969] --- level:6 rd:6 wd:6 [3756294.797974] disk 0, o:1, dev:sdi1 [3756294.797979] disk 1, o:1, dev:sdf1 [3756294.797982] disk 2, o:1, dev:sdh1 [3756294.797986] disk 3, o:1, dev:sde1 [3756294.797989] disk 4, o:1, dev:sdg1 [3756294.797992] disk 5, o:1, dev:sdd1
Re: [gentoo-user] A drive in my RAID6 has failed
On Thu, Sep 5, 2013 at 11:52 AM, Michael Orlitzky mich...@orlitzky.com wrote: This is the process I always follow: http://www.howtoforge.com/replacing_hard_disks_in_a_raid1_array The sfdisk trick will save you a bit of hassle. Thanks, it looks like I was on the right path! Crossing my fingers...
Re: [gentoo-user] A drive in my RAID6 has failed
On 09/05/2013 12:49 PM, Paul Hartman wrote: Hi, I woke up this morning to see the dreaded email from mdadm telling me one of my drives failed overnight, while I was happily dreaming about cute puppies and kittens installing a rainbow-colored roof on my house. The array is a RAID6 (two parity drives) and this is the current state: md0 : active raid6 sdd1[5] sdg1[4] sde1[3](F) sdh1[2] sdf1[1] sdi1[0] 11720009728 blocks super 1.2 level 6, 512k chunk, algorithm 2 [6/5] [UUU_UU] I've been using RAID in Linux for years, but this is actually the first time I've had a disk fail in one. If I remember correctly, the process should be as simple as: #remove the failed disk from the array: mdadm /dev/md0 -r /dev/sde1 #pull the drive, replace with new one, partition it, then add it to the array: mdadm /dev/md0 -a /dev/sde1 and sit back and eat popcorn while I enjoy the blinkenlights for the next several hours/days? :) Any advice/suggestions for managing this process any differently? This is the process I always follow: http://www.howtoforge.com/replacing_hard_disks_in_a_raid1_array The sfdisk trick will save you a bit of hassle.
[gentoo-user] A drive in my RAID6 has failed
Hi, I woke up this morning to see the dreaded email from mdadm telling me one of my drives failed overnight, while I was happily dreaming about cute puppies and kittens installing a rainbow-colored roof on my house. The array is a RAID6 (two parity drives) and this is the current state: md0 : active raid6 sdd1[5] sdg1[4] sde1[3](F) sdh1[2] sdf1[1] sdi1[0] 11720009728 blocks super 1.2 level 6, 512k chunk, algorithm 2 [6/5] [UUU_UU] I've been using RAID in Linux for years, but this is actually the first time I've had a disk fail in one. If I remember correctly, the process should be as simple as: #remove the failed disk from the array: mdadm /dev/md0 -r /dev/sde1 #pull the drive, replace with new one, partition it, then add it to the array: mdadm /dev/md0 -a /dev/sde1 and sit back and eat popcorn while I enjoy the blinkenlights for the next several hours/days? :) Any advice/suggestions for managing this process any differently? For now I have unmounted the filesystem that sits atop it, to prevent any more writes from occurring, just in case... Thanks, Paul
Re: [gentoo-user] A drive in my RAID6 has failed
On Thu, Sep 5, 2013 at 12:11 PM, Paul Hartman paul.hartman+gen...@gmail.com wrote: On Thu, Sep 5, 2013 at 11:52 AM, Michael Orlitzky mich...@orlitzky.com wrote: This is the process I always follow: http://www.howtoforge.com/replacing_hard_disks_in_a_raid1_array The sfdisk trick will save you a bit of hassle. Thanks, it looks like I was on the right path! Crossing my fingers... So, I probably should not have attempted to do this immediately after eating dinner. My brain was not operating at full speed, and I went ahead and pulled the drive before removing it from the array. Oops! As soon as I pulled the latch to release the drive, I had that oh no! moment. Luckily, as it turns out, md (or mdadm? or udev?) was nice enough to automatically remove it for me when the drive ceased to exist. So, I simply inserted and partitioned the new drive, added it to the array and away we go! md0 : active raid6 sde1[6] sdd1[5] sdg1[4] sdh1[2] sdf1[1] sdi1[0] 11720009728 blocks super 1.2 level 6, 512k chunk, algorithm 2 [6/5] [UUU_UU] [] recovery = 2.3% (69513216/2930002432) finish=428.7min speed=111206K/sec When I wake up in the morning, I hope there won't be any errors. BTW -- a couple tips I found which speed up RAID building/recovery tremendously (season to taste): echo 32768 /sys/block/md0/md/stripe_cache_size echo 20 /proc/sys/dev/raid/speed_limit_max