Re: [MLUG] RAID5 with software raid and 2 failed drives.

Jean-Luc cooke Fri, 05 Dec 2014 10:00:16 -0800

RAID5 single-redundancy (N-1 out of N) has its risks when the capacityto write-speed ratios are as high as they have been in the past 5-10years. Your problem of 2nd failure before recovery is the main issue.See "RAID6" in http://en.wikipedia.org/wiki/Non-standard_RAID_levels

Solutions are usually double redundancy (2 of 4) or striping mirroredraid (stripe over two arrays of 2 mirrored discs) also known as RAID1+0in line above.


None of which will help you now.

Right now, you need luck and patience or the constitution to say goodbyeto your data. Keep trying to bring your raid back up with redundancy.(I'd give up after 3 failed rebuilds) Then you need to re-sun smartctlself tests and replace anything that looks bad. Then, move to a doubleredundancy setup. You may need more drives.

There really isn't much other choice. There are data recovery servicesin town used by police and the like. Last I read it was $300 per HD.Not sure if they can rebuild linux software raid5 arrays. But maybe.

I've moved away from Linux RAID because I don't have the time for thisstuff anymore. I got a Drobo 5N was able to move my discs in one at atime after copying the data in. Drobo Apps exist for ssh, rsync andnfs. Not the fastest thing in the world. But fast enough to play 1080ph.264 videos with 5.1 AAC which is really the main work-load for thisdevice. It's a toaster, I set it and forget it.


On 05/12/14 12:39, Emery Guevremont wrote:

Hello,
I haven't written to this list in a while, the situation I put myselfin requires me to ask for some help from this mailing list.
The short story here is I have a 4 disk raid 5 array, where one of mydrives died. My array became degraded, I shutdown my home server untilI received a new replacement drive. Upon receiving my replacementdrive, I used it to replaced the broken drive and booted into singleuser mode to resync my RAID5 array. At about the 12% mark, anotherreturned a read error and mdadm marked that drive to failed,essentially leaving me with a spare, a failed drive and only 2 driveout 4 in the array. I know, my chances are bleek to recover from this,but I still believe there's hope. Forget about backups, only half ofmy important files are backed up. I was in the middle of building abigger NAS to backup my home server.
Long story and what I've done.

/dev/md0 is assembled with 4 drives
/dev/sda3
/dev/sdb3
/dev/sdc3
/dev/sdd3
2 weeks ago, mdadm marked /dev/sda3 as failed. cat /proc/mdstat showed_UUU. smarctl also confirmed that the drive was dying. So I shutdownthe server and until I received a replacement drive.
This week, I replaced the dying drive with my new drive. Booted intosingle user mode and did this:
mdadm --manage /dev/md0 --add /dev/sda3 a cat of /proc/mdstatconfirmed the resyncing process. The last time I checked it was up to11%. After a few minutes later, I noticed that the syncing stopped. Anread error message on /dev/sdd3 (have a pic of it if interested)appear on the console. It appears that /dev/sdd3 might be going bad. Acat /proc/mdstat showed _U_U. Now I panic, and decide to leaveeverything as is and to go to bed.
The next day, I shutdown the server and reboot with a live usb distro(Ubuntu rescue remix). After booting into the live distro, a cat/proc/mdstat showed that my /dev/md0 was detected but all drives hadan (S) next to it. i.e. /dev/sda3 (S)... Naturally I don't like thelooks of this.
I ran ddrescue to copy /dev/sdd onto my new replacement disk(/dev/sda). Everything, worked, ddrescue got only one read error, butwas eventually able to read the bad sector on a retry.
Tonight I plan to repeat this procedure with ddrescue and to clone/dev/sdb and /dev/sdc.
Now is where I need your help. How should I got about to try andrebuild my array? I will be using the cloned drives to do this. Mygoal is to simply assemble my raid array in degraded state with sdb3,sdc3 and sdd3, mount /dev/md0 and backup as many files as I can. Fromthe various google searches, I was going to first try a:
mdadm --assemble --scan #which I expect to not work.
mdadm --assemble --force #still not quite sure about the syntax and ifordering is important.
I've also seen people using the --assume-clean option.
But what commands should I do. Since if I mess-up, starting over, byrecloning my drives is a time consuming step.
_______________________________________________
mlug mailing list
[email protected]
https://listes.koumbit.net/cgi-bin/mailman/listinfo/mlug-listserv.mlug.ca



--
Jean-Luc Cooke
+1-613-263-2983

_______________________________________________
mlug mailing list
[email protected]
https://listes.koumbit.net/cgi-bin/mailman/listinfo/mlug-listserv.mlug.ca

Re: [MLUG] RAID5 with software raid and 2 failed drives.

Reply via email to