[MLUG] RAID5 with software raid and 2 failed drives.

Emery Guevremont Fri, 05 Dec 2014 09:41:12 -0800

Hello,

I haven't written to this list in a while, the situation I put myself in
requires me to ask for some help from this mailing list.


The short story here is I have a 4 disk raid 5 array, where one of my
drives died. My array became degraded, I shutdown my home server until I
received a new replacement drive. Upon receiving my replacement drive, I
used it to replaced the broken drive and booted into single user mode to
resync my RAID5 array. At about the 12% mark, another returned a read error
and mdadm marked that drive to failed, essentially leaving me with a spare,
a failed drive and only 2 drive out 4 in the array. I know, my chances are
bleek to recover from this, but I still believe there's hope. Forget about
backups, only half of my important files are backed up. I was in the middle
of building a bigger NAS to backup my home server.

Long story and what I've done.

/dev/md0 is assembled with 4 drives
/dev/sda3
/dev/sdb3
/dev/sdc3
/dev/sdd3

2 weeks ago, mdadm marked /dev/sda3 as failed. cat /proc/mdstat showed
_UUU. smarctl also confirmed that the drive was dying. So I shutdown the
server and until I received a replacement drive.

This week, I replaced the dying drive with my new drive. Booted into single
user mode and did this:

mdadm --manage /dev/md0 --add /dev/sda3  a cat of /proc/mdstat confirmed
the resyncing process. The last time I checked it was up to 11%. After a
few minutes later, I noticed that the syncing stopped. An read error
message on /dev/sdd3 (have a pic of it if interested) appear on the
console. It appears that /dev/sdd3 might be going bad. A cat /proc/mdstat
showed _U_U. Now I panic, and decide to leave everything as is and to go to
bed.

The next day, I shutdown the server and reboot with a live usb distro
(Ubuntu rescue remix). After booting into the live distro, a cat
/proc/mdstat showed that my /dev/md0 was detected but all drives had an (S)
next to it. i.e. /dev/sda3 (S)... Naturally I don't like the looks of this.

I ran ddrescue to copy /dev/sdd onto my new replacement disk (/dev/sda).
Everything, worked, ddrescue got only one read error, but was eventually
able to read the bad sector on a retry.

Tonight I plan to repeat this procedure with ddrescue and to clone /dev/sdb
and /dev/sdc.

Now is where I need your help. How should I got about to try and rebuild my
array? I will be using the cloned drives to do this. My goal is to simply
assemble my raid array in degraded state with sdb3, sdc3 and sdd3, mount
/dev/md0 and backup as many files as I can. From the various google
searches, I was going to first try a:

mdadm --assemble --scan #which I expect to not work.
mdadm --assemble --force #still not quite sure about the syntax and if
ordering is important.
I've also seen people using the --assume-clean option.

But what commands should I do. Since if I mess-up, starting over, by
recloning my drives is a time consuming step.

_______________________________________________
mlug mailing list
[email protected]
https://listes.koumbit.net/cgi-bin/mailman/listinfo/mlug-listserv.mlug.ca

[MLUG] RAID5 with software raid and 2 failed drives.

Reply via email to