On Wed, Feb 22, 2012 at 4:22 PM, Bill Maidment <[email protected]> wrote: > -----Original message----- > From: Tom H <[email protected]> > Sent: Thu 23-02-2012 01:12 > Subject: Re: Degraded array issues with SL 6.1 and SL 6.2 > To: SL Users <[email protected]>; >> On Wed, Feb 22, 2012 at 7:58 AM, Bill Maidment <[email protected]> wrote: >> > -----Original message----- >> > From: Bill Maidment <[email protected]> >> > Sent: Mon 20-02-2012 17:43 >> > Subject: Degraded array issues with SL 6.1 and SL 6.2 >> > To: [email protected] <[email protected]>; >> >> I have had some issues with the last two kernel releases. When a degraded >> array >> >> event occurs, I am unable to add a new disk back in to the array. This has >> been >> >> reported on Centos 6.1/6.2 and also RHEL 6.2 (see Bug 772926 - dracut >> >> unable >> to >> >> boot from a degraded raid1 array). I have found that I need to revert to >> kernel >> >> 2.6.32-131.21.1.el6.x86_64 in order to be able to add the new drive. >> > >> > The response from RH is as follows: >> > 1) If you try to re-add a disk to a running raid1 after having failed it, >> > mdadm correctly rejects it as it has no way of knowing which of the disks >> > are authoritative. It clearly tells you that in the error message you >> > pasted into the bug. >> > >> > 2) You reported a Scientific Linux bug against Red Hat Enterprise Linux. >> > Red Hat does not support Scientific Linux, please report bugs against >> > Scientific Linux to the people behind Scientific Linux. >> > >> > My response is: >> > 1) a) It used to work it out. b) No it does not clearly spell it out. c) >> > Why >> was it not a problem in earlier kernels? >> > 2) Is this an SL bug? I think not! >> >> Bug 772926 doesn't have anything about SL. Are you referring to another bug? >> >> In (1) above, are they replying that you can't "--fail", "--remove", >> and then "--add" the same disk or that you can't "--fail" and >> "--remove" a disk, replace it, and then can't "--add" it because it's >> got the same "X"/"XY" in "sdX"/"sdaXY" as the previous, failed disk? >> > > Bug 772926 was reported from via someone from CentOS, but it would affect SL > too and it seemed to be related > http://bugs.centos.org/view.php?id=5400 > > I think they are saying that you NOW can't re-add the same disk without first > zeroing out the disk superblock. > I just find the wording of the error message a bit confusing: > > [root@ferguson ~]# mdadm /dev/md3 -a /dev/sdc1 > mdadm: /dev/sdc1 reports being an active member for /dev/md3, but a --re-add > fails. > mdadm: not performing --add as that would convert /dev/sdc1 in to a spare. > mdadm: To make this a spare, use "mdadm --zero-superblock /dev/sdc1" first. > [root@ferguson ~]#
772926 doesn't have "You reported a Scientific Linux bug against Red Hat Enterprise Linux". The wording about a spare in the third line seems wrong. Anyway, I'd never re-add a failed and removed disk without zeroing the superblock; if you could do it previously, it was an oversight/bug that's now been fixed.
