Bug#588965: marked as done (Please add support for replacing a failing but still usable drive with a spare without marking the first drive as failed)

Debian Bug Tracking System Fri, 05 Dec 2014 04:36:44 -0800

Your message dated Fri, 05 Dec 2014 15:33:50 +0300
with message-id <[email protected]>
and subject line Re: Please add support for replacing a failing but still 
usable drive with a spare without marking the first drive as failed
has caused the Debian Bug report #588965,
regarding Please add support for replacing a failing but still usable drive 
with a spare without marking the first drive as failed
to be marked as done.


This means that you claim that the problem has been dealt with.
If this is not the case it is now your responsibility to reopen the
Bug report if necessary, and/or fix the problem forthwith.

(NB: If you are a system administrator and have no idea what this
message is talking about, this may indicate a serious mail system
misconfiguration somewhere. Please contact [email protected]
immediately.)


-- 
588965: http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=588965
Debian Bug Tracking System
Contact [email protected] with problems

--- Begin Message ---

Package: mdadm
Version: 3.1.2-2
Severity: wishlist
Tags: upstream

Hi,

Especially in the case of RAID5 arrays it would often be life-saving to be
able to activate a hot-spare and prepare to replace a live drive with it,
without marking that drive as failed first.

Consider the following scenario. Let's say we have a RAID5 array composed of
sdb, sdc and sdd, with sde added as a spare (i.e. 3 active drives).

sdc starts to noticeably fail. Unknown to the user, sdd also has developed a
bad sector. The user marks sdc as failed and waits for sde to be synced;
however, during the resync, the system hits the bad sector on sdd, causing
sdd to also be marked as failed, the resync to fail and the array to become
unusable. (The same can happen if an intermittent bit error occurs during
the resync operation.)

The algorithm I'd like to see implemented would work as follows:

sdc starts to noticeably fail. The user marks it for replacement. sde is
activated and the system copies everything from sdc to sde, using the
redundancy provided by the other drives if/when necessary. Temporarily,
while this operation is in progress, sdc and sde are both active and in the
same slot; any writes that hit the array get committed to both. When sde is
completely up to date, sdc gets deactivated and marked as failed. The bad
sector on sdd doesn't compromise our ability to sync the hotspare. At this
point, another spare could be added, sdd marked for replacement, and so on.

I realise this also requires changes to the kernel. Apologies if it's
already planned; I haven't seen it discussed anywhere.

Best regards,

Andras

-- 
 Andras Korn <korn at elan.rulez.org> - <http://chardonnay.math.bme.hu/~korn/>
  All that glitters may not be gold, but it sure has a high refractive index.

--- End Message ---

--- Begin Message ---

Version: 3.3-1

On Tue, 13 Jul 2010 22:51:42 +0200 Andras Korn <[email protected]> 
wrote:
> Package: mdadm
> Version: 3.1.2-2
> Severity: wishlist
> Tags: upstream
> 
> Hi,
> 
> Especially in the case of RAID5 arrays it would often be life-saving to be
> able to activate a hot-spare and prepare to replace a live drive with it,
> without marking that drive as failed first.

This feature has been implemented in mdadm-3.3, which is available
in the debian archives for quite some time.  Closing this bugreport
now.

BTW, I'm not sure which kernel version is needed for this to work
properly.. ;)

Thanks,

/mjt

--- End Message ---

Bug#588965: marked as done (Please add support for replacing a failing but still usable drive with a spare without marking the first drive as failed)

Reply via email to