Bug#396582: Some additional info

Dan Pascu Sat, 04 Nov 2006 02:42:06 -0800

martin f krafft wrote:

also sprach Dan Pascu <[EMAIL PROTECTED]> [2006.11.03.2238 +0100]:

But I'm glad you were able to at least see the problem I'm
experiencing. One thing that intrigues me is why in my case when
failing a drive and stopping the array, after restarting it, the
failed drive was already removed (even though I never removed it
myself) and the arrays started degraded with 1 drive out of 2, and
in your case the array started with the failed drive included and
reported that it started with 2 drives.


This only happens when the last update time stored in the failed
component's superblock is the same as the time in the other
components superblocks. Then mdadm says that the drive is failed but
looks okay. When I reproduced the problem, I saw exactly your
behaviour and did not have to remove the component.

Here's what I think happens exactly:

  - while the array is running, writes result in updates to the
    superblocks
  - if you fail a component, its superblock is no longer updated
  - when you stop/start an array, mdadm checks all the superblocks
  - if they all seem as if they'd been stopped at the same time, it
    just assembles.
  - however, if a superblock seems out of date, md writes:

      kernel: md: kicking non-fresh sde1 from array!

    and starts the array in degraded mode.

I think that explains it. Since I was using bitmaps, which were updatedevery 5 seconds, it's very likely that the superblock was updated atleast once between the moment I failed it and the moment I stopped thearray.I guess you can obtain the same effect, without writing to the arraywhile failed, but by manually removing the drive after failing it andbefore stopping the array. The issue seems to be when the array startsdegraded with 1 drive missing.


--
Dan




--
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]

Bug#396582: Some additional info

Reply via email to