martin f krafft wrote:
also sprach Dan Pascu <[EMAIL PROTECTED]> [2006.11.03.2238 +0100]:
But I'm glad you were able to at least see the problem I'm
experiencing. One thing that intrigues me is why in my case when
failing a drive and stopping the array, after restarting it, the
failed drive was already removed (even though I never removed it
myself) and the arrays started degraded with 1 drive out of 2, and
in your case the array started with the failed drive included and
reported that it started with 2 drives.

This only happens when the last update time stored in the failed
component's superblock is the same as the time in the other
components superblocks. Then mdadm says that the drive is failed but
looks okay. When I reproduced the problem, I saw exactly your
behaviour and did not have to remove the component.

Here's what I think happens exactly:

  - while the array is running, writes result in updates to the
    superblocks
  - if you fail a component, its superblock is no longer updated
  - when you stop/start an array, mdadm checks all the superblocks
  - if they all seem as if they'd been stopped at the same time, it
    just assembles.
  - however, if a superblock seems out of date, md writes:

      kernel: md: kicking non-fresh sde1 from array!

    and starts the array in degraded mode.

I think that explains it. Since I was using bitmaps, which were updated every 5 seconds, it's very likely that the superblock was updated at least once between the moment I failed it and the moment I stopped the array. I guess you can obtain the same effect, without writing to the array while failed, but by manually removing the drive after failing it and before stopping the array. The issue seems to be when the array starts degraded with 1 drive missing.

--
Dan




--
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]

Reply via email to