-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
The Tuesday 2007-03-06 at 16:26 +0100, Leen de Braal wrote:
> >> > Look at the logs... it's the only way. It could be a glitch. There
> >> > is a temporary problem sometime, a disk is removed, and it awaits
> >> > manual intervention. It will automatically activate an spare if
> >> > available, though.
> >> >
> >>
> >> Found:
> >>
> >> Mar 5 00:17:14 linux kernel: hda: dma_intr: status=0x51 { DriveReady
> >> SeekComplete Error }
> >> Mar 5 00:17:14 linux kernel: hda: dma_intr: error=0x40 {
> >> UncorrectableError }, LBAsect=273480054, high=16, low=5044598,
> >> sector=273480053
> >> Mar 5 00:17:14 linux kernel: ide: failed opcode was: unknown
> >> Mar 5 00:17:14 linux kernel: end_request: I/O error, dev hda, sector
> >> 273480053
> >> Mar 5 00:17:14 linux kernel: raid1: Disk failure on hda3, disabling
> >> device.
> >> Mar 5 00:17:14 linux kernel: Operation continuing on 1 devices
> >> Mar 5 00:17:14 linux kernel: raid1: hda3: rescheduling sector 271343408
> >> Mar 5 00:17:14 linux kernel: RAID1 conf printout:
> >> Mar 5 00:17:14 linux kernel: --- wd:1 rd:2
> >> Mar 5 00:17:14 linux kernel: disk 0, wo:1, o:0, dev:hda3
> >> Mar 5 00:17:14 linux kernel: disk 1, wo:0, o:1, dev:hdb3
> >> Mar 5 00:17:14 linux kernel: RAID1 conf printout:
> >> Mar 5 00:17:14 linux kernel: --- wd:1 rd:2
> >> Mar 5 00:17:14 linux kernel: disk 1, wo:0, o:1, dev:hdb3
> >> Mar 5 00:17:14 linux kernel: raid1: hdb3: redirecting sector 271343408 to
> >> another mirror
> >
> > Is the above telling me that hda3 was removed from the mirror because
> > of a single bad sector?
Yes...
> > That seems extremely aggressive.
Quite so.
> Me too
>
> >
> > I know there is some LKML discussion of needing to have MD
> > automatically detect the above and simply rewrite the failed sector
> > with data from the good mirrored sector.
> >
> > During the write /dev/hda should re-map the failed sector and continue
> > running fine. (ie. All disk sector remapping for failures happens on
> > writes AIUI.)
Yes, that should work. The disk firmware remaps bad sectors when writing.
Alternatively, the software could remap a sector, but it would do that on
the layer above the mirror, ie, at ext3 level, for example, meaning on
both disks. But that is not automatic, either, AFAIK.
> > If a disk is failed after a single sector read error currently I can
> > see why the kernel developers are looking into alternate ways to
> > handle the situation.
Seems so.
> It is running ok now, as far as i can see, all in sync.
> For me it means that I will have to pay more attention to monitor this
> kind of errors. Will look into mdadm, as I have seen, that it has
> parameters that can make it do this, and report me by mail or something
> like that.
You can set it to email you, even to page or phone you, I think.
Also, you can find the error in the SMART log of that HD, using smartctl.
It should be possible to deduce if the sector was remaped, looking at the
Reallocated_Sector_Ct.
- --
Cheers,
Carlos E. R.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.5 (GNU/Linux)
Comment: Made with pgp4pine 1.76
iD8DBQFF7iMttTMYHG2NR9URAlLZAJkBdp8ppHVlp57xw+cMKor04qsnZQCgipmz
9KAlen8lUNj4HC9SxCGpmQs=
=+jq6
-----END PGP SIGNATURE-----
--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]