-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
The Tuesday 2007-05-08 at 13:09 +0200, Wilfred van Velzen wrote:
> > Uau, that's a large disk. Or busy. Usually, it's about two hours or
> so.
>
> It's big: 750GB
>
> Not too busy, but busy enough during work hours, because the 30% hasn't
> moved yet...
Probably busy enough that the test doesn't progress (much). If the test is
well designed, the normal disk activity has priority. Probably you just
have to wait longer and see.
There is a trick, although you may not like it. f the raid is in software,
you can deactivate one of the hard disks (simulate a failure). The other
disk(s) take over the load, the failed one goes idle, and the test can
happily progress on that one. However, if the other disk goes down in the
interval... ouch :-(
> > Not the system log, but the smart log that resides in the disk; you
> > can dig it out with "smartctl -a device".
>
> Yes, that was what I meant. I checked with:
>
> smartctl -l error /dev/sdb
>
> and:
>
> smartctl -l selftest /dev/sdb
>
> But that shows the same output as the -a option...
Ah...
I expected something like this (I see it with -a):
SMART Error Log Version: 1
ATA Error Count: 251 (device log contains only the most recent five errors)
...
Error 251 occurred at disk power-on lifetime: 3734 hours (155 days + 14 hours)
When the command that caused the error occurred, the device was active or
idle.
After command completion occurred, registers were:
ER ST SC SN CL CH DH
-- -- -- -- -- -- --
84 51 00 e8 f6 83 f0 Error: ICRC, ABRT at LBA = 0x0083f6e8 = 8648424
Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
-- -- -- -- -- -- -- -- ---------------- --------------------
25 00 f0 f9 f5 83 f0 00 00:04:45.606 READ DMA EXT
25 00 f0 f9 f5 83 f0 00 00:04:44.706 READ DMA EXT
10 00 3f 00 00 00 f0 00 00:04:44.705 RECALIBRATE [OBS-4]
25 00 f0 f9 f5 83 f0 00 00:04:44.421 READ DMA EXT
25 00 f0 f9 f5 83 f0 00 00:04:44.248 READ DMA EXT
I think these logs depend on the disk manufacturer.
> > Right. If I interpret it correctly, your sda has four sectors
> remapped. It
>
> sdb!
Right, sdb, I got confused.
> I'll advice the one who controls the money to order a spare one in
> advance, so we can replace it if necessary. It's one of the disks in a
> raid 1 configuration, so it shouldn't be an immediate problem if one
> disk fails...
In the case of a production server that you consider important enough to
have a raid, it should always be important to have a disks spare at hand,
errors or not ;-)
Also, you know that you can have an "active spare" inside the raid. If
there is a problem, it will immediately activate it and switch over. The
disadvantage is, obviously, that the spare is powered up, although idle.
In those cases, I would have an spare outside, too - maybe I'm too
paranoid ;-)
> > > This isn't something that can be fixed on short notice ;), so I hope
> > > you will see this message!
> >
> > Yep, I noticed, because you sent also a CC to me: in those cases Pine
> > shows a yellow mark :-)
>
> I will keep doing this, then... ;)
No problem. Just remember that some people here do not like those at all -
I really don't mind, my filters work nicely ;-)
- --
Cheers,
Carlos E. R.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.5 (GNU/Linux)
Comment: Made with pgp4pine 1.76
iD8DBQFGQG3+tTMYHG2NR9URAopmAJwPH+9oifhx6UZdRmWYdBcM7UA3+gCeKaYn
wHv5e9D4vePAc5Kw8eyTKPU=
=lHY7
-----END PGP SIGNATURE-----
--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]