On Tue, Sep 05, 2017 at 07:44:59PM +0200, Martin Husemann wrote:
> On Tue, Sep 05, 2017 at 05:35:07PM +0000, Steve Blinkhorn wrote:
> > I have discovered a problem on a live server (i386) I run - this 
> > is filling up /var/log/messages so that it has turned over more than
> > 10 times today.
> > 
> > The message:
> > 
> > Sep  5 16:56:49 trafalgar /netbsd: wd0a: error reading fsbn 1005056 of 
> > 1005056-1005087 (wd0 bn 1005119; cn 997 tn 2 sn 17), retrying
> > Sep  5 16:56:49 trafalgar /netbsd: wd0: (uncorrectable data error)
> > 
> > The fsbn is mostly 1005056 but sometimes 1005086.
> > 
> > Server response time is impacted.
> > 
> > I've never had, so never tackled, this kind of issue before.   Advice
> > much appreciated.
> 
> 1) backup your data ;-)
> 2) check the drive's SMART status with atactl smart status
> 3) try to write to the affected sectors, that usually will cause the drive
>    to remap it (if it still has spares available)

FWIW, when I first saw this on a drive, NetBSD was able to recover and
let me back-up the data (indeed very first thing to do). After, the
SMART status was almost useless, since the faults reappeared until the
disk finally failed (with unability to recover). And the SMART status
was up-to-date only when it finally failed---that is: too late.
-- 
        Thierry Laronde <tlaronde +AT+ polynum +dot+ com>
                     http://www.kergis.com/
Key fingerprint = 0FF7 E906 FBAF FE95 FD89  250D 52B1 AE95 6006 F40C

Reply via email to