On 06/05/14 17:38, Christian Weisgerber wrote:
I have a 3TB disk here...

sd1 at scsibus1 targ 1 lun 0: <ATA, Hitachi HUA72303, MKAO> SCSI3 0/direct 
fixed naa.5000cca225c5fbeb
sd1: 2861588MB, 512 bytes/sector, 5860533168 sectors

... that's serving as a general media dump with a single FFS2 file
system on it.

Filesystem     Size    Used   Avail Capacity  Mounted on
/dev/sd1d      2.7T    2.5T   63.7G    98%    /export

Yesterday, I experienced the odd effect that reading some files,
or parts of files, from that disk became excruciatingly slow.  We're
talking a few kB/s here.  Other files were fine.  There were no
kernel errors/warnings whatsoever.  There were no read errors, the
disk was just 100% busy and appeared to be returning data drip by
drip.

# atactl sd1 smartstatus
No SMART threshold exceeded

No change on reboot.  dd(1) from the raw device was initially fast,
then slowed to a crawl as it progressed.  I eventually "fixed" it
all by powering off the machine, jiggling the SATA connectors (all
fine), and powering the machine back up.

Tonight the problem is back.  Something is very wrong.  Given that
dd if=/dev/rsd1c also seems affected, the filesystem layer can be
excluded.  I won't cry too much over a dying disk, but why the heck
are there no error indications of any kind?

Any other ideas?


I think you are relying on the smart system too much.  Certainly try
what David said, but it's obvious that the disk is sick despite what the
smart system may say.

I've had about seven disk failures in the last several years.  Three or
four of them the smart system was absolutely correct, with the others
being less informative.  I've also had a false notice that a disk was bad,
but worked for several years, till it got too small for its task.

Smart is good, but it has its limitations.  It best deals with gradual
errors, not fast catastrophic ones.

--STeve Andre'

Reply via email to