On Fri, Feb 26, 2010 at 1:46 AM, Alex Schuster <[email protected]> wrote:
> Mark Knecht writes:
>
>> Do I just watch the logs looking for problems? I have no way of
>> knowing right now whether this was a disk problem that's going to come
>> back, a 1 time deal due to power, or something else entirely.
>>
>> As these cheap machines that don't use RAID what's the right way to
>> go? emerge -e @world and then wait for the next event? Do nothing and
>> wait?
>
> Emerge smartmontools, then:
>
> smartctl -h /dev/sda # get overview of what the drive thinks about itself
>
> smartctl -t short /dev/sda # start short self test
> Wait
> smartctl -l selftest /dev/sda # see results
>
> smartctl -t long /dev/sda # start long self test
> Wait a lot longer
> smartctl -l selftest /dev/sda # see results
>
> You can continue working in the meanwhile, there will be no performance
> impact. You will see something like this in the log:
>
> === START OF READ SMART DATA SECTION ===
> SMART Self-test log structure revision number 1
> Num Test_Description Status Remaining LifeTime(hours)
> LBA_of_first_error
> # 1 Short offline Completed without error 00% 2275 -
> # 2 Extended offline Completed without error 00% 2270 -
> # 3 Extended offline Completed without error 00% 1799 -
> # 4 Extended offline Completed without error 00% 197 -
> # 5 Extended offline Completed without error 00% 26 -
>
> I you have a '-' in the right column, the disk has found no errors. If
> there is a number, than it's the position of the first error.
>
> There's also badblocks, this will check every block and output the bad
> ones: badblocks -sv /dev/sda
>
> badblocks -svn /dev/sda will do a read-write test. In case of a bad block,
> the drive should exchange it with a spare one. Maybe this happens already
> in read-only mode, I am not sure.
>
> Also watch for errors in syslog or via dmesg, there should be some when
> bad blocks are being accessed.
>
> Wonko
>
>
Hi Wonko,
Yes, I do use smartctl on some other machines although I'm not very
good about it and your write-up is helpful so thanks for that.
My wife's machines is older and and I don't think SMART is
supported on her drive. Note the lack of a * on the SMART line in
hdparm -I:
dragonfly ~ # hdparm -I /dev/hda
/dev/hda:
ATA device, with non-removable media
Model Number: WDC WD1600BB-00FTA0
Serial Number: WD-WMAES2091586
Firmware Revision: 15.05R15
Standards:
Supported: 6 5 4
Likely used: 6
Configuration:
Logical max current
cylinders 16383 16383
heads 16 16
sectors/track 63 63
--
CHS current addressable sectors: 16514064
LBA user addressable sectors: 268435455
LBA48 user addressable sectors: 312581808
Logical/Physical Sector size: 512 bytes
device size with M = 1024*1024: 152627 MBytes
device size with M = 1000*1000: 160041 MBytes (160 GB)
cache/buffer size = 2048 KBytes (type=DualPortCache)
Capabilities:
LBA, IORDY(can be disabled)
Standby timer values: spec'd by Standard, with device specific minimum
R/W multiple sector transfer: Max = 16 Current = 16
Recommended acoustic management value: 128, current value: 254
DMA: mdma0 mdma1 mdma2 udma0 udma1 udma2 udma3 udma4 *udma5
Cycle time: min=120ns recommended=120ns
PIO: pio0 pio1 pio2 pio3 pio4
Cycle time: no flow control=120ns IORDY flow control=120ns
Commands/features:
Enabled Supported:
SMART feature set
Security Mode feature set
* Power Management feature set
* Write cache
* Look-ahead
* Host Protected Area feature set
* WRITE_BUFFER command
* READ_BUFFER command
* DOWNLOAD_MICROCODE
SET_MAX security extension
Automatic Acoustic Management feature set
* 48-bit Address feature set
* Device Configuration Overlay feature set
* Mandatory FLUSH_CACHE
* FLUSH_CACHE_EXT
* SMART error logging
* SMART self-test
Security:
supported
not enabled
not locked
not frozen
not expired: security count
not supported: enhanced erase
HW reset results:
CBLID- above Vih
Device num = 0 determined by CSEL
Checksum: correct
dragonfly ~ #
dragonfly ~ # smartctl -H /dev/hda
smartctl version 5.38 [i686-pc-linux-gnu] Copyright (C) 2002-8 Bruce Allen
Home page is http://smartmontools.sourceforge.net/
SMART Disabled. Use option -s with argument 'on' to enable it.
dragonfly ~ # smartctl -s on /dev/hda
smartctl version 5.38 [i686-pc-linux-gnu] Copyright (C) 2002-8 Bruce Allen
Home page is http://smartmontools.sourceforge.net/
=== START OF ENABLE/DISABLE COMMANDS SECTION ===
Error SMART Enable failed: Input/output error
Smartctl: SMART Enable Failed.
A mandatory SMART command failed: exiting. To continue, add one or
more '-T permissive' options.
dragonfly ~ #
I've not tried the -T permissive options.
I've never used badblocks as it seems I should only do that off-line.
This might be a good time to boot with a CD and try it out.
Maybe I should just get a new drive that supports SMART?
- Mark