On Feb 22, 2007, at 11:55 AM, Eric Schrock wrote:
[ ... ]


b) It is not uncommon for such successful reads of partially defective media to happen only after several retries. It is somewhat unfortunate that there is no simple way to tell the drive how many times to retry. Firmware in ATA/SATA drives, used predominantly in single-disk PCs,
   will typically do a heroic effort to retrieve the data. It will
   make numerous attempts to reposition the actuator, recalibrate the
   head current, etc. It can take up to 20-40 seconds! Such strategy
   is reasonable for a desktop PC but in it happens in an busy
   enterprise file server it results in a temporary availability loss
   (the drive freezes for up 20-40 seconds every time you try to
   read this sector). Also, this strategy does not make any sense if
   a RAID group in which the drive participates has redundant data
elsewhere, which is why SCSI/FC drives give up after a few retries.

One can detect (and repair) such problematic areas on disk by monitoring
the  SMART counters during scrubbing, and/or by monitoring physical
read timings (looking for abnormally slow ones).

Solaris currently has a disk monitoring FMA module that is specific to
Thumper (x4500) and monitors only the most basic information (overtemp,
self-test fail, predictive failure).  I have separated this out into a
common FMA transport module which will bring this functionality to all
platforms (though support for SCSI devices will depend on the
aforementioned SCSI FMA portfolio).  This should be putback soon.
Future work could expand this beyond the simple indicators into more
detailed analysis of various counters.

All of this is really a common FMA problem, not ZFS-specific. All that is needed in ZFS is an agent actively responding to external diagnoses.
I am laying the groundwork for this as part of ongoing ZFS/FMA work
mentioned in other threads. For more information on ongoing FMA work, I
recommend visiting the FMA discussion forum.

- Eric

--
Eric Schrock, Solaris Kernel Development http://blogs.sun.com/ eschrock


I disagree.  Originally, I asked for the following:

- Objective performance reporting in a simple to parse format (similar to scrub) - The ability to schedule non-data-intrusive disk tests to verify disk performance.
- The ability to compare two similar disks for performance.

In the above, you've taken pro-active capabilities and turned them into failure mitigation, or, re-active capabilities.

From the paper, the problem isn't outright disk failure, but disk performance degradation. I asked for the above to easily determine whether a disk is performing similarly to others, or may be degrading.

The need for ZFS to do this is two-fold:

1. ZFS can write to the disk non-intrusively. Any subsystem outside of the native filesystem will be able to execute read tests only, which is only part of the analysis. 2. If the command is available at the zfs (or pool) level, it becomes an easy method for diagnosis. When you must 'roll your own' via script or dtrace, the objectivity goes away and comparisons between systems become increasingly difficult.

My concern for moving this into exclusively FMA has to do with focus. I've found that most fault mitigation systems concentrate on just that: faults. Performance degradation isn't treated as a fault, and usually falls out of any fault management system as a "we'd like to do that, but we've got bigger things to do."

-----
Gregory Shaw, IT Architect
IT CTO Group, Sun Microsystems Inc.
Phone: (303)-272-8817 (x78817)
500 Eldorado Blvd, UBRM02-157     [EMAIL PROTECTED] (work)
Broomfield, CO 80021                   [EMAIL PROTECTED] (home)
"When Microsoft writes an application for Linux, I've won." - Linus Torvalds




_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to