geom(4)/gmirror(4) automatic device DEGRADED status demotion (WAS:Re: gmirror HD failure detection)

Brian A. Seklecki Wed, 14 Feb 2007 12:58:29 -0800

On Wed, 14 Feb 2007, Brian A. Seklecki wrote:

All:
For a while our strategy was to use NRPE2+ a custom nagios check(check_raid_fbsdgmirror -- ugly-as-hell Perl, but which I can makeavailable to the public).
However, this morning a drive in a Dell PE1850 (one without a PERC4controller) started erroring. It has just regular old (bad) mpt(4)controller.
The problem is that gmirror(4) never marked the drive as failed.
I'd have to tear through the code to find where the logic is for automaticdemotion of a failed mirror.
Either way, the original thinking behind the Nagios pluging check, was thatgmirror(4) would have some threshold of failed attempts to write/read from aprovider disk should lead to flagging a provider as "DEGRADED"
Its entirely possible that we never had a chance to test it.

Now I have to go back and re-visit all of that.

~BAS

On Wed, 20 Sep 2006, Alex Zbyslaw wrote:
Robin Becker wrote:
After using Dru Lavigne's excellent article http://tinyurl.com/da66a aboutRaid-1 I have a full Raid-1 mirror on a new rack server. I'm wondering ifanyone can tell me how best to monitor the hardware status to detectimminent failure of one of the disks? Do I use something like smartctl ina cron or what?
Assuming that the disks support SMART then just read the man page forsmartd. No need for cron. You can also schedule "short" and "long" teststo run in off hours. smartmontools is easy to uninstall if it doesn't workfor you. However, this will tell you that a disk is failing (or failed)which is not quite the same as array status. An array (theoretically)might be sub-optimal for non-SMART reasons. Someone familiar with gmirrorwill have to answer that bit... but gmirror status -s looks from the manpage like it might be interesting and *that* could be run from cron andparsed to weed out "status OK results".

_______________________________________________
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

geom(4)/gmirror(4) automatic device DEGRADED status demotion (WAS:Re: gmirror HD failure detection)

Reply via email to