On 01/12/11 19:32, Chris Forgeron wrote:
Interesting, I was just testing Solaris 11 Express's ability to handle a pulled 
drive today. It handles it quite well. However, my Areca 1880 drive (arcmsr0) 
crashes when you reinsert the drive.. but that's another topic, and an issue 
for Areca tech support..

..back to the point:

Solaris runs a separate process called Fault Management Daemon (fmd) that looks 
to handle this logic - This means that it's really not inside the ZFS code to 
handle this, and FreeBSD would need something similar, hopefully less kludgy 
than a user script.

I wonder if anyone has been eyeing the fma code in the cddl with a thought for 
porting it - It looks to be a really neat bit of code - I'm still quite new 
with it, having only been working with Solaris the last few months.

Here's two links to a bit of info on the Solaris daemon:

http://www.princeton.edu/~unix/Solaris/troubleshoot/fm.html
http://hub.opensolaris.org/bin/view/Community+Group+fm/


Here's my log of the event in Solaris 11 Express:

Jan 12 21:28:47 solaris fmd: [ID 377184 daemon.error] SUNW-MSG-ID: ZFS-8000-FD, 
TYPE: Fault, VER: 1, SEVERITY: Major
Jan 12 21:28:47 solaris EVENT-TIME: Wed Jan 12 21:28:47 UTC 2011
Jan 12 21:28:47 solaris PLATFORM: PowerEdge-T710, CSN: 39SLQN1, HOSTNAME: 
solaris
Jan 12 21:28:47 solaris SOURCE: zfs-diagnosis, REV: 1.0
Jan 12 21:28:47 solaris EVENT-ID: ccfa7a23-838b-ebc8-decf-c2607afb390d
Jan 12 21:28:47 solaris DESC: The number of I/O errors associated with a ZFS 
device exceeded
Jan 12 21:28:47 solaris              acceptable levels.  Refer to 
http://sun.com/msg/ZFS-8000-FD for more information.
Jan 12 21:28:47 solaris AUTO-RESPONSE: The device has been offlined and marked 
as faulted.  An attempt
Jan 12 21:28:47 solaris              will be made to activate a hot spare if 
available.
Jan 12 21:28:47 solaris IMPACT: Fault tolerance of the pool may be compromised.
Jan 12 21:28:47 solaris REC-ACTION: Run 'zpool status -x' and replace the bad 
device.

After a cursory glance at their fault-management infrastructure, I noticed that it also deals with other kinds of stuff like CPU and memory problems, which might make a port painful or impractical. Would the people with custom hot-spare scripts, or nothing automated at all, be content if the sysutils/geomWatch program grew support for hot spares in a future version? I already became somewhat familiar with the userland ZFS API when I added ZFS support to it.

-Boris
_______________________________________________
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Reply via email to