Hi Russell,
I would assume that the resilvering is related to the checksum errors. From
the zpool(8) manpage:
Scrubbing and resilvering are very similar operations. The difference
is that resilvering only examines data that ZFS knows to be out of
date (for example, when attaching a new device to a mirror or
replacing an existing device), whereas scrubbing examines all data to
discover silent errors due to hardware faults or disk failure.
For the messages: FreeBSD has a sysctl vfs.zfs.debug. This sysctl approach
was ported to Linux, my Google 'research' (e.g.
http://askubuntu.com/questions/228386/how-do-you-apply-performance-tuning-settings-for-native-zfs)
indicates,
so you may be able to use it under Linux too.
BTW: There is a Nagios/Icinga check_zfs plugin.
I did not know about "mon" before... How does it compare to Nagios/Icinga?
Regards
Peter
On Thu, Sep 22, 2016 at 10:54 PM, Russell Coker via luv-main <
luv-main@luv.asn.au> wrote:
> Below is part of the output of "zpool status". It seems that sdr is
> defective, it has a steadily increasing number of checksum errors.
>
> Would the "resilvered 763M" part be about the 121 checksum errors? If so
> does
> that mean each checksum error required resilvering on average 6M of data?
>
> The kernel message log has NOTHING about this. I'm used to Ext* and BTRFS
> which give kernel message log entries about filesystem errors. Can ZFS be
> configured to give similar logging?
>
> As an aside I've written a mon module for monitoring for such ZFS errors.
> I'll release it sometime soon. But I'd be happy to give a version that's
> quite usable although not ready for full release to anyone who wants it.
>
> status: One or more devices has experienced an unrecoverable error. An
> attempt was made to correct the error. Applications are
> unaffected.
> action: Determine if the device needs to be replaced, and clear the errors
> using 'zpool clear' or replace the device with 'zpool replace'.
>see: http://zfsonlinux.org/msg/ZFS-8000-9P
> scan: resilvered 763M in 0h0m with 0 errors on Thu Aug 18 14:48:53 2016
> config:
>
> NAME STATE READ WRITE CKSUM
> server ONLINE 0 0 0
> raidz1-0 ONLINE 0 0 0
> sdjONLINE 0 0 0
> sdkONLINE 0 0 0
> sdlONLINE 0 0 0
> sdmONLINE 0 0 0
> sdnONLINE 0 0 0
> sdoONLINE 0 0 0
> sdpONLINE 0 0 0
> sdqONLINE 0 0 0
> sdrONLINE 0 0 121
>
> --
> My Main Blog http://etbe.coker.com.au/
> My Documents Bloghttp://doc.coker.com.au/
>
> ___
> luv-main mailing list
> luv-main@luv.asn.au
> https://lists.luv.asn.au/cgi-bin/mailman/listinfo/luv-main
>
___
luv-main mailing list
luv-main@luv.asn.au
https://lists.luv.asn.au/cgi-bin/mailman/listinfo/luv-main