Hi Russell,

I would assume that the resilvering is related to the checksum errors. From
the zpool(8) manpage:

Scrubbing and resilvering are very similar operations. The difference
is that resilvering only examines data that ZFS knows to be out of
date (for example, when attaching a new device to a mirror or
replacing an existing device), whereas scrubbing examines all data to
discover silent errors due to hardware faults or disk failure.


For the messages: FreeBSD has a sysctl vfs.zfs.debug. This sysctl approach
was ported to Linux, my Google 'research' (e.g.
http://askubuntu.com/questions/228386/how-do-you-apply-performance-tuning-settings-for-native-zfs)
indicates,
so you may be able to use it under Linux too.

BTW: There is a Nagios/Icinga check_zfs plugin.

I did not know about "mon" before... How does it compare to Nagios/Icinga?

Regards
Peter


On Thu, Sep 22, 2016 at 10:54 PM, Russell Coker via luv-main <
luv-main@luv.asn.au> wrote:

> Below is part of the output of "zpool status".  It seems that sdr is
> defective, it has a steadily increasing number of checksum errors.
>
> Would the "resilvered 763M" part be about the 121 checksum errors?  If so
> does
> that mean each checksum error required resilvering on average 6M of data?
>
> The kernel message log has NOTHING about this.  I'm used to Ext* and BTRFS
> which give kernel message log entries about filesystem errors.  Can ZFS be
> configured to give similar logging?
>
> As an aside I've written a mon module for monitoring for such ZFS errors.
> I'll release it sometime soon.  But I'd be happy to give a version that's
> quite usable although not ready for full release to anyone who wants it.
>
> status: One or more devices has experienced an unrecoverable error.  An
>         attempt was made to correct the error.  Applications are
> unaffected.
> action: Determine if the device needs to be replaced, and clear the errors
>         using 'zpool clear' or replace the device with 'zpool replace'.
>    see: http://zfsonlinux.org/msg/ZFS-8000-9P
>   scan: resilvered 763M in 0h0m with 0 errors on Thu Aug 18 14:48:53 2016
> config:
>
>         NAME           STATE     READ WRITE CKSUM
>         server         ONLINE       0     0     0
>           raidz1-0     ONLINE       0     0     0
>             sdj        ONLINE       0     0     0
>             sdk        ONLINE       0     0     0
>             sdl        ONLINE       0     0     0
>             sdm        ONLINE       0     0     0
>             sdn        ONLINE       0     0     0
>             sdo        ONLINE       0     0     0
>             sdp        ONLINE       0     0     0
>             sdq        ONLINE       0     0     0
>             sdr        ONLINE       0     0   121
>
> --
> My Main Blog         http://etbe.coker.com.au/
> My Documents Blog    http://doc.coker.com.au/
>
> _______________________________________________
> luv-main mailing list
> luv-main@luv.asn.au
> https://lists.luv.asn.au/cgi-bin/mailman/listinfo/luv-main
>
_______________________________________________
luv-main mailing list
luv-main@luv.asn.au
https://lists.luv.asn.au/cgi-bin/mailman/listinfo/luv-main

Reply via email to