On 10/08/2017 15:01, Alan Somers wrote:
Really interesting answer Alan, thank you very much !
Slightly off-topic but I take this opportunity,
how do you check SAS drives health ?
I personally cron a background long test every 2 weeks (using smartmontools).
I did not experience SAS drive error yet, so not sure how this behaves.
Does the drive reports to FreeBSD when its read or write error rate cross
a threshold (so that we can replace it before it fails) ?
Or perhaps smartd will do ?
As an example below a SAS error counter log returned by smartctl :
Errors Corrected by Total Correction Gigabytes Total
ECC rereads/ errors algorithm processed uncorrected
fast | delayed rewrites corrected invocations [10^9 bytes] errors
read: 0 49 0 49 233662 73743.588 0
write: 0 3 0 3 83996 9118.895 0
verify: 0 0 0 0 28712 0.000 0
Thank you !
Ben
smartmontools is probably the best way to read SAS error logs.
Interpreting them can be hard, though. The Backblaze blog is probably
the best place to get current advice. But the easiest thing to do is
certainly to wait until something fails hard. With ZFS, you can have
up to 3 drives' worth of redundancy, and hotspares too.
I concur with Alan. Trying to predict drive failure is a mug's game.
Very through research (e.g. Google, 2007) has shown it's a waste of time
trying.
With ZFS (or geom mirror) a drive will be "failed" as soon as there's a
problem and you can get notification using a cron job that sends an
email if the output of zpool status (or gmirror status ) contains
"DEGRADED".
That said, I've found it useful to use smartctl to pick up when a drive
is overheating, usually due to fan failure. You might also find the new
(11.0+?) sesutil handy to monitor components on a SAS expander IF YOU
HAVE ONE. Things like fans and temperature sensors are readable this way.
Regards, Frank.
_______________________________________________
[email protected] mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-hardware
To unsubscribe, send any mail to "[email protected]"