On Mon, 28 Jul 2008, Richard Elling wrote:
>
> But ZFS can do better.  I filed CR6674679 which basically says
> that if redundant copies of data have the same, wrong checksum,
> then ZFS should issue an e-report to that effect.  This will allow
> you to move suspicion away from the disks as a root cause towards
> a  common cause, like memory, shared HBA or bus, etc. It won't
> be able to recover the data, but it can help debug the system.

A rather obvious thing to do is to have a low-priority task running 
which validates checksums of memory in the ZFS ARC.  That way memory 
content which is somehow altered (due to memory glitch or kernel bug) 
will be detected so someone can fix the problem.  Even ECC memory will 
not fix the problem when an adaptor card writes to the wrong location, 
or a device driver does something wrong.

Bob
======================================
Bob Friesenhahn
[EMAIL PROTECTED], http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer,    http://www.GraphicsMagick.org/

_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to