> can you guess? <billtodd <at> metrocast.net> writes:
> > 
> > You really ought to read a post before responding
> to it:  the CERN study
> > did encounter bad RAM (and my post mentioned that)
> - but ZFS usually can't
> > do a damn thing about bad RAM, because errors tend
> to arise either
> > before ZFS ever gets the data or after it has
> already returned and checked
> > it (and in both cases, ZFS will think that
> everything's just fine).
> 
> According to the memtest86 author, corruption most
> often occurs at the moment 
> memory cells are written to, by causing bitflips in
> adjacent cells. So when a 
> disk DMA data to RAM, and corruption occur when the
> DMA operation writes to 
> the memory cells, and then ZFS verifies the checksum,
> then it will detect the 
> corruption.
> 
> Therefore ZFS is perfectly capable (and even likely)
> to detect memory 
> corruption during simple read operations from a ZFS
> pool.
> 
> Of course there are other cases where neither ZFS nor
> any other checksumming 
> filesystem is capable of detecting anything (e.g. the
> sequence of events: data 
> is corrupted, checksummed, written to disk).

Indeed - the latter was the first of the two scenarios that I sketched out.  
But at least on the read end of things ZFS should have a good chance of 
catching errors due to marginal RAM.
That must mean that most of the worrisome alpha-particle problems of yore have 
finally been put to rest (since they'd be similarly likely to trash data on the 
read side after ZFS had verified it).  I think I remember reading that 
somewhere at some point, but I'd never gotten around to reading that far in the 
admirably-detailed documentation that accompanies memtest:  thanks for 
enlightening me.

- bill
 
 
This message posted from opensolaris.org
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to