> From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss-
> boun...@opensolaris.org] On Behalf Of Neil Perrin
> 
> This is a consequence of the design for performance of the ZIL code.
> Intent log blocks are dynamically allocated and chained together.
> When reading the intent log we read each block and checksum it
> with the embedded checksum within the same block. If we can't read
> a block due to an IO error then that is reported, but if the checksum
> does
> not match then we assume it's the end of the intent log chain.
> Using this design means we the minimum number of writes to add
> write an intent log record is just one write.
> 
> So corruption of an intent log is not going to generate any errors.

I didn't know that.  Very interesting.  This raises another question ...

It's commonly stated, that even with log device removal supported, the most
common failure mode for an SSD is to blindly write without reporting any
errors, and only detect that the device is failed upon read.  So ... If an
SSD is in this failure mode, you won't detect it?  At bootup, the checksum
will simply mismatch, and we'll chug along forward, having lost the data ...
(nothing can prevent that) ... but we don't know that we've lost data?

Worse yet ... In preparation for the above SSD failure mode, it's commonly
recommended to still mirror your log device, even if you have log device
removal.  If you have a mirror, and the data on each half of the mirror
doesn't match each other (one device failed, and the other device is good)
... Do you read the data from *both* sides of the mirror, in order to
discover the corrupted log device, and correctly move forward without data
loss?

_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to