On 08/25/10 20:33, Edward Ned Harvey wrote:
From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss-
boun...@opensolaris.org] On Behalf Of Neil Perrin

This is a consequence of the design for performance of the ZIL code.
Intent log blocks are dynamically allocated and chained together.
When reading the intent log we read each block and checksum it
with the embedded checksum within the same block. If we can't read
a block due to an IO error then that is reported, but if the checksum
does
not match then we assume it's the end of the intent log chain.
Using this design means we use the minimum number of writes.

So corruption of an intent log is not going to generate any errors.

I didn't know that.  Very interesting.  This raises another question ...

It's commonly stated, that even with log device removal supported, the most
common failure mode for an SSD is to blindly write without reporting any
errors, and only detect that the device is failed upon read.  So ... If an
SSD is in this failure mode, you won't detect it?  At bootup, the checksum
will simply mismatch, and we'll chug along forward, having lost the data ...
(nothing can prevent that) ... but we don't know that we've lost data?

- Indeed, we wouldn't know we lost data.

Worse yet ... In preparation for the above SSD failure mode, it's commonly
recommended to still mirror your log device, even if you have log device
removal.  If you have a mirror, and the data on each half of the mirror
doesn't match each other (one device failed, and the other device is good)
... Do you read the data from *both* sides of the mirror, in order to
discover the corrupted log device, and correctly move forward without data
loss?


Hmm, I need to check, but if we get a checksum mismatch then I don't think we try other mirror(s). This is automatic for the 'main pool', but of course the ZIL code is different by necessity. This problem can of course be fixed. (It will be a week and a bit before I can
report back on this, as I'm on vacation).

Neil.
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to