Re: Karel, some followup Q:s on your RAID1C patch

Tinker Mon, 01 Feb 2016 08:50:17 -0800

Since these are not emails with patches, let's not disturb tech@ buthave this thread moved to misc@ , thanks.


On 2016-02-01 18:40, Janne Johansson wrote:

I did not oppose adding the sector number, just the "idea" thatinternal
relocations would make this number change.
If it did, then everything would break for all filesystems, so that is
obviously not how it is done.
2016-02-01 11:11 GMT+01:00 Tinker <ti...@openmailbox.org>:
On 2016-02-01 16:29, Janne Johansson wrote:
2016-01-31 9:24 GMT+01:00 Tinker <ti...@openmailbox.org>:

Q1:
My most important question to you is, the DATA that you CHECKSUM, doyouinclude the SECTOR NUMBER (or other disk location info) of that dataintoyour checksum function's inputs, so if the underlying storage'sstoragemapping table breaks down or by other reason disk WRITE:s go to theWRONG
place, then when READ later on, those READS will FAIL?
Whenever any underlying storage does migrations, it would neverchange theOS view of the sector number, all filesystems (raid or not) wouldbreak if
that happened.
Janne (and Karel),
The reason I suggested the location info e.g. sector number to beincludedin the checksum calculation's input data, is that it's a real riskthat adisk's logical-sector-to-physical-sector-mapping table breaks down,either
because of physical failure, or because of firmware errors in disk
controller or disk, or because of OS bugs, memory bugs, driver bugs,you
name it.
While I agree that within RAID1C the probability ridiculously small,thatsuch a failure would happen so that a certain sector X's locationwould becorrupted, *and* that its checksum in the checksums zone on the diskwould
be corrupted in a way symmetric with the first corruption so that the
checksum checks not would catch the problem also, then still on alevel of
(mathemathical/system) symmetry it does make a sense that the checksum
calculation uses the data location as input also.

ZFS does this to guarantee that the data read is the data that really
belongs there.
And I guess we're talking about in the range 50-100 extra CPU cyclespersector access to deliver this, and no extra storage need, so myspontaneousfeel about this is that it probably could be implemented on a"why-not"
basis -

What do you say?

Tinker

Re: Karel, some followup Q:s on your RAID1C patch

Reply via email to