Re: Karel, some followup Q:s on your RAID1C patch

2016-02-01 Thread Janne Johansson
2016-01-31 9:24 GMT+01:00 Tinker :

> Q1:
>
> My most important question to you is, the DATA that you CHECKSUM, do you
> include the SECTOR NUMBER (or other disk location info) of that data into
> your checksum function's inputs, so if the underlying storage's storage
> mapping table breaks down or by other reason disk WRITE:s go to the WRONG
> place, then when READ later on, those READS will FAIL?
>


Whenever any underlying storage does migrations, it would never change the
OS view of the sector number, all filesystems (raid or not) would break if
that happened.

-- 
May the most significant bit of your life be positive.


Re: Karel, some followup Q:s on your RAID1C patch

2016-02-01 Thread Janne Johansson
I did not oppose adding the sector number, just the "idea" that internal
relocations would make this number change.
If it did, then everything would break for all filesystems, so that is
obviously not how it is done.


2016-02-01 11:11 GMT+01:00 Tinker :

> On 2016-02-01 16:29, Janne Johansson wrote:
>
>> 2016-01-31 9:24 GMT+01:00 Tinker :
>>
>> Q1:
>>>
>>> My most important question to you is, the DATA that you CHECKSUM, do you
>>> include the SECTOR NUMBER (or other disk location info) of that data into
>>> your checksum function's inputs, so if the underlying storage's storage
>>> mapping table breaks down or by other reason disk WRITE:s go to the WRONG
>>> place, then when READ later on, those READS will FAIL?
>>>
>>>
>>
>> Whenever any underlying storage does migrations, it would never change the
>> OS view of the sector number, all filesystems (raid or not) would break if
>> that happened.
>>
>
> Janne (and Karel),
>
> The reason I suggested the location info e.g. sector number to be included
> in the checksum calculation's input data, is that it's a real risk that a
> disk's logical-sector-to-physical-sector-mapping table breaks down, either
> because of physical failure, or because of firmware errors in disk
> controller or disk, or because of OS bugs, memory bugs, driver bugs, you
> name it.
>
> While I agree that within RAID1C the probability ridiculously small, that
> such a failure would happen so that a certain sector X's location would be
> corrupted, *and* that its checksum in the checksums zone on the disk would
> be corrupted in a way symmetric with the first corruption so that the
> checksum checks not would catch the problem also, then still on a level of
> (mathemathical/system) symmetry it does make a sense that the checksum
> calculation uses the data location as input also.
>
> ZFS does this to guarantee that the data read is the data that really
> belongs there.
>
> And I guess we're talking about in the range 50-100 extra CPU cycles per
> sector access to deliver this, and no extra storage need, so my spontaneous
> feel about this is that it probably could be implemented on a "why-not"
> basis -
>
> What do you say?
>
> Tinker
>
>


-- 
May the most significant bit of your life be positive.


Re: Karel, some followup Q:s on your RAID1C patch

2016-02-01 Thread Tinker

On 2016-02-01 16:29, Janne Johansson wrote:

2016-01-31 9:24 GMT+01:00 Tinker :


Q1:

My most important question to you is, the DATA that you CHECKSUM, do 
you
include the SECTOR NUMBER (or other disk location info) of that data 
into
your checksum function's inputs, so if the underlying storage's 
storage
mapping table breaks down or by other reason disk WRITE:s go to the 
WRONG

place, then when READ later on, those READS will FAIL?




Whenever any underlying storage does migrations, it would never change 
the
OS view of the sector number, all filesystems (raid or not) would break 
if

that happened.


Janne (and Karel),

The reason I suggested the location info e.g. sector number to be 
included in the checksum calculation's input data, is that it's a real 
risk that a disk's logical-sector-to-physical-sector-mapping table 
breaks down, either because of physical failure, or because of firmware 
errors in disk controller or disk, or because of OS bugs, memory bugs, 
driver bugs, you name it.


While I agree that within RAID1C the probability ridiculously small, 
that such a failure would happen so that a certain sector X's location 
would be corrupted, *and* that its checksum in the checksums zone on the 
disk would be corrupted in a way symmetric with the first corruption so 
that the checksum checks not would catch the problem also, then still on 
a level of (mathemathical/system) symmetry it does make a sense that the 
checksum calculation uses the data location as input also.


ZFS does this to guarantee that the data read is the data that really 
belongs there.


And I guess we're talking about in the range 50-100 extra CPU cycles per 
sector access to deliver this, and no extra storage need, so my 
spontaneous feel about this is that it probably could be implemented on 
a "why-not" basis -


What do you say?

Tinker



Re: Karel, some followup Q:s on your RAID1C patch

2016-02-01 Thread Tinker
Since these are not emails with patches, let's not disturb tech@ but 
have this thread moved to misc@ , thanks.


On 2016-02-01 18:40, Janne Johansson wrote:
I did not oppose adding the sector number, just the "idea" that 
internal

relocations would make this number change.
If it did, then everything would break for all filesystems, so that is
obviously not how it is done.


2016-02-01 11:11 GMT+01:00 Tinker :


On 2016-02-01 16:29, Janne Johansson wrote:


2016-01-31 9:24 GMT+01:00 Tinker :

Q1:


My most important question to you is, the DATA that you CHECKSUM, do 
you
include the SECTOR NUMBER (or other disk location info) of that data 
into
your checksum function's inputs, so if the underlying storage's 
storage
mapping table breaks down or by other reason disk WRITE:s go to the 
WRONG

place, then when READ later on, those READS will FAIL?




Whenever any underlying storage does migrations, it would never 
change the
OS view of the sector number, all filesystems (raid or not) would 
break if

that happened.



Janne (and Karel),

The reason I suggested the location info e.g. sector number to be 
included
in the checksum calculation's input data, is that it's a real risk 
that a
disk's logical-sector-to-physical-sector-mapping table breaks down, 
either

because of physical failure, or because of firmware errors in disk
controller or disk, or because of OS bugs, memory bugs, driver bugs, 
you

name it.

While I agree that within RAID1C the probability ridiculously small, 
that
such a failure would happen so that a certain sector X's location 
would be
corrupted, *and* that its checksum in the checksums zone on the disk 
would

be corrupted in a way symmetric with the first corruption so that the
checksum checks not would catch the problem also, then still on a 
level of

(mathemathical/system) symmetry it does make a sense that the checksum
calculation uses the data location as input also.

ZFS does this to guarantee that the data read is the data that really
belongs there.

And I guess we're talking about in the range 50-100 extra CPU cycles 
per
sector access to deliver this, and no extra storage need, so my 
spontaneous
feel about this is that it probably could be implemented on a 
"why-not"

basis -

What do you say?

Tinker






Karel, some followup Q:s on your RAID1C patch

2016-01-31 Thread Tinker

Hi Karel,

Can you please tell me, about your RAID1C patch:

So basically, your RAID1C patch is just the ordinary softraid, BUT, with 
checksums for each sector, located right at the end of the physical 
disc.


Q1:

My most important question to you is, the DATA that you CHECKSUM, do you 
include the SECTOR NUMBER (or other disk location info) of that data 
into your checksum function's inputs, so if the underlying storage's 
storage mapping table breaks down or by other reason disk WRITE:s go to 
the WRONG place, then when READ later on, those READS will FAIL?



Q2:

When your RAID1C detects a checksum failure, will it return READ FAILURE 
on those reads?


(If all the storage copies are broken as detected by checksum check 
failure obviously.)



Q3:

What checksumming algorithm do you use? I think anything 64bit would be 
fine, but, 32bit checksums have too many collissions.



Q4:

What is your status on getting RAID1C included into OpenBSD?


Thanks!
Tinker



Re: Karel, some followup Q:s on your RAID1C patch

2016-01-31 Thread Tinker

Migrating this thread to misc@ .

On 2016-01-31 16:24, Tinker wrote:

Hi Karel,

Can you please tell me, about your RAID1C patch:

So basically, your RAID1C patch is just the ordinary softraid, BUT,
with checksums for each sector, located right at the end of the
physical disc.

Q1:

My most important question to you is, the DATA that you CHECKSUM, do
you include the SECTOR NUMBER (or other disk location info) of that
data into your checksum function's inputs, so if the underlying
storage's storage mapping table breaks down or by other reason disk
WRITE:s go to the WRONG place, then when READ later on, those READS
will FAIL?


Q2:

When your RAID1C detects a checksum failure, will it return READ
FAILURE on those reads?

(If all the storage copies are broken as detected by checksum check
failure obviously.)


Q3:

What checksumming algorithm do you use? I think anything 64bit would
be fine, but, 32bit checksums have too many collissions.


Q4:

What is your status on getting RAID1C included into OpenBSD?


Thanks!
Tinker

M