[zfs-code] fletcher2/4 implementations fundamentally flawed

2009-04-14 Thread paul
Agreed, seems like a prudent plan. -- This message posted from opensolaris.org

[zfs-code] fletcher2/4 implementations fundamentally flawed

2009-04-14 Thread Jonathan Adams
On Mon, Apr 13, 2009 at 06:34:36PM -0700, paul wrote: > Nice analysis. I understand you've found on the > processor/memory-subsystem architecture you've experimented with, that > if the data being summed isn't already cached, it's access will likely > dominate it's use in calculations: > > - Altho

[zfs-code] fletcher2/4 implementations fundamentally flawed

2009-04-13 Thread paul
Nice analysis. I understand you've found on the processor/memory-subsystem architecture you've experimented with, that if the data being summed isn't already cached, it's access will likely dominate it's use in calculations: - Although this makes sense (presuming no mechanism may used to pre-cac

[zfs-code] fletcher2/4 implementations fundamentally flawed

2009-04-13 Thread Jonathan Adams
On Tue, Mar 31, 2009 at 09:07:31AM -0700, paul wrote: > Nice. > > As a minor thought with respect to naming conventions, as most > aren't likely to bother attempting to understand the relative merits > of available checksum implementations, I can't help but wonder > if the naming conventions shoul

[zfs-code] fletcher2/4 implementations fundamentally flawed

2009-03-31 Thread paul
Nice. As a minor thought with respect to naming conventions, as most aren't likely to bother attempting to understand the relative merits of available checksum implementations, I can't help but wonder if the naming conventions should to be ordered relative to their performance/goodness. For ex

[zfs-code] fletcher2/4 implementations fundamentally flawed

2009-03-31 Thread Jeff Bonwick
Very cool, Mark -- thanks for posting this! I was discussing this general issue with Jonathan Adams this afternoon. He's currently coding and measuring the performance and effectiveness of a number of fletcher2 variants, which I'll let him describe in exquisite detail. Something we observed along

[zfs-code] fletcher2/4 implementations fundamentally flawed

2009-03-31 Thread Mark Butler
I ran tests of every dual bit error case on an 8KB block containing the first 8K of a typical ELF executable with six checksum algorithms. I included two trivial checksums for comparison purposes. The undetected error counts are as follows, out of a total of 214783648 cases: xor128:167

[zfs-code] fletcher2/4 implementations fundamentally flawed

2009-03-29 Thread Mark Butler
The first thing I tried was 128 bit additions of 64 bit quantities. The problem is that it doesn't run any faster than fletcher4 on a 32 bit machine, and fletcher4 on 32bit machines is almost 7 times slower than fletcher2. I did some further testing, and fletcher2 does not look as bad as in my

[zfs-code] fletcher2/4 implementations fundamentally flawed

2009-03-29 Thread Jeff Bonwick
On Sun, Mar 29, 2009 at 03:32:20PM -0700, Mark Butler wrote: > > a0 += ip[0] + (ip[0] >> 32); That has certain weaknesses too. What we really want is a 128-bit add across two 64-bit registers. If you write the code like this: value = ip[0]; a0 += value; if (a0 <

[zfs-code] fletcher2/4 implementations fundamentally flawed

2009-03-29 Thread Mark Butler
The code didn't come out right (presumably due to angle bracket handling), so I have included it as an attachment here. -- This message posted from opensolaris.org -- next part -- A non-text attachment was scrubbed... Name: prop_fletcher8.c Type: application/octet-stream S

[zfs-code] fletcher2/4 implementations fundamentally flawed

2009-03-29 Thread Mark Butler
Here is a better example. Suppose you are checksumming an 8 KB block that consists of all one bits. fletcher2 runs two parallel quasi-fletcher checksums on alternating 64 bit words, such that the linear term of each checksum over an 8KB block is the sum of 512 additions. The highest linear ter

[zfs-code] fletcher2/4 implementations fundamentally flawed

2009-03-27 Thread Mark Butler
The problem is that for 1 out of 64 bits, "fletcher2" is [i]not[/i] better than a simple XOR. 50% probability of a false negative for any number of bit errors in those bits. If you have a phantom write to a previously used block that differs only in bits that occupy that bit position, you only

[zfs-code] fletcher2/4 implementations fundamentally flawed

2009-03-27 Thread Richard Elling
Mark Butler wrote: > As I understand it, there is no way to fix a problem with the algorithm of a > defined checksum without invalidating existing zfs filesystems. Any fix to > to the fletcher2 will have to be given a new name. > Other than the name, fletcher, is there actually a problem he

[zfs-code] fletcher2/4 implementations fundamentally flawed

2009-03-27 Thread Mark Butler
As I understand it, there is no way to fix a problem with the algorithm of a defined checksum without invalidating existing zfs filesystems. Any fix to to the fletcher2 will have to be given a new name. Given how incredibly weak the current fletcher2 is, perhaps the first thing that should be

[zfs-code] fletcher2/4 implementations fundamentally flawed (correction)

2008-08-22 Thread Jonathan Adams
On Mon, Aug 18, 2008 at 04:41:35PM -0700, paul wrote: > CORRECTION: As fletcher4 actually sums 32bit (not 64bit) data using 64bit > accumulators, > the first two checksum terms (a and b) will not overflow for data block sizes > as large as > 128KB, and therefore should be considered at least as s

[zfs-code] fletcher2/4 implementations fundamentally flawed (correction)

2008-08-18 Thread paul
CORRECTION: As fletcher4 actually sums 32bit (not 64bit) data using 64bit accumulators, the first two checksum terms (a and b) will not overflow for data block sizes as large as 128KB, and therefore should be considered at least as strong as that expected of a 32/64bit Fletcher checksum (althoug

[zfs-code] fletcher2/4 implementations fundamentally flawed

2008-08-17 Thread paul
As a heads up, Fletcher class of checksum algorithms are reasonably strong because they are specified to perform 1's complement sums (i.e. end around carries [(a + b) mod (2^n -1)]) such that sums don't overflow into oblivion and thereby loose critical information. Unsigned additions in "C" are no