Re: [zfs-discuss] (Fletcher+Verification) versus (Sha256+No Verification)

2011-01-18 Thread Orvar Korvar
...If this is a general rule, maybe it will be worth considering using
SHA512 truncated to 256 bits to get more speed...

Doesn't it need more investigation if truncating 512bit to 256bit gives 
equivalent security as a plain 256bit hash? Maybe truncation will introduce 
some bias?
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] (Fletcher+Verification) versus (Sha256+No Verification)

2011-01-18 Thread Orvar Korvar
Totally Off Topic:
Very interesting. Did you produce some papers on this? Where do you work? Seems 
very fun place to work at! 


BTW, I thought about this. What do you say?

Assume I want to compress data and I succeed in doing so. And then I transfer 
the compressed data. So all the information I transferred is the compressed 
data. But, then you don't count all the information: knowledge about which 
algorithm was used, which number system, laws of math, etc. So there are lots 
of other information that is implicit, when compress/decompress - not just the 
data.

So, if you add data and all implicit information you get a certain bit size X. 
Do this again on the same set of data, with another algorithm and you get 
another bit size Y. 

You compress the data, using lots of implicit information. If you use less 
implicit information (simple algorithm relying on simple math), will X be 
smaller than if you use lots of implicit information (advanced algorithm 
relying on a large body of advanced math)? What can you say about the numbers X 
and Y? Advanced math requires many math books that you need to transfer as well.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] (Fletcher+Verification) versus (Sha256+No Verification)

2011-01-18 Thread Nicolas Williams
On Tue, Jan 18, 2011 at 07:16:04AM -0800, Orvar Korvar wrote:
 BTW, I thought about this. What do you say?
 
 Assume I want to compress data and I succeed in doing so. And then I
 transfer the compressed data. So all the information I transferred is
 the compressed data. But, then you don't count all the information:
 knowledge about which algorithm was used, which number system, laws of
 math, etc. So there are lots of other information that is implicit,
 when compress/decompress - not just the data.
 
 So, if you add data and all implicit information you get a certain bit
 size X. Do this again on the same set of data, with another algorithm
 and you get another bit size Y. 
 
 You compress the data, using lots of implicit information. If you use
 less implicit information (simple algorithm relying on simple math),
 will X be smaller than if you use lots of implicit information
 (advanced algorithm relying on a large body of advanced math)? What
 can you say about the numbers X and Y? Advanced math requires many
 math books that you need to transfer as well.

Just as the laws of thermodynamics preclude perpetual motion machines,
so do they preclude infinite, loss-less data compression.  Yes,
thermodynamics and information theory are linked, amazingly enough.

Data compression algorithms work by identifying certain types of
patterns, then replacing the input with notes such as pattern 1 is ...
and appears at offsets 12345 and 1234567 (I'm simplifying a lot).  Data
that has few or no observable patterns (observable by the compression
algorithm in question) will not compress, but will expand if you insist
on compressing -- randomly-generated data (e.g., the output of
/dev/urandom) will not compress at all and will expand if you insist.
Even just one bit needed to indicate whether a file is compressed or not
will mean expansion when you fail to compress and store the original
instead of the compressed version.  Data compression reduces
repetition, thus making it harder to further compress compressed data.

Try it yourself.  Try building a pipeline of all the compression tools
you have, see how many rounds of compression you can apply to typical
data before further compression fails.

Nico
-- 
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] (Fletcher+Verification) versus (Sha256+No Verification)

2011-01-17 Thread Nicolas Williams
On Sat, Jan 15, 2011 at 10:19:23AM -0600, Bob Friesenhahn wrote:
 On Fri, 14 Jan 2011, Peter Taps wrote:
 
 Thank you for sharing the calculations. In lay terms, for Sha256,
 how many blocks of data would be needed to have one collision?
 
 Two.

Pretty funny.

In this thread some of you are treating SHA-256 as an idealized hash
function.  The odds of accidentally finding collisions in an idealized
256-bit hash function are minute because the distribution of hash
function outputs over inputs is random (or, rather, pseudo-random).

But cryptographic hash functions are generally only approximations of
idealized hash functions.  There's nothing to say that there aren't
pathological corner cases where a given hash function produces lots of
collisions that would be semantically meaningful to people -- i.e., a
set of inputs over which the outputs are not randomly distributed.  Now,
of course, we don't know of such pathological corner cases for SHA-256,
but not that long ago we didn't know of any for SHA-1 or MD5 either.

The question of whether disabling verification would improve performance
is pretty simple: if you have highly deduplicatious, _synchronous_ (or
nearly so, due to frequent fsync()s or NFS close operations) writes, and
the working set did not fit in the ARC nor L2ARC, then yes, disabling
verification will help significantly, by removing an average of at least
half a disk rotation from the write latency.  Or if you have the same
work load but with asynchronous writes that might as well be synchronous
due to an undersized cache (relative to the workload).  Otherwise the
cost of verification should be hidden by caching.

Another way to put this would be that you should first determine that
verification is actually affecting performance, and only _then_ should
you consider disabling it.  But if you want to have the freedom to
disable verficiation, then you should be using SHA-256 (or switch to it
when disabling verification).

Safety features that cost nothing are not worth turning off,
so make sure their cost is significant before even thinking
of turning them off.

Similarly, the cost of SHA-256 vs. Fletcher should also be lost in the
noise if the system has enough CPU, but if the choice of hash function
could make the system CPU-bound instead of I/O-bound, then the choice of
hash function would make an impact on performance.  The choice of hash
functions will have a different performance impact than verification: a
slower hash function will affect non-deduplicatious workloads more than
highly deduplicatious workloads (since the latter will require more I/O
for verification, which will overwhelm the cost of the hash function).
Again, measure first.

Nico
-- 
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] (Fletcher+Verification) versus (Sha256+No Verification)

2011-01-15 Thread Pawel Jakub Dawidek
On Fri, Jan 14, 2011 at 11:32:58AM -0800, Peter Taps wrote:
 Ed,
 
 Thank you for sharing the calculations. In lay terms, for Sha256, how many 
 blocks of data would be needed to have one collision?
 
 Assuming each block is 4K is size, we probably can calculate the final data 
 size beyond which the collision may occur. This would enable us to make the 
 following statement:
 
 With Sha256, you need verification to be turned on only if you are dealing 
 with more than xxxT of data.

Except that this is wrong question to ask. The question you can ask is
How many blocks of data do I need so collision probability is X%?.

 Also, another related question. Why 256 bits was chosen and not 128 bits or 
 512 bits? I guess Sha512 may be an overkill. In your formula, how many blocks 
 of data would be needed to have one collision using Sha128?

There is no SHA128 and SHA512 has too long hash. Currently the maximum
hash ZFS can handle is 32 bytes (256 bits). Wasting another 32 bytes
without improving anything in practise wasn't probably worth it.

BTW. As for SHA512 being slower it looks like it depends on
implementation or SHA512 is faster to compute on 64bit CPU.
On my laptop OpenSSL computes SHA256 55% _slower_ than SHA512.
If this is a general rule, maybe it will be worth considering using
SHA512 truncated to 256 bits to get more speed.

-- 
Pawel Jakub Dawidek   http://www.wheelsystems.com
p...@freebsd.org   http://www.FreeBSD.org
FreeBSD committer Am I Evil? Yes, I Am!


pgpXQHlrciD1Y.pgp
Description: PGP signature
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] (Fletcher+Verification) versus (Sha256+No Verification)

2011-01-15 Thread Edward Ned Harvey
 From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss-
 boun...@opensolaris.org] On Behalf Of Peter Taps
 
 Thank you for sharing the calculations. In lay terms, for Sha256, how many
 blocks of data would be needed to have one collision?

There is no point in making a generalization and a recommended best
practice.  Because it's all just a calculation of probabilities and
everyone will draw the line differently.  My personal recommendation is to
enable verification at work, just so you can never be challenged or have to
try convincing your boss about something that is obvious to you.  The main
value of verification is that you don't need to explain anything to anyone.


Two blocks would be needed to have one collision, at a probability of 2^-256
which is 10^-78.  This is coincidentally very near the probability of
randomly selecting the same atom in the observable universe twice
consecutively.  The more blocks, the higher the probability, and that's all
there is to it (unless someone is intentionally trying to cause collisions
with data generated specifically and knowledgeably for that express
purpose).  When you start reaching 2^128 (which is 10^38) blocks it becomes
likely you have a collision.

Every data pool in the world is someplace in between 2 and 2^128 blocks.
Smack in the middle of the region where the probability is distinctly more
likely than randomly selecting the same atom of the universe twice, and less
likely than an armageddon caused by earth asteroid collision.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] (Fletcher+Verification) versus (Sha256+No Verification)

2011-01-15 Thread Bob Friesenhahn

On Fri, 14 Jan 2011, Peter Taps wrote:

Thank you for sharing the calculations. In lay terms, for Sha256, 
how many blocks of data would be needed to have one collision?


Two.

Bob
--
Bob Friesenhahn
bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer,http://www.GraphicsMagick.org/
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] (Fletcher+Verification) versus (Sha256+No Verification)

2011-01-14 Thread Peter Taps
Ed,

Thank you for sharing the calculations. In lay terms, for Sha256, how many 
blocks of data would be needed to have one collision?

Assuming each block is 4K is size, we probably can calculate the final data 
size beyond which the collision may occur. This would enable us to make the 
following statement:

With Sha256, you need verification to be turned on only if you are dealing 
with more than xxxT of data.

Also, another related question. Why 256 bits was chosen and not 128 bits or 512 
bits? I guess Sha512 may be an overkill. In your formula, how many blocks of 
data would be needed to have one collision using Sha128?

Appreciate your help.

Regards,
Peter
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] (Fletcher+Verification) versus (Sha256+No Verification)

2011-01-14 Thread Peter Taps
I am posting this once again as my previous post went into the middle of the 
thread and may go unnoticed.

Ed,

Thank you for sharing the calculations. In lay terms, for Sha256, how many 
blocks of data would be needed to have one collision?

Assuming each block is 4K is size, we probably can calculate the final data 
size beyond which the collision may occur. This would enable us to make the 
following statement:

With Sha256, you need verification to be turned on only if you are dealing 
with more than xxxT of data.

Also, another related question. Why 256 bits was chosen and not 128 bits or 512 
bits? I guess Sha512 may be an overkill. In your formula, how many blocks of 
data would be needed to have one collision using Sha128?

Appreciate your help.

Regards,
Peter
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] (Fletcher+Verification) versus (Sha256+No Verification)

2011-01-14 Thread David Magda
On Jan 14, 2011, at 14:32, Peter Taps wrote:

 Also, another related question. Why 256 bits was chosen and not 128 bits or 
 512 bits? I guess Sha512 may be an overkill. In your formula, how many blocks 
 of data would be needed to have one collision using Sha128?


There are two ways to get 128 bits: use a 128-bit function (e.g., MD5), or use 
a longer function and truncate its output.

In the case of MD5, it has been depreciated for a while now because of 
collisions. [1] Similarly 160-bit hash functions are getting collisions as well 
(SHA-1). [2] So the next step up is generally 256 (though there are a few 
224-bit-output hashes out there).

However, if you're going to use to 256-bit hash functions, why throw away half 
of security if you've already done all the work to get those 256 bits? Might as 
well use all the bits and get extra security.

Using a 512-bit hash function was probably deemed as too expensive for CPUs 
at this time. There's also the fact that things are a bit in flux currently, as 
there's a competition to find the official (US) 'next generation' hash 
function. [3] And while it official-ness is mostly a US (military) thing, it 
will probably become a de facto standard in many other countries and industries.

For the proverbial sha128, you'd roughly need only half the blocks of data 
before getting a collision as compared to SHA-256. The math is left as an 
exercise for the reader.

[1] http://en.wikipedia.org/wiki/MD5#Security
[2] http://en.wikipedia.org/wiki/SHA-1#SHA-1
[3] http://en.wikipedia.org/wiki/SHA-3
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] (Fletcher+Verification) versus (Sha256+No Verification)

2011-01-12 Thread Enrico Maria Crisostomo
Edward, this is OT but may I suggest you to use something like Wolfram Alpha to 
perform your calculations a bit more comfortably?

-- 
Enrico M. Crisostomo

On Jan 12, 2011, at 4:24, Edward Ned Harvey 
opensolarisisdeadlongliveopensola...@nedharvey.com wrote:

 For anyone who still cares:
 
 I'm calculating the odds of a sha256 collision in an extremely large zpool,
 containing 2^35 blocks of data, and no repetitions.
 
 The formula on wikipedia for the birthday problem is:
 p(n;d) ~= 1-( (d-1)/d )^( 0.5*n*(n-1) )
 
 In this case, 
 n=2^35
 d=2^256
 
 The problem is, this formula does not compute because n is so large.
 Fortunately x = e^ln(x) and so we're able to use this technique to make the
 huge exponent instead, a huge multiplication.
 
 (Using the bc mathlib notation, the l(x) function is the natural log of x,
 and e(x) is e raised to the power of x)
 p(n;d) ~= 1-e(   (  0.5*n*(n-1)*l((d-1)/d)  )   )
 
 Using bc to calculate the answer:
 bc -l
 
 n=2^35
 d=2^256
 scale=1024
 1-e(   (  0.5*n*(n-1)*l((d-1)/d)  )   )
 .50978941154
 I manually truncated here (precision goes out 1024 places).  This is
 5.1*10^-57
 
 Note: I had to repeat the calculation many times, setting a larger and
 larger scale.  The default scale of 20, and even 64 and 70 and 80 were not
 precise enough to produce a convergent answer around the -57th decimal
 place.  So I just kept going larger, and in retrospect, anything over 100
 would have been fine.  I wrote 1024 above, so who cares.
 
 If you've been paying close attention you'll recognize this is the same
 answer I originally calculated, but my original equation was in fact wrong.
 It just so happens that my original equation neglects the probability of
 collisions from previous events, so it is actually accurate whenever the
 probability of previous events is insignificant.  It is merely luck that for
 the data size in question, my equation produced something that looked
 correct.  It would produce a wrong calculation of probability for a larger
 value of n.
 
 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] (Fletcher+Verification) versus (Sha256+No Verification)

2011-01-12 Thread Edward Ned Harvey
 Edward, this is OT but may I suggest you to use something like Wolfram
Alpha
 to perform your calculations a bit more comfortably?

Wow, that's pretty awesome.  Thanks.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] (Fletcher+Verification) versus (Sha256+No Verification)

2011-01-11 Thread Lassi Tuura
Hey there,

 ~= 5.1E-57
 
 Bah.  My math is wrong.  I was never very good at PS.  I'll ask someone at
 work tomorrow to look at it and show me the folly.  Wikipedia has it right,
 but I can't evaluate numbers to the few-hundredth power in any calculator
 that I have handy.

bc -l EOF
scale=150
define bday(n, h) { return 1 - e(-(n^2)/(2*h)); }
bday(2^35, 2^256)
bday(2^35, 2^256) * 10^57
EOF

Basically, ~5.1 * 10^-57.

Seems your number was correct, although I am not sure how you arrived at it.
It appears that number is ~ consistent with the tables on Wikipedia article
on birthday attack (http://en.wikipedia.org/wiki/Birthday_attack): p=10^−15
for 256-bit hash collision requires 1.5*10^31 (O(2^103)) hashes, assuming
the hashes are randomly distributed.

That's definitely improbable.

For anecdotal evidence to the contrary, we once had apps using GUIDs. We did
believe collisions would be extremely improbable. Yet GUID conflicts turned
into recurring events, at worst several collisions a week. It may or may not
be related - in our case it turned out the GUIDs weren't anywhere near as
random as people thought they would or should be. In retrospect we clearly
used a flawed GUID generator which rendered our assumptions about randomness
invalid. SHA256 on (even non-random) data is different, but do be careful
about what you assume.

Regards,
Lassi
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] (Fletcher+Verification) versus (Sha256+No Verification)

2011-01-11 Thread Edward Ned Harvey
 From: Lassi Tuura [mailto:l...@cern.ch]
 
 bc -l EOF
 scale=150
 define bday(n, h) { return 1 - e(-(n^2)/(2*h)); }
 bday(2^35, 2^256)
 bday(2^35, 2^256) * 10^57
 EOF
 
 Basically, ~5.1 * 10^-57.
 
 Seems your number was correct, although I am not sure how you arrived at
 it.

The number was calculated correctly, unfortunately, the formula was not.  :-(
The correct formula is the one on wikipedia...

p(n;d) ~= 1-(   ((d-1)/d)^(0.5*n*(n-1))   )
Where n=2^35 and d=2^256

Unfortunately, bc isn't good at exponentiating huge numbers.  Observe:
exponent=0.5*2^35*(2^35-1)
exponent
590295810341525782528
(2^256-1)^exponent
Runtime error (func=(main), adr=35): exponent too large in raise
Even if I chop that down to something much much much smaller ... 9 ... then 
bc simply hangs.  It's apparently looping 9 times which will take forever.

It's so easy to calculate when I use the wrong formula, and difficult to 
calculate when I use the right formula.   ;-)  I'm hoping somebody at work 
knows how to do this in matlab or octave or something...  Will try later today.


 It appears that number is ~ consistent with the tables on Wikipedia article
 on birthday attack (http://en.wikipedia.org/wiki/Birthday_attack): p=10^−15

This is why, at first, I thought my formula (which is wrong) was equal to the 
formula on wikipedia.  Because at least roughly speaking, they were coming up 
with believably similar numbers.  But you can clearly see mine is wrong, for 
example, if you set n to something larger than the square root of d.  According 
to my formula, if you choose 2^160 hashes (for example) out of the space of 
2^256 hashes then you're 100% certain to have a collision, which is blatantly 
wrong.  An easier error to see is ... if you choose 4 out of a space of 10, 
then the probability of collision is 1.0.  Which again, is clearly wrong.


 For anecdotal evidence to the contrary, we once had apps using GUIDs. We
 did
 believe collisions would be extremely improbable. Yet GUID conflicts turned
 into recurring events, at worst several collisions a week. It may or may not
 be related - in our case it turned out the GUIDs weren't anywhere near as
 random as people thought they would or should be. In retrospect we clearly
 used a flawed GUID generator which rendered our assumptions about
 randomness
 invalid. SHA256 on (even non-random) data is different, but do be careful
 about what you assume.

Whenever I hear anyone saying they have experienced collisions, just as your 
GUID problem, I question what precisely was colliding, and the answer is never 
any cryptographic hash.  It's either poor randomness, or a weak 
non-cryptographic hash, or something like that.  Just as ... earlier in this 
thread I believe someone said they experienced rsync collisions...  Which I 
consider to be irrelevant to the discussion of sha256.

I certainly acknowledge that sha256 is not random, and yet it is fundamental to 
assume it's random, or more accurately, unpredictable and evenly distributed.  
But since even the most dedicated specialized mathmaticians thus far haven't 
been able to show anything to the contrary (at least not in public) the 
conclusion I'm drawing is:  I trust the distribution characteristics as long as 
I don't have malicious users or friendly users intentionally trying to 
specifically generate collisions specifically for sha256.  Someday sha256 might 
become at least partially broken, but until that time, I trust it.  I'd bet 
anything short of my life on it preventing accidental collisions...  In fact, 
every day I risk my life on things that are more likely to kill me, so I guess 
I'd even bet my life on it, if I had anything to gain.  Give me $1 and I'll bet 
my life on sha256 not generating an accidental collision.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] (Fletcher+Verification) versus (Sha256+No Verification)

2011-01-11 Thread Edward Ned Harvey
For anyone who still cares:

I'm calculating the odds of a sha256 collision in an extremely large zpool,
containing 2^35 blocks of data, and no repetitions.

The formula on wikipedia for the birthday problem is:
p(n;d) ~= 1-( (d-1)/d )^( 0.5*n*(n-1) )

In this case, 
n=2^35
d=2^256

The problem is, this formula does not compute because n is so large.
Fortunately x = e^ln(x) and so we're able to use this technique to make the
huge exponent instead, a huge multiplication.

(Using the bc mathlib notation, the l(x) function is the natural log of x,
and e(x) is e raised to the power of x)
p(n;d) ~= 1-e(   (  0.5*n*(n-1)*l((d-1)/d)  )   )

Using bc to calculate the answer:
bc -l
 
n=2^35
d=2^256
scale=1024
1-e(   (  0.5*n*(n-1)*l((d-1)/d)  )   )
.50978941154
I manually truncated here (precision goes out 1024 places).  This is
5.1*10^-57

Note: I had to repeat the calculation many times, setting a larger and
larger scale.  The default scale of 20, and even 64 and 70 and 80 were not
precise enough to produce a convergent answer around the -57th decimal
place.  So I just kept going larger, and in retrospect, anything over 100
would have been fine.  I wrote 1024 above, so who cares.

If you've been paying close attention you'll recognize this is the same
answer I originally calculated, but my original equation was in fact wrong.
It just so happens that my original equation neglects the probability of
collisions from previous events, so it is actually accurate whenever the
probability of previous events is insignificant.  It is merely luck that for
the data size in question, my equation produced something that looked
correct.  It would produce a wrong calculation of probability for a larger
value of n.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] (Fletcher+Verification) versus (Sha256+No Verification)

2011-01-11 Thread Edward Ned Harvey
In case you were wondering how big is n before the probability of collision
becomes remotely possible, slightly possible, or even likely?

Given a fixed probability of collision p, the formula to calculate n is:
n = 0.5 + sqrt(   (  0.25 + 2*l(1-p)/l((d-1)/d)  )   )
(That's just the same equation as before, solved for n)

p=0.01  n=4.8*10^35  ~= 2^118
p=0.1   n=1.5*10^36  ~=  2^120
p=0.0001n=4.8*10^36  ~=  2^122
p=0.001 n=1.5*10^37  ~=  2^123
p=0.01  n=4.8*10^37  ~=  2^125
p=0.1   n=1.5*10^38  ~=  2^127
p=0.5   n=4.0*10^38  ~=  2^128
p=0.9   n=7.3*10^38  ~=  2^129
p=0.99  n=1.0*10^39  ~=  2^130
p=0.999 n=1.3*10^39  ~=  2^130

Recall that 2^256 ~= 1.15*10^77
Something somewhere says the n for expected collision happens around when
the exponent is halved...  Half of 77 is around 38, which is supported
above.  Half of 256 is 128, which is supported above.

So as n is reduced exponentially from the expected point, the probability
of collision exponentially approaches 0.
You'll notice for n larger than the expected point, the probability even
more dramatically approaches 1.  I cannot get my poor little computer to
compute anything higher than 5.1*10^39, no matter how many 9's I put on
there.  At this point, it becomes exponentially near impossible to avoid a
collision.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] (Fletcher+Verification) versus (Sha256+No Verification)

2011-01-10 Thread Robert Milkowski

 On 01/ 8/11 05:59 PM, Edward Ned Harvey wrote:


Has anybody measured the cost of enabling or disabling verification?

The cost of disabling verification is an infinitesimally small number
multiplied by possibly all your data.  Basically lim-0 times lim-infinity.
This can only be evaluated on a case-by-case basis and there's no use in
making any more generalizations in favor or against it.

The benefit of disabling verification would presumably be faster
performance.  Has anybody got any measurements, or even calculations or
vague estimates or clueless guesses, to indicate how significant this is?
How much is there to gain by disabling verification?



Exactly my point and there isn't one answer which fits all environments.
In the testing I'm doing so far enabling/disabling verification doesn't 
make any noticeable difference so I'm sticking to verify. But I have 
enough memory and such a workload that I see little physical reads going 
on.



--
Robert Milkowski
http://milek.blogspot.com

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] (Fletcher+Verification) versus (Sha256+No Verification)

2011-01-10 Thread Pawel Jakub Dawidek
On Sat, Jan 08, 2011 at 12:59:17PM -0500, Edward Ned Harvey wrote:
 Has anybody measured the cost of enabling or disabling verification?

Of course there is no easy answer:)

Let me explain how verification works exactly first.

You try to write a block. You see that block is already in dedup table
(it is already referenced). You read the block (maybe it is in ARC or in
L2ARC). You compare read block with what you want to write.

Based on the above:
1. If you have dedup on, but your blocks are not deduplicable at all,
   you will pay no price for verification, as there will be no need to
   compare anything.
2. If your data is highly deduplicable you will verify often. Now it
   depends if the data you need to read fits into your ARC/L2ARC or not.
   If it can be found in ARC, the impact will be small.
   If your pool is very large and you can't count on ARC help, each
   write will be turned into a read.

Also note an interesting property of dedup: if your data is highly
deduplicable you can actually improve performance by avoiding data
writes (and just increasing reference count).
Let me show you three degenerated tests to compare options.
I'm writing 64GB of zeros to a pool with dedup turned off, with dedup turned on
and with dedup+verification turned on (I use SHA256 checksum everywhere):

# zpool create -O checksum=sha256 tank ada{0,1,2,3}
# time sh -c 'dd if=/dev/zero of=/tank/zero bs=1m count=65536; sync; 
zpool export tank'
254,11 real 0,07 user40,80 sys

# zpool create -O checksum=sha256 -O dedup=on tank ada{0,1,2,3}
# time sh -c 'dd if=/dev/zero of=/tank/zero bs=1m count=65536; sync; 
zpool export tank'
154,60 real 0,05 user37,10 sys

# zpool create -O checksum=sha256 -O dedup=sha256,verify tank 
ada{0,1,2,3}
# time sh -c 'dd if=/dev/zero of=/tank/zero bs=1m count=65536; sync; 
zpool export tank'
173,43 real 0,02 user38,41 sys

As you can see in second and third test the data is of course in ARC, so the
difference here is only because of data comparison (no extra reads are needed)
and verification is 12% slower.

This is of course silly test, but as you can see dedup (even with verification)
is much faster than nodedup case, but this data is highly deduplicable:)

# zpool list
NAME   SIZE  ALLOC   FREECAP  DEDUP   HEALTH  ALTROOT
tank   149G  8,58M   149G 0%  524288.00x  ONLINE  -

-- 
Pawel Jakub Dawidek   http://www.wheelsystems.com
p...@freebsd.org   http://www.FreeBSD.org
FreeBSD committer Am I Evil? Yes, I Am!


pgp3iTC1h5dwE.pgp
Description: PGP signature
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] (Fletcher+Verification) versus (Sha256+No Verification)

2011-01-10 Thread David Magda
On Mon, January 10, 2011 02:41, Eric D. Mudama wrote:
 On Sun, Jan  9 at 22:54, Peter Taps wrote:
 Thank you all for your help. I am the OP.

 I haven't looked at the link that talks about the probability of
 collision. Intuitively, I still wonder how the chances of collision
 can be so low. We are reducing a 4K block to just 256 bits. If the
 chances of collision are so low, *theoretically* it is possible to
 reconstruct the original block from the 256-bit signature by using a
 simple lookup. Essentially, we would now have world's best
 compression algorithm irrespective of whether the data is text or
 binary. This is hard to digest.

 simple lookup isn't so simple when there are 2^256 records to
 search, however, fundamentally your understanding of hashes is
 correct.
[...]

It should also be noted that ZFS itself can only address 2^128 bytes
(not even 4K 'records'), and supposedly to fill those 2^128 bytes it would
take as much energy as it would take to boil the Earth's oceans:

http://blogs.sun.com/bonwick/entry/128_bit_storage_are_you

So recording and looking up 2^256 records would be quite an
accomplishment. It's a lot of data.

If the OP wants to know why the chances are so low, he'll have to learn a
bit about hash functions (which is what SHA-256 is):

http://en.wikipedia.org/wiki/Hash_function
http://en.wikipedia.org/wiki/Cryptographic_hash_function

Knowing exactly how the math (?) works is not necessary, but understanding
the principles would be useful if one wants to have a general picture as
to why SHA-256 doesn't need a verification step, and why it was chosen as
one of the ZFS (dedupe) checksum options.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] (Fletcher+Verification) versus (Sha256+No Verification)

2011-01-10 Thread Edward Ned Harvey
 From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss-
 boun...@opensolaris.org] On Behalf Of Peter Taps
 
 I haven't looked at the link that talks about the probability of
collision.
 Intuitively, I still wonder how the chances of collision can be so low. We
are
 reducing a 4K block to just 256 bits. If the chances of collision are so
low,
 *theoretically* it is  possible to reconstruct the original block from the
256-bit
 signature by using a simple lookup. Essentially, we would now have world's
 best compression algorithm irrespective of whether the data is text or
 binary. This is hard to digest.

BTW, at work we do a lot of theoretical mathematics, and one day a few
months ago, I was given the challenge to explore the concept of using a
hashing algorithm as a form of compression, exactly as you said.  The
conclusion was:  You can't reverse-hash in order to reconstruct unknown
original data, but you can do it (theoretically) if you have enough
additional information about what constitutes valid original data.  If you
have a huge lookup table of all the possible original data blocks, then the
hash can only be used to identify 2^(N-M) of them as possible candidates,
and some additional technique is necessary to figure out precisely which one
of those is the original data block.  (N is the length of the data block in
bits, and M is the length of the hash, in bits.)

Hashing discards some of the original data.  In fact, random data is
generally uncompressible, so if you try to compress random data and end up
with something smaller than the original, you can rest assured you're not
able to reconstruct.  However, if you know something about the original...
For example if you know the original is a valid text document written in
English, then in all likelihood there is only one possible original block
fitting that description and yielding the end hash result.  Even if there is
more than one original block which looks like valid English text and
produces the same end hash, it is easy to choose which one is correct based
on context...  Since you presumably know the previous block and the
subsequent block, you just choose the intermediate block which seamlessly
continues to produce valid English grammar at the junctions with adjacent
blocks.  This technique can be applied to most types of clearly structured
original data, but it cannot be applied to unstructured or unknown original
data.  So at best, hashing could be a special-case form of compression.

To decompress would require near-infinite compute hours or a large lookup
table to scan all the possible sets of inputs to find one which produces the
end hash...  So besides the fact that hashing is at best a specific form of
compression requiring additional auxiliary information, it's also
impractical.  To get this down to something reasonable, I considered using a
48MB lookup table for a 24-bit block of data (that's 2^24 entries of 24 bits
each), or a 16GB lookup table for a 32-bit block of data (2^32 entries of 32
bits each).  Well, in order to get a compression ratio worth talking about,
the hash size would have to be 3 bits or smaller.  That's a pretty big
lookup table to decompress 3 bits into 24 or 32...  And let's face it ...
9:1 compression isn't stellar for a text document.

And the final nail in the coffin was:  In order for this technique to be
viable, as mentioned, the original data must be structured.  For any set of
structured original data, all the information which is necessary for the
reverse-hash to identify valid data from the lookup table, could have been
used instead to create a specialized compression algorithm which is equal or
better than the reverse-hash.  

So reverse-hash decompression is actually the worst case algorithm for all
the data types which it's capable of working on.

But yes, you're right, it's theoretically possible for specific cases, but
not theoretically possible for the general case.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] (Fletcher+Verification) versus (Sha256+No Verification)

2011-01-10 Thread Edward Ned Harvey
 From: Pawel Jakub Dawidek [mailto:p...@freebsd.org]
 
 Well, I find it quite reasonable. If your block is referenced 100 times,
 it is probably quite important. 

If your block is referenced 1 time, it is probably quite important.  Hence
redundancy in the pool.


 There are many corruption possibilities
 that can destroy your data. Imagine memory error, which corrupts
 io_offset in write zio structure and corrupted io_offset points at your
 deduped block referenced 100 times. It will be overwritten and
 redundancy won't help you. 

All of the corruption scenarios which allow you to fail despite pool
redundancy, also allow you to fail despite copies+N.


 Note, that deduped data is not alone
 here. Pool-wide metadata are stored 'copies+2' times (but no more than
 three) and dataset-wide metadata are stored 'copies+1' times (but no
 more than three), so by default pool metadata have three copies and
 dataset metadata have two copies, AFAIR. When you lose root node of a
 tree, you lose all your data, are you really, really sure only one copy
 is enough?

Interesting.  But no.  There is not only one copy as long as you have pool
redundancy.


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] (Fletcher+Verification) versus (Sha256+No Verification)

2011-01-10 Thread Edward Ned Harvey
 From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss-
 boun...@opensolaris.org] On Behalf Of David Magda

 Knowing exactly how the math (?) works is not necessary, but understanding

Understanding the math is not necessary, but it is pretty easy.  And
unfortunately it becomes kind of necessary because even when you tell
somebody the odds of a collision are a zillion times smaller than the odds
of our Sun exploding and destroying Earth, they still don't believe you.

The explanation of the math, again, is described in the wikipedia article
Birthday Problem, or stated a little more simply here:

Given a finite pool of N items, pick one at random and return it to the
pool.
Pick another one.  The odds of it being the same as the first are 1/N.
Pick another one.  The odds of it being the same as the first are 1/N, and
the odds of it being the same as the 2nd are 1/N.  So the odds of it
matching any of the prior picks are 2/N.
Pick another one.  The odds of it being the same as any previous pick are
3/N.

If you repeatedly draw M items out of the pool (plus the first draw),
returning them each time, then the odds of any draw matching any other draw
are:
P = 1/N + 2/N +3/N + ... + M/N
P = ( sum(1 to M) ) / N

Note:  If you google for sum positive integers, you'll find sum(1 to N) =
N * (N+1) / 2

P = M * (M+1) / 2N

In the context of hash collisions in a zpool, M would be the number of data
blocks in your zpool, and N would be all the possible hashes.  A sha-256
hash has 256 bits, so N = 2^256

I described an excessively large worst-case zpool in my other email, which
had 2^35 data blocks in it.  So...
M = 2^35

So the probability of any block hash colliding with any other hash in that
case is
2^35 * (2^35+1) / (2*2^256)
= ( 2^70 + 2^35 ) * 2^-257
= 2^-187 + 2^-222
~= 5.1E-57

There are estimated 8.87 E 49 atoms in planet Earth.  (
http://pages.prodigy.net/jhonig/bignum/qaearth.html )

The probability of a collision in your worst-case unrealistic dataset as
described, is even 100 million times less likely than randomly finding a
single specific atom in the whole planet Earth by pure luck.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] (Fletcher+Verification) versus (Sha256+No Verification)

2011-01-10 Thread Edward Ned Harvey
 From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss-
 boun...@opensolaris.org] On Behalf Of Edward Ned Harvey
 
 ~= 5.1E-57

Bah.  My math is wrong.  I was never very good at PS.  I'll ask someone at
work tomorrow to look at it and show me the folly.  Wikipedia has it right,
but I can't evaluate numbers to the few-hundredth power in any calculator
that I have handy.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] (Fletcher+Verification) versus (Sha256+No Verification)

2011-01-09 Thread Pawel Jakub Dawidek
On Fri, Jan 07, 2011 at 03:06:26PM -0800, Brandon High wrote:
 On Fri, Jan 7, 2011 at 11:33 AM, Robert Milkowski mi...@task.gda.pl wrote:
  end-up with the block A. Now if B is relatively common in your data set you
  have a relatively big impact on many files because of one corrupted block
  (additionally from a fs point of view this is a silent data corruption).
  Without dedup if you get a single block corrupted silently an impact usually
  will be relatively limited.
 
 A pool can be configures so that a dedup'd block will only be
 referenced a certain number of times. So if you write out 10,000
 identical blocks, it may be written 10 times with each duplicate
 referenced 1,000 times. The exact number is controlled by the
 dedupditto property for your pool, and you should set it as your risk
 tolerance allows.

Dedupditto doesn't work exactly that way. You can have at most 3 copies
of your block. Dedupditto minimal value is 100. The first copy is
created on first write, the second copy is created on dedupditto
references and the third copy is created on 'dedupditto * dedupditto'
references. So once you reach 1 references of your block ZFS will
create three physical copies, not earlier and never more than three.

-- 
Pawel Jakub Dawidek   http://www.wheelsystems.com
p...@freebsd.org   http://www.FreeBSD.org
FreeBSD committer Am I Evil? Yes, I Am!


pgp8xQSJhrMH1.pgp
Description: PGP signature
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] (Fletcher+Verification) versus (Sha256+No Verification)

2011-01-09 Thread Edward Ned Harvey
 From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss-
 boun...@opensolaris.org] On Behalf Of Pawel Jakub Dawidek
 
 Dedupditto doesn't work exactly that way. You can have at most 3 copies
 of your block. Dedupditto minimal value is 100. The first copy is
 created on first write, the second copy is created on dedupditto
 references and the third copy is created on 'dedupditto * dedupditto'
 references. So once you reach 1 references of your block ZFS will
 create three physical copies, not earlier and never more than three.

What is the point of dedupditto?  If there is a block on disk, especially on
a pool with redundancy so it can safely be assumed good now and for the
future...   Why store the multiples?  Even if it is a maximum of 3, I
presently only see the sense in a maximum of 1.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] (Fletcher+Verification) versus (Sha256+No Verification)

2011-01-09 Thread Peter Taps
Thank you all for your help. I am the OP.

I haven't looked at the link that talks about the probability of collision. 
Intuitively, I still wonder how the chances of collision can be so low. We are 
reducing a 4K block to just 256 bits. If the chances of collision are so low, 
*theoretically* it is  possible to reconstruct the original block from the 
256-bit signature by using a simple lookup. Essentially, we would now have 
world's best compression algorithm irrespective of whether the data is text or 
binary. This is hard to digest.

Peter
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] (Fletcher+Verification) versus (Sha256+No Verification)

2011-01-09 Thread Eric D. Mudama

On Sun, Jan  9 at 22:54, Peter Taps wrote:

Thank you all for your help. I am the OP.

I haven't looked at the link that talks about the probability of
collision. Intuitively, I still wonder how the chances of collision
can be so low. We are reducing a 4K block to just 256 bits. If the
chances of collision are so low, *theoretically* it is possible to
reconstruct the original block from the 256-bit signature by using a
simple lookup. Essentially, we would now have world's best
compression algorithm irrespective of whether the data is text or
binary. This is hard to digest.

Peter


simple lookup isn't so simple when there are 2^256 records to
search, however, fundamentally your understanding of hashes is
correct.

That being said, while at some point people might identify two
commonly-used blocks with the same hash (e.g. system library files or
other) the odds of it happening are extremely low.  Random
google-result website calculates you as needing ~45 exabytes in your
pool of 4KB chunk deduped data before you get to a ~10^-17 chance of a
hash collision:

http://www.backupcentral.com/mr-backup-blog-mainmenu-47/13-mr-backup-blog/145-de-dupe-hash-collisions.html

Now, obviously the above is in the context of having to restore from
backup, which is rare, however in live usage I don't think the math
changes a whole lot.



--
Eric D. Mudama
edmud...@mail.bounceswoosh.org

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] (Fletcher+Verification) versus (Sha256+No Verification)

2011-01-08 Thread Robert Milkowski

 On 01/ 7/11 09:02 PM, Pawel Jakub Dawidek wrote:

On Fri, Jan 07, 2011 at 07:33:53PM +, Robert Milkowski wrote:


Now what if block B is a meta-data block?

Metadata is not deduplicated.


Good point but then it depends on a perspective.
What if you you are storing lots of VMDKs?
One corrupted block which is shared among hundreds of VMDKs will affect 
all of them.

And it might be a block containing meta-data information within vmdk...

Anyway, green or not, imho if in a given environment turning 
verification on still delivers acceptable performance then I would 
basically turn it on.


In other environments it is about risk assessment.

Best regards,
 Robert Milkowski
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] (Fletcher+Verification) versus (Sha256+No Verification)

2011-01-08 Thread Edward Ned Harvey
 From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss-
 boun...@opensolaris.org] On Behalf Of Robert Milkowski
 
 What if you you are storing lots of VMDKs?
 One corrupted block which is shared among hundreds of VMDKs will affect
 all of them.
 And it might be a block containing meta-data information within vmdk...

Although the probability of hash collision is astronomically infinitesimally
small, if it were to happen, the damage could be expansive and
unrecoverable.  Even backups could not protect you, because the corruption
would be replicated undetected into your backups too.  Just like other
astronomical events (meteors, supernova, etc) which could destroy all your
data, all your backups, and kill everyone you know, if these risks are not
acceptable to you, you need to take precautions against it.  But you have to
weigh the odds of damage versus the cost of protection.  Admittedly,
precautions against nuclear strike are more costly to implement than
precaution against hash collision (enabling verification is a trivial task).
But that does not mean enabling verification comes without cost.

Has anybody measured the cost of enabling or disabling verification?

The cost of disabling verification is an infinitesimally small number
multiplied by possibly all your data.  Basically lim-0 times lim-infinity.
This can only be evaluated on a case-by-case basis and there's no use in
making any more generalizations in favor or against it.

The benefit of disabling verification would presumably be faster
performance.  Has anybody got any measurements, or even calculations or
vague estimates or clueless guesses, to indicate how significant this is?
How much is there to gain by disabling verification?

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] (Fletcher+Verification) versus (Sha256+No Verification)

2011-01-08 Thread Bob Friesenhahn

On Thu, 6 Jan 2011, David Magda wrote:


If you're not worried about disk read errors (and/or are not experiencing
them), then you shouldn't be worried about has collisions.


Except for the little problem that if there is a collision then there 
will always be a collision for the same data and it is unavoidable. 
:-)


Bit rot is a different sort of problem.

Bob
--
Bob Friesenhahn
bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer,http://www.GraphicsMagick.org/
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] (Fletcher+Verification) versus (Sha256+No Verification)

2011-01-07 Thread Bakul Shah
On Thu, 06 Jan 2011 22:42:15 PST Michael DeMan sola...@deman.com  wrote:
 To be quite honest, I too am skeptical about about using de-dupe just based o
 n SHA256.  In prior posts it was asked that the potential adopter of the tech
 nology provide the mathematical reason to NOT use SHA-256 only.  However, if 
 Oracle believes that it is adequate to do that, would it be possible for some
 body to provide:
 
 (A) The theoretical documents and associated mathematics specific to say one 
 simple use case?

See http://en.wikipedia.org/wiki/Birthday_problem -- in
particular see section 5.1 and the probability table of
section 3.4.

 On Jan 6, 2011, at 10:05 PM, Edward Ned Harvey wrote:
 
  I have been told that the checksum value returned by Sha256 is almost
  guaranteed to be unique. In fact, if Sha256 fails in some case, we have a
  bigger problem such as memory corruption, etc. Essentially, adding
  verification to sha256 is an overkill.

Agreed.

  Someone please correct me if I'm wrong.

OK :-)

  Suppose you have 128TB of data.  That is ...  you have 2^35 unique 4k block
 s
  of uniformly sized data.  Then the probability you have any collision in
  your whole dataset is (sum(1 thru 2^35))*2^-256 
  Note: sum of integers from 1 to N is  (N*(N+1))/2
  Note: 2^35 * (2^35+1) = 2^35 * 2^35 + 2^35 = 2^70 + 2^35
  Note: (N*(N+1))/2 in this case = 2^69 + 2^34
  So the probability of data corruption in this case, is 2^-187 + 2^-222 ~=
  5.1E-57 + 1.5E-67
  
  ~= 5.1E-57

I believe this is wrong. See the wikipedia article referenced
above.

p(n,d) = 1 - d!/(d^n*(d-n)!)

In your example n = 2^35, d = 2^256.  If you extrapolate the
256 bits row of the probability table of section 3.1, it is
somewhere between 10^-48 and 10^-51. 

This may be easier to grasp: to get a 50% probability of a
collision with sha256, you need 4*10^38 blocks. For a
probability similar to disk error rates (10^-15), you need
1.5*10^31 blocks.
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] (Fletcher+Verification) versus (Sha256+No Verification)

2011-01-07 Thread Darren J Moffat

On 06/01/2011 23:07, David Magda wrote:

On Jan 6, 2011, at 15:57, Nicolas Williams wrote:


Fletcher is faster than SHA-256, so I think that must be what you're
asking about: can Fletcher+Verification be faster than
Sha256+NoVerification?  Or do you have some other goal?


Would running on recent T-series servers, which have have on-die crypto units, 
help any in this regard?


The on chip SHA-256 implementation is not yet used see:

http://blogs.sun.com/darren/entry/improving_zfs_dedup_performance_via

Note that the fix I integrated only uses a software implementation of 
SHA256 on the T5120 (UltraSPARC T2) and is not (yet) using the on CPU 
hardware implementation of SHA256.  The reason for this is to do with 
boot time availability of the Solaris Cryptographic Framework and the 
need to have ZFS as the root filesystem.


Not yet changed it turns out to be quite complicated to fix due to very 
early boot issues.


--
Darren J Moffat
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] (Fletcher+Verification) versus (Sha256+No Verification)

2011-01-07 Thread Sašo Kiselkov
On 01/07/2011 10:26 AM, Darren J Moffat wrote:
 On 06/01/2011 23:07, David Magda wrote:
 On Jan 6, 2011, at 15:57, Nicolas Williams wrote:

 Fletcher is faster than SHA-256, so I think that must be what you're
 asking about: can Fletcher+Verification be faster than
 Sha256+NoVerification?  Or do you have some other goal?

 Would running on recent T-series servers, which have have on-die
 crypto units, help any in this regard?

 The on chip SHA-256 implementation is not yet used see:

 http://blogs.sun.com/darren/entry/improving_zfs_dedup_performance_via

 Note that the fix I integrated only uses a software implementation of
 SHA256 on the T5120 (UltraSPARC T2) and is not (yet) using the on CPU
 hardware implementation of SHA256.  The reason for this is to do with
 boot time availability of the Solaris Cryptographic Framework and the
 need to have ZFS as the root filesystem.

 Not yet changed it turns out to be quite complicated to fix due to
 very early boot issues.

Would it be difficult to implement both methods and allow ZFS to switch
to the hardware-accelerated crypto backend at runtime after it has been
brought up and initialized? It seems like one heck of a feature
(essentially removing most of the computational complexity of dedup).

--
Saso
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] (Fletcher+Verification) versus (Sha256+No Verification)

2011-01-07 Thread Darren J Moffat

On 07/01/2011 11:56, Sašo Kiselkov wrote:

On 01/07/2011 10:26 AM, Darren J Moffat wrote:

On 06/01/2011 23:07, David Magda wrote:

On Jan 6, 2011, at 15:57, Nicolas Williams wrote:


Fletcher is faster than SHA-256, so I think that must be what you're
asking about: can Fletcher+Verification be faster than
Sha256+NoVerification?  Or do you have some other goal?


Would running on recent T-series servers, which have have on-die
crypto units, help any in this regard?


The on chip SHA-256 implementation is not yet used see:

http://blogs.sun.com/darren/entry/improving_zfs_dedup_performance_via

Note that the fix I integrated only uses a software implementation of
SHA256 on the T5120 (UltraSPARC T2) and is not (yet) using the on CPU
hardware implementation of SHA256.  The reason for this is to do with
boot time availability of the Solaris Cryptographic Framework and the
need to have ZFS as the root filesystem.

Not yet changed it turns out to be quite complicated to fix due to
very early boot issues.


Would it be difficult to implement both methods and allow ZFS to switch
to the hardware-accelerated crypto backend at runtime after it has been
brought up and initialized? It seems like one heck of a feature


Wither it is difficult or not depends on your level of familiarity with 
ZFS, boot and the cryptographic framework ;-)


For me no it wouldn't be difficult but it still isn't completely trivial.


(essentially removing most of the computational complexity of dedup).


Most of the data I've seen on the performance impact of dedup is not 
coming from the SHA256 computation it is mostly about the additional IO 
to deal with the DDT.   Though lowering the overhead that SHA256 does 
add is always a good thing.


--
Darren J Moffat
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] (Fletcher+Verification) versus (Sha256+No Verification)

2011-01-07 Thread Sašo Kiselkov
On 01/07/2011 01:15 PM, Darren J Moffat wrote:
 On 07/01/2011 11:56, Sašo Kiselkov wrote:
 On 01/07/2011 10:26 AM, Darren J Moffat wrote:
 On 06/01/2011 23:07, David Magda wrote:
 On Jan 6, 2011, at 15:57, Nicolas Williams wrote:

 Fletcher is faster than SHA-256, so I think that must be what you're
 asking about: can Fletcher+Verification be faster than
 Sha256+NoVerification?  Or do you have some other goal?

 Would running on recent T-series servers, which have have on-die
 crypto units, help any in this regard?

 The on chip SHA-256 implementation is not yet used see:

 http://blogs.sun.com/darren/entry/improving_zfs_dedup_performance_via

 Note that the fix I integrated only uses a software implementation of
 SHA256 on the T5120 (UltraSPARC T2) and is not (yet) using the on CPU
 hardware implementation of SHA256.  The reason for this is to do with
 boot time availability of the Solaris Cryptographic Framework and the
 need to have ZFS as the root filesystem.

 Not yet changed it turns out to be quite complicated to fix due to
 very early boot issues.

 Would it be difficult to implement both methods and allow ZFS to switch
 to the hardware-accelerated crypto backend at runtime after it has been
 brought up and initialized? It seems like one heck of a feature

 Wither it is difficult or not depends on your level of familiarity
 with ZFS, boot and the cryptographic framework ;-)

 For me no it wouldn't be difficult but it still isn't completely trivial.

 (essentially removing most of the computational complexity of dedup).

 Most of the data I've seen on the performance impact of dedup is not
 coming from the SHA256 computation it is mostly about the additional
 IO to deal with the DDT.   Though lowering the overhead that SHA256
 does add is always a good thing.

Well, seeing as all mainline ZFS development is now happening behind
closed doors, all I can really do is ask for features and hope Oracle
implements them :-). Nevertheless, thanks for the clarification.

BR,
--
Saso
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] (Fletcher+Verification) versus (Sha256+No Verification)

2011-01-07 Thread Edward Ned Harvey
 From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss-
 boun...@opensolaris.org] On Behalf Of Bakul Shah
 
 See http://en.wikipedia.org/wiki/Birthday_problem -- in
 particular see section 5.1 and the probability table of
 section 3.4.

They say The expected number of n-bit hashes that can be generated before
getting a collision is not 2^n, but rather only 2^(n/2).

While that is not necessarily incorrect, the operative word is expected.
They're looking for some probability threshold...  I did not assign any
particular threshold to call something expected or not expected.  I
simply calculated the probability.  And compared it to the number of
molecules in Earth.  ;-)

Two paragraphs earlier, they show a formula, The generic [probability] can
be derived using...
p(n:d) ~= 1 - (( (d-1)/d ) ^ ( n*(n-1)/2 )   )

In the case that I looked at, d=2^256 and n=2^35
p(n:d) ~= ... unfortunately excel calculated this so small it resulted in
0  


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] (Fletcher+Verification) versus (Sha256+No Verification)

2011-01-07 Thread David Magda
On Fri, January 7, 2011 04:26, Darren J Moffat wrote:
 On 06/01/2011 23:07, David Magda wrote:

 Would running on recent T-series servers, which have have on-die crypto
 units, help any in this regard?

 The on chip SHA-256 implementation is not yet used see:

 http://blogs.sun.com/darren/entry/improving_zfs_dedup_performance_via

 Note that the fix I integrated only uses a software implementation of
 SHA256 on the T5120 (UltraSPARC T2) and is not (yet) using the on CPU
 hardware implementation of SHA256.  The reason for this is to do with
 boot time availability of the Solaris Cryptographic Framework and the
 need to have ZFS as the root filesystem.

 Not yet changed it turns out to be quite complicated to fix due to very
 early boot issues.

Out of curiosity, would a two-stage system be practical: use the
software-only code for boot, and then switch to hardware assistance (if
available) later on?

Generally a few extra CPUs cycles wouldn't be a big deal on start up (not
much else running), but once you get to multi-user mode, the hardware
should be ready to go and the CPU offloading would be more pertinent. It's
necessary to start off in software-only mode to a certain extent anyway,
since there's still plenty of machines out there without crypto (the
M-series servers, as well as x86).

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] (Fletcher+Verification) versus (Sha256+No Verification)

2011-01-07 Thread David Magda
On Fri, January 7, 2011 01:42, Michael DeMan wrote:
 Then - there is the other side of things.  The 'black swan' event.  At
 some point, given percentages on a scenario like the example case above,
 one simply has to make the business justification case internally at their
 own company about whether to go SHA-256 only or Fletcher+Verification?
 Add Murphy's Law to the 'black swan event' and of course the only data
 that is lost is that .01% of your data that is the most critical?

The other thing to note is that by default (with de-dupe disabled), ZFS
uses Fletcher checksums to prevent data corruption. Add also the fact all
other file systems don't have any checksums, and simply rely on the fact
that disks have a bit error rate of (at best) 10^-16.

Given the above: most people are content enough to trust Fletcher to not
have data corruption, but are worried about SHA-256 giving 'data
corruption' when it comes de-dupe? The entire rest of the computing world
is content to live with 10^-15 (for SAS disks), and yet one wouldn't be
prepared to have 10^-30 (or better) for dedupe?

I certainly can understand caution, but given the numbers involved, you're
more likely to be hit by lighting a few times in a lifetime before a
collision occurs for a reasonably sized ZFS pool. :)

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] (Fletcher+Verification) versus (Sha256+No Verification)

2011-01-07 Thread Michael DeMan

On Jan 7, 2011, at 6:13 AM, David Magda wrote:

 On Fri, January 7, 2011 01:42, Michael DeMan wrote:
 Then - there is the other side of things.  The 'black swan' event.  At
 some point, given percentages on a scenario like the example case above,
 one simply has to make the business justification case internally at their
 own company about whether to go SHA-256 only or Fletcher+Verification?
 Add Murphy's Law to the 'black swan event' and of course the only data
 that is lost is that .01% of your data that is the most critical?
 
 The other thing to note is that by default (with de-dupe disabled), ZFS
 uses Fletcher checksums to prevent data corruption. Add also the fact all
 other file systems don't have any checksums, and simply rely on the fact
 that disks have a bit error rate of (at best) 10^-16.
 
Agreed - but I think it is still missing the point of what the original poster 
was asking about.

In all honesty I think the debate is a business decision - the highly 
improbable vs. certainty.

Somebody somewhere must have written this stuff up, along with simple use cases?
Perhaps even a new acronym?  MTBC - mean time before collision?

And even with the 'certainty' factor being the choice - other things like human 
error come in to play and are far riskier?




___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] (Fletcher+Verification) versus (Sha256+No Verification)

2011-01-07 Thread Nicolas Williams
On Fri, Jan 07, 2011 at 06:39:51AM -0800, Michael DeMan wrote:
 On Jan 7, 2011, at 6:13 AM, David Magda wrote:
  The other thing to note is that by default (with de-dupe disabled), ZFS
  uses Fletcher checksums to prevent data corruption. Add also the fact all
  other file systems don't have any checksums, and simply rely on the fact
  that disks have a bit error rate of (at best) 10^-16.
 
 Agreed - but I think it is still missing the point of what the
 original poster was asking about.
 
 In all honesty I think the debate is a business decision - the highly
 improbable vs. certainty.

The OP seemed to be concerned that SHA-256 is particularly slow, so the
business decision here would involve a performance vs. error rate
trade-off.

Now, unless you have highly deduplicatious data, a workload with a high
cache hit ratio in the ARC for DDT entries, and a fast ZIL device, I
suspect that the I/O costs of dedup dominate the cost of the hash
function, which means: the above business trade-off is not worthwhile as
one would be trading an tiny uptick in error rates for small uptick in
performance.  Before you even get to where you're making such a decision
you'll want to have invested in plenty of RAM, L2ARC and fast ZIL device
capacity -- and for those making such that investment I suspect that the
OP's trade-off won't seem worthwhile.

BTW, note that verification isn't guaranteed to have a zero error
rate...  Imagine a) a block being written collides with a different
block already in the pool, b) bit rot on disk in that colliding block
such that the on-disk block matches the new block, c) on a mirrored vdev
such that you might get one or another version of the block in question,
randomly.  Such an error requires monumentally bad luck to happen at
all.

Nico
-- 
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] (Fletcher+Verification) versus (Sha256+No Verification)

2011-01-07 Thread Casper . Dik

On Fri, January 7, 2011 01:42, Michael DeMan wrote:
 Then - there is the other side of things.  The 'black swan' event.  At
 some point, given percentages on a scenario like the example case above,
 one simply has to make the business justification case internally at their
 own company about whether to go SHA-256 only or Fletcher+Verification?
 Add Murphy's Law to the 'black swan event' and of course the only data
 that is lost is that .01% of your data that is the most critical?

The other thing to note is that by default (with de-dupe disabled), ZFS
uses Fletcher checksums to prevent data corruption. Add also the fact all
other file systems don't have any checksums, and simply rely on the fact
that disks have a bit error rate of (at best) 10^-16.

Given the above: most people are content enough to trust Fletcher to not
have data corruption, but are worried about SHA-256 giving 'data
corruption' when it comes de-dupe? The entire rest of the computing world
is content to live with 10^-15 (for SAS disks), and yet one wouldn't be
prepared to have 10^-30 (or better) for dedupe?


I would; we're not talking about flipping bits the OS comparing data
using just the checksums and replacing one set with another.

You might want to create a file to show how weak fletcher really is
but two such files wouldn't be properly stored on a de-dup zpool 
unless you verify.




Casper

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] (Fletcher+Verification) versus (Sha256+No Verification)

2011-01-07 Thread Robert Milkowski

 On 01/ 7/11 02:13 PM, David Magda wrote:


Given the above: most people are content enough to trust Fletcher to not
have data corruption, but are worried about SHA-256 giving 'data
corruption' when it comes de-dupe? The entire rest of the computing world
is content to live with 10^-15 (for SAS disks), and yet one wouldn't be
prepared to have 10^-30 (or better) for dedupe?



I think you are do not understand entirely the problem.
Lets say two different blocks A and B have the same sha256 checksum, A 
is already stored in a pool, B is being written. Without verify and 
dedup enabled B won't be written. Next time you ask for block B you will 
actually end-up with the block A. Now if B is relatively common in your 
data set you have a relatively big impact on many files because of one 
corrupted block (additionally from a fs point of view this is a silent 
data corruption). Without dedup if you get a single block corrupted 
silently an impact usually will be relatively limited.


Now what if block B is a meta-data block?

The point is that a potential impact of a hash collision is much bigger 
than a single silent data corruption to a block, not to mention that 
dedup or not all the other possible cases of data corruption are there 
anyway, adding yet another one might or might not be acceptable.



--
Robert Milkowski
http://milek.blogspot.com


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] (Fletcher+Verification) versus (Sha256+No Verification)

2011-01-07 Thread David Magda
On Fri, January 7, 2011 14:33, Robert Milkowski wrote:
   On 01/ 7/11 02:13 PM, David Magda wrote:

 Given the above: most people are content enough to trust Fletcher to not
 have data corruption, but are worried about SHA-256 giving 'data
 corruption' when it comes de-dupe? The entire rest of the computing
 world
 is content to live with 10^-15 (for SAS disks), and yet one wouldn't be
 prepared to have 10^-30 (or better) for dedupe?


 I think you are do not understand entirely the problem.
[...]

I was aware of all of that.

 Now what if block B is a meta-data block?

 The point is that a potential impact of a hash collision is much bigger
 than a single silent data corruption to a block, not to mention that
 dedup or not all the other possible cases of data corruption are there
 anyway, adding yet another one might or might not be acceptable.

And my point is that people are walking around the file hierarchy tree
completely blind for most other file systems and are not worry about it
all. They could be grabbing inappropriate data at a rate that is several
orders of magnitude more likely than a hash collision and sleeping fine at
night, and yet are worried about something that is 10^-20 less likely.

Personally the potential impact of being hit by lightening is much bigger
to me than anything that happens in at my job, and the odds of that are
about 1 in 170,000 (2^-17 to 2^-18). If I'm not worried about that, why
the hell should I be worried about something that is 2^-128?

If then-Sun/now-Oracle thought it was a problem they wouldn't have made
on an alias to sha256.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] (Fletcher+Verification) versus (Sha256+No Verification)

2011-01-07 Thread Pawel Jakub Dawidek
On Fri, Jan 07, 2011 at 07:33:53PM +, Robert Milkowski wrote:
  On 01/ 7/11 02:13 PM, David Magda wrote:
 
 Given the above: most people are content enough to trust Fletcher to not
 have data corruption, but are worried about SHA-256 giving 'data
 corruption' when it comes de-dupe? The entire rest of the computing world
 is content to live with 10^-15 (for SAS disks), and yet one wouldn't be
 prepared to have 10^-30 (or better) for dedupe?
 
 
 I think you are do not understand entirely the problem.
 Lets say two different blocks A and B have the same sha256 checksum, A 
 is already stored in a pool, B is being written. Without verify and 
 dedup enabled B won't be written. Next time you ask for block B you will 
 actually end-up with the block A. Now if B is relatively common in your 
 data set you have a relatively big impact on many files because of one 
 corrupted block (additionally from a fs point of view this is a silent 
 data corruption). [...]

All true, that's why verification was mandatory for fletcher, which is
not cryptographically strong hash. Until SHA256 is no broken, wasting
power for verification is just a waste of resources, which isn't green:)
Once SHA256 is broken, verification can be turned on.

 [...] Without dedup if you get a single block corrupted 
 silently an impact usually will be relatively limited.

Except when corruption happens on write, not read, ie. you write data,
it is corrupted on the fly, but corrupted version still matches fletcher
checksum (the default now). Now every read of this block will return
silently corrupted data.

 Now what if block B is a meta-data block?

Metadata is not deduplicated.

 The point is that a potential impact of a hash collision is much bigger 
 than a single silent data corruption to a block, not to mention that 
 dedup or not all the other possible cases of data corruption are there 
 anyway, adding yet another one might or might not be acceptable.

I'm more in opinion that it was mistake that the verification feature
wasn't removed along with fletcher-for-dedup removal. It is good to be
able to turn on verification once/if SHA256 will be broken - that's the
only reason I'll leave it, but I somehow feel that there are bigger
chances you can corrupt your data because of extra code complexity
coming with verification than because of SHA256 collision.

-- 
Pawel Jakub Dawidek   http://www.wheelsystems.com
p...@freebsd.org   http://www.FreeBSD.org
FreeBSD committer Am I Evil? Yes, I Am!


pgpDaDkDP6RK3.pgp
Description: PGP signature
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] (Fletcher+Verification) versus (Sha256+No Verification)

2011-01-07 Thread Brandon High
On Fri, Jan 7, 2011 at 11:33 AM, Robert Milkowski mi...@task.gda.pl wrote:
 end-up with the block A. Now if B is relatively common in your data set you
 have a relatively big impact on many files because of one corrupted block
 (additionally from a fs point of view this is a silent data corruption).
 Without dedup if you get a single block corrupted silently an impact usually
 will be relatively limited.

A pool can be configures so that a dedup'd block will only be
referenced a certain number of times. So if you write out 10,000
identical blocks, it may be written 10 times with each duplicate
referenced 1,000 times. The exact number is controlled by the
dedupditto property for your pool, and you should set it as your risk
tolerance allows.

-B

-- 
Brandon High : bh...@freaks.com
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] (Fletcher+Verification) versus (Sha256+No Verification)

2011-01-06 Thread Peter Taps
Folks,

I have been told that the checksum value returned by Sha256 is almost 
guaranteed to be unique. In fact, if Sha256 fails in some case, we have a 
bigger problem such as memory corruption, etc. Essentially, adding verification 
to sha256 is an overkill.

Perhaps (Sha256+NoVerification) would work 99.99% of the time. But 
(Fletcher+Verification) would work 100% of the time.

Which one of the two is a better deduplication strategy?

If we do not use verification with Sha256, what is the worst case scenario? Is 
it just more disk space occupied (because of failure to detect duplicate 
blocks) or there is a chance of actual data corruption (because two blocks were 
assumed to be duplicate although they are not)?

Or, if I go with (Sha256+Verification), how much is the overhead of 
verification on the overall process?

If I do go with verification, it seems (Fletcher+Verification) is more 
efficient than (Sha256+Verification). And both are 100% accurate in detecting 
duplicate blocks.

Thank you in advance for your help.

Peter
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] (Fletcher+Verification) versus (Sha256+No Verification)

2011-01-06 Thread David Magda
On Thu, January 6, 2011 14:44, Peter Taps wrote:
 I have been told that the checksum value returned by Sha256 is almost
 guaranteed to be unique. In fact, if Sha256 fails in some case, we have a
 bigger problem such as memory corruption, etc. Essentially, adding
 verification to sha256 is an overkill.

 Perhaps (Sha256+NoVerification) would work 99.99% of the time. But
 (Fletcher+Verification) would work 100% of the time.

 Which one of the two is a better deduplication strategy?

The ZFS default is what you should be using unless you can explain
(technically, and preferably mathematically) why you should use something
else.

I'm guessing you're using 99.99% as a 'literary gesture', and
haven't done the math. The above means that you have a 0.001% or 10^-7
chance of having a collision.

The reality is that the odds are actually 10^-77 (~ 2^-256; see [1] though):

http://blogs.sun.com/bonwick/entry/zfs_dedup

As a form of comparison, the odds of having an non-recoverable bit error
from a hard disk is about 10^15 for SAS disks and 10^-14 for SATA disks.
So you're about sixty times more likely to have a disk read error than get
a collision from SHA-256.

If you're not worried about disk read errors (and/or are not experiencing
them), then you shouldn't be worried about has collisions.

TL;DR: do a dedupe=on and forget about it.

Some more discussion as it relates to some backup dedupe appliances (the
principles are the same):

http://tinyurl.com/36369pb
http://www.backupcentral.com/mr-backup-blog-mainmenu-47/13-mr-backup-blog/145-de-dupe-hash-collisions.html


[1] It may actually be 10^-38 (2^-128) or so because of the birthday
paradox, but we're still talking unlikely. You have a better chance of
dying from lightning or being attacked by a mountain lion:

http://www.blog.joelx.com/odds-chances-of-dying/877/

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] (Fletcher+Verification) versus (Sha256+No Verification)

2011-01-06 Thread Robert Milkowski

 On 01/ 6/11 07:44 PM, Peter Taps wrote:

Folks,

I have been told that the checksum value returned by Sha256 is almost 
guaranteed to be unique. In fact, if Sha256 fails in some case, we have a 
bigger problem such as memory corruption, etc. Essentially, adding verification 
to sha256 is an overkill.

Perhaps (Sha256+NoVerification) would work 99.99% of the time. But 
(Fletcher+Verification) would work 100% of the time.

Which one of the two is a better deduplication strategy?

If we do not use verification with Sha256, what is the worst case scenario? Is 
it just more disk space occupied (because of failure to detect duplicate 
blocks) or there is a chance of actual data corruption (because two blocks were 
assumed to be duplicate although they are not)?


Yes, there is a possibility of data corruption.


Or, if I go with (Sha256+Verification), how much is the overhead of 
verification on the overall process?


It really depends on your specific workload.
If your application is mostly reading data then it well might be you 
won't even notice verify.


Sha256 is supposed to be almost bullet proof but...
At the end of a day it is all about how much you value your data.
But as I wrote before, try with verify and see if performance is 
acceptable. It well might be the case.

You can always disable verify at any time.


If I do go with verification, it seems (Fletcher+Verification) is more 
efficient than (Sha256+Verification). And both are 100% accurate in detecting 
duplicate blocks.

I don't believe that fletcher is still allowed for dedup - right now it 
is only sha256.


--
Robert Milkowski
http://milek.blogspot.com

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] (Fletcher+Verification) versus (Sha256+No Verification)

2011-01-06 Thread Richard Elling
On Jan 6, 2011, at 11:44 AM, Peter Taps wrote:

 Folks,
 
 I have been told that the checksum value returned by Sha256 is almost 
 guaranteed to be unique. In fact, if Sha256 fails in some case, we have a 
 bigger problem such as memory corruption, etc. Essentially, adding 
 verification to sha256 is an overkill.

I disagree. I do not believe you can uniquely identify all possible 
permutations of 1 million
bits using only 256 bits.

 Perhaps (Sha256+NoVerification) would work 99.99% of the time. But 
 (Fletcher+Verification) would work 100% of the time.
 
 Which one of the two is a better deduplication strategy?

If you love your data, always use verify=on

 If we do not use verification with Sha256, what is the worst case scenario? 
 Is it just more disk space occupied (because of failure to detect duplicate 
 blocks) or there is a chance of actual data corruption (because two blocks 
 were assumed to be duplicate although they are not)?

If you do not use verify=on, you risk repeatable data corruption.  

In some postings you will find claims of the odds being 1 in 2^256 +/-  for a 
collision.  This is correct.  However, they will then compare this to the odds 
of
a disk read error.  There is an important difference however -- the disk error
is likely to be noticed, but a collision is completely silent without the 
verify 
option.  This is why it is a repeatable problem, different than hardware 
failures
which are not repeatable.  Accepting repeatable and silent data corruption is a 
very bad tradeoff, IMNSHO.

 Or, if I go with (Sha256+Verification), how much is the overhead of 
 verification on the overall process?

In my experience, I see little chance that a verification will be used. As 
above,
you might run into a collision, but it will be rare.

 If I do go with verification, it seems (Fletcher+Verification) is more 
 efficient than (Sha256+Verification). And both are 100% accurate in detecting 
 duplicate blocks.

Yes.  Fletcher with verification will be more performant than sha-256.
However, that option is not available in the Solaris releases.
 -- richard


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] (Fletcher+Verification) versus (Sha256+No Verification)

2011-01-06 Thread Nicolas Williams
On Thu, Jan 06, 2011 at 11:44:31AM -0800, Peter Taps wrote:
 I have been told that the checksum value returned by Sha256 is almost
 guaranteed to be unique.

All hash functions are guaranteed to have collisions [for inputs larger
than their output anyways].

  In fact, if Sha256 fails in some case, we
 have a bigger problem such as memory corruption, etc. Essentially,
 adding verification to sha256 is an overkill.

What makes a hash function cryptographically secure is not impossibility
of collisions, but computational difficulty of finding arbitrary
colliding input pairs, collisions for known inputs, second pre-images,
and first pre-images.  Just because you can't easily find collisions on
purpose doesn't mean that you can't accidentally find collisions.

That said, if the distribution of SHA-256 is even enough then your
chances of finding a collision by accident are so remote (one in 2^128)
that you could reasonably decide that you don't care.

 Perhaps (Sha256+NoVerification) would work 99.99% of the time. But
 (Fletcher+Verification) would work 100% of the time.

Fletcher is faster than SHA-256, so I think that must be what you're
asking about: can Fletcher+Verification be faster than
Sha256+NoVerification?  Or do you have some other goal?

Assuming I guessed correctly...  The speed of the hash function isn't
significant compared to the cost of the verification I/O, period, end of
story.  So, SHA-256 w/o verification will be faster than Fletcher +
Verification -- lots faster if you have particularly deduplicatious data
to write.  Moreorever, SHA-256 + verification will likely be somewhat
faster than Fletcher + verification because SHA-256 will likely have
fewer collisions than Fletcher, and the cost of I/O dominates the cost
of the hash functions.

 Which one of the two is a better deduplication strategy?
 
 If we do not use verification with Sha256, what is the worst case
 scenario? Is it just more disk space occupied (because of failure to
 detect duplicate blocks) or there is a chance of actual data
 corruption (because two blocks were assumed to be duplicate although
 they are not)?

If you don't verify then you run the risk of corruption on collision,
NOT the risk of using too much disk space.

 Or, if I go with (Sha256+Verification), how much is the overhead of
 verification on the overall process?
 
 If I do go with verification, it seems (Fletcher+Verification) is more
 efficient than (Sha256+Verification). And both are 100% accurate in
 detecting duplicate blocks.

You're confused.  Fletcher may be faster to compute than SHA-256, but
the run-time of both is as nothing compared to latency of the disk I/O
needed for verification, which means that the hash function's rate of
collisions is more important than its computational cost.

(Now, Fletcher is thought to not be a cryptographically secure hash
function, while SHA-256 is, for now, considered cryptographically
secure.  That probably means that the distribution of Fletcher's outputs
over random inputs is not as even as that of SHA-256, which probably
means you can expect more collisions with Fletcher than with SHA-256.
Note that I made no absolute statements in the previous sentence --
that's because I've not read any studies of Fletcher's performance
relative to SHA-256, thus I'm not certain of anything stated in the
previous sentence.)

David Magda's advice is spot on.

Nico
-- 
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] (Fletcher+Verification) versus (Sha256+No Verification)

2011-01-06 Thread David Magda
On Jan 6, 2011, at 15:57, Nicolas Williams wrote:

 Fletcher is faster than SHA-256, so I think that must be what you're
 asking about: can Fletcher+Verification be faster than
 Sha256+NoVerification?  Or do you have some other goal?

Would running on recent T-series servers, which have have on-die crypto units, 
help any in this regard?
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] (Fletcher+Verification) versus (Sha256+No Verification)

2011-01-06 Thread Nicolas Williams
On Thu, Jan 06, 2011 at 06:07:47PM -0500, David Magda wrote:
 On Jan 6, 2011, at 15:57, Nicolas Williams wrote:
 
  Fletcher is faster than SHA-256, so I think that must be what you're
  asking about: can Fletcher+Verification be faster than
  Sha256+NoVerification?  Or do you have some other goal?
 
 Would running on recent T-series servers, which have have on-die
 crypto units, help any in this regard?

Yes, particularly for larger blocks.

Hash collisions don't matter as long as ZFS verifies dups, so the real
question is: what is the false positive dup rate (i.e., the accidental
collision rate).  But that's going to vary a lot by {hash function,
working data set}, thus it's not possible to make exact determinations,
just estimates.

For me the biggest issue is that as good as Fletcher is for a CRC, I'd
rather have a cryptographic hash function because I've seen incredibly
odd CRC failures before.  There's a famous case from within SWAN a few
years ago where a switch flipped pairs of bits such that all too often
the various CRCs that applied to the moving packets failed to detect the
bit flips, and we discovered this when an SCCS file in a clone of the ON
gate got corrupted.  Such failures (collisions) wouldn't affect dedup,
but they would mask corruption of non-deduped blocks.

Nico
-- 
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] (Fletcher+Verification) versus (Sha256+No Verification)

2011-01-06 Thread Edward Ned Harvey
 From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss-
 boun...@opensolaris.org] On Behalf Of Peter Taps
 
 Perhaps (Sha256+NoVerification) would work 99.99% of the time. But

Append 50 more 9's on there. 
99.%

See below.


 I have been told that the checksum value returned by Sha256 is almost
 guaranteed to be unique. In fact, if Sha256 fails in some case, we have a
 bigger problem such as memory corruption, etc. Essentially, adding
 verification to sha256 is an overkill.

Someone please correct me if I'm wrong.  I assume ZFS dedup matches both the
blocksize and the checksum right?  A simple checksum collision (which is
astronomically unlikely) is still not sufficient to produce corrupted data.
It's even more unlikely than that.

Using the above assumption, here's how you calculate the probability of
corruption if you're not using verification:

Suppose every single block in your whole pool is precisely the same size
(which is unrealistic in the real world, but I'm trying to calculate worst
case.)  Suppose the block is 4K, which is again, unrealistically worst case.
Suppose your dataset is purely random or sequential ... with no duplicated
data ... which is unrealisic because if your data is like that, then why in
the world are you enabling dedupe?  But again, assuming worst case
scenario...  At this point we'll throw in some evil clowns, spit on a voodoo
priestess, and curse the heavens for some extra bad luck.

If you have astronomically infinite quantities of data, then your
probability of corruption approaches 100%.  With infinite data, eventually
you're guaranteed to have a collision.  So the probability of corruption is
directly related to the total amount of data you have, and the new question
is:  For anything Earthly, how near are you to 0% probability of collision
in reality?

Suppose you have 128TB of data.  That is ...  you have 2^35 unique 4k blocks
of uniformly sized data.  Then the probability you have any collision in
your whole dataset is (sum(1 thru 2^35))*2^-256 
Note: sum of integers from 1 to N is  (N*(N+1))/2
Note: 2^35 * (2^35+1) = 2^35 * 2^35 + 2^35 = 2^70 + 2^35
Note: (N*(N+1))/2 in this case = 2^69 + 2^34
So the probability of data corruption in this case, is 2^-187 + 2^-222 ~=
5.1E-57 + 1.5E-67

~= 5.1E-57

In other words, even in the absolute worst case, cursing the gods, running
without verification, using data that's specifically formulated to try and
cause errors, on a dataset that I bet is larger than what you're doing, ...

Before we go any further ... The total number of bits stored on all the
storage in the whole planet is a lot smaller than the total number of
molecules in the planet.

There are estimated 8.87 * 10^49 molecules in planet Earth.

The probability of a collision in your worst-case unrealistic dataset as
described, is even 100 million times less likely than randomly finding a
single specific molecule in the whole planet Earth by pure luck.



___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] (Fletcher+Verification) versus (Sha256+No Verification)

2011-01-06 Thread Michael DeMan
At the end of the day this issue essentially is about mathematical 
improbability versus certainty?

To be quite honest, I too am skeptical about about using de-dupe just based on 
SHA256.  In prior posts it was asked that the potential adopter of the 
technology provide the mathematical reason to NOT use SHA-256 only.  However, 
if Oracle believes that it is adequate to do that, would it be possible for 
somebody to provide:

(A) The theoretical documents and associated mathematics specific to say one 
simple use case?
(A1) Total data size is 1PB (lets say the zpool is 2PB to not worry about that 
part of it).
(A2) Daily, 10TB of data is updated, 1TB of data is deleted, and 1TB of data is 
'new'.
(A3) Out of the dataset, 25% of the data is capable of being de-duplicated
(A4) Between A2 and A3 above, the 25% rule from A3 also applies to everything 
in A2.


I think the above would be a pretty 'soft' case for justifying the case that 
SHA-256 works?  I would presume some kind of simple kind of scenario 
mathematically has been run already by somebody inside Oracle/Sun long ago when 
first proposing that ZFS be funded internally at all?


Then - there is the other side of things.  The 'black swan' event.  At some 
point, given percentages on a scenario like the example case above, one simply 
has to make the business justification case internally at their own company 
about whether to go SHA-256 only or Fletcher+Verification?  Add Murphy's Law to 
the 'black swan event' and of course the only data that is lost is that .01% of 
your data that is the most critical?



Not trying to be aggressive or combative here at all against peoples opinions 
and understandings of it all - I would just like to see some hard information 
about it all - it must exist somewhere already?

Thanks,
 
- Mike




On Jan 6, 2011, at 10:05 PM, Edward Ned Harvey wrote:

 From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss-
 boun...@opensolaris.org] On Behalf Of Peter Taps
 
 Perhaps (Sha256+NoVerification) would work 99.99% of the time. But
 
 Append 50 more 9's on there. 
 99.%
 
 See below.
 
 
 I have been told that the checksum value returned by Sha256 is almost
 guaranteed to be unique. In fact, if Sha256 fails in some case, we have a
 bigger problem such as memory corruption, etc. Essentially, adding
 verification to sha256 is an overkill.
 
 Someone please correct me if I'm wrong.  I assume ZFS dedup matches both the
 blocksize and the checksum right?  A simple checksum collision (which is
 astronomically unlikely) is still not sufficient to produce corrupted data.
 It's even more unlikely than that.
 
 Using the above assumption, here's how you calculate the probability of
 corruption if you're not using verification:
 
 Suppose every single block in your whole pool is precisely the same size
 (which is unrealistic in the real world, but I'm trying to calculate worst
 case.)  Suppose the block is 4K, which is again, unrealistically worst case.
 Suppose your dataset is purely random or sequential ... with no duplicated
 data ... which is unrealisic because if your data is like that, then why in
 the world are you enabling dedupe?  But again, assuming worst case
 scenario...  At this point we'll throw in some evil clowns, spit on a voodoo
 priestess, and curse the heavens for some extra bad luck.
 
 If you have astronomically infinite quantities of data, then your
 probability of corruption approaches 100%.  With infinite data, eventually
 you're guaranteed to have a collision.  So the probability of corruption is
 directly related to the total amount of data you have, and the new question
 is:  For anything Earthly, how near are you to 0% probability of collision
 in reality?
 
 Suppose you have 128TB of data.  That is ...  you have 2^35 unique 4k blocks
 of uniformly sized data.  Then the probability you have any collision in
 your whole dataset is (sum(1 thru 2^35))*2^-256 
 Note: sum of integers from 1 to N is  (N*(N+1))/2
 Note: 2^35 * (2^35+1) = 2^35 * 2^35 + 2^35 = 2^70 + 2^35
 Note: (N*(N+1))/2 in this case = 2^69 + 2^34
 So the probability of data corruption in this case, is 2^-187 + 2^-222 ~=
 5.1E-57 + 1.5E-67
 
 ~= 5.1E-57
 
 In other words, even in the absolute worst case, cursing the gods, running
 without verification, using data that's specifically formulated to try and
 cause errors, on a dataset that I bet is larger than what you're doing, ...
 
 Before we go any further ... The total number of bits stored on all the
 storage in the whole planet is a lot smaller than the total number of
 molecules in the planet.
 
 There are estimated 8.87 * 10^49 molecules in planet Earth.
 
 The probability of a collision in your worst-case unrealistic dataset as
 described, is even 100 million times less likely than randomly finding a
 single specific molecule in the whole planet Earth by pure luck.
 
 
 
 ___