...If this is a general rule, maybe it will be worth considering using
SHA512 truncated to 256 bits to get more speed...
Doesn't it need more investigation if truncating 512bit to 256bit gives
equivalent security as a plain 256bit hash? Maybe truncation will introduce
some bias?
--
This
Totally Off Topic:
Very interesting. Did you produce some papers on this? Where do you work? Seems
very fun place to work at!
BTW, I thought about this. What do you say?
Assume I want to compress data and I succeed in doing so. And then I transfer
the compressed data. So all the information
On Tue, Jan 18, 2011 at 07:16:04AM -0800, Orvar Korvar wrote:
BTW, I thought about this. What do you say?
Assume I want to compress data and I succeed in doing so. And then I
transfer the compressed data. So all the information I transferred is
the compressed data. But, then you don't count
On Sat, Jan 15, 2011 at 10:19:23AM -0600, Bob Friesenhahn wrote:
On Fri, 14 Jan 2011, Peter Taps wrote:
Thank you for sharing the calculations. In lay terms, for Sha256,
how many blocks of data would be needed to have one collision?
Two.
Pretty funny.
In this thread some of you are
On Fri, Jan 14, 2011 at 11:32:58AM -0800, Peter Taps wrote:
Ed,
Thank you for sharing the calculations. In lay terms, for Sha256, how many
blocks of data would be needed to have one collision?
Assuming each block is 4K is size, we probably can calculate the final data
size beyond which
From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss-
boun...@opensolaris.org] On Behalf Of Peter Taps
Thank you for sharing the calculations. In lay terms, for Sha256, how many
blocks of data would be needed to have one collision?
There is no point in making a generalization and a
On Fri, 14 Jan 2011, Peter Taps wrote:
Thank you for sharing the calculations. In lay terms, for Sha256,
how many blocks of data would be needed to have one collision?
Two.
Bob
--
Bob Friesenhahn
bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
GraphicsMagick
Ed,
Thank you for sharing the calculations. In lay terms, for Sha256, how many
blocks of data would be needed to have one collision?
Assuming each block is 4K is size, we probably can calculate the final data
size beyond which the collision may occur. This would enable us to make the
I am posting this once again as my previous post went into the middle of the
thread and may go unnoticed.
Ed,
Thank you for sharing the calculations. In lay terms, for Sha256, how many
blocks of data would be needed to have one collision?
Assuming each block is 4K is size, we probably can
On Jan 14, 2011, at 14:32, Peter Taps wrote:
Also, another related question. Why 256 bits was chosen and not 128 bits or
512 bits? I guess Sha512 may be an overkill. In your formula, how many blocks
of data would be needed to have one collision using Sha128?
There are two ways to get 128
Edward, this is OT but may I suggest you to use something like Wolfram Alpha to
perform your calculations a bit more comfortably?
--
Enrico M. Crisostomo
On Jan 12, 2011, at 4:24, Edward Ned Harvey
opensolarisisdeadlongliveopensola...@nedharvey.com wrote:
For anyone who still cares:
I'm
Edward, this is OT but may I suggest you to use something like Wolfram
Alpha
to perform your calculations a bit more comfortably?
Wow, that's pretty awesome. Thanks.
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
Hey there,
~= 5.1E-57
Bah. My math is wrong. I was never very good at PS. I'll ask someone at
work tomorrow to look at it and show me the folly. Wikipedia has it right,
but I can't evaluate numbers to the few-hundredth power in any calculator
that I have handy.
bc -l EOF
scale=150
From: Lassi Tuura [mailto:l...@cern.ch]
bc -l EOF
scale=150
define bday(n, h) { return 1 - e(-(n^2)/(2*h)); }
bday(2^35, 2^256)
bday(2^35, 2^256) * 10^57
EOF
Basically, ~5.1 * 10^-57.
Seems your number was correct, although I am not sure how you arrived at
it.
The number was
For anyone who still cares:
I'm calculating the odds of a sha256 collision in an extremely large zpool,
containing 2^35 blocks of data, and no repetitions.
The formula on wikipedia for the birthday problem is:
p(n;d) ~= 1-( (d-1)/d )^( 0.5*n*(n-1) )
In this case,
n=2^35
d=2^256
The problem
In case you were wondering how big is n before the probability of collision
becomes remotely possible, slightly possible, or even likely?
Given a fixed probability of collision p, the formula to calculate n is:
n = 0.5 + sqrt( ( 0.25 + 2*l(1-p)/l((d-1)/d) ) )
(That's just the same equation
On 01/ 8/11 05:59 PM, Edward Ned Harvey wrote:
Has anybody measured the cost of enabling or disabling verification?
The cost of disabling verification is an infinitesimally small number
multiplied by possibly all your data. Basically lim-0 times lim-infinity.
This can only be evaluated on a
On Sat, Jan 08, 2011 at 12:59:17PM -0500, Edward Ned Harvey wrote:
Has anybody measured the cost of enabling or disabling verification?
Of course there is no easy answer:)
Let me explain how verification works exactly first.
You try to write a block. You see that block is already in dedup
On Mon, January 10, 2011 02:41, Eric D. Mudama wrote:
On Sun, Jan 9 at 22:54, Peter Taps wrote:
Thank you all for your help. I am the OP.
I haven't looked at the link that talks about the probability of
collision. Intuitively, I still wonder how the chances of collision
can be so low. We
From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss-
boun...@opensolaris.org] On Behalf Of Peter Taps
I haven't looked at the link that talks about the probability of
collision.
Intuitively, I still wonder how the chances of collision can be so low. We
are
reducing a 4K block to
From: Pawel Jakub Dawidek [mailto:p...@freebsd.org]
Well, I find it quite reasonable. If your block is referenced 100 times,
it is probably quite important.
If your block is referenced 1 time, it is probably quite important. Hence
redundancy in the pool.
There are many corruption
From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss-
boun...@opensolaris.org] On Behalf Of David Magda
Knowing exactly how the math (?) works is not necessary, but understanding
Understanding the math is not necessary, but it is pretty easy. And
unfortunately it becomes kind of
From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss-
boun...@opensolaris.org] On Behalf Of Edward Ned Harvey
~= 5.1E-57
Bah. My math is wrong. I was never very good at PS. I'll ask someone at
work tomorrow to look at it and show me the folly. Wikipedia has it right,
but I can't
On Fri, Jan 07, 2011 at 03:06:26PM -0800, Brandon High wrote:
On Fri, Jan 7, 2011 at 11:33 AM, Robert Milkowski mi...@task.gda.pl wrote:
end-up with the block A. Now if B is relatively common in your data set you
have a relatively big impact on many files because of one corrupted block
From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss-
boun...@opensolaris.org] On Behalf Of Pawel Jakub Dawidek
Dedupditto doesn't work exactly that way. You can have at most 3 copies
of your block. Dedupditto minimal value is 100. The first copy is
created on first write, the
Thank you all for your help. I am the OP.
I haven't looked at the link that talks about the probability of collision.
Intuitively, I still wonder how the chances of collision can be so low. We are
reducing a 4K block to just 256 bits. If the chances of collision are so low,
*theoretically* it
On Sun, Jan 9 at 22:54, Peter Taps wrote:
Thank you all for your help. I am the OP.
I haven't looked at the link that talks about the probability of
collision. Intuitively, I still wonder how the chances of collision
can be so low. We are reducing a 4K block to just 256 bits. If the
chances of
On 01/ 7/11 09:02 PM, Pawel Jakub Dawidek wrote:
On Fri, Jan 07, 2011 at 07:33:53PM +, Robert Milkowski wrote:
Now what if block B is a meta-data block?
Metadata is not deduplicated.
Good point but then it depends on a perspective.
What if you you are storing lots of VMDKs?
One
From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss-
boun...@opensolaris.org] On Behalf Of Robert Milkowski
What if you you are storing lots of VMDKs?
One corrupted block which is shared among hundreds of VMDKs will affect
all of them.
And it might be a block containing meta-data
On Thu, 6 Jan 2011, David Magda wrote:
If you're not worried about disk read errors (and/or are not experiencing
them), then you shouldn't be worried about has collisions.
Except for the little problem that if there is a collision then there
will always be a collision for the same data and
On Thu, 06 Jan 2011 22:42:15 PST Michael DeMan sola...@deman.com wrote:
To be quite honest, I too am skeptical about about using de-dupe just based o
n SHA256. In prior posts it was asked that the potential adopter of the tech
nology provide the mathematical reason to NOT use SHA-256 only.
On 06/01/2011 23:07, David Magda wrote:
On Jan 6, 2011, at 15:57, Nicolas Williams wrote:
Fletcher is faster than SHA-256, so I think that must be what you're
asking about: can Fletcher+Verification be faster than
Sha256+NoVerification? Or do you have some other goal?
Would running on
On 01/07/2011 10:26 AM, Darren J Moffat wrote:
On 06/01/2011 23:07, David Magda wrote:
On Jan 6, 2011, at 15:57, Nicolas Williams wrote:
Fletcher is faster than SHA-256, so I think that must be what you're
asking about: can Fletcher+Verification be faster than
Sha256+NoVerification? Or do
On 07/01/2011 11:56, Sašo Kiselkov wrote:
On 01/07/2011 10:26 AM, Darren J Moffat wrote:
On 06/01/2011 23:07, David Magda wrote:
On Jan 6, 2011, at 15:57, Nicolas Williams wrote:
Fletcher is faster than SHA-256, so I think that must be what you're
asking about: can Fletcher+Verification be
On 01/07/2011 01:15 PM, Darren J Moffat wrote:
On 07/01/2011 11:56, Sašo Kiselkov wrote:
On 01/07/2011 10:26 AM, Darren J Moffat wrote:
On 06/01/2011 23:07, David Magda wrote:
On Jan 6, 2011, at 15:57, Nicolas Williams wrote:
Fletcher is faster than SHA-256, so I think that must be what
From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss-
boun...@opensolaris.org] On Behalf Of Bakul Shah
See http://en.wikipedia.org/wiki/Birthday_problem -- in
particular see section 5.1 and the probability table of
section 3.4.
They say The expected number of n-bit hashes that can
On Fri, January 7, 2011 04:26, Darren J Moffat wrote:
On 06/01/2011 23:07, David Magda wrote:
Would running on recent T-series servers, which have have on-die crypto
units, help any in this regard?
The on chip SHA-256 implementation is not yet used see:
On Fri, January 7, 2011 01:42, Michael DeMan wrote:
Then - there is the other side of things. The 'black swan' event. At
some point, given percentages on a scenario like the example case above,
one simply has to make the business justification case internally at their
own company about
On Jan 7, 2011, at 6:13 AM, David Magda wrote:
On Fri, January 7, 2011 01:42, Michael DeMan wrote:
Then - there is the other side of things. The 'black swan' event. At
some point, given percentages on a scenario like the example case above,
one simply has to make the business justification
On Fri, Jan 07, 2011 at 06:39:51AM -0800, Michael DeMan wrote:
On Jan 7, 2011, at 6:13 AM, David Magda wrote:
The other thing to note is that by default (with de-dupe disabled), ZFS
uses Fletcher checksums to prevent data corruption. Add also the fact all
other file systems don't have any
On Fri, January 7, 2011 01:42, Michael DeMan wrote:
Then - there is the other side of things. The 'black swan' event. At
some point, given percentages on a scenario like the example case above,
one simply has to make the business justification case internally at their
own company about
On 01/ 7/11 02:13 PM, David Magda wrote:
Given the above: most people are content enough to trust Fletcher to not
have data corruption, but are worried about SHA-256 giving 'data
corruption' when it comes de-dupe? The entire rest of the computing world
is content to live with 10^-15 (for SAS
On Fri, January 7, 2011 14:33, Robert Milkowski wrote:
On 01/ 7/11 02:13 PM, David Magda wrote:
Given the above: most people are content enough to trust Fletcher to not
have data corruption, but are worried about SHA-256 giving 'data
corruption' when it comes de-dupe? The entire rest of the
On Fri, Jan 07, 2011 at 07:33:53PM +, Robert Milkowski wrote:
On 01/ 7/11 02:13 PM, David Magda wrote:
Given the above: most people are content enough to trust Fletcher to not
have data corruption, but are worried about SHA-256 giving 'data
corruption' when it comes de-dupe? The entire
On Fri, Jan 7, 2011 at 11:33 AM, Robert Milkowski mi...@task.gda.pl wrote:
end-up with the block A. Now if B is relatively common in your data set you
have a relatively big impact on many files because of one corrupted block
(additionally from a fs point of view this is a silent data
Folks,
I have been told that the checksum value returned by Sha256 is almost
guaranteed to be unique. In fact, if Sha256 fails in some case, we have a
bigger problem such as memory corruption, etc. Essentially, adding verification
to sha256 is an overkill.
Perhaps (Sha256+NoVerification)
On Thu, January 6, 2011 14:44, Peter Taps wrote:
I have been told that the checksum value returned by Sha256 is almost
guaranteed to be unique. In fact, if Sha256 fails in some case, we have a
bigger problem such as memory corruption, etc. Essentially, adding
verification to sha256 is an
On 01/ 6/11 07:44 PM, Peter Taps wrote:
Folks,
I have been told that the checksum value returned by Sha256 is almost
guaranteed to be unique. In fact, if Sha256 fails in some case, we have a
bigger problem such as memory corruption, etc. Essentially, adding verification
to sha256 is an
On Jan 6, 2011, at 11:44 AM, Peter Taps wrote:
Folks,
I have been told that the checksum value returned by Sha256 is almost
guaranteed to be unique. In fact, if Sha256 fails in some case, we have a
bigger problem such as memory corruption, etc. Essentially, adding
verification to sha256
On Thu, Jan 06, 2011 at 11:44:31AM -0800, Peter Taps wrote:
I have been told that the checksum value returned by Sha256 is almost
guaranteed to be unique.
All hash functions are guaranteed to have collisions [for inputs larger
than their output anyways].
In fact, if
On Jan 6, 2011, at 15:57, Nicolas Williams wrote:
Fletcher is faster than SHA-256, so I think that must be what you're
asking about: can Fletcher+Verification be faster than
Sha256+NoVerification? Or do you have some other goal?
Would running on recent T-series servers, which have have
On Thu, Jan 06, 2011 at 06:07:47PM -0500, David Magda wrote:
On Jan 6, 2011, at 15:57, Nicolas Williams wrote:
Fletcher is faster than SHA-256, so I think that must be what you're
asking about: can Fletcher+Verification be faster than
Sha256+NoVerification? Or do you have some other
From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss-
boun...@opensolaris.org] On Behalf Of Peter Taps
Perhaps (Sha256+NoVerification) would work 99.99% of the time. But
Append 50 more 9's on there.
99.%
See below.
I
At the end of the day this issue essentially is about mathematical
improbability versus certainty?
To be quite honest, I too am skeptical about about using de-dupe just based on
SHA256. In prior posts it was asked that the potential adopter of the
technology provide the mathematical reason to
54 matches
Mail list logo