Re: [zfs-discuss] New fast hash algorithm - is it needed?

2012-07-11 Thread Sašo Kiselkov
On 07/11/2012 02:18 AM, John Martin wrote: On 07/10/12 19:56, Sašo Kiselkov wrote: Hi guys, I'm contemplating implementing a new fast hash algorithm in Illumos' ZFS implementation to supplant the currently utilized sha256. On modern 64-bit CPUs SHA-256 is actually much slower than SHA-512

[zfs-discuss] Recall: Repairing Faulted ZFS pool and missing disks

2012-07-11 Thread Kwang Whee Lee
Kwang Whee Lee would like to recall the message, [zfs-discuss] Repairing Faulted ZFS pool and missing disks. EMAIL DISCLAIMER This email message and its attachments are confidential and may also contain copyright or privileged material. If you are not the intended recipient, you may not forward

Re: [zfs-discuss] New fast hash algorithm - is it needed?

2012-07-11 Thread Sašo Kiselkov
On 07/11/2012 05:20 AM, Edward Ned Harvey wrote: From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss- boun...@opensolaris.org] On Behalf Of Sašo Kiselkov I'm contemplating implementing a new fast hash algorithm in Illumos' ZFS implementation to supplant the currently utilized

Re: [zfs-discuss] New fast hash algorithm - is it needed?

2012-07-11 Thread Ferenc-Levente Juhos
Hello all, what about the fletcher2 and fletcher4 algorithms? According to the zfs man page on oracle, fletcher4 is the current default. Shouldn't the fletcher algorithms be much faster then any of the SHA algorithms? On Wed, Jul 11, 2012 at 9:19 AM, Sašo Kiselkov skiselkov...@gmail.comwrote:

Re: [zfs-discuss] New fast hash algorithm - is it needed?

2012-07-11 Thread Sašo Kiselkov
On 07/11/2012 09:58 AM, Ferenc-Levente Juhos wrote: Hello all, what about the fletcher2 and fletcher4 algorithms? According to the zfs man page on oracle, fletcher4 is the current default. Shouldn't the fletcher algorithms be much faster then any of the SHA algorithms? On Wed, Jul 11, 2012

Re: [zfs-discuss] New fast hash algorithm - is it needed?

2012-07-11 Thread Ferenc-Levente Juhos
I was under the impression that the hash (or checksum) used for data integrity is the same as the one used for deduplication, but now I see that they are different. On Wed, Jul 11, 2012 at 10:23 AM, Sašo Kiselkov skiselkov...@gmail.comwrote: On 07/11/2012 09:58 AM, Ferenc-Levente Juhos wrote:

Re: [zfs-discuss] New fast hash algorithm - is it needed?

2012-07-11 Thread Sašo Kiselkov
On 07/11/2012 10:41 AM, Ferenc-Levente Juhos wrote: I was under the impression that the hash (or checksum) used for data integrity is the same as the one used for deduplication, but now I see that they are different. They are the same in use, i.e. once you switch dedup on, that implies

Re: [zfs-discuss] New fast hash algorithm - is it needed?

2012-07-11 Thread Joerg Schilling
Sa??o Kiselkov skiselkov...@gmail.com wrote: write in case verify finds the blocks are different). With hashes, you can leave verify off, since hashes are extremely unlikely (~10^-77) to produce collisions. This is how a lottery works. the chance is low but some people still win. q~A

Re: [zfs-discuss] New fast hash algorithm - is it needed?

2012-07-11 Thread Ferenc-Levente Juhos
Actually although as you pointed out that the chances to have an sha256 collision is minimal, but still it can happen, that would mean that the dedup algorithm discards a block that he thinks is a duplicate. Probably it's anyway better to do a byte to byte comparison if the hashes match to be sure

Re: [zfs-discuss] New fast hash algorithm - is it needed?

2012-07-11 Thread Ferenc-Levente Juhos
I'm pushing the send button too often, but yes, considering what said before, byte-to-byte comparison should be mandatory when deduplicating, and therefore a lighter hash or checksum algorithm, would suffice to reduce the number of dedup candidates. And overall deduping would be bulletproof and

Re: [zfs-discuss] New fast hash algorithm - is it needed?

2012-07-11 Thread Darren J Moffat
On 07/11/12 00:56, Sašo Kiselkov wrote: * SHA-512: simplest to implement (since the code is already in the kernel) and provides a modest performance boost of around 60%. FIPS 180-4 introduces SHA-512/t support and explicitly SHA-512/256.

Re: [zfs-discuss] New fast hash algorithm - is it needed?

2012-07-11 Thread Sašo Kiselkov
On 07/11/2012 10:47 AM, Joerg Schilling wrote: Sa??o Kiselkov skiselkov...@gmail.com wrote: write in case verify finds the blocks are different). With hashes, you can leave verify off, since hashes are extremely unlikely (~10^-77) to produce collisions. This is how a lottery works. the

Re: [zfs-discuss] New fast hash algorithm - is it needed?

2012-07-11 Thread Sašo Kiselkov
On 07/11/2012 11:02 AM, Darren J Moffat wrote: On 07/11/12 00:56, Sašo Kiselkov wrote: * SHA-512: simplest to implement (since the code is already in the kernel) and provides a modest performance boost of around 60%. FIPS 180-4 introduces SHA-512/t support and explicitly SHA-512/256.

Re: [zfs-discuss] New fast hash algorithm - is it needed?

2012-07-11 Thread Sašo Kiselkov
On 07/11/2012 10:50 AM, Ferenc-Levente Juhos wrote: Actually although as you pointed out that the chances to have an sha256 collision is minimal, but still it can happen, that would mean that the dedup algorithm discards a block that he thinks is a duplicate. Probably it's anyway better to do

Re: [zfs-discuss] New fast hash algorithm - is it needed?

2012-07-11 Thread Joerg Schilling
Sa?o Kiselkov skiselkov...@gmail.com wrote: On 07/11/2012 10:47 AM, Joerg Schilling wrote: Sa??o Kiselkov skiselkov...@gmail.com wrote: write in case verify finds the blocks are different). With hashes, you can leave verify off, since hashes are extremely unlikely (~10^-77) to produce

Re: [zfs-discuss] New fast hash algorithm - is it needed?

2012-07-11 Thread Casper . Dik
You do realize that the age of the universe is only on the order of around 10^18 seconds, do you? Even if you had a trillion CPUs each chugging along at 3.0 GHz for all this time, the number of processor cycles you will have executed cumulatively is only on the order 10^40, still 37 orders of

Re: [zfs-discuss] New fast hash algorithm - is it needed?

2012-07-11 Thread Justin Stringfellow
You do realize that the age of the universe is only on the order of around 10^18 seconds, do you? Even if you had a trillion CPUs each chugging along at 3.0 GHz for all this time, the number of processor cycles you will have executed cumulatively is only on the order 10^40, still 37 orders of

Re: [zfs-discuss] New fast hash algorithm - is it needed?

2012-07-11 Thread Casper . Dik
Sorry, but isn't this what dedup=verify solves? I don't see the problem here. Maybe all that's needed is a comment in the manpage saying hash algorithms aren't perfect. The point is that hash functions are many to one and I think the point was about that verify wasn't really needed if the hash

Re: [zfs-discuss] New fast hash algorithm - is it needed?

2012-07-11 Thread Ferenc-Levente Juhos
Saso, I'm not flaming at all, I happen to disagree, but still I understand that chances are very very very slim, but as one poster already said, this is how the lottery works. I'm not saying one should make an exhaustive search with trillions of computers just to produce a sha256 collision. If I

Re: [zfs-discuss] New fast hash algorithm - is it needed?

2012-07-11 Thread Ferenc-Levente Juhos
Precisely, I said the same thing a few posts before: dedup=verify solves that. And as I said, one could use dedup=hash algorithm,verify with an inferior hash algorithm (that is much faster) with the purpose of reducing the number of dedup candidates. For that matter one could use a trivial CRC32,

Re: [zfs-discuss] New fast hash algorithm - is it needed?

2012-07-11 Thread Sašo Kiselkov
On 07/11/2012 11:53 AM, Tomas Forsman wrote: On 11 July, 2012 - Sa??o Kiselkov sent me these 1,4K bytes: Oh jeez, I can't remember how many times this flame war has been going on on this list. Here's the gist: SHA-256 (or any good hash) produces a near uniform random distribution of output.

Re: [zfs-discuss] New fast hash algorithm - is it needed?

2012-07-11 Thread Sašo Kiselkov
On 07/11/2012 12:00 PM, casper@oracle.com wrote: You do realize that the age of the universe is only on the order of around 10^18 seconds, do you? Even if you had a trillion CPUs each chugging along at 3.0 GHz for all this time, the number of processor cycles you will have executed

Re: [zfs-discuss] New fast hash algorithm - is it needed?

2012-07-11 Thread Sašo Kiselkov
On 07/11/2012 12:24 PM, Justin Stringfellow wrote: Suppose you find a weakness in a specific hash algorithm; you use this to create hash collisions and now imagined you store the hash collisions in a zfs dataset with dedup enabled using the same hash algorithm. Sorry, but isn't this

Re: [zfs-discuss] New fast hash algorithm - is it needed?

2012-07-11 Thread Casper . Dik
On 07/11/2012 12:24 PM, Justin Stringfellow wrote: Suppose you find a weakness in a specific hash algorithm; you use this to create hash collisions and now imagined you store the hash collisions in a zfs dataset with dedup enabled using the same hash algorithm. Sorry, but isn't this

Re: [zfs-discuss] New fast hash algorithm - is it needed?

2012-07-11 Thread Justin Stringfellow
The point is that hash functions are many to one and I think the point was about that verify wasn't really needed if the hash function is good enough. This is a circular argument really, isn't it? Hash algorithms are never perfect, but we're trying to build a perfect one?   It seems to me

Re: [zfs-discuss] New fast hash algorithm - is it needed?

2012-07-11 Thread Sašo Kiselkov
On 07/11/2012 12:32 PM, Ferenc-Levente Juhos wrote: Saso, I'm not flaming at all, I happen to disagree, but still I understand that chances are very very very slim, but as one poster already said, this is how the lottery works. I'm not saying one should make an exhaustive search with

Re: [zfs-discuss] New fast hash algorithm - is it needed?

2012-07-11 Thread Sašo Kiselkov
On 07/11/2012 12:37 PM, Ferenc-Levente Juhos wrote: Precisely, I said the same thing a few posts before: dedup=verify solves that. And as I said, one could use dedup=hash algorithm,verify with an inferior hash algorithm (that is much faster) with the purpose of reducing the number of dedup

Re: [zfs-discuss] New fast hash algorithm - is it needed?

2012-07-11 Thread Sašo Kiselkov
On 07/11/2012 01:09 PM, Justin Stringfellow wrote: The point is that hash functions are many to one and I think the point was about that verify wasn't really needed if the hash function is good enough. This is a circular argument really, isn't it? Hash algorithms are never perfect, but

Re: [zfs-discuss] New fast hash algorithm - is it needed?

2012-07-11 Thread Casper . Dik
This assumes you have low volumes of deduplicated data. As your dedup ratio grows, so does the performance hit from dedup=verify. At, say, dedupratio=10.0x, on average, every write results in 10 reads. I don't follow. If dedupratio == 10, it means that each item is *referenced* 10 times but it

Re: [zfs-discuss] New fast hash algorithm - is it needed?

2012-07-11 Thread Justin Stringfellow
This assumes you have low volumes of deduplicated data. As your dedup ratio grows, so does the performance hit from dedup=verify. At, say, dedupratio=10.0x, on average, every write results in 10 reads. Well you can't make an omelette without breaking eggs! Not a very nice one, anyway.   Yes

Re: [zfs-discuss] New fast hash algorithm - is it needed?

2012-07-11 Thread Sašo Kiselkov
On 07/11/2012 01:36 PM, casper@oracle.com wrote: This assumes you have low volumes of deduplicated data. As your dedup ratio grows, so does the performance hit from dedup=verify. At, say, dedupratio=10.0x, on average, every write results in 10 reads. I don't follow. If dedupratio

[zfs-discuss] Solaris derivate with the best long-term future

2012-07-11 Thread Eugen Leitl
As a napp-it user who recently needs to upgrade from NexentaCore I recently saw preferred for OpenIndiana live but running under Illumian, NexentaCore and Solaris 11 (Express) as a system recommendation for napp-it. I wonder about the future of OpenIndiana and Illumian, which fork is likely to

Re: [zfs-discuss] New fast hash algorithm - is it needed?

2012-07-11 Thread Sašo Kiselkov
On 07/11/2012 01:42 PM, Justin Stringfellow wrote: This assumes you have low volumes of deduplicated data. As your dedup ratio grows, so does the performance hit from dedup=verify. At, say, dedupratio=10.0x, on average, every write results in 10 reads. Well you can't make an omelette without

Re: [zfs-discuss] Solaris derivate with the best long-term future

2012-07-11 Thread Hung-Sheng Tsao Ph.D.
hi if U have not check this page please do http://en.wikipedia.org/wiki/ZFS interesting info about the status of ZFS in various OS regards my 2c 1)if you have the money buy ZFS appliance 2)if you want to build your self napp-it get solaris 11 support, it only charge the SW/socket and not

Re: [zfs-discuss] Solaris derivate with the best long-term future

2012-07-11 Thread Sašo Kiselkov
On 07/11/2012 01:51 PM, Eugen Leitl wrote: As a napp-it user who recently needs to upgrade from NexentaCore I recently saw preferred for OpenIndiana live but running under Illumian, NexentaCore and Solaris 11 (Express) as a system recommendation for napp-it. I wonder about the future

Re: [zfs-discuss] Solaris derivate with the best long-term future

2012-07-11 Thread Eugen Leitl
On Wed, Jul 11, 2012 at 08:48:54AM -0400, Hung-Sheng Tsao Ph.D. wrote: hi if U have not check this page please do http://en.wikipedia.org/wiki/ZFS interesting info about the status of ZFS in various OS regards Thanks for the pointer. It doesn't answer my question though -- where the most

Re: [zfs-discuss] New fast hash algorithm - is it needed?

2012-07-11 Thread David Magda
On Wed, July 11, 2012 04:50, Ferenc-Levente Juhos wrote: Actually although as you pointed out that the chances to have an sha256 collision is minimal, but still it can happen, that would mean that the dedup algorithm discards a block that he thinks is a duplicate. Probably it's anyway better

Re: [zfs-discuss] New fast hash algorithm - is it needed?

2012-07-11 Thread David Magda
On Tue, July 10, 2012 19:56, Sašo Kiselkov wrote: However, before I start out on a pointless endeavor, I wanted to probe the field of ZFS users, especially those using dedup, on whether their workloads would benefit from a faster hash algorithm (and hence, lower CPU utilization). Developments

Re: [zfs-discuss] New fast hash algorithm - is it needed?

2012-07-11 Thread Sašo Kiselkov
On 07/11/2012 03:39 PM, David Magda wrote: On Tue, July 10, 2012 19:56, Sašo Kiselkov wrote: However, before I start out on a pointless endeavor, I wanted to probe the field of ZFS users, especially those using dedup, on whether their workloads would benefit from a faster hash algorithm (and

Re: [zfs-discuss] New fast hash algorithm - is it needed?

2012-07-11 Thread Gregg Wonderly
Since there is a finite number of bit patterns per block, have you tried to just calculate the SHA-256 or SHA-512 for every possible bit pattern to see if there is ever a collision? If you found an algorithm that produced no collisions for any possible block bit pattern, wouldn't that be the

Re: [zfs-discuss] New fast hash algorithm - is it needed?

2012-07-11 Thread Edward Ned Harvey
From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss- boun...@opensolaris.org] On Behalf Of Sašo Kiselkov On 07/11/2012 11:53 AM, Tomas Forsman wrote: On 11 July, 2012 - Sa??o Kiselkov sent me these 1,4K bytes: Oh jeez, I can't remember how many times this flame war has been

Re: [zfs-discuss] New fast hash algorithm - is it needed?

2012-07-11 Thread Sašo Kiselkov
On 07/11/2012 03:57 PM, Gregg Wonderly wrote: Since there is a finite number of bit patterns per block, have you tried to just calculate the SHA-256 or SHA-512 for every possible bit pattern to see if there is ever a collision? If you found an algorithm that produced no collisions for any

Re: [zfs-discuss] New fast hash algorithm - is it needed?

2012-07-11 Thread Edward Ned Harvey
From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss- boun...@opensolaris.org] On Behalf Of Sašo Kiselkov As your dedup ratio grows, so does the performance hit from dedup=verify. At, say, dedupratio=10.0x, on average, every write results in 10 reads. Why? If you intend to write

Re: [zfs-discuss] New fast hash algorithm - is it needed?

2012-07-11 Thread Bob Friesenhahn
On Tue, 10 Jul 2012, Edward Ned Harvey wrote: CPU's are not getting much faster. But IO is definitely getting faster. It's best to keep ahead of that curve. It seems that per-socket CPU performance is doubling every year. That seems like faster to me. If server CPU chipsets offer

Re: [zfs-discuss] New fast hash algorithm - is it needed?

2012-07-11 Thread Edward Ned Harvey
From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss- boun...@opensolaris.org] On Behalf Of Gregg Wonderly Since there is a finite number of bit patterns per block, have you tried to just calculate the SHA-256 or SHA-512 for every possible bit pattern to see if there is ever a

Re: [zfs-discuss] New fast hash algorithm - is it needed?

2012-07-11 Thread Sašo Kiselkov
On 07/11/2012 03:58 PM, Edward Ned Harvey wrote: From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss- boun...@opensolaris.org] On Behalf Of Sašo Kiselkov I really mean no disrespect, but this comment is so dumb I could swear my IQ dropped by a few tenths of a point just by reading.

Re: [zfs-discuss] New fast hash algorithm - is it needed?

2012-07-11 Thread Joerg Schilling
Bob Friesenhahn bfrie...@simple.dallas.tx.us wrote: On Tue, 10 Jul 2012, Edward Ned Harvey wrote: CPU's are not getting much faster. But IO is definitely getting faster. It's best to keep ahead of that curve. It seems that per-socket CPU performance is doubling every year. That

Re: [zfs-discuss] New fast hash algorithm - is it needed?

2012-07-11 Thread Gregg Wonderly
But this is precisely the kind of observation that some people seem to miss out on the importance of. As Tomas suggested in his post, if this was true, then we could have a huge compression ratio as well. And even if there was 10% of the bit patterns that created non-unique hashes, you could

Re: [zfs-discuss] New fast hash algorithm - is it needed?

2012-07-11 Thread Bob Friesenhahn
On Wed, 11 Jul 2012, Sašo Kiselkov wrote: the hash isn't used for security purposes. We only need something that's fast and has a good pseudo-random output distribution. That's why I looked toward Edon-R. Even though it might have security problems in itself, it's by far the fastest algorithm in

Re: [zfs-discuss] New fast hash algorithm - is it needed?

2012-07-11 Thread Ferenc-Levente Juhos
You don't need to reproduce all possible blocks. 1. SHA256 produces a 256 bit hash 2. That means it produces a value on 256 bits, in other words a value between 0..2^256 - 1 3. If you start counting from 0 to 2^256 and for each number calculate the SHA256 you will get at least one hash collision

Re: [zfs-discuss] New fast hash algorithm - is it needed?

2012-07-11 Thread Casper . Dik
On Tue, 10 Jul 2012, Edward Ned Harvey wrote: CPU's are not getting much faster. But IO is definitely getting faster. It's best to keep ahea d of that curve. It seems that per-socket CPU performance is doubling every year. That seems like faster to me. I think that I/O isn't getting as

Re: [zfs-discuss] New fast hash algorithm - is it needed?

2012-07-11 Thread Edward Ned Harvey
From: Bob Friesenhahn [mailto:bfrie...@simple.dallas.tx.us] Sent: Wednesday, July 11, 2012 10:06 AM On Tue, 10 Jul 2012, Edward Ned Harvey wrote: CPU's are not getting much faster. But IO is definitely getting faster. It's best to keep ahead of that curve. It seems that per-socket

Re: [zfs-discuss] New fast hash algorithm - is it needed?

2012-07-11 Thread Gregg Wonderly
Unfortunately, the government imagines that people are using their home computers to compute hashes and try and decrypt stuff. Look at what is happening with GPUs these days. People are hooking up 4 GPUs in their computers and getting huge performance gains. 5-6 char password space covered

Re: [zfs-discuss] New fast hash algorithm - is it needed?

2012-07-11 Thread Sašo Kiselkov
On 07/11/2012 04:19 PM, Gregg Wonderly wrote: But this is precisely the kind of observation that some people seem to miss out on the importance of. As Tomas suggested in his post, if this was true, then we could have a huge compression ratio as well. And even if there was 10% of the bit

Re: [zfs-discuss] New fast hash algorithm - is it needed?

2012-07-11 Thread Sašo Kiselkov
On 07/11/2012 04:22 PM, Bob Friesenhahn wrote: On Wed, 11 Jul 2012, Sašo Kiselkov wrote: the hash isn't used for security purposes. We only need something that's fast and has a good pseudo-random output distribution. That's why I looked toward Edon-R. Even though it might have security

Re: [zfs-discuss] New fast hash algorithm - is it needed?

2012-07-11 Thread Gregg Wonderly
This is exactly the issue for me. It's vital to always have verify on. If you don't have the data to prove that every possible block combination possible, hashes uniquely for the small bit space we are talking about, then how in the world can you say that verify is not necessary? That just

Re: [zfs-discuss] New fast hash algorithm - is it needed?

2012-07-11 Thread Bob Friesenhahn
On Wed, 11 Jul 2012, Joerg Schilling wrote: Bob Friesenhahn bfrie...@simple.dallas.tx.us wrote: On Tue, 10 Jul 2012, Edward Ned Harvey wrote: CPU's are not getting much faster. But IO is definitely getting faster. It's best to keep ahead of that curve. It seems that per-socket CPU

Re: [zfs-discuss] New fast hash algorithm - is it needed?

2012-07-11 Thread Sašo Kiselkov
On 07/11/2012 04:23 PM, casper@oracle.com wrote: On Tue, 10 Jul 2012, Edward Ned Harvey wrote: CPU's are not getting much faster. But IO is definitely getting faster. It's best to keep ahea d of that curve. It seems that per-socket CPU performance is doubling every year. That

Re: [zfs-discuss] New fast hash algorithm - is it needed?

2012-07-11 Thread Justin Stringfellow
Since there is a finite number of bit patterns per block, have you tried to just calculate the SHA-256 or SHA-512 for every possible bit pattern to see if there is ever a collision?  If you found an algorithm that produced no collisions for any possible block bit pattern, wouldn't that be

Re: [zfs-discuss] New fast hash algorithm - is it needed?

2012-07-11 Thread Ferenc-Levente Juhos
As I said several times before, to produce hash collisions. Or to calculate rainbow tables (as a previous user theorized it) you only need the following. You don't need to reproduce all possible blocks. 1. SHA256 produces a 256 bit hash 2. That means it produces a value on 256 bits, in other

Re: [zfs-discuss] New fast hash algorithm - is it needed?

2012-07-11 Thread Sašo Kiselkov
On 07/11/2012 04:27 PM, Gregg Wonderly wrote: Unfortunately, the government imagines that people are using their home computers to compute hashes and try and decrypt stuff. Look at what is happening with GPUs these days. People are hooking up 4 GPUs in their computers and getting huge

Re: [zfs-discuss] New fast hash algorithm - is it needed?

2012-07-11 Thread Sašo Kiselkov
On 07/11/2012 04:30 PM, Gregg Wonderly wrote: This is exactly the issue for me. It's vital to always have verify on. If you don't have the data to prove that every possible block combination possible, hashes uniquely for the small bit space we are talking about, then how in the world can

Re: [zfs-discuss] New fast hash algorithm - is it needed?

2012-07-11 Thread Sašo Kiselkov
On 07/11/2012 04:36 PM, Justin Stringfellow wrote: Since there is a finite number of bit patterns per block, have you tried to just calculate the SHA-256 or SHA-512 for every possible bit pattern to see if there is ever a collision? If you found an algorithm that produced no collisions

Re: [zfs-discuss] New fast hash algorithm - is it needed?

2012-07-11 Thread Edward Ned Harvey
From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss- boun...@opensolaris.org] On Behalf Of Gregg Wonderly But this is precisely the kind of observation that some people seem to miss out on the importance of. As Tomas suggested in his post, if this was true, then we could have a

Re: [zfs-discuss] New fast hash algorithm - is it needed?

2012-07-11 Thread Sašo Kiselkov
On 07/11/2012 04:39 PM, Ferenc-Levente Juhos wrote: As I said several times before, to produce hash collisions. Or to calculate rainbow tables (as a previous user theorized it) you only need the following. You don't need to reproduce all possible blocks. 1. SHA256 produces a 256 bit hash

Re: [zfs-discuss] New fast hash algorithm - is it needed?

2012-07-11 Thread Casper . Dik
Unfortunately, the government imagines that people are using their home com= puters to compute hashes and try and decrypt stuff. Look at what is happen= ing with GPUs these days. People are hooking up 4 GPUs in their computers = and getting huge performance gains. 5-6 char password space

Re: [zfs-discuss] New fast hash algorithm - is it needed?

2012-07-11 Thread Gregg Wonderly
Yes, but from the other angle, the number of unique 128K blocks that you can store on your ZFS pool, is actually finitely small, compared to the total space. So the patterns you need to actually consider is not more than the physical limits of the universe. Gregg Wonderly On Jul 11, 2012, at

Re: [zfs-discuss] New fast hash algorithm - is it needed?

2012-07-11 Thread Edward Ned Harvey
From: Gregg Wonderly [mailto:gr...@wonderly.org] Sent: Wednesday, July 11, 2012 10:28 AM Unfortunately, the government imagines that people are using their home computers to compute hashes and try and decrypt stuff. Look at what is happening with GPUs these days. heheheh. I guess the

Re: [zfs-discuss] New fast hash algorithm - is it needed?

2012-07-11 Thread Casper . Dik
Do you need assurances that in the next 5 seconds a meteorite won't fall to Earth and crush you? No. And yet, the Earth puts on thousands of tons of weight each year from meteoric bombardment and people have been hit and killed by them (not to speak of mass extinction events). Nobody has ever

Re: [zfs-discuss] New fast hash algorithm - is it needed?

2012-07-11 Thread Ferenc-Levente Juhos
You don't have to store all hash values: a. Just memorize the first one SHA256(0) b. start cointing c. bang: by the time you get to 2^256 you get at least a collision. (do this using BOINC, you dont have to wait for the last hash to be calculated, I'm pretty sure a collision will occur sooner)

Re: [zfs-discuss] New fast hash algorithm - is it needed?

2012-07-11 Thread Gregg Wonderly
So, if I had a block collision on my ZFS pool that used dedup, and it had my bank balance of $3,212.20 on it, and you tried to write your bank balance of $3,292,218.84 and got the same hash, no verify, and thus you got my block/balance and now your bank balance was reduced by 3 orders of

Re: [zfs-discuss] New fast hash algorithm - is it needed?

2012-07-11 Thread Sašo Kiselkov
On 07/11/2012 04:54 PM, Ferenc-Levente Juhos wrote: You don't have to store all hash values: a. Just memorize the first one SHA256(0) b. start cointing c. bang: by the time you get to 2^256 you get at least a collision. Just one question: how long do you expect this going to take on average?

Re: [zfs-discuss] New fast hash algorithm - is it needed?

2012-07-11 Thread Sašo Kiselkov
On 07/11/2012 04:56 PM, Gregg Wonderly wrote: So, if I had a block collision on my ZFS pool that used dedup, and it had my bank balance of $3,212.20 on it, and you tried to write your bank balance of $3,292,218.84 and got the same hash, no verify, and thus you got my block/balance and now

Re: [zfs-discuss] New fast hash algorithm - is it needed?

2012-07-11 Thread Gregg Wonderly
I'm just suggesting that the time frame of when 256-bits or 512-bits is less safe, is closing faster than one might actually think, because social elements of the internet allow a lot more effort to be focused on a single problem than one might consider. Gregg Wonderly On Jul 11, 2012, at

Re: [zfs-discuss] New fast hash algorithm - is it needed?

2012-07-11 Thread David Magda
On Wed, July 11, 2012 09:45, Sašo Kiselkov wrote: I'm not convinced waiting makes much sense. The SHA-3 standardization process' goals are different from ours. SHA-3 can choose to go with something that's slower, but has a higher security margin. I think that absolute super-tight security

Re: [zfs-discuss] New fast hash algorithm - is it needed?

2012-07-11 Thread Sašo Kiselkov
On 07/11/2012 05:10 PM, David Magda wrote: On Wed, July 11, 2012 09:45, Sašo Kiselkov wrote: I'm not convinced waiting makes much sense. The SHA-3 standardization process' goals are different from ours. SHA-3 can choose to go with something that's slower, but has a higher security margin. I

Re: [zfs-discuss] New fast hash algorithm - is it needed?

2012-07-11 Thread David Magda
On Wed, July 11, 2012 10:23, casper@oracle.com wrote: I think that I/O isn't getting as fast as CPU is; memory capacity and bandwith and CPUs are getting faster. I/O, not so much. (Apart from the one single step from harddisk to SSD; but note that I/O is limited to standard interfaces

Re: [zfs-discuss] New fast hash algorithm - is it needed?

2012-07-11 Thread Bob Friesenhahn
On Wed, 11 Jul 2012, Sašo Kiselkov wrote: The reason why I don't think this can be used to implement a practical attack is that in order to generate a collision, you first have to know the disk block that you want to create a collision on (or at least the checksum), i.e. the original block is

Re: [zfs-discuss] New fast hash algorithm - is it needed?

2012-07-11 Thread Nico Williams
On Wed, Jul 11, 2012 at 9:48 AM, casper@oracle.com wrote: Huge space, but still finite=85 Dan Brown seems to think so in Digital Fortress but it just means he has no grasp on big numbers. I couldn't get past that. I had to put the book down. I'm guessing it was as awful as it threatened

Re: [zfs-discuss] New fast hash algorithm - is it needed?

2012-07-11 Thread Casper . Dik
On Wed, Jul 11, 2012 at 9:48 AM, casper@oracle.com wrote: Huge space, but still finite=85 Dan Brown seems to think so in Digital Fortress but it just means he has no grasp on big numbers. I couldn't get past that. I had to put the book down. I'm guessing it was as awful as it

Re: [zfs-discuss] New fast hash algorithm - is it needed?

2012-07-11 Thread Sašo Kiselkov
On 07/11/2012 05:33 PM, Bob Friesenhahn wrote: On Wed, 11 Jul 2012, Sašo Kiselkov wrote: The reason why I don't think this can be used to implement a practical attack is that in order to generate a collision, you first have to know the disk block that you want to create a collision on (or at

Re: [zfs-discuss] Solaris derivate with the best long-term future

2012-07-11 Thread Bob Friesenhahn
On Wed, 11 Jul 2012, Eugen Leitl wrote: It would be interesting to see when zpool versions 28 will be available in the open forks. Particularly encryption is a very useful functionality. Illumos advanced to zpool version 5000 and this is available in the latest OpenIndiana development

Re: [zfs-discuss] New fast hash algorithm - is it needed?

2012-07-11 Thread Gregg Wonderly
You're entirely sure that there could never be two different blocks that can hash to the same value and have different content? Wow, can you just send me the cash now and we'll call it even? Gregg On Jul 11, 2012, at 9:59 AM, Sašo Kiselkov wrote: On 07/11/2012 04:56 PM, Gregg Wonderly wrote:

Re: [zfs-discuss] New fast hash algorithm - is it needed?

2012-07-11 Thread Sašo Kiselkov
On 07/11/2012 05:58 PM, Gregg Wonderly wrote: You're entirely sure that there could never be two different blocks that can hash to the same value and have different content? Wow, can you just send me the cash now and we'll call it even? You're the one making the positive claim and I'm

Re: [zfs-discuss] New fast hash algorithm - is it needed?

2012-07-11 Thread Gregg Wonderly
What I'm saying is that I am getting conflicting information from your rebuttals here. I (and others) say there will be collisions that will cause data loss if verify is off. You say it would be so rare as to be impossible from your perspective. Tomas says, well then lets just use the hash

Re: [zfs-discuss] New fast hash algorithm - is it needed?

2012-07-11 Thread David Magda
On Wed, July 11, 2012 11:58, Gregg Wonderly wrote: You're entirely sure that there could never be two different blocks that can hash to the same value and have different content? [...] The odds of being hit by lighting (at least in the US) are about 1 in 700,000. I don't worry about that

Re: [zfs-discuss] New fast hash algorithm - is it needed?

2012-07-11 Thread Bob Friesenhahn
On Wed, 11 Jul 2012, Sašo Kiselkov wrote: For example, the well-known block might be part of a Windows anti-virus package, or a Windows firewall configuration, and corrupting it might leave a Windows VM open to malware attack. True, but that may not be enough to produce a practical collision

Re: [zfs-discuss] New fast hash algorithm - is it needed?

2012-07-11 Thread Richard Elling
Thanks Sašo! Comments below... On Jul 10, 2012, at 4:56 PM, Sašo Kiselkov wrote: Hi guys, I'm contemplating implementing a new fast hash algorithm in Illumos' ZFS implementation to supplant the currently utilized sha256. No need to supplant, there are 8 bits for enumerating hash

Re: [zfs-discuss] New fast hash algorithm - is it needed?

2012-07-11 Thread Sašo Kiselkov
On 07/11/2012 06:23 PM, Gregg Wonderly wrote: What I'm saying is that I am getting conflicting information from your rebuttals here. Well, let's address that then: I (and others) say there will be collisions that will cause data loss if verify is off. Saying that there will be without any

Re: [zfs-discuss] New fast hash algorithm - is it needed?

2012-07-11 Thread Bob Friesenhahn
On Wed, 11 Jul 2012, Richard Elling wrote: The last studio release suitable for building OpenSolaris is available in the repo. See the instructions at http://wiki.illumos.org/display/illumos/How+To+Build+illumos Not correct as far as I can tell. You should re-read the page you referenced.

Re: [zfs-discuss] New fast hash algorithm - is it needed?

2012-07-11 Thread Richard Elling
On Jul 11, 2012, at 10:11 AM, Bob Friesenhahn wrote: On Wed, 11 Jul 2012, Richard Elling wrote: The last studio release suitable for building OpenSolaris is available in the repo. See the instructions at http://wiki.illumos.org/display/illumos/How+To+Build+illumos Not correct as far as I

Re: [zfs-discuss] New fast hash algorithm - is it needed?

2012-07-11 Thread Richard Elling
On Jul 11, 2012, at 10:23 AM, Sašo Kiselkov wrote: Hi Richard, On 07/11/2012 06:58 PM, Richard Elling wrote: Thanks Sašo! Comments below... On Jul 10, 2012, at 4:56 PM, Sašo Kiselkov wrote: Hi guys, I'm contemplating implementing a new fast hash algorithm in Illumos' ZFS

Re: [zfs-discuss] New fast hash algorithm - is it needed?

2012-07-11 Thread Hung-Sheng Tsao (LaoTsao) Ph.D
Sent from my iPad On Jul 11, 2012, at 13:11, Bob Friesenhahn bfrie...@simple.dallas.tx.us wrote: On Wed, 11 Jul 2012, Richard Elling wrote: The last studio release suitable for building OpenSolaris is available in the repo. See the instructions at

Re: [zfs-discuss] New fast hash algorithm - is it needed?

2012-07-11 Thread Nico Williams
On Wed, Jul 11, 2012 at 3:45 AM, Sašo Kiselkov skiselkov...@gmail.com wrote: It's also possible to set dedup=verify with checksum=sha256, however, that makes little sense (as the chances of getting a random hash collision are essentially nil). IMO dedup should always verify. Nico --

Re: [zfs-discuss] New fast hash algorithm - is it needed?

2012-07-11 Thread Bob Friesenhahn
On Wed, 11 Jul 2012, Hung-Sheng Tsao (LaoTsao) Ph.D wrote: Not correct as far as I can tell. You should re-read the page you referenced. Oracle recinded (or lost) the special Studio releases needed to build the OpenSolaris kernel. you can still download 12 12.1 12.2, AFAIK through OTN

Re: [zfs-discuss] Scenario sanity check

2012-07-11 Thread Brian Wilson
On 07/ 9/12 04:36 PM, Ian Collins wrote: On 07/10/12 05:26 AM, Brian Wilson wrote: Yep, thanks, and to answer Ian with more detail on what TruCopy does. TruCopy mirrors between the two storage arrays, with software running on the arrays, and keeps a list of dirty/changed 'tracks' while the

Re: [zfs-discuss] New fast hash algorithm - is it needed?

2012-07-11 Thread Hung-Sheng Tsao Ph.D.
On 7/11/2012 3:16 PM, Bob Friesenhahn wrote: On Wed, 11 Jul 2012, Hung-Sheng Tsao (LaoTsao) Ph.D wrote: Not correct as far as I can tell. You should re-read the page you referenced. Oracle recinded (or lost) the special Studio releases needed to build the OpenSolaris kernel. you can

Re: [zfs-discuss] New fast hash algorithm - is it needed?

2012-07-11 Thread Bill Sommerfeld
On 07/11/12 02:10, Sašo Kiselkov wrote: Oh jeez, I can't remember how many times this flame war has been going on on this list. Here's the gist: SHA-256 (or any good hash) produces a near uniform random distribution of output. Thus, the chances of getting a random hash collision are around

Re: [zfs-discuss] Solaris derivate with the best long-term future

2012-07-11 Thread Fabian Keil
Bob Friesenhahn bfrie...@simple.dallas.tx.us wrote: On Wed, 11 Jul 2012, Eugen Leitl wrote: It would be interesting to see when zpool versions 28 will be available in the open forks. Particularly encryption is a very useful functionality. Illumos advanced to zpool version 5000 and

Re: [zfs-discuss] New fast hash algorithm - is it needed?

2012-07-11 Thread Sašo Kiselkov
On 07/11/2012 10:06 PM, Bill Sommerfeld wrote: On 07/11/12 02:10, Sašo Kiselkov wrote: Oh jeez, I can't remember how many times this flame war has been going on on this list. Here's the gist: SHA-256 (or any good hash) produces a near uniform random distribution of output. Thus, the chances of

  1   2   >