On 04/22/2011 12:00 PM, [email protected] wrote: > Message: 15 Date: Fri, 22 Apr 2011 11:53:23 -0400 From: David > Rosenstrauch <[email protected]> Subject: Re: ZFS and block > deduplication To: [email protected] Message-ID: > <[email protected]> Content-Type: text/plain; > charset=ISO-8859-1; format=flowed On 04/22/2011 11:41 AM, Mark > Woodward wrote: >> > I have been trying to convince myself that the SHA2/256 hash is >> > sufficient to identify blocks on a file system. Is anyone familiar with >> > this? >> > >> > The theory is that you take a hash value of a block on a disk, and the >> > hash, which is smaller than the actual block, is unique enough that the >> > probability of any two blocks creating the same hash, is actually less >> > than the probability of hardware failure. >> > Given a small enough block size with a small enough set size, I can >> > almost see it as safe enough for backups, but I certainly wouldn't put >> > mission critical data on it. Would you? Tell me how I'm flat out wrong. >> > I need to hear it. > If you read up on the rsync algorithm > (http://cs.anu.edu.au/techreports/1996/TR-CS-96-05.html), he uses a > combination of 2 different checksums to determine block uniqueness. > And, IIRC, even then he still does an additional final check to make > sure that the copied data is correct (and copies again if not). That's rsync, and I tend to agree with their level of paranoia. Take a look at this link: http://blogs.sun.com/bonwick/entry/zfs_dedup
_______________________________________________ Discuss mailing list [email protected] http://lists.blu.org/mailman/listinfo/discuss
