RE: ZFS and block deduplication

2011-04-27 Thread Edward Ned Harvey
From: Tom Metro [mailto:tmetro-...@vl.com] I think the attack vector would be along the lines of an attacker identifying one or more blocks of a privileged executable, creating replacement blocks that have both malicious code and cause a hash collision. They write the blocks to disk, and

Re: ZFS and block deduplication

2011-04-27 Thread Richard Pieri
On Apr 27, 2011, at 5:00 PM, Edward Ned Harvey wrote: It's even more difficult than that ... Yes, many files span multiple blocks, and therefore begin at the beginning of one block and end in the middle of another block, but the hashes are calculated on a per-block basis up to 128k. So any

Re: ZFS and block deduplication

2011-04-25 Thread Mark Woodward
On 04/24/2011 10:52 PM, Edward Ned Harvey wrote: From: Mark Woodward [mailto:ma...@mohawksoft.com] You know, I've read the same math and I've worked it out myself. I agree it sounds so astronomical as to be unrealistic to even imagine it, but no matter how astronomical the odds, someone

Re: ZFS and block deduplication

2011-04-25 Thread Mark Woodward
On 04/25/2011 09:32 AM, Daniel Feenberg wrote: On Mon, 25 Apr 2011, Mark Woodward wrote: This is one of those things that make my brain hurt. If I am representing more data with a fixed size number, i.e. a 4K block vs a 16K block, that does, in fact, increase the probability of collision

Re: ZFS and block deduplication

2011-04-25 Thread Tom Metro
Edward Ned Harvey wrote: (2) We're assuming the data in question is not being maliciously formed for the purposes of causing a hash collision. I think this is a safe assumption, because in the event of a collision, you would have two different pieces of data that are assumed to be identical

RE: ZFS and block deduplication

2011-04-25 Thread Edward Ned Harvey
From: Tom Metro [mailto:tmetro-...@vl.com] (Doesn't ZFS also employ overall file hashing to insure the integrity of a file? (Or is that the verification option you referred to?) If so, then that would likely thwart this attack vector.) The data integrity hash is something much faster and

RE: ZFS and block deduplication

2011-04-24 Thread Edward Ned Harvey
From: Mark Woodward [mailto:ma...@mohawksoft.com] You know, I've read the same math and I've worked it out myself. I agree it sounds so astronomical as to be unrealistic to even imagine it, but no matter how astronomical the odds, someone usually wins the lottery. I'm just trying to assure

RE: ZFS and block deduplication

2011-04-23 Thread Edward Ned Harvey
From: discuss-boun...@blu.org [mailto:discuss-boun...@blu.org] On Behalf Of Mark Woodward I have been trying to convince myself that the SHA2/256 hash is sufficient to identify blocks on a file system. Is anyone familiar with this? I am intimately familiar with this. And on planet Earth,

Re: ZFS and block deduplication

2011-04-23 Thread Mark Woodward
On 04/23/2011 08:31 AM, Edward Ned Harvey wrote: From: discuss-boun...@blu.org [mailto:discuss-boun...@blu.org] On Behalf Of Mark Woodward I have been trying to convince myself that the SHA2/256 hash is sufficient to identify blocks on a file system. Is anyone familiar with this? I am

ZFS and block deduplication

2011-04-22 Thread Mark Woodward
I have been trying to convince myself that the SHA2/256 hash is sufficient to identify blocks on a file system. Is anyone familiar with this? The theory is that you take a hash value of a block on a disk, and the hash, which is smaller than the actual block, is unique enough that the

Re: ZFS and block deduplication

2011-04-22 Thread Mark Woodward
On 04/22/2011 12:00 PM, discuss-requ...@blu.org wrote: Message: 15 Date: Fri, 22 Apr 2011 11:53:23 -0400 From: David Rosenstrauch dar...@darose.net Subject: Re: ZFS and block deduplication To: discuss@blu.org Message-ID: 4db1a473.1090...@darose.net Content-Type: text/plain; charset=ISO