> -----Original Message----- > From: Philipp Marek [mailto:philipp.ma...@emerion.com] > Sent: Wednesday, February 25, 2009 1:54 PM > On Mittwoch, 25. Februar 2009, Christensen Stefan wrote:
> My point is - either there *is* verification (then the hash > function itself > doesn't matter that much), or there is *none*. > In the latter case you risk trashing your data. > > As the amount of data stored will only grow, there's an > increasing risk of > collisions. > > And, if you use a 512 bit hash for 4096*8 bits of data, you > have 1/64th of > your storage wasted for the data index alone. That is a bit too much waste. > But if you're getting 1MB of data, and have to tell some > hardware to do 256 > individual SHA2 calculations of 4kB each, you'll have some latency. I'm not quite sure how fast SHA-2 can be run on a current CPU, but I don't think it would be slower than the transferspeed of disks(~70MiB/s). > If that's a simple calculation in the CPU, then you can > already ask the SSD > for the first (expected) data block after hashing the first 4kB. > > Maybe it's better via extra hardware - I don't know. > I just think that > - a *big* hash, for collision-resistance, takes too much space; and > - a smaller hash has probably collisions in our lifetime. > So take some ASIC or GPU, and use that for a *simple* hash > calculation; but > *verify* the block, to make sure that nothing bad happens. After thinking a bit I think you are right. Only use a easy hash to reduce the amount of times you have to check the actual disk-img of that block to see if it is the same. But you would have to have a list of diskblocks that have the same hash value but are diffrent. Maybe a good size for the hashvalue would be 64bits then. It will yield a collision every 2**32 blocks, but wouldn't take up too much space. And the criteria for the hash would be few instructions. But you will not need to offload the hash to an ASIC/GPU. -- Stefan _______________________________________________ Tux3 mailing list Tux3@tux3.org http://mailman.tux3.org/cgi-bin/mailman/listinfo/tux3