Matija Grabnar wrote:
> Paul J Stevens wrote:
>> and you can select
>> the hash(es) you trust not to generate collisions. 
> Any hash you choose (as long as it shorter than the messages) WILL
> generate collisions. It is a mathematical fact. You can not represent
> all possible attachments with a short hash value.
> (If you could, it wouldn't be a hash algorithm, it would be a
> compression algorithm ;-)

I think I understand the math. Adding support for stronger hashes, and
compound hashes was done mostly to take the pressure off.

> I re-iterate: regardless of which digest algorithm is chosen, the code
> MUST be able to
> detect and correctly handle collisions. Collisions WILL occur,
> regardless of the algorithm
> chosen. It is a mathematically provable fact.

The basic concept of single-instance storage relies on being able to
identify unique blobs. Using a hash is one way of quickly generating
them. Detecting and dealing with collisions can be done without too much
loss in performance, but it will take some really care and I wont rush
it if you dont mind.



-- 
  ________________________________________________________________
  Paul Stevens                                      paul at nfg.nl
  NET FACILITIES GROUP                     GPG/PGP: 1024D/11F8CD31
  The Netherlands________________________________http://www.nfg.nl
_______________________________________________
DBmail mailing list
[email protected]
https://mailman.fastxs.nl/mailman/listinfo/dbmail

Reply via email to