From: firebird-support@yahoogroups.com 
[mailto:firebird-support@yahoogroups.com] 
Sent: Thursday, February 9, 2017 11:33 AM
To: firebird-support@yahoogroups.com
Subject: RE: [firebird-support] Re: detect duplicate blobs, how to?

 

  



> You are aware of course that you can't use any hashing function on its own to 
> detect duplicates? - the best you can do is detect *probable* duplicates, 

Actually, if you choose the right hash function you can detect duplicates. 

If you create a UDF based on/using SHA256, the result would be unique (with a 
2^256 certainty) -- there is no known collision of a SHA256 hash 
(https://en.wikipedia.org/wiki/Hash_function_security_summary). 


Sean 

Even SHA256 can’t eliminate all possibility of a duplicate. If you have files 
of more than 256 bits in them, by the pigeon hole principle, there WILL be 
duplicates within the universe of all possible files. There HAS to be. The 
probability is very low (but not zero) if you keep you set of files below 2^128 
members, but it is NOT 0. The key property of a has like SHA256 is that given a 
hash value, you cannot create a file (other than by brute force) that will 
yield that hash value. When using a hash, you need to decide if the chance of a 
false positive on match. With a good large hash, that probability gets very 
small, so maybe you can assume it is perfect. I would likely still compare the 
files to be sure, since you will likely only occur that cost if you do have a 
duplicate.   ,_._,___

  • [firebird-s... hamacker sirhamac...@gmail.com [firebird-support]
    • [fireb... Dmitry Yemanov dim...@users.sourceforge.net [firebird-support]
      • Re... hamacker sirhamac...@gmail.com [firebird-support]
        • ... Tim Ward t...@telensa.com [firebird-support]
          • ... 'Leyne, Sean' s...@broadviewsoftware.com [firebird-support]
            • ... 'Richard Damon' rich...@damon-family.org [firebird-support]
              • ... 'Leyne, Sean' s...@broadviewsoftware.com [firebird-support]
                • ... 'Richard Damon' rich...@damon-family.org [firebird-support]
        • ... Dmitry Yemanov dim...@users.sourceforge.net [firebird-support]

Reply via email to