Claude Warren created COLLECTIONS-824:
-----------------------------------------

             Summary: BloomFilter: change name of SimpleHasher
                 Key: COLLECTIONS-824
                 URL: https://issues.apache.org/jira/browse/COLLECTIONS-824
             Project: Commons Collections
          Issue Type: Improvement
          Components: Collection
    Affects Versions: 4.5
            Reporter: Claude Warren


The wikipedia article on [Double Hashing 
(variants)|https://en.wikipedia.org/wiki/Double_hashing#Variants] notes that 
this type of hasher when used in Bloom filters will result in a collision rate 
that is twice as likely as intended.

An alternative is to provide enhanced double hashing which changes the 
increment each loop iteration. This would require some additional work on 
wrapping the modulus of the increment.

It may be better to change the name to KMHasher and more explicitly state that 
the hash functions are:
 {{g_i(x) = (h1(x) + i h2(x)) mod m; i in [1, k]}}
 



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

Reply via email to