aherbert commented on issue #140: Bloomfilter updates2
URL: 
https://github.com/apache/commons-collections/pull/140#issuecomment-599025222
 
 
   @Claudenw I have updated the check on the number of bits to be `< 1`.
   
   But it still seems to me you can create a shape that you would never want:
   ```java
   // necessary
   numberOfItems < numberOfBits
   // sensible
   numberOfHashFunctions < numberOfBits
   ```
   
   I think the first check may make the test on the calculated probability 
redundant. Looking at the unit tests you have to construct a Shape with number 
of items > number of bits to create a bad probability.
   
   I think the limitation to only allow shape creation when the probability 
computation is in the range (0, 1) when the filter is filled to exactly the 
expected number of items is strange. IMO it would be better to throw an an 
exception if the filter cannot be created with an actual probability at 
saturation close to the desired probability (e.g. within 1e-3). But then you 
have to define what is close. So this check perhaps should be dropped, and may 
be redundant if number of items is less than number of bits.
   
   I am also not sold on the the `getProbability()` function. It is limited to 
`n` set to the expected number of items. But you may want to know the 
probability of your Bloom filter that you have just filled with x items. So 
perhaps we should either have a static method in Shape to compute each of the 
standard functions defined in the header (i.e. n, m, p, k) and/or an instance 
level `getProbability(n)` that uses the fixed `m` and `k`.
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

Reply via email to