aherbert commented on code in PR #358:
URL: 
https://github.com/apache/commons-collections/pull/358#discussion_r1017051177


##########
src/main/java/org/apache/commons/collections4/bloomfilter/BloomFilter.java:
##########
@@ -214,26 +215,48 @@ default boolean isFull() {
      * <p>By default this is the rounding of the {@code 
Shape.estimateN(cardinality)} calculation for the
      * shape and cardinality of this filter.</p>
      *
-     * <p>This produces an estimate roughly equivalent to the number of 
Hashers that have been merged into the filter.</p>
+     * <p>This produces an estimate roughly equivalent to the number of 
Hashers that have been merged into the filter
+     * by rounding the value from the calculation described in the {@link 
Shape} class javadoc.</p>
      *
-     * @return an estimate of the number of items in the bloom filter.
+     * <p><em>Note:</em></p>
+     * <ul>
+     * <li>if cardinality == numberOfBits, then result is 
Integer.MAX_VALUE.</li>
+     * <li>if cardinality &gt; numberOfBits, then an IllegalArgumentException 
is thrown.</li>
+     * </ul>
+     *
+     * @return an estimate of the number of items in the bloom filter.  Will 
return Integer.MAX_VALUE if the
+     * estimate is larger than Integer.MAX_VALUE.
+     * @throws IllegalArgumentException if the cardinality is &gt; 
numberOfBits as defined in Shape.
      * @see Shape#estimateN(int)
+     * @see Shape
      */
     default int estimateN() {
-        return (int) Math.round(getShape().estimateN(cardinality()));
+        double d = getShape().estimateN(cardinality());
+        if (Double.isInfinite(d)) {
+            return Integer.MAX_VALUE;
+        }
+        if (Double.isNaN(d)) {
+            throw new IllegalArgumentException("Cardinality too large: " + 
cardinality());
+        }
+        long l = Math.round(d);
+        return l > Integer.MAX_VALUE ? Integer.MAX_VALUE : (int) l;

Review Comment:
   Could be changed to:
   ```Java
   return (int) Math.min(Integer.MAX_VALUE, l);
   ```
   However that does obfuscate the fact there is no coverage of this edge case.



##########
src/main/java/org/apache/commons/collections4/bloomfilter/BloomFilter.java:
##########
@@ -245,16 +268,43 @@ default int estimateUnion(final BloomFilter other) {
     /**
      * Estimates the number of items in the intersection of this Bloom filter 
with the other bloom filter.
      *
-     * <p>By default this is the {@code estimateN() + other.estimateN() - 
estimateUnion(other)} </p>
+     * <p>This method produces estimate is roughly equivalent to the number of 
unique Hashers that have been merged into both
+     * of the filters by rounding the value from the calculation described in 
the {@link Shape} class javadoc.</p>
      *
-     * <p>This produces estimate is roughly equivalent to the number of unique 
Hashers that have been merged into both
-     * of the filters.</p>
+     * <p><em>{@code estimateIntersection} should only be called with Bloom 
filters of the same Shape.  If called on Bloom
+     * filters of differing shape this method is not symmetric. If {@code 
other} has more bits an {@code IllegalArgumentException}
+     * may be thrown.</em></p>
      *
      * @param other The other Bloom filter
-     * @return an estimate of the number of items in the intersection.
+     * @return an estimate of the number of items in the intersection. If the 
calculated estimate is larger than Integer.MAX_VALUE then MAX_VALUE is returned.
+     * @throws IllegalArgumentException if the estimated N for the union of 
the filters is infinite.
+     * @see #estimateN()
+     * @see Shape
      */
     default int estimateIntersection(final BloomFilter other) {
         Objects.requireNonNull(other, "other");
-        return estimateN() + other.estimateN() - estimateUnion(other);
+        double eThis = getShape().estimateN(cardinality());
+        double eOther = getShape().estimateN(other.cardinality());
+        long estimate = 0L;
+        if (Double.isInfinite(eThis) && Double.isInfinite(eOther)) {
+            // if both are infinite the union is infinite and we return 
Integer.MAX_VALUE
+            return Integer.MAX_VALUE;
+        }
+        // if one is infinite the intersection is the other.
+        if (Double.isInfinite(eThis)) {
+            estimate = Math.round(eOther);
+        } else if (Double.isInfinite(eOther)) {
+            estimate = Math.round(eThis);
+        } else {
+            BloomFilter union = this.copy();
+            union.merge(other);
+            double eUnion = getShape().estimateN(union.cardinality());
+            if (Double.isInfinite(eUnion)) {
+                throw new IllegalArgumentException("The estimated N for the 
union of the filters is infinite");
+            }
+            // all estimated values are small values greater than 0 but less 
that number of bits

Review Comment:
   `less than`



##########
src/main/java/org/apache/commons/collections4/bloomfilter/BloomFilter.java:
##########
@@ -214,26 +215,48 @@ default boolean isFull() {
      * <p>By default this is the rounding of the {@code 
Shape.estimateN(cardinality)} calculation for the
      * shape and cardinality of this filter.</p>
      *
-     * <p>This produces an estimate roughly equivalent to the number of 
Hashers that have been merged into the filter.</p>
+     * <p>This produces an estimate roughly equivalent to the number of 
Hashers that have been merged into the filter
+     * by rounding the value from the calculation described in the {@code 
Shape} class javadoc.</p>
      *
-     * @return an estimate of the number of items in the bloom filter.
+     * <p><em>Note:</em></p>
+     * <ul>
+     * <li> if cardinality == numberOfBits, then result is 
Integer.MAX_VALUE.</li>
+     * <li> if cardinality &gt; numberOfBits, then an IllegalArgumentException 
is thrown.</li>
+     * </ul>
+     *
+     * @return an estimate of the number of items in the bloom filter.  Will 
return Integer.MAX_VALUE if the
+     * estimate is larger than Integer.MAX_VALUE.
+     * @throws IllegalArgumentException if the cardinality is &gt; 
numberOfBits as defined in Shape.
      * @see Shape#estimateN(int)
+     * @see Shape
      */
     default int estimateN() {
-        return (int) Math.round(getShape().estimateN(cardinality()));
+        double d = getShape().estimateN(cardinality());
+        if (Double.isInfinite(d)) {
+            return Integer.MAX_VALUE;
+        }
+        if (Double.isNaN(d)) {

Review Comment:
   Is this possible without a broken filter `cardinality` or `Shape.estimateN`?



##########
src/main/java/org/apache/commons/collections4/bloomfilter/BloomFilter.java:
##########
@@ -245,16 +268,43 @@ default int estimateUnion(final BloomFilter other) {
     /**
      * Estimates the number of items in the intersection of this Bloom filter 
with the other bloom filter.
      *
-     * <p>By default this is the {@code estimateN() + other.estimateN() - 
estimateUnion(other)} </p>
+     * <p>This method produces estimate is roughly equivalent to the number of 
unique Hashers that have been merged into both
+     * of the filters by rounding the value from the calculation described in 
the {@link Shape} class javadoc.</p>
      *
-     * <p>This produces estimate is roughly equivalent to the number of unique 
Hashers that have been merged into both
-     * of the filters.</p>
+     * <p><em>{@code estimateIntersection} should only be called with Bloom 
filters of the same Shape.  If called on Bloom
+     * filters of differing shape this method is not symmetric. If {@code 
other} has more bits an {@code IllegalArgumentException}
+     * may be thrown.</em></p>
      *
      * @param other The other Bloom filter
-     * @return an estimate of the number of items in the intersection.
+     * @return an estimate of the number of items in the intersection. If the 
calculated estimate is larger than Integer.MAX_VALUE then MAX_VALUE is returned.
+     * @throws IllegalArgumentException if the estimated N for the union of 
the filters is infinite.
+     * @see #estimateN()
+     * @see Shape
      */
     default int estimateIntersection(final BloomFilter other) {
         Objects.requireNonNull(other, "other");
-        return estimateN() + other.estimateN() - estimateUnion(other);
+        double eThis = getShape().estimateN(cardinality());
+        double eOther = getShape().estimateN(other.cardinality());
+        long estimate = 0L;
+        if (Double.isInfinite(eThis) && Double.isInfinite(eOther)) {
+            // if both are infinite the union is infinite and we return 
Integer.MAX_VALUE
+            return Integer.MAX_VALUE;
+        }
+        // if one is infinite the intersection is the other.
+        if (Double.isInfinite(eThis)) {
+            estimate = Math.round(eOther);
+        } else if (Double.isInfinite(eOther)) {
+            estimate = Math.round(eThis);
+        } else {
+            BloomFilter union = this.copy();
+            union.merge(other);
+            double eUnion = getShape().estimateN(union.cardinality());
+            if (Double.isInfinite(eUnion)) {
+                throw new IllegalArgumentException("The estimated N for the 
union of the filters is infinite");
+            }
+            // all estimated values are small values greater than 0 but less 
that number of bits
+            estimate = Math.round(eThis + eOther - eUnion);
+        }
+        return estimate>Integer.MAX_VALUE?Integer.MAX_VALUE:(int) estimate;

Review Comment:
   Formatting: `estimate > Integer.MAX_VALUE ? Integer.MAX_VALUE : (int) 
estimate;`
   
   Could be changed to:
   ```Java
   return (int) Math.min(Integer.MAX_VALUE, estimate);
   ```
   However that does obfuscate the fact there is no coverage of this edge case.
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to