Claudenw commented on PR #402: URL: https://github.com/apache/commons-collections/pull/402#issuecomment-1645018264
Recent changes: # Main code ## BloomFilterProducer * Cleaned up documentation to specify that implementations should specify if filters used in `forEachBloomFilterPair()` and `asBloomFilterArray` are copies or references. * Layer and depth order discussion moved to `LayerManager`. * `asBloomFilterArray` coding change accepted, now uses List internally and returns array. * `forEachBloomFilterPair` documentation corrected. * added `flatten()` ## CountingPredicate * documentation updated ## LayerManager * documentation updated. * `ExtendCheck` and `Cleanup` classes changed to have only static methods, no public static variables. * added parameter checks for `ExtendCheck.advanceOnCount#breakAt`, `ExtendCheck.advanceOnSaturation#maxN`, and `Cleanup.onMaxSize#maxSize` * removed the ``ExtendCheck.advanceOnSaturation(int)` method * LinkedList is required as the Bloom filter storage implementation (and `cleanup` argument) as we need rapid access to the intermediate layers in order to return the specific layers. * added method `addFilter()` to verify that the `filterSupplier` did not return `null`. * optimised `copy()` so that no filters are created during the construction * moved the check for empty filter in `next()` to be part of the cleanup consumer. * the `get()` method needs to be public so that access to the specific layer is permitted as required by the `LayeredBloomFilter`. * `target()` renamed to `getTarget()` * `clear()` now calls `addFilter()` to ensure a non-null filter is added. * `Builder.withX` pattern change to `Builder.setX` * Builder method documentation added. * changed Builder exceptions from IllegalArgument to NPE * The quesiton of whether to use a Builder here is still open. ## BloomFilter * added `isEmpty()` ## LayeredBloomFilter * Changed the hyphens in the cite for the article but left the "V1"s in the page numbers as that is how the publisher asks for it to be cited. * updated documentation * Access to layers is provided because there can be additional info associated with the Bloom filter. In the paper and in many implementations the find returns the deepest node that matches. Our finds returns the indexes that match so that the application may select the deepest, the shallowest, the one from the middle, or perhaps all of them and do some subsequent processing. * removed `clear(int)` and `target()` * added `contains(BloomFilterProducer)` to handle the special case. * Continued to return 0 (zero) for the `characteristics()` as the layered Bloom filter makes some assumptions that work best when it is considered non-sparse. As we don't know what other characteristics may be added in the future, it seems that defining the characteristics here is correct. In addition there is no requirement that all the layers be of the same characteristic. * notes on `forEachIndex()` in the general case notes for the method indicate that index ordering and uniqueness are not guaranteed. * `forEachBitMap()` may only return the nbumber of BitMaps specified by the Shape so it must use `flatten()`. * simplified `forEachBitMap()`, `estimateN()`, and `estimateUnion()` to explicitly use `flatten()` * removed redundant `estimateUnion()` * the use case for `next()` is using `ExtendCheck.neverAdvance()` and then triggering the advance on some outside state change/trigger. ## Shape * clarified `estimageMaxN()` documentation * fixed formatting ## WrappedBloomFilter * argument for not moving code to test directory: Let's consider this for a moment. The example in the test code is precisely what the Kafka KIP-936 is looking for. Now, I don't want to implement a "strange" bloom filter in this class, but I do want an easy way to extend the classes we have. Since they are all final decorating them with a wrapper is the only reasonable solution. So let's provide the solution. If we move it out, then I will have to ask to move it back when the KIP-936 begins development using this library. I believe they are starting that process now. * Updated documentation # Testing code ## DefaultBloomFilterTest * Provided access to the two (2) BloomFilter test classes that implement the Defaults (Sparse and non-Sparse). ## LayerManagerTest * Significantly modified `testOnMaxSize()` and `testAdvanceOnCount` to test wider range of values, including breaking out parameterized tests. * Adjusted to match the methods now provided in LayerManager ## LayeredBloomFilterTest * Removed checks of zero depth as it can not be returned. * Reduced time in `testExpiration()` to approx 600ms ## TestingHashers * converted to `ThreadLocalRandom.current()` * left `randomHasher()` as public so that it can be utilized in 3rd party Bloom filter testing. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
