Akanksha-kedia opened a new pull request, #18898: URL: https://github.com/apache/pinot/pull/18898
## Description When a user changes the `fpp` (false positive probability) config for a bloom filter index, Pinot previously did NOT detect the change and would not rebuild the index. Users had to: 1. Remove the bloom filter config 2. Reload table 3. Re-add bloom filter config with new fpp 4. Reload table again This PR adds fpp change detection to `BloomFilterHandler`, following the same pattern used for H3 index resolution detection (PR #16953). The detection works by comparing the number of hash functions stored in the existing bloom filter with what the new fpp config would produce (given the column's cardinality). If they differ, the bloom filter is removed and recreated with the updated config. ### Changes Made - Modified `BloomFilterHandler.needUpdateIndices()` to check for fpp config changes on existing bloom filter columns - Modified `BloomFilterHandler.updateIndices()` to remove and rebuild bloom filters when fpp config has changed - Added `isFppChanged()` helper that reads `numHashFunctions` from the existing bloom filter data buffer - Added `computeExpectedNumHashFunctions()` that mirrors Guava's BloomFilter formula to compute the expected number of hash functions from fpp and cardinality ## Related Issue Fixes #17137 ## Upgrade Notes None. This is a purely additive behavior change - bloom filter indexes will now be automatically rebuilt when fpp config changes, instead of silently keeping the old index. ## Testing Done - [x] Unit tests added: `SegmentPreProcessorTest#testBloomFilterFppUpdate` (tests both v1 and v3 segment formats) - [x] Test verifies: creating bloom filter with fpp=0.1, confirming no processing needed, changing fpp to 0.01, confirming processing IS needed, rebuilding, and confirming no further processing needed - [x] All existing bloom filter tests pass - [x] Checkstyle, spotless, and license checks pass -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
