I would suggest you consider an alternative data structures: a Cuckoo
Filter or a Golumb Compressed Sequence.

The GCS data structure was introduced in Cache-, Hash- and Space-Efficient
Bloom Filters
<http://algo2.iti.kit.edu/documents/cacheefficientbloomfilters-jea.pdf> by
F. Putze, P. Sanders, and J. Singler.  See section 4.



> We should discuss which exact implementation of bloom filters are the best
> fit.
> @Fabian: There are also implementations of bloom filters that use counting
> and therefore support
> deletes, but obviously this comes at the cost of a potentially higher
> space consumption.
>
> Am 23.05.2018 um 11:29 schrieb Fabian Hueske <fhue...@gmail.com>:
>> IMO, such a feature would be very interesting. However, my concerns with
>> Bloom Filter
>> is that they are insert-only data structures, i.e., it is not possible to
>> remove keys once
>> they were added. This might render the filter useless over time.
>> In a different thread (see discussion in FLINK-8918 [1]), you mentioned
>> that the Bloom
>> Filters would be growing.
>> If we keep them in memory, how can we prevent them from exceeding memory
>> boundaries over
>> time?
>
>

Reply via email to