While the Deque makes clear the idea of enqueueing and dequeueing  the
layers it does not have the method to natively traverse and extract entries
from the middle of the queue.  Nor would I expect it to.  So I think the
Deque does not accurately reflect how the collection of Bloom filters is
utilized.

On Wed, Apr 17, 2024 at 2:17 PM Alex Herbert <alex.d.herb...@gmail.com>
wrote:

> Looks good to me.
>
> Any opinions on changing the LayerManager to keep the layers in a Deque
> rather than a LinkedList. I think it would only require a change to the
> following method:
>
> public final BloomFilter get(int depth)
>
> Performance will be the same as the Deque can be a LinkedList. This is more
> about how any custom downstream code is currently using the collection of
> layers.
>
> Alex
>
> On Wed, 17 Apr 2024 at 10:00, Claude Warren <cla...@xenei.com> wrote:
>
> > I have an open pull request to fix this problem.  I could use another
> > review: https://github.com/apache/commons-collections/pull/476
> >
> >
> > On Tue, Apr 9, 2024 at 11:29 AM Claude Warren <cla...@xenei.com> wrote:
> >
> > > Alex,
> > >
> > > I like your solution.  To answer your question. We create a BloomFilter
> > > that has a timestamp associated with it.  When the timestamp is greater
> > > than System.currentTimeMillis() the filter is removed.  The custom
> > cleanup
> > > calls Cleanup.removeEmptyTarget().andThen(<timestampCleanup>)
> > >
> > > I think that creating a cleanup() or clean() method on the
> > > LayeredBloomFilter is the appropriate solution and that it should call
> > > cleanup() on the LayerManager. (so 2 new methods, one exposed).
> > >
> > > The next() method is used when external circumstances dictate that a
> new
> > > layer should be created.  I think a StableBloomFilter I implemented
> > > required it,  but I do not have the code to hand at the moment.
> > >
> > > Claude
> > >
> > >
> > > On Tue, Apr 9, 2024 at 10:38 AM Alex Herbert <alex.d.herb...@gmail.com
> >
> > > wrote:
> > >
> > >> Hi Claude,
> > >>
> > >> Q. What is your current clean-up filter, i.e.
> > >> the Consumer<LinkedList<BloomFilter>>? I assume you are using a custom
> > >> one.
> > >>
> > >> The current collections code only has 2 functional implementations.
> One
> > >> will remove the newest filter if it is empty, one will remove the
> oldest
> > >> filters until the size is below a limit. Since neither of those will
> > >> iterate the list and purge stale objects then I assume you are using a
> > >> custom clean-up filter. So you had to have created the layer manager
> > with
> > >> your custom filter. Assuming this then there are at least two
> solutions
> > >> for
> > >> the current code:
> > >>
> > >> 1. The current implementation always calls the clean-up filter with
> the
> > >> same LinkedList since it is final. So you can capture the list and do
> > what
> > >> you want with it:
> > >>
> > >>         @SuppressWarnings("rawtypes")
> > >>         LinkedList[] captured = new LinkedList[1];
> > >>         Consumer<LinkedList<BloomFilter>> cleanup = list -> {
> > >>             captured[0] = list;
> > >>             // ... do clean-up
> > >>         };
> > >>
> > >>         // Once you have captured the list, you can clean it when you
> > >> want:
> > >>         // unchecked conversion
> > >>         cleanup.accept(captured[0]);
> > >>
> > >> Obviously this is not ideal as you have to manage the captured list to
> > >> call
> > >> cleanup. But it delivers exactly what you require in terms of being
> able
> > >> to
> > >> call cleanup at any time.
> > >>
> > >> 2. The call to next() will clean the layers but also add a new layer.
> So
> > >> your custom clean method could clean stale objects and also any empty
> > >> filters not at the end of the list. This will avoid building up lots
> of
> > >> empty filters when you frequently trigger next() to purge stale
> filters.
> > >> You can call next() directly on the LayeredBloomFilter. I do not know
> > what
> > >> extend check you are using so there is some management to be done with
> > the
> > >> other settings of the LayerManager to avoid removing any functional
> > layers
> > >> which are currently empty.
> > >>
> > >> --
> > >>
> > >> As to exposing the LayerManager and adding a clean() method to the
> > >> LayerManager, I think this is not in keeping with the current design.
> > The
> > >> LayerManager is used during construction and then never used again. So
> > >> functionality to act on the layers is public through the
> > >> LayeredBloomFilter
> > >> (e.g. calling next()). So perhaps the change to the API should be to
> > add a
> > >> cleanup() method to LayeredBloomFilter. This does the same as next(),
> > but
> > >> does not add a new layer.
> > >>
> > >> I cannot recall the use case for next() in the LayeredBloomFilter.
> Would
> > >> the addition of cleanup() make the next() method redundant?
> > >>
> > >> --
> > >>
> > >> Note: The typing against LinkedList could be updated to
> java.util.Deque.
> > >> The only issue with this is the method:
> > >> public final BloomFilter get(int depth)
> > >>
> > >> This is not supported by the Deque interface. However the LinkedList
> > >> implementation of get(int) will use the iterator from the start or end
> > of
> > >> the list (whichever is closer) to find the element. This can use the
> > >> iterator/descendingIterator method of Deque for the same performance
> > (but
> > >> the code to do this has to be written).
> > >>
> > >> Alex
> > >>
> > >>
> > >> On Tue, 9 Apr 2024 at 08:45, Claude Warren <cla...@xenei.com> wrote:
> > >>
> > >> > Greetings,
> > >> >
> > >> > I am attempting to use the Bloomfilter code in Kafka to manage PID
> > >> > generation.  The requirement is to remove pid tracking after a
> period
> > of
> > >> > time.  This is possible with the LayeredBloomFilter but it has an
> edge
> > >> case
> > >> > problem.
> > >> >
> > >> > The LayeredBloomFilter uses the LayerManager to manage the filters
> > that
> > >> > comprise the layers of the LayerdBloomFilter.
> > >> > The LayerManager uses a Consumer<LinkedList<BloomFilter>> called
> > >> > filterCleanup to remove old layers.
> > >> > The filterCleanup is only called when a new layer is added to the
> > >> layered
> > >> > filter.
> > >> >
> > >> > This solution works well in the general case where data is flowing
> > >> through
> > >> > the layered filter.  However if nothing is added to the filter,
> > >> > filterCleanup is not called.
> > >> >
> > >> > In the Kafka case we have a LayeredBloomFilter for PIDs for each
> > >> producer.
> > >> > As long as a producer is producing PIDs the filter gets updated.
> > >> >
> > >> > However, if a producer drops from the system or goes offline for a
> > >> period
> > >> > of time, then they will no longer be producing PIDs and their old
> > >> expired
> > >> > data will remain.
> > >> >
> > >> > We want to remove the producer from the collection when there are no
> > >> more
> > >> > PIDs being tracked.
> > >> >
> > >> > I think this can be solved by adding a clean() method to the
> > >> LayerManager
> > >> > that simply calls the existing filterCleanup.
> > >> > It would be easier to access this method if the LayeredBloomFilter
> > had a
> > >> > method to return the LayerManager that was passed in the
> constructor.
> > >> >
> > >> > Does anyone see any issues with this approach?  Are there other
> > >> solutions
> > >> > to be had?
> > >> >
> > >> > Questions and comments welcomed.
> > >> > --
> > >> > LinkedIn: http://www.linkedin.com/in/claudewarren
> > >> >
> > >>
> > >
> > >
> > > --
> > > LinkedIn: http://www.linkedin.com/in/claudewarren
> > >
> >
> >
> > --
> > LinkedIn: http://www.linkedin.com/in/claudewarren
> >
>


-- 
LinkedIn: http://www.linkedin.com/in/claudewarren

Reply via email to