A long time ago there was some discussion on large listener lists, and
how they perform very poorly when cleaning up (removing existing listeners).

A listener list basically needs to support the following operations:

- append at end, allowing duplicates
- remove by value, removing the first (oldest) match only
- iteration

It has no other needs.

There was some discussion if this could be replaced with a
LinkedHashMap, but it handles duplicates subtly different and is not
very fast at iteration (lots of random memory accesses).

So I created what is best described as an "ordered" bag.  It allows
adding and removal, while maintaining order and because it is backed by
(amongst others) an ordered array, iteration is about as fast as what
ArrayList does.  It has O(1) insertion, O(1) removal, O(n) iteration,
but about 3x the memory requirements (not including the listener cost
itself), unless the list is small (for small lists the overhead is only
slightly higher than ArrayList).

Insertion is about 5x slower than ArrayList; Removal is far faster (150x
faster for a 10000 listener case); Iteration is almost equally fast.

Because it has the exact same semantics as an (Array)List with regards
to duplicates and their removal order, it is a drop-in replacement.

Internally it works by maintaining an ordered array (basically what
ArrayList has) which is allowed to have removal gaps that are skipped
during iteration.  When the array needs to grow, it first sees if it can
consolidate the gaps before increasing the size (and it also shrinks on
demand, unlike ArrayList).  Other than that, for lists above a certain
size, it maintains three additional arrays; these are used to maintain
hashed linked lists of similar elements (basically a singly linked list
per bucket, but with fast appends at the end using a tail pointer).

When the number of listeners is low, the implementation falls back on
the simple array only (and the other 3 lists are nulled), which means
that for small lists the overhead is minimal.  It's sort of a best of
both worlds thing (although there is always overhead), where it uses
minimal memory for small lists and simply scans the entire list for
removals, while for large lists it initializes additional structures to
allow for quick removals at any size.

Thoughts?

--John

Reply via email to