>>>>> "Roman" == Roman Kennke <[EMAIL PROTECTED]> writes:

Roman> We have a possible memory leak in our GapContent implementation. The
Roman> problem is that the GapContent stores all the Position objects it
Roman> creates in an ArrayList, but since there is no way (except
Roman> WeakReferences) to detect when they are no longer used, they are never
Roman> removed from that array. I fixed this by storing the Position objects in
Roman> WeakReferences and adjusting the class to handle this. This involved
Roman> implementing a special Comparator that compares weakly referenced
Roman> objects and a clearing mechanism that clears up the ArrayList before new
Roman> Positions are created.

Ugh, I really don't like this API!  Since Positions are implicitly
managed, I think we need to use WeakReferences... but these are really
hard to use correctly.  All this ugliness could be avoided with a
simple 'invalidate' API for Positions :-(

Roman> +  private class WeakPositionComparator
Roman> +    implements Comparator

This class should have an explicit constructor to avoid an accessor
constructor.  (This is very minor.)

Roman> +      if (p1 == null || p2 == null)
Roman> +        retVal = -1;

This doesn't seem right, since it means that the comparison is not
symmetric.  But see below.

Roman> +    clearPositionReferences();

Because a WeakReference can be cleared at any time, there's no point
to doing this since all the code that walks the ArrayList will have to
deal with the null case anyway... so you might as well remove elements
lazily.

Roman> +    int index = Collections.binarySearch(positions, r,
Roman> +                                         new WeakPositionComparator());

Because WeakPositionComparator does not have any state, you only need
a single instance, which can be reused everywhere.

But... I think any binary search of this array is inherently racy,
because elements may disappear as you are searching.  It isn't clear
what compareTo ought to return when it sees a null element, since the
correct result will depend on what is going on in the search right
now, and this is unknowable.


I think there are different approaches that avoid the races and let us
keep the nice binary searchability property of the current
implementation.

There are a few ways to do this but the basic idea is that the
GapPosition we give to the user doesn't hold the marker itself, but
rather some kind of reference to the marker.  When we notice that a
GapPosition is destroyed, we remove the corresponding marker.  This is
less racy because we explicitly control when the contents of the
marker data structure are updated -- it doesn't matter if we update a
few "extra" markers on occasion.


One way to do this is to reference count the markers (you could of
course have a unique GapContentPosition for each marker position, but
I think that ends up being about the same amount of work and
overhead).

So instead of the current ArrayList we would have:

    // This is a sorted list of positions where a marker appears.
    int[] markers;
    // This is a parallel array of reference counts.
    int[] refCounts;

Then we'd have a way to map into the markers array:

    // This maps sequential integers to indices in the markers array.
    int[] contentIndices;

A GapContentPosition would hold an 'int index' which would be used to
find the actual marker.  So for instance GapContentPosition.getOffset
would do something like:

    int location = markers[contentIndices[index]];

When creating a new GapContentPosition we would grow contentIndices.
Then there are 2 cases:

1. The marker already exists.  The new value for contentIndices is the
   index of the marker.  We increment the marker's reference count.
2. The marker does not exist.  We grow markers and refCounts and add
   the marker.  Then we update the contentIndices array to reflect
   the addition.

For noticing the death of a GapContentPosition, we would use a
ReferenceQueue and do the clearing in clearPositionReferences:

  ReferenceQueue deathQueue = new ReferenceQueue();

  private void clearPositionReferences()
  {
    Reference r;
    while ((r = deathQueue.poll()) != null)
      {
        GapContentReference gcr = (GapContentReference) r;
        int contentIndex = gcr.getIndex();
        // Now update contentIndices, reference counts, etc.
      }
  }

(Really you'd want to be smarter about doing this to avoid excess
copying, but you get the idea.)

One last part is to have a new GapContentReference which extends
WeakReference (or PhantomReference, it doesn't really matter) and
which holds the index into the contentIndices array.  When creating a
new GapContentPosition you would also register the reference:

    new GapContentReference(blah, deathQueue);


I keep thinking that there must be a simpler way, but offhand I can't
think of it.  Hopefully you can :-).. I'm probably missing something
obvious.  Anyway the point is that we can preserve the nice (required,
really) features of the current approach while avoiding the raciness
of using WeakReferences directly in the model.

Tom

Reply via email to