Hi Gil,
I totally agree with your assessment. We should not introduce another
way of reviving the almost collectable objects and I fully support
tightening the specification so that soft and weak references to the
same referent and to other referents from which this referent is
reachable are required to be cleared together atomically.
I modified the prototype to (hopefully) adhere to this new Ephemeron
specification that Gil and I agreed upon. Anyone interested in
experimenting can find it here:
http://cr.openjdk.java.net/~plevart/misc/Ephemeron/webrev.jdk.02/
http://cr.openjdk.java.net/~plevart/misc/Ephemeron/webrev.hotspot.02/
It is rebased to current tip of jdk9-dev repositories (after the bulk of
merges for jdk-9+102), but still contains the change to remove the
Cleaner reference type as it has not yet managed to get in...
I have also added a test that is a start for verifying the functionality.
Regards, Peter
On 01/23/2016 07:25 PM, Gil Tene wrote:
On Jan 23, 2016, at 5:14 AM, Peter Levart <peter.lev...@gmail.com
<mailto:peter.lev...@gmail.com>> wrote:
Hi Gil, it's good to have this discussion. See comments inline...
On 01/23/2016 05:13 AM, Gil Tene wrote:
....
On Jan 22, 2016, at 2:49 PM, Peter Levart <peter.lev...@gmail.com>
wrote:
Ephemeron always touches definitions of at least two consecutive
strengths of reachabilities. The prototype says:
* <li> An object is <em>weakly reachable</em> if it is neither
* strongly nor softly reachable but can be reached by traversing a
* weak reference or by traversing an ephemeron through it's value
while
* the ephemeron's key is at least weakly reachable.
* <li> An object is <em>ephemerally reachable</em> if it is neither
* strongly, softly nor weakly reachable but can be reached by
traversing an
* ephemeron through it's key or by traversing an ephemeron through
it's value
* while it's key is at most ephemerally reachable. When the
ephemerons that
* refer to ephemerally reachable key object are cleared, the key
object becomes
* eligible for finalization.
Looking into this a bit more, I don't think the above is quite
right. Specifically, If an ephemeron's key is either strongly of
softly reachable, you want the value to remain appropriately
strongly/softly reachable. Without this quality, Ephemeron value
referents can (and will) be prematurely collected and finalized
while the keys are not. This (IMO) needed quality not provided by
the behavior you specify…
This is not quite true. While ephemeron's value is weakly or even
ephemerally-reachable, it is not finalizable, because
ephemeraly-reachable is stronger than finaly-reachable. After
ephemeron's key becomes ephemeraly-reachable, the ephemeron is
cleared by GC which sets it's key *and* value to null atomically. The
life of key and value at that moment becomes untangled. Either of
them can have a finalizer or not and both of them will eventually be
collected if not revived by their finalize() methods. But it can
never happen that ephemeron's value is finalized or collected while
it's key is still reachable through the ephemeron (while the
ephemeron is not cleared yet).
But I agree that it would be desirable for ephemeron's value to
follow the reachability of it's key. In above specification, if the
key is strongly reachable, the value is weakly reachable, so any
WeakReferences or SoftReferences pointing at the Ephemeron's value
can already be cleared while the key is still strongly reachable.
This is arguably no different than current specification of Soft vs.
Weak references. A SoftReference can already be cleared while its
referent is still reachable through a WeakReference,
We seem to agree about the cleaner behavior specification (in both of
our texts below), so the these next paragraphs are really about
arguing for why this is an important design choice if/when adding
Ephemerons to Java:
It is true the [current] spec allows for soft references to an object
to be cleared while weak references to the same object are not: the
"determines" in "Suppose that the garbage collector determines at a
certain point in time hat an object is RRRR reachable..." part
[for RRRR = {soft, weak}] does not have to happen at the same "certain
point in time".
However, to my knowledge all current implementations present as if
this determination is happening at the same "point in time" for all
weakly and softly reachable objects combined. Specifically [in
implementations]: if soft reachability is determined for an object at
some point in time, then weak reachability for that object is
determined at the same point in time. And the weak reachability
determination for an object depends on whether the collector chose to
clear existing soft references to that object at that same point in
time, with the appearance of the choice to clear (or not to clear)
soft references to a given object atomically affecting the
determination of it's weak reachability. Since the collector is
*required* to act on a weak determination when it is made, while it
*may* act on a soft determination when it is made, making the combined
determination at the same "point in time" eliminates an obviously
confusing situation that is not prohibited by the spec: if the
determination for weak and soft reachability was not done at the same
point in time, then an object that was softly reachable and had it's
soft references cleared and queued could later become strongly
reachable, and even softly reachable again. When reference processing
is done as a STW thing, this "combined determination" effect is a
trivial side-effect of STW. When it is done concurrently (or
incrementally?), implementations still work to maintain the appearance
of combined atomic determination of soft and weak reachability. I know
ours does. In our case, we do it because we had no desire to be the
ones to argue "I know that all implementations did this atomically
because they were STW, but the spec allows us to add this bug to your
program…".
So in actual implementations (to my knowledge), finalization is
currently the only mechanism that can create this "strange situation"
where an object was no longer strongly reachable, had actions
triggered as a result from loss of strong reachability (i.e. actually
observed by the program as "known to not be strongly reachable"), and
later became strongly reachable again. E.g. a finalizer can propagate
a strong reference to a previously non-strongly reachable object
('this' in the finalizer, or anything that 'this' transitively refers
and was not otherwise reachable when the finalizer was called).. This
is one of those "undesired" things that the introduction of Reference
types was meant to deal with (Reference types were introduced in 1.2,
after finalization was unfortunately already included and spec'ed. And
phantom refs were meant to allow for a cleaner form that could replace
finalization). And while the specifications of SoftReference and
WeakReference do not prohibit it, implementations are not required to
allow it, and in practice non of them do (I think), as doing so would
most likely expose some "interesting"
spec-allowed-but-extremely-surprising things/bugs that none of us want
to have to defend...
In this context, it would be a "highly undesirable" design choice to
introduce Ephemerons in a way that would them to return a strong
reference to an object that has previously been determined to no
longer be strongly reachable. Structuring the spec to prohibit this is
a better design choice.
To highlight the design choice here, let me describe a specific
problem scenario for which the previous (above) spec would cause
"re-strengthening" behavior that would break assumptions that are
allowed under the current spec: in the above/previously specified
behavior an object V that is known to have no finalizers, but has e.g.
3 WeakReference objects that refer to it, can become weakly reachable
while both a key referent object K in some ephemeron E with a value
referent of V remain strongly reachable. At such a point (V is weakly
reachable, K and E are strongly reachable), the collector may
determine weak reachability for V, [atomically] clear all weak
references to V, and enqueue those weak reference objects on their
respective queues. While V is still ephemerally reachable under your
previous definition, there are no references to it anywhere other than
in ephemeron value referent fields, and weak references that did refer
to it have been cleared and queued. Since the ephemeron is still
there, and the key is still there, and the ephemeron has not been
cleared, an Ephemeron.getValue() call would create a strong reference
to an object that was previously determined to not be weakly
reachable. Re-creating a strong reference to V after the point where
weak references to V were cleared and the weak refs to it were
enqueued would be "surprising" to current weak reference based code
(the only thing that could cause this under the current spec would be
a finalizer), so allowing that (jn the spec) is likely to break all
kinds of logic that depends on currently spec'ed weak reference behaviors.
The spec'ed behavior we seem to be agreeing on (below) would prohibit
this loophole and would [I think] maintain any reachability-based
expectations that current weak-ref based logic can make under the
current spec. Maintaining this continuity is an important design
choice for adding Ephemerons into the current set of Reference behaviors.
And since I suspect that all implementations will continue to choose
to do the "determination" of soft and weak reachability at the same
"point in time", this will fit well with how people would build this
stuff anyway.
Separate note: It would be separately interesting to consider
narrowing the SoftRef spec to require JVM implementations to
atomically clear all soft *and* weak references to an object at the
same time. I.e. if the garbage collector chooses to clear a soft
reference to an object that would become weakly reachable as a result,
then all weak references to that object must be [atomically] cleared
at the same time. Since I suspect that all current JVM implementations
actually adhere to this stronger requirement already, this would not
"hurt" anything or require extra work to comply with. [Anyone from
Metronome or some other non-STW reference processing implementations
want to chime in?].
but for Ephemeron's value this might be confusing. The easier to
understand conceptual model for Ephemerons might be a pair of
(WeakReference<K>, WeakReference<V>) where the key has a virtual
strong reference to the value. And this is what we get if we say that
reachability of the value follows reachability of the key.
For a correctly specified behavior, I think all strengths (from
strong down) need to be affected by key/value Ephemeron
relationships, but without adding an "ephemerally reachable"
strength. E.g. I think you fundamentally need something like this:
- "An object is <em>strongly reachable</em> if it can be reached by
(a) some thread without traversing any reference objects, or by (b)
traversing the value of an Ephemeron whose key is strongly
reachable. A newly-created object is strongly reachable by the
thread that created it"
- "An object is <em>softly reachable</em> if it is not
strongly reachable but can be reached by (a) traversing a soft
reference or by (b) traversing the value of an Ephemeron whose key
is softly reachable.
- "An object is <em>weakly reachable</em> if it is neither strongly
nor softly reachable but can be reached by (a) traversing a weak
reference or by (b) traversing the value of an ephemeron whose key
is weakly reachable.
...and that's where we stop, because when we make Ephemeron just a
special kind of WeakReference, the next thing that happens is:
* <p> Suppose that the garbage collector determines at a certain
point in time
* that an object is <a href="package-summary.html#reachability">weakly
* reachable</a>. At that time it will atomically clear all weak
references to
* that object and all weak references to any other weakly-reachable
objects
* from which that object is reachable through a chain of strong and soft
* references. At the same time it will declare all of the formerly
* weakly-reachable objects to be finalizable. At the same time or
at some
* later time it will enqueue those newly-cleared weak references
that are
* registered with reference queues.
...where "clearing of the WeakReference" means reseting the key *and*
value to null in case it is an Ephemeron; and
"all weak references to some object" means Ephemerons that have that
object as a key (but not those that only have it as a value!) in case
of ephemerons
...
I still think that Ephemeron<K, V> should extend WeakReference<K>,
since that places already established rules and expectation on (a)
when it will be enqueued, (b) when the collector will clear it (when
the the collector encounters the <K> key being weakly reachable),
and (c) that clearing of all Ephemeron *and* WeakReference instances
who share an identical key value is done atomically, along with (d)
all weak references to to any other weakly-reachable objects from
which that object is reachable through a chain of strong and soft
references. These last (c, d) parts are critically captured since an
Ephemeron *is a* WeakReference, and the statement in WeakReference
that says that "… it will atomically clear all weak references to
that object and all weak references to any other weakly-reachable
objects from which that object is reachable through a chain of
strong and soft references." has a clear application.
Here are some suggested edits to the JavaDoc to go with this
suggested spec'ed behavior:
/**
* Ephemeron<K, V> objects are a special kind of WeakReference<K>
objects, which
* hold two referents (a key referent and a value referent) and do
not prevent their
* referents from being made finalizable, finalized, and then
reclaimed.
* In addition to the key referent, which adheres to the referent
behavior of a
* WeakReference<K>, an ephemeron also holds a value referent whose
reachabiliy
* strength is affected by the reachability strength of the key
referent:
* The value referent of an Ephemeron instance is considered:
* (a) strongly reachable if the key referent of the same Ephemeron
* object is strongly reachable, or if the value referent is
otherwise strongly reachable.
* (b) softly reachable if it is not strongly reachable, and (i)
the key referent of
* the same Ephemeron object is softly reachable, or (ii) if the
value referent is otherwise
* softly reachable.
* (c) weakly reachable if it is not strongly or softly reachable,
and (i) the key referent of
* the same Ephemeron object is weakly reachable, or (ii) if the
value referent is otherwise
* weakly reachable.
* <p> When the collector clears an Ephemeron object instance
(according to the rules
* expressed for clearing WeakReference object instances), the
Ephemeron instance's
* key referent value referent are simultaneously and atomically
cleared.
* <p> By convenience, the Ephemeron's referent is also called the
key, and can be
* obtained either by invoking {@link #get} or {@link #getKey}
while the value
* can be obtained by invoking {@link #getValue} method.
*...
Thanks, this is very nice. I do like this behavior more.
Let me see what it takes to implement this strategy...
Regards, Peter