David,
On 17/11/2015 02:00, David Holmes wrote:
Another issue is that writing a rule for javac that is not
overpessimistic is also an intractable problem. The pattern
"atomic operation followed by method invocation to complete
status update" is too general to be the trigger of the annotation.
1) It would lead to an excessive number of methods being annotated,
and by consequence an over-sizing of the reserved area.
2) Another condition to hit the bug is that the stack space of
the callee method must be bigger than the caller method with
the atomic operation. This information depends heavily on runtime
information like HW, compilation policies (inlining) and execution
profile (to know which methods are going to be compiled first).
javac won't have access to these information to annotated methods.
Right - there is no way to know that the atomic operation followed by
the call should really be a single atomic operation that we can't
implement that way. Given many atomic operations have subsequent code
actions, the pattern would degenerate to simply involve the atomic
operation, and that wouldn't work well at all.
Further, it isn't always that pattern. Consider for example
ReentrantReadWriteLock.sync.tryRelease():
int nextc = getState() - releases;
boolean free = exclusiveCount(nextc) == 0;
if (free)
setExclusiveOwnerThread(null);
setState(nextc);
here we clear the owner first, but then may fail to set the new state.
Actually detecting where a stackoverflow can lead to inconsistent
state requires detailed scrutiny of the code. I would like to be able
to reason that setState requires no more stack than
setExclusiveOwnerThread, but I can't do that simply by code inspection.
Exactly. The inspection of method attribute in the class file doesn't
give reliable
information. It only provides a hint about the stack space required
when the method
is interpreted. The stack space required once the method has been JIT
compiled
is fully platform dependent, no way to reason with it at the Java source
level.
Perhaps we should simply start with j.u.c.locks as the initial
candidate
sets of classes and work out from there. The idiomatic Lock usage:
l.lock();
try {
// ...
} finally {
l.unlock();
}
epitomizes the critical-finally-block scenario (though the lock()
problem is much more subtle). AQS is too low-level to flag any specific
function I think; and the other synchronizers perhaps too high-level,
with fewer idiomatic use-cases that obviously benefit from this.
In that regard I don't agree with where Fred has currently placed the
annotation in ReentrantLock and AQS. The placements appear arbitrary at
first glance - though no doubt they are the result of a deep and
careful
analysis. But it is far from obvious why the annotation is placed on
NonfairSync.lock but FairSync.tryAcquire(), for example.
The annotation is used on methods with the problematic pattern
"atomic operation followed by method invocation".
Their execution could lead to data corruption if atomic operation
is executed successfully but the following method invocation
fails because of SOE.
In the NonFairSync class, this pattern is in the lock() method,
while the tryAcquire() is only a method invocation. So lock()
is annotated and tryAcquire() is not. Note that the method
invoked by try acquire is nonfairTryAcquire() which has the
problematic pattern and is annotated.
In the FairSync class, the situation is reversed: lock() is
just an method invocation and it is not annotated, and
tryAcquire() has the problematic pattern and is annotated.
I tried to put the annotation as close as possible to the
critical sections in order to minimize the size requirement
for the reserved area.
Okay I now understand the rule you were applying. But it isn't obvious
from the code. Further, there seem to be other places which could also
suffer serious problems. In AQS doReleaseShared() we have:
720 if (!h.compareAndSetWaitStatus(Node.SIGNAL, 0))
721 continue; // loop to recheck
cases
722 unparkSuccessor(h);
which could seemingly throw SOE without unparking the successor.
Any compound action that logically forms a "transaction" would need to
be immune to SOE.
I agree but describing all patterns that logically forms a"transaction"
is far
above the goal of this JEP. The reserved stack area is a mechanism that
can be used to mitigate effects of stack overflows when their occur in these
"transaction". It is obvious that there's to automatic way to annotate all
critical sections or "transactions". This is why I only annotated the code
relative to ReentranLocks (which was the initial bug) and delayed the
annotation of JDK (or j.u.c) critical sections to a future work. As you
noticed above, it is important to have a deep understanding of the code
in order to identify which sections of code should be annotated.
Note that the annotation doesn't make the code "immune" to SOE.
The SOE still happens, it just delayed until the end of the annotated
section in order to mitigate the damages.
I would be tempted to place them on all the public lock/unlock methods
of the Lock implementations, rather than on the internal AQS-derived
functions, and AQS itself.
I've tried that :-) The amount of code being executed with
the ReservedStackAccess privilege tends to increase very
quickly and I was concerned about keeping the size of the
reserved area small.
How much space does each level of calling need? I know that is hard to
answer but are we talking a few bytes, few kb, many kb?
The few sections I've investigated and dis-assembled were small, far
less than a few kb. However, some sections I've tried to annotate
because they looked critical were accessed through different paths,
or with different entry points. My expertise on the java.util.concurrent
code is limited, and I was not confident I could put the annotation on
the more appropriate places.
If you feel confident you could make this analysis and annotation work,
I think it will be a very valuable addition to this JEP work.
Thanks,
Fred
I'd be looking at expanding the current application of the annotation
to cover all affected areas of AQS, AQLS, ReentrantLock and
ReeentrantReadWriteLock. StampedLock is a bit more problematic - there
it seems we do need to annotate the public API - but if we do it then
j.u.c.locks is at least covered.
Thanks,
David
--
Frederic Parain - Oracle
Grenoble Engineering Center - France
Phone: +33 4 76 18 81 17
Email: frederic.par...@oracle.com