On Thu, 15 Aug 2024 18:22:54 GMT, Viktor Klang <[email protected]> wrote:
> My suspicion is that Condition::await() throws before having successfully
> reacquired the lock, and this exception is swallowed because Lock::unlock()
> then throws when invoke with an IllegalMonitorStateException as the current
> thread was not reestablished as the owner.
>
> So this PR is intended to harden the reacquisition of await-ing methods in
> AQS and AQLS so that instead of throwing they spin-loop trying to reacquire
> the lockāat least a thread dump would show where the Thread is stuck trying
> to reacquire.
>
> There's also the option of attempting to log a JFR event on the first
> reacquisition failure, but that might need to be done with a pre-created
> event as the cause of the failing acquisition may be an OOME.
>
> I also needed to harden the TestDisposerRace itself, as it currently tries to
> provoke OOMEs but then proceeds to want to allocate objects used for the test
> itself. I'm not super-happy about the changes there, but they should be safer
> than before.
>
> The first RuntimeException/Error encountered will be rethrown upon success to
> facilitate debugging.
@DougLea @AlanBateman I'm currently trying to provoke the test to fail with
this new implementation.
@DougLea @AlanBateman This seems to fix TestDisposerRace.java
Ready for review
Closing this PR and opening a new one, to see if that helps.
Gah, so it had nothing to do with the commit message. There was, for some
reason, a tab in the JBS title which was copy-pasted in as the PR title.
src/java.base/share/classes/java/util/concurrent/locks/AbstractQueuedLongSynchronizer.java
line 286:
> 284: */
> 285: private final void reacquire(Node node, long arg) {
> 286: for (long nanos = 1L;;) {
@AlanBateman @DougLea If we're concerned about SOE due to adding another
stackframe by invoking reacquire we might consider ForceInline, but I was weary
about doing so.
src/java.base/share/classes/java/util/concurrent/locks/AbstractQueuedLongSynchronizer.java
line 290:
> 288: acquire(node, arg, false, false, false, 0L);
> 289: break;
> 290: } catch (Error | RuntimeException ex) {
@DougLea @AlanBateman The first question is if this set of exception are the
way to go, or we should go with Throwable
src/java.base/share/classes/java/util/concurrent/locks/AbstractQueuedLongSynchronizer.java
line 294:
> 292: if (first) {
> 293: first = false;
> 294: // Log JFR event?
@DougLea @AlanBateman The second question is whether we should declare a JFR
event for this use-case or not.
src/java.base/share/classes/java/util/concurrent/locks/AbstractQueuedLongSynchronizer.java
line 312:
> 310: }
> 311: }
> 312: }
@DougLea @AlanBateman So I've added such that the first exception/error is
rethrown after subsequent successful reacquisition. See above.
-------------
PR Comment: https://git.openjdk.org/jdk/pull/20602#issuecomment-2292008244
PR Comment: https://git.openjdk.org/jdk/pull/20602#issuecomment-2293131396
PR Comment: https://git.openjdk.org/jdk/pull/20602#issuecomment-2296564889
PR Comment: https://git.openjdk.org/jdk/pull/20602#issuecomment-2325963172
PR Comment: https://git.openjdk.org/jdk/pull/20602#issuecomment-2326008569
PR Review Comment: https://git.openjdk.org/jdk/pull/20602#discussion_r1719777654
PR Review Comment: https://git.openjdk.org/jdk/pull/20602#discussion_r1718867228
PR Review Comment: https://git.openjdk.org/jdk/pull/20602#discussion_r1718867694
PR Review Comment: https://git.openjdk.org/jdk/pull/20602#discussion_r1731486424