On Wed, 12 Nov 2025 13:29:07 GMT, Anton Artemov <[email protected]> wrote:
>> Hi, please consider the following changes: >> >> If suspension is allowed when a thread is re-entering an object monitor >> (OM), then a deadlock is possible. There are two places where it can happen: >> >> 1) The waiting thread is made to be a successor and is unparked. Upon a >> suspension request, the thread will suspend itself whilst clearing the >> successor. The OM will be left unlocked (not grabbed by any thread), while >> the other threads are parked until a thread grabs the OM and the exits it. >> The suspended thread is on the entry-list and can be selected as a successor >> again. None of other threads can be woken up to grab the OM until the >> suspended thread has been resumed and successfully releases the OM. >> >> 2) The race between suspension and retry: the thread could reacquire the OM >> and complete the wait() code in full, but then on return to Java it will be >> suspended while holding the OM. >> >> The issues are addressed by not allowing suspension in case 1, and by >> handling the suspension request at a later stage, after the thread has >> grabbed the OM in `reenter_internal()` in case 2. In case of a suspension >> request, the thread exits the OM and enters it again once resumed. >> >> The JVMTI `waited` event posting (2nd one) is postponed until the suspended >> thread is resumed and has entered the OM again. The `enter` to the OM (in >> case `ExitOnSuspend` did exit) is done without posting any events. >> >> Tests are added for both scenarios. >> >> Tested in tiers 1 - 7. > > Anton Artemov has updated the pull request with a new target base due to a > merge or a rebase. The pull request now contains 20 commits: > > - Merge remote-tracking branch 'origin/master' into > JDK-8366659-OM-wait-suspend-deadlock > - 8366659: Fixed lines in tests. > - Merge remote-tracking branch 'origin/master' into > JDK-8366659-OM-wait-suspend-deadlock > - 8366659: Added a comment to a boolean arg for enter() > - Merge remote-tracking branch 'origin/master' into > JDK-8366659-OM-wait-suspend-deadlock > - Merge remote-tracking branch 'origin/master' into > JDK-8366659-OM-wait-suspend-deadlock > - 8366659: Fixed new lines. > - Merge remote-tracking branch 'origin/master' into > JDK-8366659-OM-wait-suspend-deadlock > - 8366659: Removed incorrect assert, > - 8366659: Fixed merge conflict > - ... and 10 more: https://git.openjdk.org/jdk/compare/400a83da...702880c6 The transaction diagram in SuspendWithObjectMonitorWait.java on L56 -> L77 is for the `doWork1` test so the comment should be modifed to make that clear by adding this above L56: // // doWork1 algorithm: I've created a transaction diagram for doWork2: // // doWork2 algorithm: // // main waiter resumer // ================= ================== =================== // launch waiter // <launch returns> waiter running // launch resumer enter threadLock // <launch returns> threadLock.wait() resumer running // enter threadLock : wait for notify // threadLock.notify wait finishes : // : reenter blocks : // suspend waiter <suspended> : // <ready to test> : : // : : : // notify resumer : wait finishes // delay 1-second : : // exit threadLock : : // join resumer : enter threadLock // : <resumed> resume waiter // : : exit threadLock // : reenter threadLock : // <join returns> : resumer exits // join waiter : // <join returns> waiter exits // // Note: The sleep(1-second) in main along with the delayed exit // of threadLock in main forces the resumer thread to reach // "enter threadLock" and block. This difference from doWork1 // forces the resumer thread to be contending for threadLock // while the waiter thread is in threadLock.wait() increasing // stress on the monitor sub-system. // I've created a transaction diagram for doWork3: // // doWork3 algorithm: // // main waiter resumer // =================== ====================== =================== // launch waiter // <launch returns> waiter running // launch resumer enter threadLock // <launch returns> while !READY_TO_NOTIFY resumer running // delay 1-second threadLock.wait(1) wait for notify // enter threadLock : : // set READY_TO_NOTIFY : // threadLock.notify wait finishes : // : reenter blocks : // suspend waiter <suspended> : // <ready to test> : : // : : : // notify resumer : wait finishes // delay 1-second : : // exit threadLock : : // join resumer : enter threadLock // : <resumed> resume waiter // : : exit threadLock // : reenter threadLock : // <join returns> : resumer exits // join waiter : // <join returns> waiter exits // // Note: The sleep(1-second) in main along with the delayed exit // of threadLock in main forces the resumer thread to reach // "enter threadLock" and block. This difference from doWork1 // forces the resumer thread to be contending for threadLock // while the waiter thread is in the threadLock.wait(1) tight // loop increasing stress on the monitor sub-system. // // Note: The first sleep(1-second) in main and the wait(1) in the waiter // thread allows the waiter thread to loop tightly here: // while !READY_TO_NOTIFY // threadLock.wait(1) // ------------- PR Comment: https://git.openjdk.org/jdk/pull/27040#issuecomment-3529595262 PR Comment: https://git.openjdk.org/jdk/pull/27040#issuecomment-3529599987 PR Comment: https://git.openjdk.org/jdk/pull/27040#issuecomment-3529602353
