On Thu, 18 Sep 2025 18:01:48 GMT, Chris Plummer <[email protected]> wrote:

> Fixed an issue with a race with two events coming in close to the same time, 
> the first of which does not suspend any debuggee threads. More details in the 
> first comment.
> 
> Tested by running all vmTestbase/nsk/jdi tests 25x times on all platforms 
> both with and w/o virtual threads. Also ran all tier5 svc tests.

The test is dealing with two events. The first event is a MethodExitEvent. The 
request for it uses SUSPEND_NONE, so the debuggee is not suspended when this 
event is generated. The second is a BreakpointEvent that is part of the 
"breakpoint for communication" support. It uses SUSPEND_ALL.

For the MethodExitEvent, the test uses EventHandler.waitForRequestedEvent(), 
which relies mostly on waitForRequestedEventCommon(). This is where the bug is. 
It sets up an EventHandler listener, and waits for the listener to be called 
for the MethodExitEvent.

                while (!isDisconnected() && en.set == null && timeLeft > 0) {
                    EventHandler.this.wait(timeLeft);
                    timeLeft = timeToFinish - System.currentTimeMillis();
                }

The listener will store the EventSet in en.set and the Event in en.event. The 
event comes in as expected and the listener does a notifyAll() to wakeup the 
wait(). The problem is before the wait() actually wakes up, the BreakpointEvent 
comes in. This is because the MethodExitEvent was delivered with SUSPEND_NONE, 
so the debuggee has continued on to the breakpoint. This means the listener 
gets called again, even though the MethodExitEvent was already delivered. The 
listener clears out the en.set field, and then sees that the BreakpointEvent is 
not the one that was requested, so it returns but leaves en.set set to null. At 
this point the wait() above returns. It does the "en.set == null" check, and 
falls back into another wait() call. This one never wakes up with a 
notifyAll(), but does time out after 5 minutes. There is no error reported when 
it times out even though en.set is still null. en.event is still properly set, 
and this is what waitForRequestedEvent() returns, so in the end th
 e test passes, but only after the extra 5 minute delay.

The fix is pretty simple. In the EventHandler listener, if we already got the 
event we are looking for, then ignore any others that come in. 

Note I also removed the synchronize(EventHandler.this) from the listener. 
EventHandler.run() already does the same synchronize before calling the 
listener. I did the same in the listener being used for the "breakoint for 
communication". I checked all other eventReceived() callbacks, and didn't find 
any others using this synchronization.

-------------

PR Comment: https://git.openjdk.org/jdk/pull/27370#issuecomment-3308873649

Reply via email to