On Tue, 12 Oct 2021 08:03:22 GMT, Richard Reingruber <rr...@openjdk.org> wrote:

>> This change fixes deadlocks described in the JBS-bug by:
>> 
>> * Releasing `handlerLock` before waiting on `threadLock` in 
>> `blockOnDebuggerSuspend()`
>> 
>> * Notifying on `threadLock` in `threadControl_reset()`
>> 
>> Also the actions in handleAppResumeBreakpoint() are moved/deferred until
>> doPendingTasks() runs. This is necessary because an event handler must not
>> release handlerLock first and foremost because handlers are called while
>> iterating the handler chain for an event type which is protected by 
>> handlerLock
>> (see https://github.com/openjdk/jdk/pull/5805)
>> 
>> The first commit delays the cleanup actions after leaving the loop in
>> `debugLoop_run()`. It allows to reproduce the deadlock running the dispose003
>> test with the command
>> 
>> 
>> make run-test 
>> TEST=test/hotspot/jtreg/vmTestbase/nsk/jdi/VirtualMachine/dispose/dispose003
>> 
>> 
>> The second commit adds a new test that reproduces the deadlock when calling
>> threadControl_resumeThread() while a thread is waiting in
>> blockOnDebuggerSuspend().
>> 
>> The third commit contains the fix described above. With it the deadlocks
>> cannot be reproduced anymore.
>> 
>> The forth commit removes the diagnostic code introduced with the first 
>> commit again.
>> 
>> The fix passed
>> 
>> test/hotspot/jtreg/serviceability/jdwp
>> test/jdk/com/sun/jdi
>> test/hotspot/jtreg/vmTestbase/nsk/jdwp
>> test/hotspot/jtreg/vmTestbase/nsk/jdi
>
> Richard Reingruber has updated the pull request incrementally with one 
> additional commit since the last revision:
> 
>   Improve @summary section of test.

src/jdk.jdwp.agent/share/native/libjdwp/threadControl.c line 2200:

> 2198:         /* trackAppResume() needs handlerLock */
> 2199:         debugMonitorExit(threadLock);
> 2200:         eventHandler_lock(); /* for proper lock order */

Is it still necessary for the handlerLock to be held when calling 
`blockOnDebuggerSuspend()` (presuming you don't also add the new handlerLock 
related code in it)? It seems that only the threadLock is needed so it can then 
wait on it.

The main thing you've done to fix this issue is defer the 
`blockOnDebuggerSuspend()` to be done after `event_callback()` has released the 
handlerLock, and to make it so `blockOnDebuggerSuspend()` does not block while 
holding the handlerLock, so is there really any reason to be grabbing it at all?

src/jdk.jdwp.agent/share/native/libjdwp/threadControl.c line 2220:

> 2218:              * suspends a thread it will remain suspended.
> 2219:              */
> 2220:             trackAppResume(resumer);

`trackAppResume()` doesn't really need the handlerLock. However, it will grab 
it when calling `eventHandler_createInternalThreadOnly()`. Since you want to 
make sure it is grabbed before threadLock in order to preserve lock ordering, 
that complicates things if we decided not to grab the handlerLock in the code 
above. What I'm now thinking is all that is really needed is to hold the 
threadLock around the call to `blockOnDebuggerSuspend()`, or better yet just 
grab it from within `blockOnDebuggerSuspend()`. You probably don't need to do 
anything with handlerLock or threadLock inside of `doPendingTasks()`.

test/jdk/com/sun/jdi/ResumeAfterThreadResumeCallTest.java line 81:

> 79:         System.out.println();
> 80:         System.out.println("###(Target,"+ threadName +") " + m);
> 81:         System.out.println();

I'm not sure why you have the two extra `println()` calls. Seems to just add 
unneeded blank lines in the log file.

test/jdk/com/sun/jdi/ResumeAfterThreadResumeCallTest.java line 100:

> 98:         // "resumee" is suspended now because of the breakpoint
> 99:         // Calling Thread.resume() will block this thread.
> 100: 

no need for empty line here

test/jdk/com/sun/jdi/ResumeAfterThreadResumeCallTest.java line 159:

> 157:         ThreadReference resumee = bpe.thread();
> 158:         ObjectReference resumeeThreadObj = resumee.frame(1).thisObject();
> 159:         printStack(resumee);

As long as you're printing stacks, I think the stack of the main thread would 
be useful here, but you need to suspend it first. I don't think that interferes 
with the test.
``` 
        mainThread.suspend();
        printStack(mainThread);
        mainThread.resume();

test/jdk/com/sun/jdi/ResumeAfterThreadResumeCallTest.java line 165:

> 163:         setField(resumeeThreadObj, "reachedBreakpoint", 
> vm().mirrorOf(true));
> 164: 
> 165:         log("Sleeping 500ms shows that the main thread is blocked 
> calling Thread.resume() on \"resumee\" Thread.");

"shows" -> "so"

test/jdk/com/sun/jdi/ResumeAfterThreadResumeCallTest.java line 167:

> 165:         log("Sleeping 500ms shows that the main thread is blocked 
> calling Thread.resume() on \"resumee\" Thread.");
> 166:         Thread.sleep(500);
> 167:         log("After sleep.");

And after line 167 is also a good place to print the main thread's stack.

test/jdk/com/sun/jdi/ResumeAfterThreadResumeCallTest.java line 176:

> 174:             mainThreadReturnedFromResumeCall = ((PrimitiveValue) 
> v).booleanValue();
> 175:             if (!resumedResumee) {
> 176:                 // main thread should be still blocked.

"...should still be blocked"

test/jdk/com/sun/jdi/ResumeAfterThreadResumeCallTest.java line 222:

> 220:         System.out.println();
> 221:         System.out.println("###(Debugger) " + m);
> 222:         System.out.println();

Same here with the extra `printlin()` calls

-------------

PR: https://git.openjdk.java.net/jdk/pull/5849

Reply via email to