On Fri, 30 Aug 2024 17:09:58 GMT, Leonid Mesnik <lmes...@openjdk.org> wrote:

>> Only allocate the cmdQueueLock once rather than allocate each time there is 
>> a new connection (and destroy when the connection is terminated).
>> 
>> Currently each time there is a new debug agent connection established, the 
>> cmdQueueLock is allocated. It is then destroyed when the connection is 
>> dropped, only to be recreated on the next connection. I looked closely at 
>> the use of this lock, and what happens when the connection is dropped, and I 
>> can't see any reason to create and then dispose of cmdQueueLock for each 
>> connection. Of the 18 debug agent raw monitors, this is the only one that 
>> gets destroyed and recreated. The other 17 stick around for the next 
>> connection. I'd like to get rid of this reallocation in order to support 
>> static initialization of all the debug agent raw monitors on startup. This 
>> will be done as part of the ranked monitor support 
>> ([JDK-8328866](https://bugs.openjdk.org/browse/JDK-8328866)).
>> 
>> As part of the testing for this change, I put an assert in place when we try 
>> to create cmdQueueLock for the 2nd time. I ran all our debugger tests, and 
>> unfortunately none of them failed. This means we don't currently test 
>> attaching to the debug agent more than once in any of our tests, so I did 
>> two things to better test out this change. The first was to write a test 
>> that attaches numerous times in sequence, and the 2nd was to manually attach 
>> and reattach using jdb. At the same time I stepped through the debug agent 
>> code in gdb just to help better understand the setup and use of 
>> cmdQueueLock. I didn't see anything that gave me concern about making this 
>> change.
>> 
>> Tested by running all jdi, jdwp, and jdb tests on all supported platforms.
>
> src/jdk.jdwp.agent/share/native/libjdwp/debugLoop.c line 90:
> 
>> 88:     /* We may be starting a new connection after an error */
>> 89:     cmdQueue = NULL;
>> 90:     if (cmdQueueLock == NULL) {
> 
> Doesn't it makes sense to check that lock is not locked here?
> Is there possibility that lock is left locked for disconnected process or any 
> other similar timeout. Probably it is not covered by our tests.

The normal status of the cmdQueueLock is for the Listener thread to be waiting 
on it. When the Reader thread gets a packet and queues it up, it then does a 
notify. Any holding of the lock is short lived. I don't see how it could still 
be locked at this point, and if it was, it should be quickly released. I typed 
up the following notes for myself a while back when figuring this all out. They 
may be of some help to you in understanding whe this should not be a problem.

=========================================

The debug agent has two threads involved in dealing with the connection to the 
debugger. The first is the JDWP Transport Listener thread. It is initially 
created to listen for the debugger's connection request. Once the request comes 
in the connection is setup and the Listener thread spawns off the JDWP Command 
Reader thread. The Reader thread is responsible for reading in command packets 
and queuing them up for the Listener thread to process.

The Reader thread is usually waiting for the next JDWP command, which it will 
enqueue when it arrives:

    while (shouldListen) {
        jint rc;

        rc = transport_receivePacket(&packet);
        ...
        enqueue(&packet);  <-- does a notify on cmdQueueLock
        ...
    }


The Listener thread sits in a loop dequeuing commands until it gets a 
VM.Dispose command, at which point it will exit the loop, destroy the 
cmdQueueLock, and then exit the thread. Usually while dequeuing it is blocked 
here waiting for the next command:


    debugMonitorEnter(cmdQueueLock);
    while (!transportError && (cmdQueue == NULL)) {
        debugMonitorWait(cmdQueueLock);
    }
    debugMonitorExit(cmdQueueLock);


It will process enqueued packets in this manner until it gets a VM.Dispose 
command.

The cmdQueueLock is created by the Listener thread just before spawning the 
Reader thread. It is destroyed after it exits as described above. If VM.Dispose 
is being used, things are pretty orderly, but we have to also consider abnormal 
connection termination where no VM.Dispose is done. In that case the the above 
transport_receivePacket() call will fail and the Reader thread calls the 
following code:


    debugMonitorEnter(cmdQueueLock);
    transportError = JNI_TRUE;
    debugMonitorNotify(cmdQueueLock);
    debugMonitorExit(cmdQueueLock);


The notify wakes up the Listener thread, which is in the debugMonitorWait() 
trying to dequeue the next command. The dequeue fails, which results in the 
Listener thread shutting down the connection. It will destroy the cmdQueueLock 
just like it does in the orderly shutdown.

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/20755#discussion_r1739396992

Reply via email to