Re: 3-nd round RFR (XS) 6988950: JDWP exit error JVMTI_ERROR_WRONG_PHASE(112)

[email protected] Mon, 03 Nov 2014 11:37:15 -0800

On 11/3/14 10:41 AM, Daniel D. Daugherty wrote:

On 10/31/14 3:07 PM, [email protected] wrote:


It is 3-rd round of review for:
https://bugs.openjdk.java.net/browse/JDK-6988950

New webrev:
http://cr.openjdk.java.net/~sspitsyn/webrevs/2014/jdk/6988950-JDWP-wrong-phase.3/

Thumbs up on the code.


Thanks, Dan!

I have comment suggestions below...

src/jdk.jdwp.agent/share/native/libjdwp/debugLoop.c
    line 149: debugMonitorEnter(vmDeathLock);
        Perhaps this comment would help:
        /*
         * We grab the vmDeathLock here to prevent the cbVMDeath()
         * event handler from tearing things down while we're
         * asynchronously processing a command.
         */


Done


src/jdk.jdwp.agent/share/native/libjdwp/eventHandler.c
    line 71: jrawMonitorID vmDeathLock;
        A nice blurb about the new global lock's protocol would be
        good here. Something like:

        /*
         * Coordinates the cbVMDeath() event handler and the
         * debugLoop_run() thread.
         */


Done

    line 1236: debugMonitorEnter(vmDeathLock);
        Perhaps this comment would help:
        /*
         * We grab the vmDeathLock here to prevent the debugLoop_run()
         * thread from asynchronously dispatching another command.
         */


Done
I also think, it'd be enough to narrow the scope of synchronization
around the event_callback() call:

        /*
         * Coordinates the cbVMDeath() event handler and the
         * debugLoop_run() thread.
         */
        debugMonitorEnter(vmDeathLock);

        /* Only now should we actually process the VM death event */
        (void)memset(&info,0,sizeof(info));
        info.ei                 = EI_VM_DEATH;
        event_callback(env, &info);

        debugMonitorExit(vmDeathLock);

    This block caught my eye:

    1295     /*

1296 * The VM will die soon after the completion of thiscallback - we1297 * may need to do a final synchronization with thecommand loop to1298 * avoid the VM terminating with replying to the final(resume)

    1299      * command.
    1300      */
    1301     debugLoop_sync();

    The above comment implies that debugLoop_sync() does something to
    coordinate this code (cbVMDeath()) with the debugLoop code, but
    clearly something is incomplete. It's entirely possible that this
    debugLoop_sync() is solving a different problem that happens after
    the debugLoop has left its command processing loop and realizes
    that the VM is shutting down.


This block caught my eye too.
I agree, it looks incomplete.

The cbVMDeath() callback waits until a resume command finishes if it hasbeen started.

Not sure how useful it is.


Thanks!
Serguei

    Don't know for sure. I haven't been in this code for quite a while...

Dan
Summary

  For failing scenario, please, refer to the 1-st round RFR below.
I've found what is missed in the jdwp agent shutdown and decided toswitch from a workaround to a real fix.
  The agent VM_DEATH callback sets the gdata field: gdata->vmDead = 1.
  The agent debugLoop_run() has a guard against the VM shutdown:
  165             } else if (gdata->vmDead &&
  166              ((cmd->cmdSet) != JDWP_COMMAND_SET(VirtualMachine))) {
  167                 /* Protect the VM from calls while dead.
  168                  * VirtualMachine cmdSet quietly ignores some cmds
  169                  * after VM death, so, it sends it's own errors.
  170                  */
  171                 outStream_setError(&out, JDWP_ERROR(VM_DEAD));
However, the guard above does not help much if the VM_DEATH eventhappens in the middle of a command execution.
  There is a lack of synchronization here.
The fix introduces new lock (vmDeathLock) which does not allow toexecute the commands
  and the VM_DEATH event callback concurrently.
It should work well for any function that is used in implementationof the JDWP_COMMAND_SET(VirtualMachine) .
Testing:
  Run nsk.jdi.testlist, nsk.jdwp.testlist and JTREG com/sun/jdi tests


Thanks,
Serguei


On 10/29/14 6:05 PM, [email protected] wrote:
The updated webrev:
http://cr.openjdk.java.net/~sspitsyn/webrevs/2014/jdk/6988950-JDWP-wrong-phase.2/
The changes are:
  - added a comment recommended by Staffan
- removed the ignore_wrong_phase() call from functionclassSignature()
The classSignature() function is called in 16 places.
Most of them do not tolerate the NULL in place of returned signatureand will crash.I'm not comfortable to fix all the occurrences now and suggest toreturn to thisissue after gaining experience with more failure cases that arestill expected.The failure with the classSignature() involved was observed onlyonce in the nightly
and should be extremely rare reproducible.
I'll file a placeholder bug if necessary.

Thanks,
Serguei

On 10/28/14 6:11 PM, [email protected] wrote:
Please, review the fix for:
https://bugs.openjdk.java.net/browse/JDK-6988950


Open webrev:
http://cr.openjdk.java.net/~sspitsyn/webrevs/2014/jdk/6988950-JDWP-wrong-phase.1/
Summary:

   The failing scenario:
The debugger and the debuggee are well aware a VM shutdown hasbeen started in the target process.The debugger at this point is not expected to send anycommands to the JDWP agent.However, the JDI layer (debugger side) and the jdwp agent(debuggee side)
     are not in sync with the consumer layers.
One reason is because the test debugger does not invoke theJDI method VirtualMachine.dispose().Another reason is that the Debugger and the debuggee processesare uneasy to sync in general.
     As a result the following steps are possible:
       - The test debugger sends a 'quit' command to the test debuggee
       - The debuggee is normally exiting
- The jdwp backend reports (over the jdwp protocol) ananonymous class unload event- The JDI InternalEventHandler thread handles theClassUnloadEvent event- The InternalEventHandler wants to uncache the matchingreference type.If there is more than one class with the same host classsignature, it can't distinguish them,and so, deletes all references and re-retrieves them again(see tracing below):MY_TRACE: JDI:VirtualMachineImpl.retrieveClassesBySignature:sig=Ljava/lang/invoke/LambdaForm$DMH;- The jdwp backend debugLoop_run() gets the command from JDIand calls the functions
         classesForSignature() and classStatus() recursively.
- The classStatus() makes a call to the JVMTIGetClassStatus() and gets the JVMTI_ERROR_WRONG_PHASE- As a result the jdwp backend reports the JVMTI error tothe JDI, and so, the test fails
For details, see the analysis in bug report closed as a dup ofthe bug 6988950:
https://bugs.openjdk.java.net/browse/JDK-8024865
Some similar cases can be found in the two bug reports(6988950 and 8024865) describing this issue.
The fix is to skip reporting the JVMTI_ERROR_WRONG_PHASE erroras it is normal at the VM shutdown.The original jdwp backend implementation had a similarapproach for the raw monitor functions.Threy use the ignore_vm_death() to workaround theJVMTI_ERROR_WRONG_PHASE errors.
     For reference, please, see the file: src/share/back/util.c


Testing:
  Run nsk.jdi.testlist, nsk.jdwp.testlist and JTREG com/sun/jdi tests


Thanks,
Serguei

Re: 3-nd round RFR (XS) 6988950: JDWP exit error JVMTI_ERROR_WRONG_PHASE(112)

Reply via email to