Re: RFR: 8224252: [TESTBUG] hotspot/test/serviceability/sa/sadebugd/SADebugDTest.java is timing out again after fix for JDK-8163805
Thanks Serguei and Chris! Yasumasa 2019年5月25日(土) 5:35 Chris Plummer : > +1 > > On 5/24/19 8:35 AM, serguei.spit...@oracle.com wrote: > > Hi Yusamasa, > > Thank you for the clarifications! > It looks good to me. > > Thanks, > Serguei > > > On 5/24/19 03:00, Yasumasa Suenaga wrote: > > Hi Serguei, > > Other changes are included to improve reliability of testcase. > > DebugServer would show "Debugger attached ..." when attach operation is > completed. So I change its value. > Then the testcase should wait until debugd process is terminated normaly. > So I added waitFor() call. > > Null check of "app" would be done in stopApp(). So I remove it. > > > Thanks, > > Yasumasa > > 2019年5月24日(金) 18:44 serguei.spit...@oracle.com >: > >> Hi Yasumasa, >> >> I'm a little bit confused with this fix. >> If this test is failed on Windows only then the line: >> + * @requires os.family != "windows" >> >> prevents the test to be run on Windows (which looks Okay to me). >> >> Then why are there other fixes (excluding the comment with bug numbers)? >> Your summary below only tells about problem on Windows. >> What was motivation for these changes? >> >> Thanks, >> Serguei >> >> >> On 5/23/19 17:23, Yasumasa Suenaga wrote: >> >> Hi all, >> >> Please review this change. >> This webrev has passed all tests on submit repo >> (mach5-one-ysuenaga-JDK-8224252-3-20190523-0532-2647154). >> >> JBS: https://bugs.openjdk.java.net/browse/JDK-8224252 >> webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8224252/webrev.00/ >> >> JDK-8163805 was suppose to fix a timeout issue with this test and also >> removed it off the problemlist. However, it still times out on >> windows-x64 about 1/3 of the time. It passes every time on all the >> other platforms. >> >> Root cause of this is `jhsdb debugd` could not be exited normally. >> >> `jhsdb debugd` (internally, DebugServer.java) would detach from >> debuggee at shutdown hook. So jhsdb process need to be exited >> normally. >> `jhsdb debugd` could not exit itself without external operation (e.g. >> CTRL+C, signals). Thus we use Process::destroy for it. >> In the case of Windows, Process::destroy calls TerminateProcess() >> Windows API to terminate process. However it would terminate target >> process immediately and it would not give a chance to kick shutdown >> hook. (It is documented on Javadoc of Runtime::addShutdownHook) >> >> >> Thanks, >> >> Yasumasa >> >> >> > >
Re: RFR (S) 8224673: Adjust permission for delayed starting of debugging
Hi Ralf, This will need a CSR request to change the permission. Thanks, David On 24/05/2019 10:05 pm, Schmelter, Ralf wrote: Please review this small change. It changes the permission needed to delayed start the debugging (when enabled with onjcmd=y) via a diagnostic mbean. webrev: http://cr.openjdk.java.net/~rschmelter/webrevs/8224673/webrev.0/ bugreport: https://bugs.openjdk.java.net/browse/JDK-8224673 Best regards, Ralf
Re: RFR: JDK-8218701: jdb tests failed with AGENT_ERROR_INVALID_THREAD
Hi Gary, On 5/24/19 5:17 AM, Gary Adams wrote: I have not tracked down the specific root cause of this failure, yet. It appears that the suspend is being attempted before the thread has been recorded in threadControl. I don't think this is possible. When the JVMTI event is received by the debug agent, that's when threadControl adds the thread. We are well beyond that point. The debug agent has processed the JVMTI event and created and queued a ReportEventCompositeCommand (recc) to pass the event on the the debugger (the queue is processed asynchronously by another thread). You are in the process of processing this recc, which involves suspending all threads before sending the event to the debugger. If the thread in the recc is invalid (no longer known to threadControl), then the only way I can see that happening is if the thread has terminated. ASFAIK, there is nothing preventing this from happening. Adding diagnostic messages is tricky because it changes the timing. Here's a minimal probe to track threadControl addNode and clearThread transactions. See attached log.txt. diff --git a/src/jdk.jdwp.agent/share/native/libjdwp/eventHelper.c b/src/jdk.jdwp.agent/share/native/libjdwp/eventHelper.c --- a/src/jdk.jdwp.agent/share/native/libjdwp/eventHelper.c +++ b/src/jdk.jdwp.agent/share/native/libjdwp/eventHelper.c @@ -491,7 +491,9 @@ static void suspendWithInvokeEnabled(jbyte policy, jthread thread) { + /* if (threadControl_getInvokeRequest(thread) != NULL) { */ invoker_enableInvokeRequests(thread); + /* } */ if (policy == JDWP_SUSPEND_POLICY(ALL)) { (void)threadControl_suspendAll(); diff --git a/src/jdk.jdwp.agent/share/native/libjdwp/threadControl.c b/src/jdk.jdwp.agent/share/native/libjdwp/threadControl.c --- a/src/jdk.jdwp.agent/share/native/libjdwp/threadControl.c +++ b/src/jdk.jdwp.agent/share/native/libjdwp/threadControl.c @@ -285,6 +285,7 @@ static void addNode(ThreadList *list, ThreadNode *node) { + printf ("addNode %p \n", node->thread); node->next = NULL; node->prev = NULL; node->list = NULL; @@ -362,6 +363,7 @@ static void clearThread(JNIEnv *env, ThreadNode *node) { + printf("clearThread %p\n", node->thread); if (node->pendingStop != NULL) { tossGlobalRef(env, &(node->pendingStop)); } @@ -1646,6 +1648,8 @@ node = findThread(, thread); if (node != NULL) { request = >currentInvoke; + } else { + printf ("threadControl_getInvokeRequest %p\n", thread); } debugMonitorExit(threadLock); The AGENT_ERROR_INVALID_THREAD is reported when invoker_enableInvokeRequest does not find the thread. I added the print out in threadControl_getInvokeRequest when the thread is not found. You are printing the oop* instead of the oop. That's fine for the node->thread references in addNode and clearNode, since they should match up, but the "thread" reference in threadControl_getInvokeRequest() is probably a localref, so will not match up with anything printed by addNode or clearNode. You probably need to print the oop and hope there is no intervening gc. The workaround bypasses the error and falls through to the threadControl CommonSuspend path where runningThreads is complimented with an otherThreads mechanism to ensure threads that are not between start and end events will be properly resumed later on. This fix is probably fine, but I need to think about it some more. Would be best to first confirm what's going on though. thanks, Chris On 5/23/19, 1:23 PM, Chris Plummer wrote: Hi Gary, So a JVMTI event came in on a valid thread, got processed by the Debug Agent and enqueued to be sent to the Debugger. However, before it was actually sent, the thread became invalid. Am I understanding this issue correctly? thanks, Chris On 5/23/19 2:59 AM, gary.ad...@oracle.com wrote: This proposed workaround ensures that the delay between a suspend request passing through the jdwp command queue will not fail due to a no longer running thread. Webrev: http://cr.openjdk.java.net/~gadams/8218701/webrev/ Issue:
Re: RFR: 8224252: [TESTBUG] hotspot/test/serviceability/sa/sadebugd/SADebugDTest.java is timing out again after fix for JDK-8163805
+1 On 5/24/19 8:35 AM, serguei.spit...@oracle.com wrote: Hi Yusamasa, Thank you for the clarifications! It looks good to me. Thanks, Serguei On 5/24/19 03:00, Yasumasa Suenaga wrote: Hi Serguei, Other changes are included to improve reliability of testcase. DebugServer would show "Debugger attached ..." when attach operation is completed. So I change its value. Then the testcase should wait until debugd process is terminated normaly. So I added waitFor() call. Null check of "app" would be done in stopApp(). So I remove it. Thanks, Yasumasa 2019年5月24日(金) 18:44 serguei.spit...@oracle.com: Hi Yasumasa, I'm a little bit confused with this fix. If this test is failed on Windows only then the line: + * @requires os.family != "windows" prevents the test to be run on Windows (which looks Okay to me). Then why are there other fixes (excluding the comment with bug numbers)? Your summary below only tells about problem on Windows. What was motivation for these changes? Thanks, Serguei On 5/23/19 17:23, Yasumasa Suenaga wrote: Hi all, Please review this change. This webrev has passed all tests on submit repo (mach5-one-ysuenaga-JDK-8224252-3-20190523-0532-2647154). JBS: https://bugs.openjdk.java.net/browse/JDK-8224252 webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8224252/webrev.00/ JDK-8163805 was suppose to fix a timeout issue with this test and also removed it off the problemlist. However, it still times out on windows-x64 about 1/3 of the time. It passes every time on all the other platforms. Root cause of this is `jhsdb debugd` could not be exited normally. `jhsdb debugd` (internally, DebugServer.java) would detach from debuggee at shutdown hook. So jhsdb process need to be exited normally. `jhsdb debugd` could not exit itself without external operation (e.g. CTRL+C, signals). Thus we use Process::destroy for it. In the case of Windows, Process::destroy calls TerminateProcess() Windows API to terminate process. However it would terminate target process immediately and it would not give a chance to kick shutdown hook. (It is documented on Javadoc of Runtime::addShutdownHook) Thanks, Yasumasa
Re: [12u] RFR: 8214545: sun/management/jmxremote/bootstrap tests hang in revokeall.exe on Windows
+1 --alex On 05/24/2019 02:55, serguei.spit...@oracle.com wrote: Hi Daniil, The fix has been applied cleanly. LGTM++ Thanks, Serguei On 5/23/19 18:22, Daniil Titov wrote: Please review the backport of this fix to JDK 12. The JDK 12 changes applied mostly smoothly, but one hunk in make/test/JtregNativeJdk.gmk didn't apply because of changed context lines. That's the only difference. Webrev: http://cr.openjdk.java.net/~dtitov/backports/jdk12u/8214545/webrev.01/ Bug: https://bugs.openjdk.java.net/browse/JDK-8214545 Main webrev: http://cr.openjdk.java.net/~dtitov/8214545/webrev.03 Testing: sun/management/jmxremote/bootstrap, jdk_svc, tier1, tier2 and tier3 tests succeeded. Thanks! --Daniil
Re: RFR (S) 8224673: Adjust permission for delayed starting of debugging
Hi Ralf, Looks good. Thanks, Serguei On 5/24/19 06:25, Langer, Christoph wrote: Hi Ralf, looks good. I think this is the better permission to use here. Best regards Christoph From: serviceability-dev On Behalf Of Schmelter, Ralf Sent: Freitag, 24. Mai 2019 14:05 To: serviceability-dev@openjdk.java.net Subject: [CAUTION] RFR (S) 8224673: Adjust permission for delayed starting of debugging Please review this small change. It changes the permission needed to delayed start the debugging (when enabled with _onjcmd_=y) via a diagnostic mbean. webrev: http://cr.openjdk.java.net/~rschmelter/webrevs/8224673/webrev.0/ bugreport: https://bugs.openjdk.java.net/browse/JDK-8224673 Best regards, Ralf
Re: RFR: 8224252: [TESTBUG] hotspot/test/serviceability/sa/sadebugd/SADebugDTest.java is timing out again after fix for JDK-8163805
Hi Yusamasa, Thank you for the clarifications! It looks good to me. Thanks, Serguei On 5/24/19 03:00, Yasumasa Suenaga wrote: Hi Serguei, Other changes are included to improve reliability of testcase. DebugServer would show "Debugger attached ..." when attach operation is completed. So I change its value. Then the testcase should wait until debugd process is terminated normaly. So I added waitFor() call. Null check of "app" would be done in stopApp(). So I remove it. Thanks, Yasumasa 2019年5月24日(金) 18:44 serguei.spit...@oracle.com: Hi Yasumasa, I'm a little bit confused with this fix. If this test is failed on Windows only then the line: + * @requires os.family != "windows" prevents the test to be run on Windows (which looks Okay to me). Then why are there other fixes (excluding the comment with bug numbers)? Your summary below only tells about problem on Windows. What was motivation for these changes? Thanks, Serguei On 5/23/19 17:23, Yasumasa Suenaga wrote: Hi all, Please review this change. This webrev has passed all tests on submit repo (mach5-one-ysuenaga-JDK-8224252-3-20190523-0532-2647154). JBS: https://bugs.openjdk.java.net/browse/JDK-8224252 webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8224252/webrev.00/ JDK-8163805 was suppose to fix a timeout issue with this test and also removed it off the problemlist. However, it still times out on windows-x64 about 1/3 of the time. It passes every time on all the other platforms. Root cause of this is `jhsdb debugd` could not be exited normally. `jhsdb debugd` (internally, DebugServer.java) would detach from debuggee at shutdown hook. So jhsdb process need to be exited normally. `jhsdb debugd` could not exit itself without external operation (e.g. CTRL+C, signals). Thus we use Process::destroy for it. In the case of Windows, Process::destroy calls TerminateProcess() Windows API to terminate process. However it would terminate target process immediately and it would not give a chance to kick shutdown hook. (It is documented on Javadoc of Runtime::addShutdownHook) Thanks, Yasumasa
RE: RFR (S) 8224673: Adjust permission for delayed starting of debugging
Hi Ralf, looks good. I think this is the better permission to use here. Best regards Christoph From: serviceability-dev On Behalf Of Schmelter, Ralf Sent: Freitag, 24. Mai 2019 14:05 To: serviceability-dev@openjdk.java.net Subject: [CAUTION] RFR (S) 8224673: Adjust permission for delayed starting of debugging Please review this small change. It changes the permission needed to delayed start the debugging (when enabled with onjcmd=y) via a diagnostic mbean. webrev: http://cr.openjdk.java.net/~rschmelter/webrevs/8224673/webrev.0/ bugreport: https://bugs.openjdk.java.net/browse/JDK-8224673 Best regards, Ralf
Re: RFR: JDK-8218701: jdb tests failed with AGENT_ERROR_INVALID_THREAD
I have not tracked down the specific root cause of this failure, yet. It appears that the suspend is being attempted *before* the thread has been recorded in *threadControl*. Adding diagnostic messages is tricky because it changes the timing. Here's a minimal probe to track threadControl addNode and clearThread transactions. See attached log.txt. diff --git a/src/jdk.jdwp.agent/share/native/libjdwp/eventHelper.c b/src/jdk.jdwp.agent/share/native/libjdwp/eventHelper.c --- a/src/jdk.jdwp.agent/share/native/libjdwp/eventHelper.c +++ b/src/jdk.jdwp.agent/share/native/libjdwp/eventHelper.c @@ -491,7 +491,9 @@ static void suspendWithInvokeEnabled(jbyte policy, jthread thread) { + /*if (threadControl_getInvokeRequest(thread) != NULL) { */ invoker_enableInvokeRequests(thread); + /*} */ if (policy == JDWP_SUSPEND_POLICY(ALL)) { (void)threadControl_suspendAll(); diff --git a/src/jdk.jdwp.agent/share/native/libjdwp/threadControl.c b/src/jdk.jdwp.agent/share/native/libjdwp/threadControl.c --- a/src/jdk.jdwp.agent/share/native/libjdwp/threadControl.c +++ b/src/jdk.jdwp.agent/share/native/libjdwp/threadControl.c @@ -285,6 +285,7 @@ static void addNode(ThreadList *list, ThreadNode *node) { + printf ("addNode %p \n", node->thread); node->next = NULL; node->prev = NULL; node->list = NULL; @@ -362,6 +363,7 @@ static void clearThread(JNIEnv *env, ThreadNode *node) { + printf("clearThread %p\n", node->thread); if (node->pendingStop != NULL) { tossGlobalRef(env, &(node->pendingStop)); } @@ -1646,6 +1648,8 @@ node = findThread(, thread); if (node != NULL) { request = >currentInvoke; +} else { + printf ("threadControl_getInvokeRequest %p\n", thread); } debugMonitorExit(threadLock); The AGENT_ERROR_INVALID_THREAD is reported when invoker_enableInvokeRequest does not find the thread. I added the print out in threadControl_getInvokeRequest when the thread is not found. The workaround bypasses the error and falls through to the threadControl CommonSuspend path where runningThreads is complimented with an otherThreads mechanism to ensure threads that are not between start and end events will be properly resumed later on. On 5/23/19, 1:23 PM, Chris Plummer wrote: Hi Gary, So a JVMTI event came in on a valid thread, got processed by the Debug Agent and enqueued to be sent to the Debugger. However, before it was actually sent, the thread became invalid. Am I understanding this issue correctly? thanks, Chris On 5/23/19 2:59 AM, gary.ad...@oracle.com wrote: This proposed workaround ensures that the delay between a suspend request passing through the jdwp command queue will not fail due to a no longer running thread. Webrev: http://cr.openjdk.java.net/~gadams/8218701/webrev/ Issue: https://bugs.openjdk.java.net/browse/JDK-8218701 launcher > Starting jdb launching local debuggee Creating file for jdb stdout stream: .\\jdb.stdout Creating file for jdb session: .\\jdb.session Creating file for jdb stderr stream: .\\jdb.stderr Setting first breakpoint Sending command: stop in nsk.jdb.dump.dump002.dump002a.main reply[0]: Deferring breakpoint nsk.jdb.dump.dump002.dump002a.main. It will be set after the class is loaded. reply[1]: > Starting debuggee class Sending command: run reply[0]: run nsk.jdb.dump.dump002.dump002a reply[1]: Set uncaught java.lang.Throwable reply[2]: Set deferred uncaught java.lang.Throwable reply[3]: > reply[4]: VM Started: addNode 00D4466E3270 addNode 00D4466E3278 addNode 00D4466E3280 addNode 00D4466E3288 addNode 00D4466E3290 addNode 00D4466E3298 addNode 00D4466E32A0 addNode 00D4466E32A8 addNode 00D4466E32C8 addNode 00D4466E32D0 addNode 00D4466E32D0 Set deferred breakpoint nsk.jdb.dump.dump002.dump002a.main reply[5]: reply[6]: Breakpoint hit: "thread=main", nsk.jdb.dump.dump002.dump002a.main(), line=37 bci=0 reply[7]: 37 System.exit(dump002.JCK_STATUS_BASE + _dump002a.runIt(args, System.out)); reply[8]: reply[9]: main[1] Test cases starts. Sending command: stop in nsk.jdb.dump.dump002.dump002a.lastBreak reply[0]: Set breakpoint nsk.jdb.dump.dump002.dump002a.lastBreak reply[1]: main[1] Sending command: cont reply[0]: > reply[1]: Breakpoint hit: "thread=main", nsk.jdb.dump.dump002.dump002a.lastBreak(), line=40 bci=0 reply[2]: 40static void lastBreak () {} reply[3]: reply[4]: main[1] Sending command: dump nsk.jdb.dump.dump002.dump002a._dump002a reply[0]: nsk.jdb.dump.dump002.dump002a._dump002a = { reply[1]: _dump002a: instance of nsk.jdb.dump.dump002.dump002a(id=1382) reply[2]: iStatic: 0 reply[3]: iPrivate: 1 reply[4]: iProtect: 2 reply[5]: iPublic: 3 reply[6]: iFinal: 4 reply[7]: iTransient: 5 reply[8]: iVolatile: 6 reply[9]: iArray: instance of int[1] (id=1383) reply[10]: sStatic: "zero" reply[11]: sPrivate: "one" reply[12]: sProtected:
RFR (S) 8224673: Adjust permission for delayed starting of debugging
Please review this small change. It changes the permission needed to delayed start the debugging (when enabled with onjcmd=y) via a diagnostic mbean. webrev: http://cr.openjdk.java.net/~rschmelter/webrevs/8224673/webrev.0/ bugreport: https://bugs.openjdk.java.net/browse/JDK-8224673 Best regards, Ralf
Re: [12u] RFR: 8214545: sun/management/jmxremote/bootstrap tests hang in revokeall.exe on Windows
Hi Daniil, The fix has been applied cleanly. LGTM++ Thanks, Serguei On 5/23/19 18:22, Daniil Titov wrote: Please review the backport of this fix to JDK 12. The JDK 12 changes applied mostly smoothly, but one hunk in make/test/JtregNativeJdk.gmk didn't apply because of changed context lines. That's the only difference. Webrev: http://cr.openjdk.java.net/~dtitov/backports/jdk12u/8214545/webrev.01/ Bug: https://bugs.openjdk.java.net/browse/JDK-8214545 Main webrev: http://cr.openjdk.java.net/~dtitov/8214545/webrev.03 Testing: sun/management/jmxremote/bootstrap, jdk_svc, tier1, tier2 and tier3 tests succeeded. Thanks! --Daniil
Re: RFR: 8224252: [TESTBUG] hotspot/test/serviceability/sa/sadebugd/SADebugDTest.java is timing out again after fix for JDK-8163805
Hi Yasumasa, I'm a little bit confused with this fix. If this test is failed on Windows only then the line: + * @requires os.family != "windows" prevents the test to be run on Windows (which looks Okay to me). Then why are there other fixes (excluding the comment with bug numbers)? Your summary below only tells about problem on Windows. What was motivation for these changes? Thanks, Serguei On 5/23/19 17:23, Yasumasa Suenaga wrote: Hi all, Please review this change. This webrev has passed all tests on submit repo (mach5-one-ysuenaga-JDK-8224252-3-20190523-0532-2647154). JBS: https://bugs.openjdk.java.net/browse/JDK-8224252 webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8224252/webrev.00/ JDK-8163805 was suppose to fix a timeout issue with this test and also removed it off the problemlist. However, it still times out on windows-x64 about 1/3 of the time. It passes every time on all the other platforms. Root cause of this is `jhsdb debugd` could not be exited normally. `jhsdb debugd` (internally, DebugServer.java) would detach from debuggee at shutdown hook. So jhsdb process need to be exited normally. `jhsdb debugd` could not exit itself without external operation (e.g. CTRL+C, signals). Thus we use Process::destroy for it. In the case of Windows, Process::destroy calls TerminateProcess() Windows API to terminate process. However it would terminate target process immediately and it would not give a chance to kick shutdown hook. (It is documented on Javadoc of Runtime::addShutdownHook) Thanks, Yasumasa