|
Hi Gary,
On 5/24/19 5:17 AM, Gary Adams wrote:
I have not tracked down the specific root cause of this failure,
yet.
It appears that the suspend is being attempted before the
thread has been
recorded in threadControl.
I don't think this is possible. When the JVMTI event is received by
the debug agent, that's when threadControl adds the thread. We are
well beyond that point. The debug agent has processed the JVMTI
event and created and queued a ReportEventCompositeCommand (recc) to
pass the event on the the debugger (the queue is processed
asynchronously by another thread). You are in the process of
processing this recc, which involves suspending all threads before
sending the event to the debugger. If the thread in the recc is
invalid (no longer known to threadControl), then the only way I can
see that happening is if the thread has terminated. ASFAIK, there is
nothing preventing this from happening.
Adding diagnostic messages is tricky because it changes the
timing.
Here's a minimal probe to track threadControl addNode and
clearThread
transactions. See attached log.txt.
diff --git
a/src/jdk.jdwp.agent/share/native/libjdwp/eventHelper.c
b/src/jdk.jdwp.agent/share/native/libjdwp/eventHelper.c
--- a/src/jdk.jdwp.agent/share/native/libjdwp/eventHelper.c
+++ b/src/jdk.jdwp.agent/share/native/libjdwp/eventHelper.c
@@ -491,7 +491,9 @@
static void
suspendWithInvokeEnabled(jbyte policy, jthread thread)
{
+ /* if (threadControl_getInvokeRequest(thread) != NULL) {
*/
invoker_enableInvokeRequests(thread);
+ /* } */
if (policy == JDWP_SUSPEND_POLICY(ALL)) {
(void)threadControl_suspendAll();
diff --git
a/src/jdk.jdwp.agent/share/native/libjdwp/threadControl.c
b/src/jdk.jdwp.agent/share/native/libjdwp/threadControl.c
--- a/src/jdk.jdwp.agent/share/native/libjdwp/threadControl.c
+++ b/src/jdk.jdwp.agent/share/native/libjdwp/threadControl.c
@@ -285,6 +285,7 @@
static void
addNode(ThreadList *list, ThreadNode *node)
{
+ printf ("addNode %p \n", node->thread);
node->next = NULL;
node->prev = NULL;
node->list = NULL;
@@ -362,6 +363,7 @@
static void
clearThread(JNIEnv *env, ThreadNode *node)
{
+ printf("clearThread %p\n", node->thread);
if (node->pendingStop != NULL) {
tossGlobalRef(env, &(node->pendingStop));
}
@@ -1646,6 +1648,8 @@
node = findThread(&runningThreads, thread);
if (node != NULL) {
request = &node->currentInvoke;
+ } else {
+ printf ("threadControl_getInvokeRequest %p\n", thread);
}
debugMonitorExit(threadLock);
The
AGENT_ERROR_INVALID_THREAD is reported when
invoker_enableInvokeRequest
does not find the thread. I added the print out in
threadControl_getInvokeRequest
when the thread is not found.
You are printing the oop* instead of the oop. That's fine for the
node->thread references in addNode and clearNode, since they
should match up, but the "thread" reference in
threadControl_getInvokeRequest() is probably a localref, so will not
match up with anything printed by addNode or clearNode. You probably
need to print the oop and hope there is no intervening gc.
The workaround bypasses the error and falls through to the
threadControl CommonSuspend
path where runningThreads is complimented with an otherThreads
mechanism to ensure
threads that are not between start and end events will be properly
resumed later on.
This fix is probably fine, but I need to think about it some more.
Would be best to first confirm what's going on though.
thanks,
Chris
On 5/23/19, 1:23 PM, Chris Plummer wrote:
Hi Gary,
So a JVMTI event came in on a valid thread, got processed by the
Debug Agent and enqueued to be sent to the Debugger. However,
before it was actually sent, the thread became invalid. Am I
understanding this issue correctly?
thanks,
Chris
On 5/23/19 2:59 AM, [email protected]
wrote:
This proposed workaround ensures that
the delay between a suspend request
passing through the jdwp command queue will not fail due to a
no longer
running thread.
Webrev: http://cr.openjdk.java.net/~gadams/8218701/webrev/
Issue: https://bugs.openjdk.java.net/browse/JDK-8218701
|