I will move the logging statement up.

Leaving the "docontinue" in the synchronized block
will guarantee the debugger suspends the debuggee
main thread without any chance of interference
with logging commands.

I'm sure this test was left with verbose logging enabled in the
hopes of catching more details about the failing sequence of
events. In this case it was the excessive logging that was causing
the test to hang.

Now that the test can run reliably, I think we should turn off
the verbose logging.

On 2/28/19, 4:44 PM, Alex Menkov wrote:
Hi Gary,

Now "mainThread is out of: synchronized (lockingObject) {" record is logged at the end of the scenario and that makes it useless.

I think it would be better to call
pipe.println("docontinue") after logging of this message:

  log1("mainThread is out of: synchronized (lockingObject) {");
+  pipe.println("docontinue");
+  // wait until the scenario is completed to avoid deadlocks on logging
+  synchronized (Threadissuspended002a.waitnotifyObj) {
+    Threadissuspended002a.waitnotifyObj.wait();
+  }

--alex

On 02/27/2019 02:29, gary.ad...@oracle.com wrote:
Revised webrev : http://cr.openjdk.java.net/~gadams/4903717/webrev.02/

To prevent the main thread from holding the logging lock when
the debugger suspends the debuggee, an explicit wait/notify is used
so the test thread can safely issue log messages while the main thread
remains suspended.

Testing is in progress.

On 2/26/19 3:49 PM, gary.ad...@oracle.com wrote:
New fix will be coming tomorrow...

I dumped the main and thread2 stacks at the
point of the timeout and found they were
both contending for the java.io.PrintStream
writeln() synchronization block.

While writing log messages main thread held the
lock and was suspended and thread2 was blocked
trying to report it's progress. So thread2 never reaches the
breakpoint.

java.nio.Buffer:292 in thread instance of java.lang.Thread(name='main', id=1) java.nio.CharBuffer:1260 in thread instance of java.lang.Thread(name='main', id=1) java.nio.CharBuffer:265 in thread instance of java.lang.Thread(name='main', id=1) java.nio.Buffer:223 in thread instance of java.lang.Thread(name='main', id=1) java.nio.CharBuffer:284 in thread instance of java.lang.Thread(name='main', id=1) java.nio.HeapCharBuffer:77 in thread instance of java.lang.Thread(name='main', id=1) java.nio.CharBuffer:396 in thread instance of java.lang.Thread(name='main', id=1) sun.nio.cs.StreamEncoder:280 in thread instance of java.lang.Thread(name='main', id=1) sun.nio.cs.StreamEncoder:125 in thread instance of java.lang.Thread(name='main', id=1) java.io.OutputStreamWriter:211 in thread instance of java.lang.Thread(name='main', id=1) java.io.BufferedWriter:120 in thread instance of java.lang.Thread(name='main', id=1) *java.io.PrintStream:653 in thread instance of java.lang.Thread(name='main', id=1)* java.io.PrintStream:958 in thread instance of java.lang.Thread(name='main', id=1) nsk.jdi.ThreadReference.isSuspended.issuspended002a:49 in thread instance of java.lang.Thread(name='main', id=1) nsk.jdi.ThreadReference.isSuspended.issuspended002a:124 in thread instance of java.lang.Thread(name='main', id=1)


*java.io.PrintStream:650 in thread instance of nsk.jdi.ThreadReference.isSuspended.Threadissuspended002a(name='testedThread', id=589)* java.io.PrintStream:958 in thread instance of nsk.jdi.ThreadReference.isSuspended.Threadissuspended002a(name='testedThread', id=589) nsk.jdi.ThreadReference.isSuspended.issuspended002a:53 in thread instance of nsk.jdi.ThreadReference.isSuspended.Threadissuspended002a(name='testedThread', id=589) nsk.jdi.ThreadReference.isSuspended.Threadissuspended002a:203 in thread instance of nsk.jdi.ThreadReference.isSuspended.Threadissuspended002a(name='testedThread', id=589)

On 2/22/19 11:21 PM, serguei.spit...@oracle.com wrote:
Hi Gary,

Thank you for cosmetic changes.

One thing to notice there is no guard against spurious wakeup for this:
   108 Threadissuspended002a.waitnotifyObj.wait();

But I doubt it causes these timeouts as they are very well reproducible.

This time I looked more to the debugger side.
The code is poorly structured, it is hard to understand what it is doing.

>   - resumed main thread after breakpoint enabled and thread2
     double resume

Could you, please, explain more how does this help and why timeouts are intermittent?

Thanks,
Serguei


On 2/22/19 11:52 AM, Gary Adams wrote:
After 1000 successful testruns, it seems like a reasonable solution is
to resume the main thread to resolve any locks holding back the
thread2 reaching the desired breakpoint.

Updated webrev: http://cr.openjdk.java.net/~gadams/4903717/webrev.01/

Changes in this iteration:
  - cosmetic changes
     - spaces around operators
     - removed extra spaces around parens
     - removed extra blank lines
     - fixed typo in ERRROR
     - fixed curl braces indenting
  - resumed main thread after breakpoint enabled and thread2
     double resume


On 2/22/19, 6:19 AM, gary.ad...@oracle.com wrote:
I'll make a pass over the code for cosmetic fixes after the test
is working correctly.

I think the purpose of the lockingObject is to ensure the debugger
has time to enable the breakpoint and perform the double resume
before debuggee thread2 is released to run to the expected breakpoint.

I'm assuming the timeout and break point location are correct, because
the test passes most of the time. I'll dump the values to make sure.

I'll also try to capture a thread dump at the time of the timeout,
to see if the debuggee is hung somewhere unexpected.

The test leaves the main thread suspended while thread2 is
is resumed. I'll see if resuming the main thread unblocks thread2.

...

On 2/21/19 10:18 PM, serguei.spit...@oracle.com wrote:
Hi Gary,

I do not see any cause of this timeout yet.
However, the synchronization on the lockingObject looks completely wasteful. Moreover, it adds unreasonable complexity and makes the test harder to understand.

So, I'd suggested to do a small cleanup that includes:
  - removing the lockingObject field and sync around it
  - fix wrong indents in the fragment:
  161         synchronized (waitnotifyObj)                    {
162 log("entered into block: synchronized (waitnotifyObj)");
  163             waitnotifyObj.notify();                     }

Thanks,
Serguei


On 2/21/19 10:57, gary.ad...@oracle.com wrote:
The commented out "docontinue" was done to match the style
used in the other tests. I'll remove it for this test.

In any event, the test still is failing 8 / 500 testruns on
macosx and windows debug builds.

The change is not sufficient.

Here's a recent failure. You can see in the log that the debugger did
wait until the new "docontinue" arrives outside the lockingObject
synchronized block.

It appears with the main thread suspended and the thread2 double resumed, it still does not continue executing up to the breakpoint in thread2.
After the eventQueue.remove(timeout), when the vm is resumed
you can see the debuggee messages expected leading up to the breakpoint.

...

--> debugger:       suspending VM with vm.suspend();
*debugee.stderr> **> mainThread: mainThread is out of: synchronized (lockingObject) {*
--> debugger:          thread2.suspendCount() == 2
--> debugger:      :   thread2.isSuspended()
--> debugger:          mainThread.suspendCount() == 1
--> debugger:      :   mainThread.isSuspended()
--> debugger:       double resuming thread2 with thread2.resume();
--> debugger:          thread2.suspendCount() == 1
--> debugger:      :   thread2.isSuspended()
--> debugger:      enabling breakpRequest1
--> debugger:          thread2.suspendCount() == 0
--> debugger: : !thread2.isAtBreakpoint() before call to breakpoint()
--> debugger:        waiting for BreakpointEvent
*--> debugger: new: eventSet = eventQueue.remove();*
# ERROR: ##> debugger: ERROR: timeout for waiting for a BreakpintEvent
The following stacktrace is for failure analysis.
nsk.share.TestFailure: ##> debugger: ERROR: timeout for waiting for a BreakpintEvent
    at nsk.share.Log.logExceptionForFailureAnalysis(Log.java:428)
    at nsk.share.Log.complain(Log.java:399)
at nsk.jdi.ThreadReference.isSuspended.issuspended002.log3(issuspended002.java:104) at nsk.jdi.ThreadReference.isSuspended.issuspended002.breakpoint(issuspended002.java:555) at nsk.jdi.ThreadReference.isSuspended.issuspended002.runThis(issuspended002.java:389) at nsk.jdi.ThreadReference.isSuspended.issuspended002.run(issuspended002.java:90) at nsk.jdi.ThreadReference.isSuspended.issuspended002.main(issuspended002.java:85) at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.base/java.lang.reflect.Method.invoke(Method.java:567)
at PropertyResolvingWrapper.main(PropertyResolvingWrapper.java:104) at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.base/java.lang.reflect.Method.invoke(Method.java:567)
at com.sun.javatest.regtest.agent.MainWrapper$MainThread.run(MainWrapper.java:127)
    at java.base/java.lang.Thread.run(Thread.java:835)
*--> debugger: resuming both second and main thread debugee.stderr> **> mainThread: waiting for an instruction from the debugger ... --> debugger: the end of testing debugee.stderr> **> thread2: entered into block: synchronized (lockingObject) debugee.stderr> **> thread2: exited from block: synchronized (lockingObject) debugee.stderr> **> thread2: call to the method 'runt1' debugee.stderr> **> thread2: method 'runt1' enter*
--> debugger:      : returned string is 'checkend'
debugee.stderr> **> mainThread: waiting for an instruction from the debugger ...


On 2/20/19 5:15 PM, Alex Menkov wrote:
+1

But could you please remove old "pipe.println("docontinue")" statement instead of commenting it out (no new webrev is required)

--alex

On 02/20/2019 09:58, Chris Plummer wrote:
Looks good.

Chris

On 2/20/19 7:57 AM, Gary Adams wrote:
The issuspended002 has been on the ProblemList since the tests were moved to the open repos. The proposed changeset applies the same fix that was used in issuspended001.

The current test fails when the debuggee replies with the "docontinue" response, while it is still holding a lock in a syncrhonized block. When the debugger then suspends the vm the debuggee test thread fails to proceed to the expected breakpoint.
Testing in progress.

  Issue: https://bugs.openjdk.java.net/browse/JDK-4903717
  Webrev: http://cr.openjdk.java.net/~gadams/4903717/webrev.00/











Reply via email to