Hi Severin,

The fix looks good in general.
I'm testing both fixes together at the moment.

A couple of questions...

It seems, there are more places where an invokerLock critical section is missed.
The following functions:
  - invoker_enableInvokeRequests
  - invokeConstructor
  - invokeStatic
  - invokeNonvirtual
  - invokeVirtual
  - saveGlobalRef

The first function is easy to fix.
The last 5 functions are called from the invoker_doInvoke() that we already had a problem with. I'm puzzled with the question: How to synchronize and avoid deadlocks at the same time?

I'm inclined to let your fix go (if the testing is Ok) and file one more bug on the remaining sync issues.

Thanks,
Serguei


On 4/28/16 02:00, Severin Gehwolf wrote:
On Tue, 2016-04-19 at 19:32 -0700, serguei.spit...@oracle.com wrote:
Hi Severin,

I postpone a push for this fix.

There are two nsk.jdi test failures (they look like deadlocks):
   nsk/jdi/ObjectReference/invokeMethod/invokemethod012 FAIL(TIMEOUT)
   nsk/jdi/Scenarios/invokeMethod/popframes001 FAIL(TIMEOUT)

and one jtreg sun/com/jdi failure (it looks like a deadlock too):
   com/sun/jdi/InvokeHangTes.java


I'll double check if these failures are regressions caused by your fix
or not.
I confirm, the failures above are new regressions introduced by the fix.
The tests fail consistently with the fix and do not fail without the fix.
OK this was caused by the locking done in invoker_doInvoke(). Note that
holding either of them invoker lock or event handler lock causes this.

Here is a new webrev with those hunks removed. It's sufficient to grab
the locks again in invoke_completeInvokeRequest() when clearing the
global references in order to not get those failures we've seen when
the fix for JDK-4858370 was in place.

http://cr.openjdk.java.net/~sgehwolf/webrevs/JDK-8153711/webrev.02/

Testing done:
  - com/sun/jdi/InvokeTest.java com/sun/jdi/InvokeHangTest.java and
    sun/jdi/InterfaceMethodsTest.java does not fail in 1500 runs.
  - regular com/sun/jdi test set: no regressions

Note that there are some rare cases where OomDebugTest times out which
seems to be caused by the GC, though. See the bug for details. Since
this problem is rare, it's still worthwhile having that test included.
If it turns out a problem in practice we could consider disabling the
test.

Thoughts?

Cheers,
Severin

Reply via email to