Hi Daniil,
On 11/8/18 3:12 PM, Daniil Titov wrote:
Hi Chris,
You are right. There are different cases here. Please find them listed below.
1. Test vmTestbase/nsk/jdi/stress/MonitorEvents/MonitorEvents002
This test was actually fixed with JDK-8206086 changes. As far as I
remember it was not removed from ProblemList-graal.txt due to some sporadic
failures (not related to ObjectCollectedException), however with the latest
builds these failures are no longer reproduced .
OK.
2. "NewInstance" tests
(test/hotspot/jtreg/vmTestbase/nsk/jdi/ArrayType/newInstance/newinstance001.java,
test/hotspot/jtreg/vmTestbase/nsk/jdi/ArrayType/newInstance/newinstance002.java,
test/hotspot/jtreg/vmTestbase/nsk/jdi/ArrayType/newInstance/newinstance003.java)
The typical failure for these tests Is that the test creates a new array with
arrayType.newInstance(arraylength) but this object becomes immediately GC'ed.
Calling disableCollection() on the newly created instance doesn't help since by
that time the object might be already GC'ed.
com.sun.jdi.ObjectCollectedException at
jdk.jdi/com.sun.tools.jdi.JDWPException.toJDIException(JDWPException.java:55)
at
jdk.jdi/com.sun.tools.jdi.ArrayReferenceImpl.getValues(ArrayReferenceImpl.java:126)
at
jdk.jdi/com.sun.tools.jdi.ArrayReferenceImpl.getValue(ArrayReferenceImpl.java:77)
at
nsk.jdi.ArrayType.newInstance.newinstance001.runThis(newinstance001.java:316)
at
nsk.jdi.ArrayType.newInstance.newinstance001.run(newinstance001.java:84)
at
nsk.jdi.ArrayType.newInstance.newinstance001.main(newinstance001.java:79)
at
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at
java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.base/java.lang.reflect.Method.invoke(Method.java:567)
at PropertyResolvingWrapper.main(PropertyResolvingWrapper.java:104)
at
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at
java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.base/java.lang.reflect.Method.invoke(Method.java:567)
at
com.sun.javatest.regtest.agent.MainWrapper$MainThread.run(MainWrapper.java:127)
at java.base/java.lang.Thread.run(Thread.java:835)
The proposed fix for this case is to suspend VM while the body of the "for"
loop (lines 150 - 356) is executed while ensuring that for line 151 ( line =
pipe.printlln()) VM is temporary resumed ( line numbers are given for an original version
of test/hotspot/jtreg/vmTestbase/nsk/jdi/ArrayType/newInstance/newinstance001.java)
147 //------------------------------------------------------ testing section
148 log1(" TESTING BEGINS");
149
150 for (int i = 0; ; i++) {
151 pipe.println("newcheck");
152 line = pipe.readln();
153
<skipped>
356 }
357 log1(" TESTING ENDS");
358
Ok. I get it now. It was confusing because there is a resume before the
suspend, but I see now why it was done that way. I assume you
originally tried just doing the suspend outside of the loop, but found
that caused problems for pipe.readln(), so you moved it to inside the
loop and after the pipe.readln().
3. Test
test/hotspot/jtreg/vmTestbase/nsk/jdi/ReferenceType/instances/instances002/instances002.java
There were 2 issues here:
a) The same issue as above. On line 125 calling
disableCollection () on newly created array object (arrayType.newInstance())
resulted in ObjectCollectedException was thrown
The fix here is to suspend VM while the testing code is
executed.
b) IndexOutOfBoundException on line 130
(checkDebugeeAnswer_instances(className, createInstanceCount + baseInstances)
since some of base instances that were created before VM was suspended become
collected if GC was triggered before VM was suspended.
The fix here is to call disableCollection() an all base
instances and filter out ones that were already collected.
Ok.
4. Test
test/hotspot/jtreg/vmTestbase/nsk/jdi/ReferenceType/instances/instances003/instances003.java
The problem here was ObjectCollectedException thrown on line 141 while
iterating over newly created instances since these instances also include ones
created by other java threads and some of them become GC'ed before
disableCollection() is called on them. In this test the new instances are
created by debuggee itself (the test sends the commands to debuggee to create
or delete instances using pipe) so here we cannot use suspend VM approach.
The fix is to filter out already collected instances ( these are the
instances created outside of the test) and ensure that instances created by the
test are not GC'ed before we call disableCollection() on them. For the latter
the fix just extends the approach used by the test for WEAK references ( to ask
debuggee to create a temporary strong reference and then delete them after we
call disableCollection() ) to other reference types.
Ok.
5. Stress tests
vmTestbase/nsk/jdi/stress/serial/heapwalking001/TestDescription.java,
vmTestbase/nsk/jdi/stress/serial/heapwalking002/TestDescription.java,
vmTestbase/nsk/jdi/stress/serial/mixed002/TestDescription.java
Two problems here:
1) These tests use primitive arrays as test classes and these classes
might be created by other java threads outside the test . Changes in
useStrictCheck() fix this.
2) These tests include
test/hotspot/jtreg/vmTestbase/nsk/jdi/ReferenceType/instances/instances003/instances003.java
test that is fixed at 4 above
Ok.
Thanks for describing the separate issues. Can you make sure the CR is
updated with this info?
thanks,
Chris
Thanks,
Daniil
On 11/8/18, 1:10 PM, "Chris Plummer" <chris.plum...@oracle.com> wrote:
Hi Daniil,
Hi Daniil,
I was wondering why instance003 did not include the same fix as
instance002. In fact I was wondering why all the tests did not include
this fix. It looks like you have now fixed instance003 the same way as
instance002, but I'm unsure of how all the other tests are being fixed.
What is the fix now for MonitorEvents002, heapwalking001,
heapwalking002, and mixed002. Are these all covered by the
useStrictCheck() fix?
For the newinstance tests, I understand what you are saying about
needing to resume the VM to read the response, but I'm not sure how this
manifested itself as a problem before the fix and how it relates to the
reported failures, since as far as I can tell you did not add a suspend,
so it must have been in place already.
Maybe what's going on here is that there are actually 3 separate issues,
which is making it hard to figure out what the failure mode is for each
test, and how the changes are addressing it.
thanks,
Chris
On 11/7/18 5:11 PM, Daniil Titov wrote:
> Hi Chris,
>
> Thank you for your comments.
>
> Excluding of java.lang.String is not really required. It seems as it is
sufficient just to expect that the object created by other threads and included in
the result of referenceType.instances() and might be collected before we are able
to call disableCollection() on them. The new version of the patch includes such
change in
vmTestbase/nsk/jdi/ReferenceType/instances/instances003/instances003.java .
>
> The changes in useStrictCheck() are required since some tests use array
of primitives aa a test classes ( e.g. boolean[] ) and objects of these classes
might be created by other (e.g. JVMC compiler) threads.
> The useStrictCheck() defines how the number of instances the test
created should be compared to the actual number of instance of this class in the
target VM. If useStrictCheck() returns true these numbers should be equal,
otherwise the test passes if the number of the instances the test created is no
greater than the number of instances found.
>
> The changes below are required since we need to suspend the target VM
while the test is executed (to ensure no GC is triggered). At the same time we
need to ensure the target VM is temporary resumed when we read its output from the
pipe. These tests
(test/hotspot/jtreg/vmTestbase/nsk/jdi/ArrayType/newInstance/newinstance00{1,2,3}.java)
are written in a such way that the test code is executed as a body of the loop
and on each iteration it reads a response from the target VM and then runs the
tests, so on the first iteration vm.resume() is not required at the beginning of
the loop. I added comments through the code and also updated existing comments per
your suggestions.
>
> 153 if (i > 0) {
> 154 debuggee.resume();
> 155 }
> 156
> 157 line = pipe.readln();
> 158 debuggee.suspend();
>
>
> Please review a new version of the fix with the changes mentioned above.
>
> Webrev: http://cr.openjdk.java.net/~dtitov/8203174/webrev.03/
> Bug: https://bugs.openjdk.java.net/browse/JDK-8203174
>
> Thanks,
> Daniil
>
> On 11/7/18, 11:33 AM, "Chris Plummer" <chris.plum...@oracle.com> wrote:
>
> Hi Danil,
>
> So this is the crux of the issue:
>
> 112 debuggee.suspend();
> 113
> 114 List<ObjectReference> baseReferences = new
LinkedList<>();
> 115 // We need to ensure that the base references count
is not
> changed during the test.
> 116 // The instances of the class could be already
created by
> other (e.g. JVMCI) threads
> 117 // and if GC was scheduled before VM was suspended
some of
> these instances might
> 118 // become collected.
> 119 for (ObjectReference objRef :
referenceType.instances(0)) {
> 120 try {
> 121 objRef.disableCollection();
> 122 baseReferences.add(objRef);
> 123 } catch (ObjectCollectedException e) {
> 124 // skip this reference
> 125 }
> 126 }
> 127 baseInstances = baseReferences.size();
>
> First it is possible for a GC to be triggered even if the debuggee
is
> not executing any code because of JVMCI threads. So this is why
> debuggee.suspend() is needed. Second, even after calling
> debuggee.suspend(), it is still possible for a GC to happen since
it may
> have been triggered before the debuggee.suspend() call, but not yet
> completed, and debuggee.suspend() does not prevent GC from
completing in
> that case. So that is why you need to disableCollecion() on each
object,
> and accept that some of them may have already been collected.
Therefore
> baseInstances needs to be updated to only reflect the count of
instances
> that have not been collected, and will not be collected because
> disableCollection() has been called on them. This all makes sense
and
> seems fine.
>
> I do think the comments could use some updating. The
debuggee.suspend()
> call should be explained as being needed because there are
potentially
> other non-test java threads allocating objects and triggering GC's,
> JVMCI being the main culprit here. The comment before the loop
should
> focus on dealing with a GC that was triggered before the suspend,
> requiring disableCollection() be called each object returned by
> referenceType.instances() since it can potentially be collected
> otherwise. This code is not really related to the JVMCI threads.
>
> I don't understand the reason for excluding java.lang.String from
testing.
>
> I don't understand the reason for the useStrictCheck() changes.
>
> I don't understand what the following is fixing, and the impact on
test
> execution that the resume() might have:
>
> 153 if (i > 0) {
> 154 debuggee.resume();
> 155 }
> 156
> 157 line = pipe.readln();
> 158 debuggee.suspend();
>
>
> thanks,
>
> Chris
>
> On 11/7/18 8:58 AM, Daniil Titov wrote:
> > Hi Chris and Serguei,
> >
> > With recent builds I can no longer see any GC triggered by C1
compiler thread due to running out of metaspace and JDK-8193126 seems as solved this
particular problem. Currently all observed GCs are triggered by JVMCI Compiler
threads.
> >
> > Please review a new version of the fix that no longer keeps the
main thread in the target VM running while other threads are suspended (since as Dean
mentioned it might be not safe). Instead, the target VM is fully suspended when
required and resumed afterwards. The new webrev also excludes java.lang.String class
from the list of classes used by some of these tests.
> >
> > Webrev:
http://cr.openjdk.java.net/~dtitov/8203174/webrev.02/index.html
> > Bug: https://bugs.openjdk.java.net/browse/JDK-8203174
> >
> > Thanks,
> > Daniil
> >
> > On 11/5/18, 10:46 AM, "Chris Plummer" <chris.plum...@oracle.com>
wrote:
> >
> > On 11/4/18 11:45 PM, serguei.spit...@oracle.com wrote:
> > > On 11/4/18 23:40, serguei.spit...@oracle.com wrote:
> > >> Hi Daniil,
> > >>
> > >> I agree with Chris below.
> > >> Will tell more on reply to your reply.
> > >>
> > >> Thanks,
> > >> Serguei
> > >>
> > >> On 11/2/18 20:59, Chris Plummer wrote:
> > >>> Hi Daniil,
> > >>>
> > >>> I thought the issue was that C2 was doing metadata
allocations, and
> > >>> when it ran out of metaspace it did a GC (one that was
not triggered
> > >>> by a failed object
> > >
> > > Forgot to comment on the below.
> > > It is probably a typo.
> > > Should it be about the Graal, not the C2?
> > > As I understand we have no issue with the C2.
> > Actually it was the C1 Compiler Thread that was the problem,
although it
> > only turned up during Graal testing.
> >
> > Chris
> > >
> > > Thanks,
> > > Serguei
> > >
> > >
> > >>> allocation). As a results we were getting objects GCs
before the
> > >>> test ever got the chance to disable collection on them.
I thought we
> > >>> also concluded other than this metaspace issue, if the
test is
> > >>> properly disabling collection on the objects it cared
about, it
> > >>> shouldn't matter if there are GC's triggered by
excessive object
> > >>> allocations.
> > >>>
> > >>> I don't think the following check will always be valid.
It may be on
> > >>> by default at some point:
> > >>>
> > >>> 651 public boolean isJVMCICompilerOn() {
> > >>> 652 String opts =
argumentHandler.getLaunchOptions();
> > >>> 653 return opts.indexOf("-XX:+UseJVMCICompiler")
>= 0;
> > >>> 654 }
> > >>>
> > >>> thanks,
> > >>>
> > >>> Chris
> > >>>
> > >>> On 11/2/18 4:29 PM, Daniil Titov wrote:
> > >>>> Please review the change that fixes several tests
failing with
> > >>>> com.sun.jdi.ObjectCollectedException when running with
Graal.
> > >>>>
> > >>>> There main problem here is that with Graal on JVMCI
Compiler
> > >>>> threads in the target VM create a lot of objects and
sporadically
> > >>>> trigger GC that results in the objects created in the
target VM in
> > >>>> the tests being GCed before the tests complete. The
other problem
> > >>>> is that for some classes the tests use, e.g.
"java.lang.String",
> > >>>> there is already a huge number of instances created by
JVMCI threads.
> > >>>>
> > >>>> The fix suspends target VM to prevent JVMCI compiler
threads from
> > >>>> creating new objects during the tests execution. In
cases when
> > >>>> IOPipe is used for communication between the test and
the debuggee
> > >>>> the fix suspends all threads but the main rather than
suspending
> > >>>> the VM. The fix also filters out the checks for some
test classes
> > >>>> (e.g. "java.lang.String") in cases when the target VM
is run with
> > >>>> JVMCI compiler options on.
> > >>>>
> > >>>> Webrev:
http://cr.openjdk.java.net/~dtitov/8203174/webrev.01/
> > >>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8203174
> > >>>>
> > >>>> Thanks,
> > >>>> Daniil
> > >>>>
> > >>>>
> > >>>>
> > >>>
> > >>>
> > >>
> > >
> >
> >
> >
> >
> >
>
>
>
>
>