Hi Daniil,

On 11/8/18 3:12 PM, Daniil Titov wrote:
Hi Chris,

You are right. There are different cases here. Please find them listed below.

1. Test vmTestbase/nsk/jdi/stress/MonitorEvents/MonitorEvents002
        This test was actually fixed with JDK-8206086 changes. As far as I 
remember it was not removed from ProblemList-graal.txt due to some sporadic 
failures (not related to ObjectCollectedException), however with the latest  
builds these failures are no longer reproduced .
OK.

2. "NewInstance" tests 
(test/hotspot/jtreg/vmTestbase/nsk/jdi/ArrayType/newInstance/newinstance001.java, 
test/hotspot/jtreg/vmTestbase/nsk/jdi/ArrayType/newInstance/newinstance002.java, 
test/hotspot/jtreg/vmTestbase/nsk/jdi/ArrayType/newInstance/newinstance003.java)

The typical failure for these tests Is  that the test creates a new array with 
arrayType.newInstance(arraylength) but this object becomes immediately GC'ed. 
Calling disableCollection() on the newly created instance doesn't help since by 
that time the object might be already GC'ed.


               com.sun.jdi.ObjectCollectedException at 
jdk.jdi/com.sun.tools.jdi.JDWPException.toJDIException(JDWPException.java:55)
        at 
jdk.jdi/com.sun.tools.jdi.ArrayReferenceImpl.getValues(ArrayReferenceImpl.java:126)
        at 
jdk.jdi/com.sun.tools.jdi.ArrayReferenceImpl.getValue(ArrayReferenceImpl.java:77)
        at 
nsk.jdi.ArrayType.newInstance.newinstance001.runThis(newinstance001.java:316)
        at 
nsk.jdi.ArrayType.newInstance.newinstance001.run(newinstance001.java:84)
        at 
nsk.jdi.ArrayType.newInstance.newinstance001.main(newinstance001.java:79)
        at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at 
java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.base/java.lang.reflect.Method.invoke(Method.java:567)
        at PropertyResolvingWrapper.main(PropertyResolvingWrapper.java:104)
        at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at 
java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.base/java.lang.reflect.Method.invoke(Method.java:567)
        at 
com.sun.javatest.regtest.agent.MainWrapper$MainThread.run(MainWrapper.java:127)
        at java.base/java.lang.Thread.run(Thread.java:835)

The proposed fix for this case is to suspend VM while the body of the "for" 
loop (lines 150 - 356) is executed while ensuring that for line 151 ( line = 
pipe.printlln()) VM is temporary resumed ( line numbers are given for an original version 
of  test/hotspot/jtreg/vmTestbase/nsk/jdi/ArrayType/newInstance/newinstance001.java)
147 //------------------------------------------------------ testing section
  148         log1("      TESTING BEGINS");
  149
  150         for (int i = 0; ; i++) {
  151         pipe.println("newcheck");
  152             line = pipe.readln();
  153
        <skipped>         
356 }
  357         log1("      TESTING ENDS");
  358
Ok. I get it now. It was confusing because there is a resume before the suspend, but I see now why it was done that way.  I assume you originally tried just doing the suspend outside of the loop, but found that caused problems for pipe.readln(), so you moved it to inside the loop and after the pipe.readln().

3. Test 
test/hotspot/jtreg/vmTestbase/nsk/jdi/ReferenceType/instances/instances002/instances002.java
         There were 2 issues here:
                   a) The same issue as above.  On line 125 calling 
disableCollection ()  on newly created array object (arrayType.newInstance()) 
resulted in ObjectCollectedException was thrown
                        The fix here is to suspend VM while the testing code is 
executed.
           b) IndexOutOfBoundException  on line 130 
(checkDebugeeAnswer_instances(className, createInstanceCount + baseInstances)  
since some of base instances that were created before VM was suspended  become 
collected if GC was triggered before VM was suspended.
                The fix here is to call disableCollection() an all base 
instances and filter out ones that were already collected.
Ok.

4. Test 
test/hotspot/jtreg/vmTestbase/nsk/jdi/ReferenceType/instances/instances003/instances003.java
        The problem here was ObjectCollectedException thrown on line 141 while 
iterating over newly created instances  since these instances also include ones 
created by other java threads and some of them become GC'ed before 
disableCollection() is called on them. In this test the new instances are 
created by debuggee itself (the test sends the commands to debuggee to create 
or delete instances using pipe)  so here we cannot use suspend VM approach.
        The fix is to filter out already collected instances  ( these are the 
instances created outside of the test) and ensure that instances created by the 
test are not GC'ed before we call disableCollection() on them. For the latter 
the fix just extends the approach used by the test for WEAK references ( to ask 
debuggee to create a temporary strong reference and then delete them after we 
call disableCollection() ) to other reference types.
Ok.
        
5. Stress tests 
vmTestbase/nsk/jdi/stress/serial/heapwalking001/TestDescription.java, 
vmTestbase/nsk/jdi/stress/serial/heapwalking002/TestDescription.java, 
vmTestbase/nsk/jdi/stress/serial/mixed002/TestDescription.java
    Two problems here:
    1)  These tests use primitive arrays as test classes and these classes 
might be created by other java threads outside the test . Changes in 
useStrictCheck() fix this.
    2)  These tests include 
test/hotspot/jtreg/vmTestbase/nsk/jdi/ReferenceType/instances/instances003/instances003.java
 test that is fixed at 4 above
Ok.

Thanks for describing the separate issues. Can you make sure the CR is updated with this info?

thanks,

Chris


Thanks,
Daniil

On 11/8/18, 1:10 PM, "Chris Plummer" <chris.plum...@oracle.com> wrote:

     Hi Daniil,
Hi Daniil, I was wondering why instance003 did not include the same fix as
     instance002. In fact I was wondering why all the tests did not include
     this fix. It looks like you have now fixed instance003 the same way as
     instance002, but I'm unsure of how all the other tests are being fixed.
What is the fix now for MonitorEvents002, heapwalking001,
     heapwalking002, and mixed002. Are these all covered by the
     useStrictCheck() fix?
For the newinstance tests, I understand what you are saying about
     needing to resume the VM to read the response, but I'm not sure how this
     manifested itself as a problem before the fix and how it relates to the
     reported failures, since as far as I can tell you did not add a suspend,
     so it must have been in place already.
Maybe what's going on here is that there are actually 3 separate issues,
     which is making it hard to figure out what the failure mode is for each
     test, and how the changes are addressing it.
thanks, Chris On 11/7/18 5:11 PM, Daniil Titov wrote:
     > Hi Chris,
     >
     > Thank you for your comments.
     >
     > Excluding of  java.lang.String is not really required. It seems as it is 
sufficient just to expect that the object created by other threads and included in 
the result of referenceType.instances() and might be collected before we are able 
to call disableCollection() on them. The new version of the patch includes such 
change in  
vmTestbase/nsk/jdi/ReferenceType/instances/instances003/instances003.java .
>
     > The changes in useStrictCheck() are required since some tests use array 
of primitives aa a test classes ( e.g. boolean[] ) and objects of these classes 
might be created by other (e.g. JVMC compiler) threads.
     > The useStrictCheck()   defines how the number of instances the test 
created should be compared to the actual number of instance of this class in the 
target VM. If useStrictCheck() returns true these numbers should be equal, 
otherwise the test passes if the number of the instances the test created is no 
greater than the number of  instances found.
     >
     > The changes below are required since  we need to suspend the target VM 
while the test is executed (to ensure no GC is triggered). At the same time we 
need to ensure the target VM is temporary resumed when we read its output from the 
pipe. These tests 
(test/hotspot/jtreg/vmTestbase/nsk/jdi/ArrayType/newInstance/newinstance00{1,2,3}.java)
 are written in a such way that the test code is executed as a body of the loop 
and on each iteration it reads a response from the target VM and then runs the 
tests, so on the first iteration vm.resume() is not required at the beginning of 
the loop. I added comments through the code and also updated existing comments per 
your suggestions.
     >
     >        153             if (i > 0) {
     >        154                 debuggee.resume();
     >        155             }
     >        156
     >        157             line = pipe.readln();
     >        158             debuggee.suspend();
     >
     >
     > Please review a new version of the fix with the changes mentioned above.
     >
     > Webrev: http://cr.openjdk.java.net/~dtitov/8203174/webrev.03/
     > Bug: https://bugs.openjdk.java.net/browse/JDK-8203174
     >
     > Thanks,
     > Daniil
     >
     > On 11/7/18, 11:33 AM, "Chris Plummer" <chris.plum...@oracle.com> wrote:
     >
     >      Hi Danil,
     >
     >      So this is the crux of the issue:
     >
     >        112         debuggee.suspend();
     >        113
     >        114         List<ObjectReference> baseReferences = new 
LinkedList<>();
     >        115         // We need to ensure that the base references count 
is not
     >      changed during the test.
     >        116         // The instances of the class could be already 
created by
     >      other (e.g. JVMCI) threads
     >        117         // and if GC was scheduled before VM was suspended 
some of
     >      these instances might
     >        118         // become collected.
     >        119         for (ObjectReference objRef : 
referenceType.instances(0)) {
     >        120             try {
     >        121                 objRef.disableCollection();
     >        122                 baseReferences.add(objRef);
     >        123             } catch (ObjectCollectedException e) {
     >        124                 // skip this reference
     >        125             }
     >        126         }
     >        127         baseInstances = baseReferences.size();
     >
     >      First it is possible for a GC to be triggered even if the debuggee 
is
     >      not executing any code because of JVMCI threads. So this is why
     >      debuggee.suspend() is needed. Second, even after calling
     >      debuggee.suspend(), it is still possible for a GC to happen since 
it may
     >      have been triggered before the debuggee.suspend() call, but not yet
     >      completed, and debuggee.suspend() does not prevent GC from 
completing in
     >      that case. So that is why you need to disableCollecion() on each 
object,
     >      and accept that some of them may have already been collected. 
Therefore
     >      baseInstances needs to be updated to only reflect the count of 
instances
     >      that have not been collected, and will not be collected because
     >      disableCollection() has been called on them. This all makes sense 
and
     >      seems fine.
     >
     >      I do think the comments could use some updating. The 
debuggee.suspend()
     >      call should be explained as being needed because there are 
potentially
     >      other non-test java threads allocating objects and triggering GC's,
     >      JVMCI being the main culprit here. The comment before the loop 
should
     >      focus on dealing with a GC that was triggered before the suspend,
     >      requiring disableCollection() be called each object returned by
     >      referenceType.instances() since it can potentially be collected
     >      otherwise. This code is not really related to the JVMCI threads.
     >
     >      I don't understand the reason for excluding java.lang.String from 
testing.
     >
     >      I don't understand the reason for the useStrictCheck() changes.
     >
     >      I don't understand what the following is fixing, and the impact on 
test
     >      execution that the resume() might have:
     >
     >        153             if (i > 0) {
     >        154                 debuggee.resume();
     >        155             }
     >        156
     >        157             line = pipe.readln();
     >        158             debuggee.suspend();
     >
     >
     >      thanks,
     >
     >      Chris
     >
     >      On 11/7/18 8:58 AM, Daniil Titov wrote:
     >      > Hi Chris and Serguei,
     >      >
     >      > With recent builds I can no longer see any GC triggered by C1 
compiler thread due to running out of metaspace and JDK-8193126 seems as solved this 
particular problem. Currently all observed GCs are triggered by JVMCI Compiler 
threads.
     >      >
     >      > Please review a new version of the fix that no longer keeps the 
main thread in the target VM running while other threads are suspended (since as Dean 
mentioned it might be not safe). Instead, the target VM is fully suspended when 
required and resumed afterwards. The new webrev also excludes java.lang.String class 
from the list of classes used by some of these tests.
     >      >
     >      > Webrev: 
http://cr.openjdk.java.net/~dtitov/8203174/webrev.02/index.html
     >      > Bug: https://bugs.openjdk.java.net/browse/JDK-8203174
     >      >
     >      > Thanks,
     >      > Daniil
     >      >
     >      > On 11/5/18, 10:46 AM, "Chris Plummer" <chris.plum...@oracle.com> 
wrote:
     >      >
     >      >      On 11/4/18 11:45 PM, serguei.spit...@oracle.com wrote:
     >      >      > On 11/4/18 23:40, serguei.spit...@oracle.com wrote:
     >      >      >> Hi Daniil,
     >      >      >>
     >      >      >> I agree with Chris below.
     >      >      >> Will tell more on reply to your reply.
     >      >      >>
     >      >      >> Thanks,
     >      >      >> Serguei
     >      >      >>
     >      >      >> On 11/2/18 20:59, Chris Plummer wrote:
     >      >      >>> Hi Daniil,
     >      >      >>>
     >      >      >>> I thought the issue was that C2 was doing metadata 
allocations, and
     >      >      >>> when it ran out of metaspace it did a GC (one that was 
not triggered
     >      >      >>> by a failed object
     >      >      >
     >      >      > Forgot to comment on the below.
     >      >      > It is probably a typo.
     >      >      > Should it be about the Graal, not the C2?
     >      >      > As I understand we have no issue with the C2.
     >      >      Actually it was the C1 Compiler Thread that was the problem, 
although it
     >      >      only turned up during Graal testing.
     >      >
     >      >      Chris
     >      >      >
     >      >      > Thanks,
     >      >      > Serguei
     >      >      >
     >      >      >
     >      >      >>> allocation). As a results we were getting objects GCs 
before the
     >      >      >>> test ever got the chance to disable collection on them. 
I thought we
     >      >      >>> also concluded other than this metaspace issue, if the 
test is
     >      >      >>> properly disabling collection on the objects it cared 
about, it
     >      >      >>> shouldn't matter if there are GC's triggered by 
excessive object
     >      >      >>> allocations.
     >      >      >>>
     >      >      >>> I don't think the following check will always be valid. 
It may be on
     >      >      >>> by default at some point:
     >      >      >>>
     >      >      >>>  651     public boolean isJVMCICompilerOn() {
     >      >      >>>  652         String opts = 
argumentHandler.getLaunchOptions();
     >      >      >>>  653         return opts.indexOf("-XX:+UseJVMCICompiler") 
>= 0;
     >      >      >>>  654     }
     >      >      >>>
     >      >      >>> thanks,
     >      >      >>>
     >      >      >>> Chris
     >      >      >>>
     >      >      >>> On 11/2/18 4:29 PM, Daniil Titov wrote:
     >      >      >>>> Please review the change that fixes several tests 
failing with
     >      >      >>>> com.sun.jdi.ObjectCollectedException when running with 
Graal.
     >      >      >>>>
     >      >      >>>> There main problem here is that with Graal on JVMCI 
Compiler
     >      >      >>>> threads in the target VM create a lot of objects and 
sporadically
     >      >      >>>> trigger GC that results in the objects created in the 
target VM in
     >      >      >>>> the tests being GCed before the tests complete. The 
other problem
     >      >      >>>> is that for some classes the tests use, e.g. 
"java.lang.String",
     >      >      >>>> there is already a huge number of instances created by 
JVMCI threads.
     >      >      >>>>
     >      >      >>>> The fix suspends target VM to prevent JVMCI compiler 
threads from
     >      >      >>>> creating new objects during the tests execution. In 
cases when
     >      >      >>>> IOPipe is used for communication between the test and 
the debuggee
     >      >      >>>> the fix suspends all threads but the main rather than 
suspending
     >      >      >>>> the VM. The fix also filters out the checks for some 
test classes
     >      >      >>>> (e.g. "java.lang.String") in cases when the target VM 
is run with
     >      >      >>>> JVMCI compiler options on.
     >      >      >>>>
     >      >      >>>> Webrev: 
http://cr.openjdk.java.net/~dtitov/8203174/webrev.01/
     >      >      >>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8203174
     >      >      >>>>
     >      >      >>>> Thanks,
     >      >      >>>> Daniil
     >      >      >>>>
     >      >      >>>>
     >      >      >>>>
     >      >      >>>
     >      >      >>>
     >      >      >>
     >      >      >
     >      >
     >      >
     >      >
     >      >
     >      >
     >
     >
     >
     >
     >



Reply via email to