Hi Chris,

I updated the bug with the info about all these separate issues. 

Webrev: http://cr.openjdk.java.net/~dtitov/8203174/webrev.03/
Bug: https://bugs.openjdk.java.net/browse/JDK-8203174

Thanks,
Daniil

On 11/8/18, 5:11 PM, "Chris Plummer" <chris.plum...@oracle.com> wrote:

    Hi Daniil,
    
    On 11/8/18 3:12 PM, Daniil Titov wrote:
    > Hi Chris,
    >
    > You are right. There are different cases here. Please find them listed 
below.
    >
    > 1. Test vmTestbase/nsk/jdi/stress/MonitorEvents/MonitorEvents002
    >   This test was actually fixed with JDK-8206086 changes. As far as I 
remember it was not removed from ProblemList-graal.txt due to some sporadic 
failures (not related to ObjectCollectedException), however with the latest  
builds these failures are no longer reproduced .
    OK.
    >
    > 2. "NewInstance" tests 
(test/hotspot/jtreg/vmTestbase/nsk/jdi/ArrayType/newInstance/newinstance001.java,
 
test/hotspot/jtreg/vmTestbase/nsk/jdi/ArrayType/newInstance/newinstance002.java,
 
test/hotspot/jtreg/vmTestbase/nsk/jdi/ArrayType/newInstance/newinstance003.java)
    >
    > The typical failure for these tests Is  that the test creates a new array 
with arrayType.newInstance(arraylength) but this object becomes immediately 
GC'ed. Calling disableCollection() on the newly created instance doesn't help 
since by that time the object might be already GC'ed.
    >
    >
    >                com.sun.jdi.ObjectCollectedException at 
jdk.jdi/com.sun.tools.jdi.JDWPException.toJDIException(JDWPException.java:55)
    >   at 
jdk.jdi/com.sun.tools.jdi.ArrayReferenceImpl.getValues(ArrayReferenceImpl.java:126)
    >   at 
jdk.jdi/com.sun.tools.jdi.ArrayReferenceImpl.getValue(ArrayReferenceImpl.java:77)
    >   at 
nsk.jdi.ArrayType.newInstance.newinstance001.runThis(newinstance001.java:316)
    >   at 
nsk.jdi.ArrayType.newInstance.newinstance001.run(newinstance001.java:84)
    >   at 
nsk.jdi.ArrayType.newInstance.newinstance001.main(newinstance001.java:79)
    >   at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    >   at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    >   at 
java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    >   at java.base/java.lang.reflect.Method.invoke(Method.java:567)
    >   at PropertyResolvingWrapper.main(PropertyResolvingWrapper.java:104)
    >   at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    >   at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    >   at 
java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    >   at java.base/java.lang.reflect.Method.invoke(Method.java:567)
    >   at 
com.sun.javatest.regtest.agent.MainWrapper$MainThread.run(MainWrapper.java:127)
    >   at java.base/java.lang.Thread.run(Thread.java:835)
    >
    > The proposed fix for this case is to suspend VM while the body of the 
"for" loop (lines 150 - 356) is executed while ensuring that for line 151 ( 
line = pipe.printlln()) VM is temporary resumed ( line numbers are given for an 
original version of  
test/hotspot/jtreg/vmTestbase/nsk/jdi/ArrayType/newInstance/newinstance001.java)
    >   
    >   147     //------------------------------------------------------  
testing section
    >   148         log1("      TESTING BEGINS");
    >   149
    >   150         for (int i = 0; ; i++) {
    >   151         pipe.println("newcheck");
    >   152             line = pipe.readln();
    >   153
    >           <skipped>
    >   
    > 356         }
    >   357         log1("      TESTING ENDS");
    >   358
    Ok. I get it now. It was confusing because there is a resume before the 
    suspend, but I see now why it was done that way.  I assume you 
    originally tried just doing the suspend outside of the loop, but found 
    that caused problems for pipe.readln(), so you moved it to inside the 
    loop and after the pipe.readln().
    >
    > 3. Test 
test/hotspot/jtreg/vmTestbase/nsk/jdi/ReferenceType/instances/instances002/instances002.java
    >    There were 2 issues here:
    >                    a) The same issue as above.  On line 125 calling 
disableCollection ()  on newly created array object (arrayType.newInstance()) 
resulted in ObjectCollectedException was thrown
    >                   The fix here is to suspend VM while the testing code is 
executed.
    >      b) IndexOutOfBoundException  on line 130 
(checkDebugeeAnswer_instances(className, createInstanceCount + baseInstances)  
since some of base instances that were created before VM was suspended  become 
collected if GC was triggered before VM was suspended.
    >           The fix here is to call disableCollection() an all base 
instances and filter out ones that were already collected.
    Ok.
    >
    > 4. Test 
test/hotspot/jtreg/vmTestbase/nsk/jdi/ReferenceType/instances/instances003/instances003.java
    >   The problem here was ObjectCollectedException thrown on line 141 while 
iterating over newly created instances  since these instances also include ones 
created by other java threads and some of them become GC'ed before 
disableCollection() is called on them. In this test the new instances are 
created by debuggee itself (the test sends the commands to debuggee to create 
or delete instances using pipe)  so here we cannot use suspend VM approach.
    >   The fix is to filter out already collected instances  ( these are the 
instances created outside of the test) and ensure that instances created by the 
test are not GC'ed before we call disableCollection() on them. For the latter 
the fix just extends the approach used by the test for WEAK references ( to ask 
debuggee to create a temporary strong reference and then delete them after we 
call disableCollection() ) to other reference types.
    Ok.
    >
    > 5. Stress tests 
vmTestbase/nsk/jdi/stress/serial/heapwalking001/TestDescription.java, 
vmTestbase/nsk/jdi/stress/serial/heapwalking002/TestDescription.java, 
vmTestbase/nsk/jdi/stress/serial/mixed002/TestDescription.java
    >     Two problems here:
    >     1)  These tests use primitive arrays as test classes and these 
classes might be created by other java threads outside the test . Changes in 
useStrictCheck() fix this.
    >     2)  These tests include 
test/hotspot/jtreg/vmTestbase/nsk/jdi/ReferenceType/instances/instances003/instances003.java
 test that is fixed at 4 above
    Ok.
    
    Thanks for describing the separate issues. Can you make sure the CR is 
    updated with this info?
    
    thanks,
    
    Chris
    >
    >
    > Thanks,
    > Daniil
    >
    > On 11/8/18, 1:10 PM, "Chris Plummer" <chris.plum...@oracle.com> wrote:
    >
    >      Hi Daniil,
    >      
    >      Hi Daniil,
    >      
    >      I was wondering why instance003 did not include the same fix as
    >      instance002. In fact I was wondering why all the tests did not 
include
    >      this fix. It looks like you have now fixed instance003 the same way 
as
    >      instance002, but I'm unsure of how all the other tests are being 
fixed.
    >      
    >      What is the fix now for MonitorEvents002, heapwalking001,
    >      heapwalking002, and mixed002. Are these all covered by the
    >      useStrictCheck() fix?
    >      
    >      For the newinstance tests, I understand what you are saying about
    >      needing to resume the VM to read the response, but I'm not sure how 
this
    >      manifested itself as a problem before the fix and how it relates to 
the
    >      reported failures, since as far as I can tell you did not add a 
suspend,
    >      so it must have been in place already.
    >      
    >      Maybe what's going on here is that there are actually 3 separate 
issues,
    >      which is making it hard to figure out what the failure mode is for 
each
    >      test, and how the changes are addressing it.
    >      
    >      thanks,
    >      
    >      Chris
    >      
    >      On 11/7/18 5:11 PM, Daniil Titov wrote:
    >      > Hi Chris,
    >      >
    >      > Thank you for your comments.
    >      >
    >      > Excluding of  java.lang.String is not really required. It seems as 
it is sufficient just to expect that the object created by other threads and 
included in the result of referenceType.instances() and might be collected 
before we are able to call disableCollection() on them. The new version of the 
patch includes such change in  
vmTestbase/nsk/jdi/ReferenceType/instances/instances003/instances003.java .
    >      
    >      
    >      
    >      
    >      >
    >      > The changes in useStrictCheck() are required since some tests use 
array of primitives aa a test classes ( e.g. boolean[] ) and objects of these 
classes might be created by other (e.g. JVMC compiler) threads.
    >      > The useStrictCheck()   defines how the number of instances the 
test created should be compared to the actual number of instance of this class 
in the target VM. If useStrictCheck() returns true these numbers should be 
equal, otherwise the test passes if the number of the instances the test 
created is no greater than the number of  instances found.
    >      >
    >      > The changes below are required since  we need to suspend the 
target VM while the test is executed (to ensure no GC is triggered). At the 
same time we need to ensure the target VM is temporary resumed when we read its 
output from the pipe. These tests 
(test/hotspot/jtreg/vmTestbase/nsk/jdi/ArrayType/newInstance/newinstance00{1,2,3}.java)
 are written in a such way that the test code is executed as a body of the loop 
and on each iteration it reads a response from the target VM and then runs the 
tests, so on the first iteration vm.resume() is not required at the beginning 
of the loop. I added comments through the code and also updated existing 
comments per your suggestions.
    >      >
    >      >        153             if (i > 0) {
    >      >        154                 debuggee.resume();
    >      >        155             }
    >      >        156
    >      >        157             line = pipe.readln();
    >      >        158             debuggee.suspend();
    >      >
    >      >
    >      > Please review a new version of the fix with the changes mentioned 
above.
    >      >
    >      > Webrev: http://cr.openjdk.java.net/~dtitov/8203174/webrev.03/
    >      > Bug: https://bugs.openjdk.java.net/browse/JDK-8203174
    >      >
    >      > Thanks,
    >      > Daniil
    >      >
    >      > On 11/7/18, 11:33 AM, "Chris Plummer" <chris.plum...@oracle.com> 
wrote:
    >      >
    >      >      Hi Danil,
    >      >
    >      >      So this is the crux of the issue:
    >      >
    >      >        112         debuggee.suspend();
    >      >        113
    >      >        114         List<ObjectReference> baseReferences = new 
LinkedList<>();
    >      >        115         // We need to ensure that the base references 
count is not
    >      >      changed during the test.
    >      >        116         // The instances of the class could be already 
created by
    >      >      other (e.g. JVMCI) threads
    >      >        117         // and if GC was scheduled before VM was 
suspended some of
    >      >      these instances might
    >      >        118         // become collected.
    >      >        119         for (ObjectReference objRef : 
referenceType.instances(0)) {
    >      >        120             try {
    >      >        121                 objRef.disableCollection();
    >      >        122                 baseReferences.add(objRef);
    >      >        123             } catch (ObjectCollectedException e) {
    >      >        124                 // skip this reference
    >      >        125             }
    >      >        126         }
    >      >        127         baseInstances = baseReferences.size();
    >      >
    >      >      First it is possible for a GC to be triggered even if the 
debuggee is
    >      >      not executing any code because of JVMCI threads. So this is 
why
    >      >      debuggee.suspend() is needed. Second, even after calling
    >      >      debuggee.suspend(), it is still possible for a GC to happen 
since it may
    >      >      have been triggered before the debuggee.suspend() call, but 
not yet
    >      >      completed, and debuggee.suspend() does not prevent GC from 
completing in
    >      >      that case. So that is why you need to disableCollecion() on 
each object,
    >      >      and accept that some of them may have already been collected. 
Therefore
    >      >      baseInstances needs to be updated to only reflect the count 
of instances
    >      >      that have not been collected, and will not be collected 
because
    >      >      disableCollection() has been called on them. This all makes 
sense and
    >      >      seems fine.
    >      >
    >      >      I do think the comments could use some updating. The 
debuggee.suspend()
    >      >      call should be explained as being needed because there are 
potentially
    >      >      other non-test java threads allocating objects and triggering 
GC's,
    >      >      JVMCI being the main culprit here. The comment before the 
loop should
    >      >      focus on dealing with a GC that was triggered before the 
suspend,
    >      >      requiring disableCollection() be called each object returned 
by
    >      >      referenceType.instances() since it can potentially be 
collected
    >      >      otherwise. This code is not really related to the JVMCI 
threads.
    >      >
    >      >      I don't understand the reason for excluding java.lang.String 
from testing.
    >      >
    >      >      I don't understand the reason for the useStrictCheck() 
changes.
    >      >
    >      >      I don't understand what the following is fixing, and the 
impact on test
    >      >      execution that the resume() might have:
    >      >
    >      >        153             if (i > 0) {
    >      >        154                 debuggee.resume();
    >      >        155             }
    >      >        156
    >      >        157             line = pipe.readln();
    >      >        158             debuggee.suspend();
    >      >
    >      >
    >      >      thanks,
    >      >
    >      >      Chris
    >      >
    >      >      On 11/7/18 8:58 AM, Daniil Titov wrote:
    >      >      > Hi Chris and Serguei,
    >      >      >
    >      >      > With recent builds I can no longer see any GC triggered by 
C1 compiler thread due to running out of metaspace and JDK-8193126 seems as 
solved this particular problem. Currently all observed GCs are triggered by 
JVMCI Compiler threads.
    >      >      >
    >      >      > Please review a new version of the fix that no longer keeps 
the main thread in the target VM running while other threads are suspended 
(since as Dean mentioned it might be not safe). Instead, the target VM is fully 
suspended when required and resumed afterwards. The new webrev also excludes 
java.lang.String class from the list of classes used by some of these tests.
    >      >      >
    >      >      > Webrev: 
http://cr.openjdk.java.net/~dtitov/8203174/webrev.02/index.html
    >      >      > Bug: https://bugs.openjdk.java.net/browse/JDK-8203174
    >      >      >
    >      >      > Thanks,
    >      >      > Daniil
    >      >      >
    >      >      > On 11/5/18, 10:46 AM, "Chris Plummer" 
<chris.plum...@oracle.com> wrote:
    >      >      >
    >      >      >      On 11/4/18 11:45 PM, serguei.spit...@oracle.com wrote:
    >      >      >      > On 11/4/18 23:40, serguei.spit...@oracle.com wrote:
    >      >      >      >> Hi Daniil,
    >      >      >      >>
    >      >      >      >> I agree with Chris below.
    >      >      >      >> Will tell more on reply to your reply.
    >      >      >      >>
    >      >      >      >> Thanks,
    >      >      >      >> Serguei
    >      >      >      >>
    >      >      >      >> On 11/2/18 20:59, Chris Plummer wrote:
    >      >      >      >>> Hi Daniil,
    >      >      >      >>>
    >      >      >      >>> I thought the issue was that C2 was doing metadata 
allocations, and
    >      >      >      >>> when it ran out of metaspace it did a GC (one that 
was not triggered
    >      >      >      >>> by a failed object
    >      >      >      >
    >      >      >      > Forgot to comment on the below.
    >      >      >      > It is probably a typo.
    >      >      >      > Should it be about the Graal, not the C2?
    >      >      >      > As I understand we have no issue with the C2.
    >      >      >      Actually it was the C1 Compiler Thread that was the 
problem, although it
    >      >      >      only turned up during Graal testing.
    >      >      >
    >      >      >      Chris
    >      >      >      >
    >      >      >      > Thanks,
    >      >      >      > Serguei
    >      >      >      >
    >      >      >      >
    >      >      >      >>> allocation). As a results we were getting objects 
GCs before the
    >      >      >      >>> test ever got the chance to disable collection on 
them. I thought we
    >      >      >      >>> also concluded other than this metaspace issue, if 
the test is
    >      >      >      >>> properly disabling collection on the objects it 
cared about, it
    >      >      >      >>> shouldn't matter if there are GC's triggered by 
excessive object
    >      >      >      >>> allocations.
    >      >      >      >>>
    >      >      >      >>> I don't think the following check will always be 
valid. It may be on
    >      >      >      >>> by default at some point:
    >      >      >      >>>
    >      >      >      >>>  651     public boolean isJVMCICompilerOn() {
    >      >      >      >>>  652         String opts = 
argumentHandler.getLaunchOptions();
    >      >      >      >>>  653         return 
opts.indexOf("-XX:+UseJVMCICompiler") >= 0;
    >      >      >      >>>  654     }
    >      >      >      >>>
    >      >      >      >>> thanks,
    >      >      >      >>>
    >      >      >      >>> Chris
    >      >      >      >>>
    >      >      >      >>> On 11/2/18 4:29 PM, Daniil Titov wrote:
    >      >      >      >>>> Please review the change that fixes several tests 
failing with
    >      >      >      >>>> com.sun.jdi.ObjectCollectedException when running 
with Graal.
    >      >      >      >>>>
    >      >      >      >>>> There main problem here is that with Graal on 
JVMCI Compiler
    >      >      >      >>>> threads in the target VM create a lot of objects 
and sporadically
    >      >      >      >>>> trigger GC that results in the objects created in 
the target VM in
    >      >      >      >>>> the tests being GCed before the tests complete. 
The other problem
    >      >      >      >>>> is that for some classes the tests use, e.g. 
"java.lang.String",
    >      >      >      >>>> there is already a huge number of instances 
created by JVMCI threads.
    >      >      >      >>>>
    >      >      >      >>>> The fix suspends target VM to prevent JVMCI 
compiler threads from
    >      >      >      >>>> creating new objects during the tests execution. 
In cases when
    >      >      >      >>>> IOPipe is used for communication between the test 
and the debuggee
    >      >      >      >>>> the fix suspends all threads but the main rather 
than suspending
    >      >      >      >>>> the VM. The fix also filters out the checks for 
some test classes
    >      >      >      >>>> (e.g. "java.lang.String") in cases when the 
target VM is run with
    >      >      >      >>>> JVMCI compiler options on.
    >      >      >      >>>>
    >      >      >      >>>> Webrev: 
http://cr.openjdk.java.net/~dtitov/8203174/webrev.01/
    >      >      >      >>>> Bug: 
https://bugs.openjdk.java.net/browse/JDK-8203174
    >      >      >      >>>>
    >      >      >      >>>> Thanks,
    >      >      >      >>>> Daniil
    >      >      >      >>>>
    >      >      >      >>>>
    >      >      >      >>>>
    >      >      >      >>>
    >      >      >      >>>
    >      >      >      >>
    >      >      >      >
    >      >      >
    >      >      >
    >      >      >
    >      >      >
    >      >      >
    >      >
    >      >
    >      >
    >      >
    >      >
    >      
    >      
    >      
    >
    >
    
    
    


Reply via email to