On 2/27/19 2:33 PM, Daniil Titov wrote:
Hi Chris,

The change in while loop  in exclfilter001.java between the first and the last 
revisions was to ensure that all events are drained before the method returns 
and the next check starts. It used to wait for 5 minutes and now it keeps 
receiving events in portions waiting for 1 seconds at every iteration.
It look like the exclfilter001.java loop will always end up looping for eventWaitTime seconds:

  92         eventWaitTime = Utils.adjustTimeout(argHandler.getWaitTime() * 10000);

 125             receiveEvents(eventWaitTime, patterns[i]);

I think you said that argHandler.getWaitTime() is 5 and Utils.adjustTimeout() is another 4x for mach5 jobs, so this would mean a total of 200 seconds, and this is how much time the loop would always take since it loops for eventWaitTime seconds.
  You mentioned in one of previous emails that due to network issues  it might 
be the case that for some iteration we fail to receive an event within this 
smaller timeout. But later, if I understood you right, you suggested that 4 
seconds ( Utils.adjustTimeout(1000) adjusts timeout for Mach5 builds to 4 
seconds) should be sufficient.
Actually I was going more by the argHandler.getWaitTime() adjustment, which I thought was 5. Mach5 will multiply that by 4. I think 1s is too short but 5s is probably ok, and 20s for mach5 jobs should prevent any noise from rare network issues, except for serious ones that we can't expect to recover from.
   I agree that it's better to make the test return earlier and I will revert 
changes in exclfilter001.java and filter001.java to the first revision.
Ok.

For consistency I will change line 153 to take into account a wait time 
specified in argHandler.getWaitTime()
153         long waitTime = Utils.adjustTimeout(1000);
Ok.
I am sorry I missed you comments abut Utils.java. You are right we don’t need a 
new method and Utils.waitForCondition() should be sufficient.
There is no need for passing logger into since if the thread is interrupted the 
error is thrown and the test fails.  Will send a new webrev as soon as testing 
completes.
thanks,

Chris
Thanks!
Daniil


On 2/27/19, 12:08 PM, "Chris Plummer" <[email protected]> wrote:

     Hi Daniil,
In exclfilter001.java, since the first revision you changed the while
     loop to be "while (true)". I'm not sure of the reasoning. It used to
     exit the first time remove() didn't return an EventSet. Now it retries
     until the total waittime is exceeded.
When waiting for an event, sometimes you use: 91 eventWaitTime =
     Utils.adjustTimeout(argHandler.getWaitTime() * 10000);
and sometimes: 153 long waitTime = Utils.adjustTimeout(1000); Why the discrepancy. In one case you are willing to wait 50 seconds for
     an event and in the other only 5.
I think you missed my initial comment on Utils.java. See below: > In Utils.java, I think wait() should be moved right after
     > waitForCondition() and maybe given a more descriptive name. It seems
     > to basically the same as waitForCondition(), except you added a "log"
     > parameter and slightly changed the behavior. Are these behavior
     > differences necessary? Could you share code with the existing
     > waitForCondition()?
thanks, Chris On 2/27/19 9:03 AM, Daniil Titov wrote:
     > Hi Serguei and Chris,
     >
     > Thank you for reviewing this change. Please review a new version of the 
fix that addresses these findings.
     >
     > Bug: https://bugs.openjdk.java.net/browse/JDK-8207367
     > Webrev: http://cr.openjdk.java.net/~dtitov/8207367/webrev.03
     >
     > Best regards,
     > Daniil
     >
     > From: <[email protected]>
     > Organization: Oracle Corporation
     > Date: Tuesday, February 26, 2019 at 6:50 PM
     > To: Daniil Titov <[email protected]>, Chris Plummer 
<[email protected]>, OpenJDK Serviceability <[email protected]>
     > Subject: Re: RFR 8207367: 10 vmTestbase/nsk/jdi tests timed out when 
running with jtreg
     >
     > Hi Daniil,
     >
     > It looks good to me.
     > I have some minor comments though.
     > 
http://cr.openjdk.java.net/~dtitov/8207367/webrev.02/test/hotspot/jtreg/vmTestbase/nsk/jdi/ClassUnloadRequest/addClassFilter/filter001.java.frames.html
     >
     >   163                 if(eventSet != null) {
     >   Space is missed after if
     >
     >
     > 
http://cr.openjdk.java.net/~dtitov/8207367/webrev.02/test/hotspot/jtreg/vmTestbase/nsk/jdi/EventQueue/remove/remove004.java.frames.html
     >    70  * In first one, second thread waits for any incoming event from the    
<BR>
     >    71  * debugger which is sleeping for some time; hence, <BR>
     >    72  * no events are expected to be received at the debugger end.           
<BR>
     >   <BR> is not aligned at 71
     >
     >
     >   While you are at this code, could you, please,
     >   also fix unneeded spaces at the lines? :
     >   516             } catch ( InterruptedException e1) {
     >   ...
     >   526             } catch ( Exception e ) {
     >
     > 
http://cr.openjdk.java.net/~dtitov/8207367/webrev.02/test/hotspot/jtreg/vmTestbase/nsk/jdi/EventQueue/remove/remove004a.java.frames.html
     >   111                                         
Utils.adjustTimeout(argHandler.getWaitTime() * 10000),
     >   112                                        100,
     >   113                                         remove004a::log1);
     >    Line 112 is not properly aligned
     >
     >
     > 
http://cr.openjdk.java.net/~dtitov/8207367/webrev.02/test/hotspot/jtreg/vmTestbase/nsk/jdi/EventQueue/remove_l/remove_l004.java.frames.html
     >    65  * In the first one the first assertion is checked up on as follows:    
<BR>
     >    66  * the debugger sleeps for some time;                                  
<BR>
     >    67  * hence, no event is expected in the debugger within WAITTIME, and     
<BR>
     >    <BR> at 66 is not aligned
     >    Thank you for catching and fixing the typo at 67!
     >
     > 
http://cr.openjdk.java.net/~dtitov/8207367/webrev.02/test/hotspot/jtreg/vmTestbase/nsk/jdi/VirtualMachine/dispose/dispose002.java.frames.html
     >
     >   295                         }, Utils.adjustTimeout(waitTime * 60000), 
1000, dispose002::log3);
     >   A separate line is needed for next wait() argument
     >
     >
     > 
http://cr.openjdk.java.net/~dtitov/8207367/webrev.02/test/hotspot/jtreg/vmTestbase/nsk/jdi/VirtualMachine/dispose/dispose002a.java.frames.html
     >   128                          while(true) {
     >   ...
     >   130                              if(instruction.equals("check_done")){
     >   131                                  if (test_thread.isAlive()) {
     >   132                                      exitCode = FAILED;
     >   133                                  }
     >
     >    Space is missed after while.
     >    Some logErr("...") for failed case would be useful.
     >
     >
     > 
http://cr.openjdk.java.net/~dtitov/8207367/webrev.02/test/hotspot/jtreg/vmTestbase/nsk/jdi/VirtualMachine/dispose/dispose003.java.frames.html
     >   260                         }, Utils.adjustTimeout(waitTime * 60000), 
1000, dispose003::log3);
     >   New line is needed for next wait() argument
     >
     >
     > 
http://cr.openjdk.java.net/~dtitov/8207367/webrev.02/test/hotspot/jtreg/vmTestbase/nsk/jdi/VirtualMachine/dispose/dispose003a.java.frames.html
     >   129                          while(true) {
     >   130                              instruction = pipe.readln();
     >   131                              if(instruction.equals("check_done")) {
     >   132                                  if (test_thread.isAlive()) {
     >   133                                      exitCode = FAILED;
     >   134                                  }
     >    Space is missed after while and if.
     >    Some logErr("...") for failed case would be useful.
     >
     > 
http://cr.openjdk.java.net/~dtitov/8207367/webrev.02/test/hotspot/jtreg/vmTestbase/nsk/jdi/VirtualMachine/dispose/dispose004.java.frames.html
     >   284                         }, Utils.adjustTimeout(waitTime * 60000), 
1000, dispose004::log3);
     >   New line is needed for next wait() argument
     >
     >
     > 
http://cr.openjdk.java.net/~dtitov/8207367/webrev.02/test/hotspot/jtreg/vmTestbase/nsk/jdi/VirtualMachine/dispose/dispose004a.java.frames.html
     >   130                          while(true) {
     >   131                              instruction = pipe.readln();
     >   132                              if(instruction.equals("check_done")) {
     >   133                                  if (test_thread.isAlive()) {
     >   134                                      exitCode = FAILED;
     >   135                                  }
     >    Space is missed after while and if.
     >    Some logErr("...") for failed case would be useful.
     >
     >
     > Thanks,
     > Serguei
     >
     > On 2/26/19 6:06 PM, Daniil Titov wrote:
     > Hi Chris,
     >
     > Please review a new version of the webrev that slightly changes method 
receiveEvents(long,pattern)  in the following tests:
     >   
-vmTestbase/nsk/jdi/ClassUnloadRequest/addClassExclusionFilter/exclfilter001.java
     >   -vmTestbase/nsk/jdi/ClassUnloadRequest/addClassFilter/filter001.java)
     >
     >
     > The new changes ensure that receiveEvents(long,pattern)  method keeps 
receiving events in a while loop even if eventSet returned by  
debugee.VM().eventQueue().remove(waitTime) is a null due to timeout.
     >
     >
     > Bug: https://bugs.openjdk.java.net/browse/JDK-8207367
     > Webrev: http://cr.openjdk.java.net/~dtitov/8207367/webrev.02
     >
     > Thanks!
     > -Daniil
     >
     > On 2/26/19, 11:41 AM, "Chris Plummer" mailto:[email protected] 
wrote:
     >
     >      Ok. I think you mentioned below that default wait time will be 5
     >      seconds. That seems sufficient to avoid a timeout if there are some
     >      minor network issues and some packets are lost.
     >
     >      Changes look good.
     >
     >      thanks,
     >
     >      Chris
     >
     >      On 2/26/19 10:01 AM, Daniil Titov wrote:
     >      > Hi Chris,
     >      >
     >      > Yes , it is correct. For example in this particular test the 
timeout is expected (line 283 expects that breakpoint() returns returnCode3 that is 
set on line 460 when eventSet on line 456 is null due to a timeout in 
eventQueue.remove()) and shortening it makes the whole test fit in jtreg time limits.
     >      >
     >      >    281                      log2("       checking up that the thread2 
is not at breakpoint1 because of suspension");
     >      >     282                     expresult = breakpoint();
     >      >     283                     if (expresult != returnCode3) {
     >      >     284                         log3("ERROR: no timeout for waiting 
for BreakpointEvent when the thread2 is suspended");
     >      >     285                         expresult = returnCode1;
     >      >     286                         break label1;
     >      >     287                     } else
     >      >
     >      >
     >      >    445          private int breakpoint () {
     >      >     446     
     >      >     447             int returnCode = returnCode0;
     >      >     448     
     >      >     449             log2("       waiting for BreakpointEvent");
     >      >     450     
     >      >     451             labelBP:
     >      >     452                 for (;;) {
     >      >     453     
     >      >     454                     log2("       new:  eventSet = 
eventQueue.remove();");
     >      >     455                     try {
     >      >     456                         eventSet = eventQueue.remove 
(Utils.adjustTimeout(waitTime*1000));
     >      >     457                         if (eventSet == null) {
     >      >     458                             log2("::::::  timeout when waiting 
for a BreakpintEvent");
     >      >     459     //                        log3("ERROR:  timeout for 
waiting for a BreakpintEvent");
     >      >     460                             returnCode = returnCode3;
     >      >     461                             break labelBP;
     >      >     462                         }
     >      >
     >      >>     And I just noticed the space right after "remove". Can you 
remove it?
     >      > Sure. Will do.
     >      >
     >      > Thanks!
     >      > --Daniil
     >      >
     >      > On 2/25/19, 7:26 PM, "Chris Plummer" 
mailto:[email protected] wrote:
     >      >
     >      >      Hi Daniil,
     >      >
     >      >      For suspend001, are you saying the following is expected to 
timeout
     >      >      sometimes, so you need a shorter waittime to avoid making 
the whole test
     >      >      time out?
     >      >
     >      >        456                     eventSet = eventQueue.remove
     >      >      (Utils.adjustTimeout(waitTime*1000));
     >      >
     >      >      And I just noticed the space right after "remove". Can you 
remove it?
     >      >
     >      >      thanks,
     >      >
     >      >      Chris
     >      >
     >      >      On 2/25/19 6:57 PM, Daniil Titov wrote:
     >      >      > Hi Chris,
     >      >      >
     >      >      > The timeout issue mentioned in the bug is about jtreg 
aborting the tests since they are running longer than the maximum allowed time. That 
happens since these tests  use extreme long internal delays,  e.g. a sleep for 5 minutes 
 or a wait for 5 minutes for a case when no events ( and a notify()) are expected. 
Reducing these internal delays makes the test passing within the default jtreg timeout ( 
2 minutes).
     >      >      >
     >      >      > Best regards,
     >      >      > Daniil
     >      >      >
     >      >      > On 2/25/19, 6:15 PM, "Chris Plummer" 
mailto:[email protected] wrote:
     >      >      >
     >      >      >      Ok. So how is the timeout issue mentioned in the bug 
addressed when
     >      >      >      there is now a shorter wait time?
     >      >      >
     >      >      >      Chris
     >      >      >
     >      >      >      On 2/25/19 5:04 PM, Daniil Titov wrote:
     >      >      >      > Hi Chris,
     >      >      >      >
     >      >      >      > Forgot to answer to your another question:
     >      >      >      >   >      > For these 3 tests the event wait timeout 
was reduced and adjusted for test.timeout.factor:
     >      >      >      >      >      >    
-vmTestbase/nsk/jdi/Event/_itself_/event001.java
     >      >      >      >      >      >    
-vmTestbase/nsk/jdi/VirtualMachine/suspend/suspend001/TestDescription.java
     >      >      >      >      >      >    
-vmTestbase/nsk/jdi/ThreadReference/suspend/suspend001.java
     >      >      >      >      >      So overall is this a shorter or longer 
waittime now?
     >      >      >      >
     >      >      >      > Overall this is a shorter waitime now.  Instead of 
300 seconds it is now 20 seconds for Mach5 jobs (they are run with test.timeout.factor set 
to 4.0) and 5 seconds for regular jtreg runs.
     >      >      >      >
     >      >      >      > Best regards,
     >      >      >      > Daniil
     >      >      >      >
     >      >      >      >
     >      >      >      > On 2/25/19, 4:38 PM, "Chris Plummer" 
mailto:[email protected] wrote:
     >      >      >      >
     >      >      >      >      Hi Daniil,
     >      >      >      >
     >      >      >      >      Yes, my point was that the max time you wait 
for a single event is much
     >      >      >      >      smaller now. I can see a possibility that with 
a little bit of network
     >      >      >      >      instability  a packet gets lost and resend 
does not happen fast enough.
     >      >      >      >
     >      >      >      >      thanks,
     >      >      >      >
     >      >      >      >      Chris
     >      >      >      >
     >      >      >      >      On 2/25/19 4:32 PM, Daniil Titov wrote:
     >      >      >      >      > Hi Chris,
     >      >      >      >      >
     >      >      >      >      > The code still waits for the whole total 
wait time. There is a while loop at lines 163-186 that keeps receiving new events (line 183) 
till elapsed time is less than the waittime (line 178) or a timeout happens (so eventSet is 
null).
     >      >      >      >      >
     >      >      >      >      > 159               begin = 
System.currentTimeMillis();
     >      >      >      >      >     160                   eventSet = 
debugee.VM().eventQueue().remove(waitTime);
     >      >      >      >      >     161                   delta = 
System.currentTimeMillis() - begin;
     >      >      >      >      >     162                   totalWaitTime -= 
delta;
     >      >      >      >      >     163                   while (eventSet != 
null) {
     >      >      >      >      >     164                       EventIterator 
eventIterator = eventSet.eventIterator();
     >      >      >      >      >
     >      >      >      >      >     178                       if (totalWaitTime 
<= 0 || exit) {
     >      >      >      >      >     179                           break;
     >      >      >      >      >     180                       }
     >      >      >      >      >     181                       
debugee.resume();
     >      >      >      >      >     182                       begin = 
System.currentTimeMillis();
     >      >      >      >      >     183                       eventSet = 
debugee.VM().eventQueue().remove(waitTime);
     >      >      >      >      >     184                       delta = 
System.currentTimeMillis() - begin;
     >      >      >      >      >     185                       totalWaitTime 
-= delta;
     >      >      >      >      >     186                   }
     >      >      >      >      >
     >      >      >      >      >
     >      >      >      >      > However, as I see now in case if a timeout 
happens on line 160  (eventSet is null) the loop is not executed at all.  I haven't observed 
it in test runs but I think it makes sense to adjust this test to take this potential case 
into account. I will send an updated version of the patch soon.
     >      >      >      >      >
     >      >      >      >      > Thanks!
     >      >      >      >      >
     >      >      >      >      > Best regards,
     >      >      >      >      > Daniil
     >      >      >      >      >
     >      >      >      >      > On 2/25/19, 12:21 PM, "Chris Plummer" 
mailto:[email protected] wrote:
     >      >      >      >      >
     >      >      >      >      >      Hi Daniil,
     >      >      >      >      >
     >      >      >      >      >      On 2/23/19 1:02 PM, Daniil Titov wrote:
     >      >      >      >      >      > Please review the change that fixes 
timeout issues for the following 10 tests when running with jtreg and default timeout factor 
(1.0).
     >      >      >      >      >      In Utils.java, I think wait() should be 
moved right after
     >      >      >      >      >      waitForCondition() and maybe given a 
more descriptive name. It seems to
     >      >      >      >      >      basically the same as waitForCondition(), except 
you added a "log"
     >      >      >      >      >      parameter and slightly changed the 
behavior. Are these behavior
     >      >      >      >      >      differences necessary? Could you share 
code with the existing
     >      >      >      >      >      waitForCondition()?
     >      >      >      >      >      >
     >      >      >      >      >      > For the following 2 tests the event 
wait timeout was reduced and adjusted for test.timeout.factor.  Method 
receiveEvents(long,pattern) was fixed to ensure that it gracefully exits after the specified wait 
period elapsed:
     >      >      >      >      >      >    
-vmTestbase/nsk/jdi/ClassUnloadRequest/addClassExclusionFilter/exclfilter001.java
     >      >      >      >      >      >    
-vmTestbase/nsk/jdi/ClassUnloadRequest/addClassFilter/filter001.java
     >      >      >      >      >        183                 eventSet = 
debugee.VM().eventQueue().remove(waitTime);
     >      >      >      >      >
     >      >      >      >      >      This code used to wait for the total 
remaining waittime. Now it waits a
     >      >      >      >      >      fixed amount based on:
     >      >      >      >      >
     >      >      >      >      >        153         long waitTime = 
Utils.adjustTimeout(1000);
     >      >      >      >      >
     >      >      >      >      >      How did you come up with this wait 
amount, and is it long enough to deal
     >      >      >      >      >      with occasional hiccups?
     >      >      >      >      >      > For these 3 tests the event wait 
timeout was reduced and adjusted for test.timeout.factor:
     >      >      >      >      >      >    
-vmTestbase/nsk/jdi/Event/_itself_/event001.java
     >      >      >      >      >      >    
-vmTestbase/nsk/jdi/VirtualMachine/suspend/suspend001/TestDescription.java
     >      >      >      >      >      >    
-vmTestbase/nsk/jdi/ThreadReference/suspend/suspend001.java
     >      >      >      >      >      So overall is this a shorter or longer 
waittime now?
     >      >      >      >      >      >
     >      >      >      >      >      > For next 2 tests the event wait 
timeout and the thread sleep time were reduced and adjusted for test.timeout.factor. Additional 
synchronization between the debugger and the debuggee was added to ensure the debugee process 
continues as soon as the test finishes the timeout related checks and advances to the next steps:
     >      >      >      >      >      >    - 
vmTestbase/nsk/jdi/EventQueue/remove_l/remove_l004/TestDescription.java
     >      >      >      >      >      >    - 
vmTestbase/nsk/jdi/EventQueue/remove/remove004/TestDescription.java
     >      >      >      >      >      Ok.
     >      >      >      >      >      >
     >      >      >      >      >      > Instead of just sleeping for 5 
minutes while waiting for the debuggee test thread to complete  the tests now check whether the 
debuggee thread is alive in the loop. The total waiting timeout was adjusted for 
test.timeout.factor:
     >      >      >      >      >      >    
-vmTestbase/nsk/jdi/VirtualMachine/dispose/dispose004/TestDescription.java
     >      >      >      >      >      >    
-vmTestbase/nsk/jdi/VirtualMachine/dispose/dispose003/TestDescription.java
     >      >      >      >      >      >    
-vmTestbase/nsk/jdi/VirtualMachine/dispose/dispose002/TestDescription.java
     >      >      >      >      >      Ok.
     >      >      >      >      >
     >      >      >      >      >      thanks,
     >      >      >      >      >
     >      >      >      >      >      Chris
     >      >      >      >      >      >
     >      >      >      >      >      > Testing.
     >      >      >      >      >      > The following VM options were used  
in Mach5 jobs to  verify these changes:
     >      >      >      >      >      > 1. No VM args
     >      >      >      >      >      > 2. -XX:+UnlockExperimentalVMOptions 
-XX:+EnableJVMCI -XX:+TieredCompilation -XX:+UseJVMCICompiler -Djvmci.Compiler=grail
     >      >      >      >      >      > 3. -Xcomp
     >      >      >      >      >      >
     >      >      >      >      >      > Also tier1, tier2 and tier3 Mach5 
jobs succeeded.
     >      >      >      >      >      >
     >      >      >      >      >      > To verify that tests succeed with 
test.timeout.factor set to 1.0 the following patch was used before running Mach5 jobs.
     >      >      >      >      >      >
     >      >      >      >      >      > --- a/make/RunTests.gmk Thu Feb 21 
15:17:42 2019 -0800
     >      >      >      >      >      > +++ b/make/RunTests.gmk Thu Feb 21 
15:42:15 2019 -0800
     >      >      >      >      >      > @@ -826,6 +826,7 @@
     >      >      >      >      >      >     else
     >      >      >      >      >      >       JTREG_TIMEOUT_FACTOR ?= 4
     >      >      >      >      >      >     endif
     >      >      >      >      >      > +  JTREG_TIMEOUT_FACTOR = 1
     >      >      >      >      >      >     JTREG_VERBOSE ?= 
fail,error,summary
     >      >      >      >      >      >     JTREG_RETAIN ?= fail,error
     >      >      >      >      >      >
     >      >      >      >      >      > Bug: 
https://bugs.openjdk.java.net/browse/JDK-8207367
     >      >      >      >      >      > Webrev: 
http://cr.openjdk.java.net/~dtitov/8207367/webrev.01
     >      >      >      >      >      >
     >      >      >      >      >      > Thanks!
     >      >      >      >      >      > --Daniil
     >      >      >      >      >      >
     >      >      >      >      >      >
     >      >      >      >      >
     >      >      >      >      >
     >      >      >      >      >
     >      >      >      >      >
     >      >      >      >      >
     >      >      >      >
     >      >      >      >
     >      >      >      >
     >      >      >      >
     >      >      >      >
     >      >      >
     >      >      >
     >      >      >
     >      >      >
     >      >      >
     >      >
     >      >
     >      >
     >      >
     >      >
     >
     >
     >
     >
     >
     >
     >
     >
     >



Reply via email to