Looks ok. Thanks, /Staffan
On 2 apr 2014, at 10:09, Jaroslav Bachorik <jaroslav.bacho...@oracle.com> wrote: > Hi Shanliang, > > I'm fine with the proposed fix, although not a reviewer. > > On 2.4.2014 10:01, shanliang wrote: >> Hope to get reviewed and to push this fix: >> >> 1) this is a fix for a bug labeled with "svc-nightly" >> >> 2) The current test must be useful. Yes the test could not be 100% sure >> to test the bug JDK-6751643, but with its 2*10000 resume repeatings it >> would have big chance to hit the bug conditions, the failure the patch >> to fix happened exactly in the condition the bug JDK-6751643 could happen. >> >> 3) there is possibly someway to realize the synchronization logic >> between the thread invoking the operations and the thread resuming, I >> could see to add code into the method "resume" to do waiting for this >> test, but I could not see an easy and practical way to do that. > > Just a blind shot - using eg. phaser or cyclic barrier to orchestrate > stepping through the code in both threads could help to increase the chances > of hitting the problematic situation (but not making it certain, though) > > >> >> 4) we can create a new bug to fix this synchronization issue if necessary. > > Sounds like a very good idea. > > Thanks, > > -JB- > > >> >> Thanks, >> Shanliang >> >> shanliang wrote: >>> Jaroslav Bachorik wrote: >>>> Thanks Shanliang, it is clear now. >>>> >>>> The patch will get rid off the IOOBE but I have my doubts about what >>>> the test actually tests. It is supposed to make sure that certain >>>> operations will not throw NPE when the debugged thread is resumed >>>> (from a concurrent debugger thread) before the operation has managed >>>> to finish. However, there seems to be no synchronization logic >>>> between the thread invoking the operations and the thread resuming >>>> the paused debugged thread, relying only on hitting this condition by >>>> chance. >>>> >>>> This test seems to be a good candidate for a thorough revision/rewrite. >>> Not sure how to make the checking happen during a "resuming" window, >>> the test creates 2 threads and each repeats "resume"10000 times, and >>> one another thread repeats checking with 100ms sleeping time, just >>> hoping some checking would fail into a resuming window. >>> >>> Shanliang >>>> >>>> -JB- >>>> >>>> On 31.3.2014 11:26, shanliang wrote: >>>>> Erik Gahlin wrote: >>>>>> I also like to understand better. >>>>> Possibly my previous reply was not clear enough or I missed something >>>>> there. >>>>> >>>>> The test was to test JDK-6751643 as I cited in the last mail, here is >>>>> the info from JDK-6751643 to which this test was developed: >>>>> ------ >>>>> This bug can only occur if a debugger has multiple threads and calls >>>>> any >>>>> of the following methods in one thread while simultaneously resuming >>>>> the >>>>> same debuggee thread in a different debugger thread. Debuggers >>>>> shouldn't >>>>> do this because it is a race condition and the result returned by these >>>>> methods will vary depending upon just where in the processing of these >>>>> methods the resume takes effect. EG, the frameCount() method could >>>>> return 6 in a case where the debuggee has already been resumed and >>>>> there >>>>> are no frames. >>>>> ------ >>>>> >>>>> To reproduce the bug, test did mainly 2 things by different threads: >>>>> 1) received vm events and resumed vm, this was done by thread >>>>> "Thread-1" >>>>> in the class TestScaffold which registered a listener and called the >>>>> following method: >>>>> /** >>>>> * Events handled directly by scaffold always resume (well, almost >>>>> always) >>>>> */ >>>>> public void eventSetComplete(EventSet set) { >>>>> // The listener in connect(..) resumes after receiving our >>>>> // special VMDeathEvent. We can't also do the resume >>>>> // here or we will probably get a VMDisconnectedException >>>>> if (!containsOurVMDeathRequest(set)) { >>>>> traceln("TS: set.resume() called"); >>>>> set.resume(); >>>>> } >>>>> } >>>>> >>>>> 2) called the method "check" in the class SimulResumerTarg, to see >>>>> whether a NullPointerException was thrown, the thread name was "test >>>>> resumer" (better to named as "checking thread"?) >>>>> >>>>> So one thread was doing resume, another thread was doing check at same. >>>>> I added the code to see the different values of frames.size() at >>>>> line 185: >>>>> for (i=0; i<10:i++) { >>>>> System.out.println("---frames.size(): "+frames.size()); >>>>> Thhread.sleep(200); >>>>> } >>>>> >>>>> if printing out frames, sometime we could see one more frame: >>>>> ------------------ java.lang.Thread.yield()+-1 in thread >>>>> instance of >>>>> SimulResumerTarg(name='Thread 2', id=109) >>>>> >>>>> >>>>> Shanliang >>>>>> >>>>>> I looked at this failure before and I couldn't see what was wrong, not >>>>>> in the test or product. >>>>>> >>>>>> Erik >>>>>> >>>>>> Jaroslav Bachorik skrev 3/27/14 4:49 PM: >>>>>>> On 27.3.2014 15:49, shanliang wrote: >>>>>>>> Hi, >>>>>>>> >>>>>>>> The call >>>>>>>> thr.frames(0, frames.size() - 1); >>>>>>>> suffers a synchronization issue, the size may be changed after >>>>>>>> frames.size() returns. >>>>>>> >>>>>>> Any idea why there is a synchronization issue? The code seems to be >>>>>>> intended to run only when a breakpoint is hit and the target thread >>>>>>> is suspended. >>>>>>> >>>>>>> -JB- >>>>>>> >>>>>>>> >>>>>>>> webrev: >>>>>>>> http://cr.openjdk.java.net/~sjiang/JDK-6815126/00/ >>>>>>>> >>>>>>>> bug: >>>>>>>> https://bugs.openjdk.java.net/browse/JDK-6815126 >>>>>>>> >>>>>>>> Shanliang