On 12/18/2012 08:21 PM, shanliang wrote: > It is good to call > gotIOException((IOException)e); > > but no need to call in next > restart((IOException)e); > > the method "gotIOException" should call "restart" if necessary, > otherwise "restart" may be called 2 times and generated 2 times > "Reconnection notification".
Yes, it seems that the restart() will be called exactly twice or not at all. I will remove the restart() call at the line 199. Updated webrev to http://cr.openjdk.java.net/~jbachorik/JDK-7146162/webrev.05 -JB- > > Shanliang > > Jaroslav Bachorik wrote: >> I've updated the webrev - no additional synchronisation is required. >> Please, review: >> http://cr.openjdk.java.net/~jbachorik/JDK-7146162/webrev.04 >> >> -JB- >> >> On 10/31/2012 01:59 PM, Jaroslav Bachorik wrote: >> >>> On 10/30/2012 05:10 PM, Eamonn McManus wrote: >>> >>>> This area has historically caused a lot of problems and I am not >>>> surprised to see that there are more. While I don't know what the best >>>> way to fix the issue at hand is, I don't think this proposed change is >>>> it. The reason is that the checkConnection and gotIOException methods >>>> do blocking operations, and it is generally not a good idea to do >>>> blocking operations in a synchronized block. Is there a way to avoid >>>> the race condition without that? >>>> >>> The important part is calling the gotIOException() method even from the >>> heart-beat checker. I've tried to return the synchronization block back >>> to the original state and the test passes with the check period of 10ms >>> which pushes the probability of data races rather high. >>> >>> It seems that the worst that can happen would be one additional >>> checkConnection() call - in case when the state gets set to TERMINATED >>> by another thread right after it has been checked in the synchronized >>> block the loop condition might evaluate to true if the state value has >>> not been flushed yet. >>> >>> I could change the "state" variable to be volatile but I am not sure >>> whether it's worth the hassle. >>> >>> -JB- >>> >>> >>>> Éamonn >>>> >>>> >>>> 2012/10/29 Jaroslav Bachorik <[email protected]>: >>>> >>>>> I am looking for a sponsor and reviewers. >>>>> >>>>> The webrev is available at >>>>> http://cr.openjdk.java.net/~jbachorik/JDK-7146162/webrev.03 >>>>> >>>>> As explained in the issue the failure is caused by the RMI connection >>>>> heart-beat thread racing against the thread executing the MBean >>>>> operation and encountering the IOException. The heart beat thread sets >>>>> the the admin state to "terminated" but does not send the failure >>>>> notifications. On the other side the operation thread determines the >>>>> state to be already terminated and skips the notifications as well. >>>>> >>>>> The fix adds the call to handle an ioexception, including sending the >>>>> failure notifications, to the hear-beat connection failure handler. >>>>> Also >>>>> it widens the synchronized block since the whole code block >>>>> checking for >>>>> the connection failure and recovering must be run atomically, >>>>> >>>>> >>>>> Thanks, >>>>> >>>>> -JB- >>>>> >> >> > >
