This area has historically caused a lot of problems and I am not surprised to see that there are more. While I don't know what the best way to fix the issue at hand is, I don't think this proposed change is it. The reason is that the checkConnection and gotIOException methods do blocking operations, and it is generally not a good idea to do blocking operations in a synchronized block. Is there a way to avoid the race condition without that?
Éamonn 2012/10/29 Jaroslav Bachorik <[email protected]>: > I am looking for a sponsor and reviewers. > > The webrev is available at > http://cr.openjdk.java.net/~jbachorik/JDK-7146162/webrev.03 > > As explained in the issue the failure is caused by the RMI connection > heart-beat thread racing against the thread executing the MBean > operation and encountering the IOException. The heart beat thread sets > the the admin state to "terminated" but does not send the failure > notifications. On the other side the operation thread determines the > state to be already terminated and skips the notifications as well. > > The fix adds the call to handle an ioexception, including sending the > failure notifications, to the hear-beat connection failure handler. Also > it widens the synchronized block since the whole code block checking for > the connection failure and recovering must be run atomically, > > > Thanks, > > -JB-
