Hi Jaroslav, I am not sure to understand how this solves the problem. The old code first checked the connection, and if that failed, sent the FAILED notification, closed the connector, and rethrew the exception.
The new code directly throws the exception without checking the connection, and therefore without closing the connection and sending the FAILED notification. So is the fix a change of behavior by which the RMIConnector will - in some cases - not try to autoclose the connection but instead simply wait for the caller to explicitely call close()? I'd be interested to hear what Shanliang has to say... best regards, -- daniel On 8/28/14 5:57 PM, Jaroslav Bachorik wrote:
I have taken over this issue from Poonam since she will be unavailable for the next month or so. Could I have reviews for this change: Bug: https://bugs.openjdk.java.net/browse/JDK-8049303 Webrev: http://cr.openjdk.java.net/~jbachorik/8049303/webrev.00 Problem and fix: By default the JMX client side notification fetch timeout (jmx.remote.x.notification.fetch.timeout) is 1 minute and the default server connection timeout (jmx.remote.x.server.connection.timeout) is 2 minutes. If the client side connector thread makes a notification fetch request to the server, but a transient network problem prevents the server response from reaching the client, the client side connector will wait for a response until the timeout period (1 minute) has expired before throwing an IOException. The client side RMIConnector implementation handles the IOException, by re-checking the connection status to understand whether or not it is broken. If the connection is not available at that moment, the connector fails by re-throwing the initial IOException. The problem is that this re-check of the connection passes because the server side of the connection doesn't time out until 2 minutes has passed (by default), so the NotifFetcher thread dies without posting a failed notification, and the client application does not get a chance to recover. The fix is to forward the non connection-related exceptions on the JMX client side instead of checking the connection status. The connection-related exceptions will cause closing the session as an unsuccessful connection check would have done. Testing: All the jdk_jmx and jdk_management regression tests passed. All the related JCK tests passed. The fix applies cleanly to 8u and 7u repos. Thanks, -JB-
