On 01/09/2013 09:40 AM, shanliang wrote: > I still have no idea why the test failed, but I do not see why a longer > timeout can fix the test. Have you reproduced the problem and tested > your fix? if yes then possible the long timeout hided a real problem.
Yes, I can reproduce the problem (using the fastbuild bits and -Xcomp switch) and verify that the fix makes the test pass. The ClientNotifForwarder scans the notifications for MBeanServerNotification.UNREGISTRATION_NOTIFICATION and removes the appropriate notification listeners in a separate thread. Thus, calling "removeNotificationListener" on the main thread is prone to racing. > > The timeout you made longer was used to wait a notification which should > never arrive. Well, it can be used to allow more time to process the "unregister" notification, too. When I think more of this I am more inclined to fix the race condition. An updated webrev will follow. > > To remove a listener from a client side, we did: > 1) at client side, check whether it was added in the client side > 2) at server side, check whether the MBean in question was registered in > the MBeanServer (!!!) > 3) at server side, check whether the listener was added. > > So 2) tells that we did not rely on a "unregister" notification. Anyway, > if you use a SAME thread to call "unregister" operation to unregister an > mbean, then any following call (without any time break) to use the mbean > should fail, like "removeNotificationListener", "isRegistered" etc. > > I do see a bug here, if we remove a listener from a non-existing MBeam, > we get "ListenerNotFoundException" at a client side, but get > "InstanceNotFoundException" at server side, I think we should create a > bug, because both implemented the same interface MBeanServerConnection. Yes, it is rather inconsistent. -JB- > > Shanliang > > Jaroslav Bachorik wrote: >> Looking for review and a sponsor. >> >> Webrev at http://cr.openjdk.java.net/~jbachorik/7170447/webrev.00 >> >> In this issue the timing is the problem. MBeanServer.unregisterMBean() >> fires the "unregister" notification which is sent to the server >> asynchronously. Thus it may happen that the "unregister" notification >> has not been yet processed at the time of invoking >> removeNotificationListener() and the notification listeners hasn't been >> cleaned up leading to the test failure. >> >> There is no synchronization between the client and the server and such >> race condition can occur occasionally. Normally, the execution is fast >> enough to behave like the "unregister" notification is processed >> synchronously with the unregisterMBean() operation but it seems that >> using fastdebug Server VM bits with the -Xcomp option strains the CPU >> enough to make this problem appear. >> >> There is no proper fix for this - the only thing that work is waiting a >> bit longer in the main thread to give the notification processing thread >> some time to clean up the listeners. >> >> Regards, >> >> -JB- >> >
