Jaroslav Bachorik wrote:
On 01/09/2013 03:25 PM, shanliang wrote:
Jaroslav Bachorik wrote:
On 01/09/2013 02:44 PM, shanliang wrote:
Let's forget the JMX implementation at first. If an MBean is
unregistered, a user at client side calls "removeNotificationListener"
on the MBean, what should happen? if the user calls "isRegistered" on
the MBean, what should happen?

I have done 2 tests, I used only one thread:

1)
......
localServer.unregisterMBean(myMBean);
boolean isRegistered = remoteClientServer.isRegistered(myMBean));

I got isRegistered = false;

2)
......
localServer.unregisterMBean(myMBean);
System.out.println("isRegistered =
"+remoteClientServer.sRegistered(myMBean));
remoteClientServer.removeNotificationListener(myMBean, listener);

I did not get an exception.

The 1) told that the client could know the MBean was unregistered, then
the client should throw an exception for the call of
"removeNotificationListener" in 2).
Yes, but then it would not test the listener leakage as it was supposed
to test but rather the fact that the client throws the appropriate
exception. The fact that the mbean was unregistered does not necessarily
mean that the listeners were released. The main problem remains - the
listeners are being cleaned-up asynchronously and the clean-up process
might race against the other uses of the JMX API.
client.removeNotificationListener is not a right way here to test
listener leak, we could use some other ways, for example we keep the
listener in a weak reference, then after the mbean is removed, the weak
reference should be empty after some time. Another way is like
DeadListenerTest does to check whether clean has done at server side:
use reflection to get the "listenerMap" at server side and make sure it
is empty, but this need to add a private method to the class
ClientNotifForwarder.

There will still be problems with timing. You need either to wait for
the GC to kick in to clean up the weak ref. And the listenerMap will not
be purged of the unregistered MBean listeners until the notification is
generated, processed on the ClientNotificationForwarder and forwarded to
the server. So there goes the timing issue again.

The problem is that the "unregisterMBean" operation does not guarantee
that the listeners have been unregistered at the time it returns. So,
one way or the other we will need to wait an arbitrary amount of time
before checking for the memory leak.
Yes we need to wait, but you can use a cycle like:
       long maxWaitingTime = 3000;
       long startTime = System.currentTimeMillis();
       while ( weakReference.get != null
&& System.currentTimeMillis() < startTime + maxWaitingTime) {
           System.gc();
           Thread.sleep(100);
           System.gc();
       }

      if (weakReference.get != null) {
         // failed
      }

Shanliang
-JB-

I think we have 3 things to do here:
1) modify the test to not use removeNotificationListener for testing
listener leak
2) create a new bug about a client does not throw an exception after an
mbean is unregistered
3) create a bug about a client does not throw a same exception as at
server side.

I will do 2) and 3), if you like you can continue 1), it might need to
do fix also in the JMX implementation.

Shanliang
The test "DeadListenerTest" got passed in some machines because of the
timeout for waiting a notification. I think its failure just tells a new
bug.

To set a longer timeout just hides the real bug, and the test might fail
again one day if running condition is changed and you might need longer
timeout again.
Yes, I agree with you that extending the timeout just lessens the
likelihood of the race condition and does not prevent it.

Shanliang

Jaroslav Bachorik wrote:
On 01/09/2013 11:08 AM, shanliang wrote:
Jaroslav Bachorik wrote:
On 01/09/2013 09:40 AM, shanliang wrote:
I still have no idea why the test failed, but I do not see why a
longer
timeout can fix the test. Have you reproduced the problem and tested
your fix? if yes then possible the long timeout hided a real
problem.
Yes, I can reproduce the problem (using the fastbuild bits and -Xcomp
switch) and verify that the fix makes the test pass.

The ClientNotifForwarder scans the notifications for
MBeanServerNotification.UNREGISTRATION_NOTIFICATION and removes the
appropriate notification listeners in a separate thread. Thus,
calling
"removeNotificationListener" on the main thread is prone to racing.
It is true that ClientNotifForwarder scans the notifications for
MBeanServerNotification.UNREGISTRATION_NOTIFICATION and removes the
appropriate notification listeners in a separate thread. This is for a
client connection to do clean if a user never calls
removeNotificationListener.

But calling directly removeNotificationListener from a client should
still get exception if the clean has not been done. As I said, if the
client checked and found the listener was still there, then the client
sent a request to its server to remove the listener at server side,
the
server should find that the MBean in question was not registered,
so the
server should throw an exception. The bug might be here.
This won't work. The server side listeners are removed upon receiving
the "unregistered" notification which is delivered from the
ClientNotificationForwarder and it may have not run yet (since it runs
in a separate executor thread). The result is that the attempt to
remove
the notification listener on the server will succeed as well failing
the
test subsequently.

-JB-

Shanliang
The timeout you made longer was used to wait a notification which
should
never arrive.
Well, it can be used to allow more time to process the "unregister"
notification, too.

When I think more of this I am more inclined to fix the race
condition.
An updated webrev will follow.

To remove a listener from a client side, we did:
1) at client side, check whether it was added in the client side
2) at server side, check whether the MBean in question was
registered in
the MBeanServer (!!!)
3) at server side, check whether the listener was added.

So 2) tells that we did not rely on a "unregister" notification.
Anyway,
if you use a SAME thread to call "unregister" operation to
unregister an
mbean, then any following call (without any time break) to use the
mbean
should fail, like "removeNotificationListener", "isRegistered" etc.

I do see a bug here, if we remove a listener from a non-existing
MBeam,
we get "ListenerNotFoundException" at a client side, but get
"InstanceNotFoundException" at server side, I think we should
create a
bug, because both implemented the same interface
MBeanServerConnection.
Yes, it is rather inconsistent.

-JB-

Shanliang

Jaroslav Bachorik wrote:
Looking for review and a sponsor.

Webrev at http://cr.openjdk.java.net/~jbachorik/7170447/webrev.00

In this issue the timing is the problem.
MBeanServer.unregisterMBean()
fires the "unregister" notification which is sent to the server
asynchronously. Thus it may happen that the "unregister"
notification
has not been yet processed at the time of invoking
removeNotificationListener() and the notification listeners hasn't
been
cleaned up leading to the test failure.

There is no synchronization between the client and the server and
such
race condition can occur occasionally. Normally, the execution is
fast
enough to behave like the "unregister" notification is processed
synchronously with the unregisterMBean() operation but it seems
that
using fastdebug Server VM bits with the -Xcomp option strains
the CPU
enough to make this problem appear.

There is no proper fix for this - the only thing that work is
waiting a
bit longer in the main thread to give the notification processing
thread
some time to clean up the listeners.

Regards,

-JB-


Reply via email to