On 8/9/2011 10:29 AM, Christopher Dolan wrote:
Has anyone else ever seen this exception?
select loop throws
java.lang.NullPointerException
at
com.sun.jini.jeri.internal.runtime.SelectionManager.waitForReadyKey(419)
at
com.sun.jini.jeri.internal.runtime.SelectionManager.access$600(80)
at
com.sun.jini.jeri.internal.runtime.SelectionManager$SelectLoop.run(287)
at com.sun.jini.thread.ThreadPool$Worker.run(150)
at java.lang.Thread.run(619)
We recently switched from blocking Jeri to async NIO to save some RAM via fewer
threads. We had only
> one known occurrence of this problem so I have no debugging information, but
when it happened it
> kept happening for hours and the service was effectively dead to the outside
world.
I studied the SelectionManager code pretty closely, and I can't see how a null
can be attached to
> any selection key that matters. The only two places where keys are registered
are in
> processRenewQueue() where the attachment is guaranteed non-null and in the
SelectionManager
> constructor, where the wakeup key indeed has a null attachment. But the
wakeup key is filtered
> out in waitForReadyKey line 391, so that can't be the cause of the NPE.
Looking at this code too Chris, I don't see anything instantly obvious. But, I
do wonder about the use of 'lock' for synchronization given this comment in the
Javadoc:
"* A selector's key and selected-key sets are not, in general, safe for use
* by multiple concurrent threads. If such a thread might modify one of these
* sets directly then access should be controlled by synchronizing on the set
* itself."
I see that "attachment" is volatile so it can be "modified" from any context and
should be "visible". But, without more staring at the code for "thread context"
it will be difficult to see if there is a "rogue" thread issue. Unfortunately,
there isn't a simple way to plug in a SelectionKey implementation that would
bark about SelectionKey.attach() being called with a "null" value. But, you
might try that with a bootclasspath change to a custom version of the class as a
"check" to see if you can see when it is unexpectedly set to null. I've done
this on other occasions where I use a "set" to store the "String" value of the
new Throwable().printStackTrace( Writer ) to a string buffer and then just print
that out when the number of elements in the set changes when I add it to the
set. This will be not so spammy, but will show you where attach() is being
called with null.
Gregg
There were many other exceptions in the process logs, but I thought I'd start
with this one
> because the others looked more normal (like failed unicast calls via
LookupLocators).
Any ideas?
Thanks,
Chris