Keith Turner created THRIFT-4847:
------------------------------------
Summary: CancelledKeyException causes TThreadedSelectorServer to
fail.
Key: THRIFT-4847
URL: https://issues.apache.org/jira/browse/THRIFT-4847
Project: Thrift
Issue Type: Bug
Components: Java - Library
Affects Versions: 0.12.0
Reporter: Keith Turner
When attempting to use TThreadedSelectorServer I see the following exception
and then the server becomes inoperable.
{noformat}
2019-04-03 11:50:37,638 [server.TThreadedSelectorServer] ERROR: run() on
SelectorThread exiting due to uncaught error
java.nio.channels.CancelledKeyException
at sun.nio.ch.SelectionKeyImpl.ensureValid(SelectionKeyImpl.java:73)
at sun.nio.ch.SelectionKeyImpl.interestOps(SelectionKeyImpl.java:82)
at
org.apache.thrift.server.AbstractNonblockingServer$FrameBuffer.changeSelectInterests(AbstractNonblockingServer.java:440)
at
org.apache.thrift.server.AbstractNonblockingServer$AbstractSelectThread.processInterestChanges(AbstractNonblockingServer.java:191)
at
org.apache.thrift.server.TThreadedSelectorServer$SelectorThread.run(TThreadedSelectorServer.java:548)
{noformat}
I tracked this down and I think it is caused by the following events :
# A frame buffer is created and given a selection key
[TThreadedSelectorServer.java line
691|https://github.com/apache/thrift/blob/v0.12.0/lib/java/src/org/apache/thrift/server/TThreadedSelectorServer.java#L691]
# The rebuild selector code introduced in THRIFT-4251 is triggered and all
selectors key are canceled when the selector is closed
[TThreadedSelectorServer.java line
668|https://github.com/apache/thrift/blob/v0.12.0/lib/java/src/org/apache/thrift/server/TThreadedSelectorServer.java#L668]
# A frame buffer attempts to modify its invalid selection key causing an
exception [AbstractNonblockingServer.java line
440|https://github.com/apache/thrift/blob/v0.12.0/lib/java/src/org/apache/thrift/server/AbstractNonblockingServer.java#L440]
I added some logging and found that {{selector.select()}} would return 0
hundreds of times, but not infinitely. I changed
SELECTOR_AUTO_REBUILD_THRESHOLD from 512 to 1,000,000 and the bug did not
happen. I don't think this change is the fix, its just what I did as part of
debugging this. Not sure what the best fix for this is.
The situation that triggers this seems to be lots of connections in a very
short time period.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)