[
https://issues.apache.org/jira/browse/ZOOKEEPER-2044?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15840707#comment-15840707
]
ASF GitHub Bot commented on ZOOKEEPER-2044:
-------------------------------------------
Github user hanm commented on a diff in the pull request:
https://github.com/apache/zookeeper/pull/156#discussion_r98120457
--- Diff: src/java/test/org/apache/zookeeper/server/NIOServerCnxnTest.java
---
@@ -68,4 +69,38 @@ public void testOperationsAfterCnxnClose() throws
IOException,
zk.close();
}
}
+
+ /**
+ * Mock extension of NIOServerCnxn to test for
+ * CancelledKeyException (ZOOKEEPER-2044).
+ */
+ private static class MockNIOServerCnxn extends NIOServerCnxn {
+ public MockNIOServerCnxn(NIOServerCnxn cnxn)
+ throws IOException {
+ super(cnxn.zkServer, cnxn.sock, cnxn.sk, cnxn.factory);
+ }
+
+ public void mockSendBuffer(ByteBuffer bb) throws Exception {
+ super.internalSendBuffer(bb);
+ }
+ }
+
+ @Test(timeout = 30000)
+ public void testValidSelectionKey() throws Exception {
+ final ZooKeeper zk = createClient();
+ try {
+ Iterable<ServerCnxn> connections =
serverFactory.getConnections();
+ for (ServerCnxn serverCnxn : connections) {
+ MockNIOServerCnxn mock = new
MockNIOServerCnxn((NIOServerCnxn) serverCnxn);
+ // Cancel key
+ ((NIOServerCnxn)
serverCnxn).sock.keyFor(((NIOServerCnxnFactory)
serverFactory).selector).cancel();;
+ mock.mockSendBuffer(ByteBuffer.allocate(8));
+ }
+ } catch (CancelledKeyException e) {
+ LOG.error("Exception while sending bytes!", e);
+ Assert.fail(e.toString());
+ } finally {
+ zk.close();
--- End diff --
@rakeshadr Good observation on the long running of the test. This is
definitely something we should fix. The actual delay indeed happens at client
close and the root cause is session timeout: when a client closing itself it
sends a request to server, and this request packet will stuck forever in our
case because server has canceled the selector; so client session will expire
eventually and by default, the timeout value between client / server is set as
30 sec and 2/3 about it - which is 20 sec is exactly what it would cost for a
heart beat to fail. I fixed this by adjusting the timeout value to 3 sec
instead just for this single test. PTAL.
> CancelledKeyException in zookeeper 3.4.5
> ----------------------------------------
>
> Key: ZOOKEEPER-2044
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2044
> Project: ZooKeeper
> Issue Type: Bug
> Components: server
> Affects Versions: 3.4.6
> Environment: Red Hat Enterprise Linux Server release 6.2
> Reporter: shamjith antholi
> Assignee: Flavio Junqueira
> Priority: Minor
> Fix For: 3.4.10
>
> Attachments: ZOOKEEPER-2044.patch, ZOOKEEPER-2044.patch
>
>
> I am getting cancelled key exception in zookeeper (version 3.4.5). Please see
> the log below. When this error is thrown, the connected solr shard is going
> down by giving the error "Failed to index metadata in
> Solr,StackTrace=SolrError: HTTP status 503.Reason:
> {"responseHeader":{"status":503,"QTime":204},"error":{"msg":"ClusterState
> says we are the leader, but locally we don't think so","code":503" and
> ultimately the current activity is going down. Could you please give a
> solution for this ?
> Zookeper log
> ----------------------------------------------------------
> 2014-09-16 02:58:47,799 [myid:1] - INFO
> [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:ZooKeeperServer@832] - Client
> attempting to renew session 0x24868e7ca980003 at /172.22.0.5:58587
> 2014-09-16 02:58:47,800 [myid:1] - INFO
> [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:Learner@107] - Revalidating
> client: 0x24868e7ca980003
> 2014-09-16 02:58:47,802 [myid:1] - INFO
> [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:ZooKeeperServer@588] - Invalid
> session 0x24868e7ca980003 for client /172.22.0.5:58587, probably expired
> 2014-09-16 02:58:47,803 [myid:1] - INFO
> [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1001] - Closed
> socket connection for client /172.22.0.5:58587 which had sessionid
> 0x24868e7ca980003
> 2014-09-16 02:58:47,810 [myid:1] - ERROR
> [CommitProcessor:1:NIOServerCnxn@180] - Unexpected Exception:
> java.nio.channels.CancelledKeyException
> at sun.nio.ch.SelectionKeyImpl.ensureValid(SelectionKeyImpl.java:55)
> at sun.nio.ch.SelectionKeyImpl.interestOps(SelectionKeyImpl.java:59)
> at
> org.apache.zookeeper.server.NIOServerCnxn.sendBuffer(NIOServerCnxn.java:153)
> at
> org.apache.zookeeper.server.NIOServerCnxn.sendResponse(NIOServerCnxn.java:1076)
> at
> org.apache.zookeeper.server.NIOServerCnxn.process(NIOServerCnxn.java:1113)
> at org.apache.zookeeper.server.DataTree.setWatches(DataTree.java:1327)
> at
> org.apache.zookeeper.server.ZKDatabase.setWatches(ZKDatabase.java:384)
> at
> org.apache.zookeeper.server.FinalRequestProcessor.processRequest(FinalRequestProcessor.java:304)
> at
> org.apache.zookeeper.server.quorum.CommitProcessor.run(CommitProcessor.java:74)
>
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)