Paul Millar created ZOOKEEPER-2813:
--------------------------------------

             Summary: Failure tight loop in acceptor
                 Key: ZOOKEEPER-2813
                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2813
             Project: ZooKeeper
          Issue Type: Bug
          Components: server
    Affects Versions: 3.4.8
            Reporter: Paul Millar
            Priority: Minor
             Fix For: 3.5.0


A failure during accepting an incoming connection results in the acceptor 
thread being caught in a tight-loop.  For example:

{noformat}
13 Jun 2017 15:31:39 (zookeeper) [] Ignoring unexpected runtime exception
java.lang.NullPointerException: null
        at 
org.apache.zookeeper.server.ZooKeeperServer.processConnectRequest(ZooKeeperServer.java:864)
 ~[zookeeper-3.4.8.jar:3.4.8--1]
        at 
org.apache.zookeeper.server.NIOServerCnxn.readConnectRequest(NIOServerCnxn.java:418)
 ~[zookeeper-3.4.8.jar:3.4.8--1]
        at 
org.apache.zookeeper.server.NIOServerCnxn.readPayload(NIOServerCnxn.java:198) 
~[zookeeper-3.4.8.jar:3.4.8--1]
        at 
org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:244) 
~[zookeeper-3.4.8.jar:3.4.8--1]
        at 
org.apache.zookeeper.server.NIOServerCnxnFactory.run(NIOServerCnxnFactory.java:203)
 ~[zookeeper-3.4.8.jar:3.4.8--1]
        at java.lang.Thread.run(Thread.java:748) [na:1.8.0_131]
13 Jun 2017 15:31:39 (zookeeper) [] Ignoring unexpected runtime exception
java.lang.NullPointerException: null
        at 
org.apache.zookeeper.server.ZooKeeperServer.createSession(ZooKeeperServer.java:569)
 ~[zookeeper-3.4.8.jar:3.4.8--1]
        at 
org.apache.zookeeper.server.ZooKeeperServer.processConnectRequest(ZooKeeperServer.java:902)
 ~[zookeeper-3.4.8.jar:3.4.8--1]
        at 
org.apache.zookeeper.server.NIOServerCnxn.readConnectRequest(NIOServerCnxn.java:418)
 ~[zookeeper-3.4.8.jar:3.4.8--1]
        at 
org.apache.zookeeper.server.NIOServerCnxn.readPayload(NIOServerCnxn.java:198) 
~[zookeeper-3.4.8.jar:3.4.8--1]
        at 
org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:244) 
~[zookeeper-3.4.8.jar:3.4.8--1]
        at 
org.apache.zookeeper.server.NIOServerCnxnFactory.run(NIOServerCnxnFactory.java:203)
 ~[zookeeper-3.4.8.jar:3.4.8--1]
        at java.lang.Thread.run(Thread.java:748) [na:1.8.0_131]
13 Jun 2017 15:31:40 (zookeeper) [] Ignoring unexpected runtime exception
java.lang.NullPointerException: null
        at 
org.apache.zookeeper.server.NIOServerCnxnFactory.run(NIOServerCnxnFactory.java:185)
 ~[zookeeper-3.4.8.jar:3.4.8--1]
        at java.lang.Thread.run(Thread.java:748) [na:1.8.0_131]
13 Jun 2017 15:31:40 (zookeeper) [] Ignoring unexpected runtime exception
java.lang.NullPointerException: null
        at 
org.apache.zookeeper.server.NIOServerCnxnFactory.run(NIOServerCnxnFactory.java:185)
 ~[zookeeper-3.4.8.jar:3.4.8--1]
        at java.lang.Thread.run(Thread.java:748) [na:1.8.0_131]
13 Jun 2017 15:31:40 (zookeeper) [] Ignoring unexpected runtime exception
java.lang.NullPointerException: null
        at 
org.apache.zookeeper.server.NIOServerCnxnFactory.run(NIOServerCnxnFactory.java:185)
 ~[zookeeper-3.4.8.jar:3.4.8--1]
        at java.lang.Thread.run(Thread.java:748) [na:1.8.0_131]
{noformat}

The first stack-trace is due to ZOOKEEPER-2810, the second is due to 
ZOOKEEPER-2812. 

The other stack-traces (NPE from NIOServerCnxnFactory.java:185) are 
never-ending, as the service has been caught in a tight-loop.

The reason is that the NIOServerCnxnFactory class fails to guarantee that 
`selected` variable is clearer, so the SelectionKey that triggered the bugs 
remains "live".  However, since there are no incoming connections, the call to 
`accept()` returns null, triggering the NPE.

It appears this problem is fixed with 3.5.0 (with commit 6302d7a7).   If 
back-porting this patch is too invasive, another solution might be to place the 
`selected.clear()` statement inside the finally-clause of the try-statement.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to