Paul Millar created ZOOKEEPER-2810:
--------------------------------------

             Summary: Racy implicit ZKDatabase creation
                 Key: ZOOKEEPER-2810
                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2810
             Project: ZooKeeper
          Issue Type: Bug
          Components: server
    Affects Versions: 3.4.8
            Reporter: Paul Millar
            Priority: Minor


The NIOServerCnxnFactory#startup method first starts the acceptor thread and 
then initialises the ZooKeeperServer instance.  In particular, the call to 
ZooKeeperServer#startdata method creates the ZKDatabase if it does not already 
exist.

This creates a race-condition: if the acceptor thread accepts an incoming 
connection before the ZKDatabase is established then there is a 
NullPointerException:

{noformat}
java.lang.NullPointerException: null
        at 
org.apache.zookeeper.server.ZooKeeperServer.processConnectRequest(ZooKeeperServer.java:864)
 ~[zookeeper-3.4.8.jar:3.4.8--1]
        at 
org.apache.zookeeper.server.NIOServerCnxn.readConnectRequest(NIOServerCnxn.java:418)
 ~[zookeeper-3.4.8.jar:3.4.8--1]
        at 
org.apache.zookeeper.server.NIOServerCnxn.readPayload(NIOServerCnxn.java:198) 
~[zookeeper-3.4.8.jar:3.4.8--1]
        at 
org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:244) 
~[zookeeper-3.4.8.jar:3.4.8--1]
        at 
org.apache.zookeeper.server.NIOServerCnxnFactory.run(NIOServerCnxnFactory.java:203)
 ~[zookeeper-3.4.8.jar:3.4.8--1]
        at java.lang.Thread.run(Thread.java:748) [na:1.8.0_131]
{noformat}

The same problem appears to be present in release-3.5 and master branches.

The naive fix would be to start the acceptor thread last in 
NIOServerCnxnFactory#startup, but I can't say whether this would cause any 
other problems.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to