symat opened a new pull request #1681: URL: https://github.com/apache/zookeeper/pull/1681
When a ZooKeeper server realizes that an other quorum peer was shut down (e.g. during a rolling upgrade or rolling restart), the ServerCnxn.zkServer variable is set to null by QuorumPear.close(). This is why in the code we usually check the zkServer variable before using it. But this check was missing in one place thus causing NPE in NettyServerCnx.receiveMessage: ``` 2021-02-08T12:42:08.229+0000 [myid:] - ERROR [nioEventLoopGroup-4-1:NettyServerCnxnFactory$CnxnChannelHandler@329]- Unexpected exception in receive java.lang.NullPointerException: null ~[zookeeper-3.6.2.jar:3.6.2] at org.apache.zookeeper.server.NettyServerCnxn.receiveMessage(NettyServerCnxn.java:518) at org.apache.zookeeper.server.NettyServerCnxn.processMessage(NettyServerCnxn.java:368) at org.apache.zookeeper.server.NettyServerCnxnFactory $CnxnChannelHandler.channelRead(NettyServerCnxnFactory.java:326) ... ``` In this commit we add the necessary check and (after throwing an IOException) we will basically ignore the processing of the received message when the remote ZooKeeper server is already down. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org