[ https://issues.apache.org/jira/browse/ZOOKEEPER-2186?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15089067#comment-15089067 ]
Markus Aalto commented on ZOOKEEPER-2186: ----------------------------------------- I think this was one of the reasons I did my implementation directly in the QuorumCnxManager within the SendWorker and RecvWorker threads directly writing/reading from the stream. This made it totally invisible for the FastLeaderElection algorithm, as Notification was just one of the message types in the connection level as soon as the initial handshake was completed, and keep-alive was managed by inside the QuorumCnxManager (as I think it should be). Unfortunately due to the issues in the handshake not supporting upgrades I got stuck on finding a good way to get the change in. > QuorumCnxManager#receiveConnection may crash with random input > -------------------------------------------------------------- > > Key: ZOOKEEPER-2186 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2186 > Project: ZooKeeper > Issue Type: Bug > Components: server > Affects Versions: 3.4.6, 3.5.0 > Reporter: Raul Gutierrez Segales > Assignee: Raul Gutierrez Segales > Fix For: 3.4.7, 3.5.1, 3.6.0 > > Attachments: ZOOKEEPER-2186-v3.4.patch, ZOOKEEPER-2186.patch, > ZOOKEEPER-2186.patch, ZOOKEEPER-2186.patch > > > This will allocate an arbitrarily large byte buffer (and try to read it!): > {code} > public boolean receiveConnection(Socket sock) { > Long sid = null; > ... > sid = din.readLong(); > // next comes the #bytes in the remainder of the message > > int num_remaining_bytes = din.readInt(); > byte[] b = new byte[num_remaining_bytes]; > // remove the remainder of the message from din > > int num_read = din.read(b); > {code} > This will crash the QuorumCnxManager thread, so the cluster will keep going > but future elections might fail to converge (ditto for leaving/joining > members). > Patch coming up in a bit. -- This message was sent by Atlassian JIRA (v6.3.4#6332)