[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2186?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15089120#comment-15089120
 ] 

Markus Aalto commented on ZOOKEEPER-2186:
-----------------------------------------

I think this might work, but I think it would still require change to allow 
different protocol version message to be accepted and skipped properly. With 
that change the the newer protocol version member could adapt to older version 
easily. 

Regarding https://issues.apache.org/jira/browse/ZOOKEEPER-901 I'm not seeing 
immediately how it will fix the issue unless proper keep-alive is implemented 
for both directions. We have been hitting some cases in our prod environment 
where one direction of the TCP/IP connection is working, but other is not 
working. This causes whole ZK cluster to fail when leader election starts. So 
the keep-alive would need to be monitored for both directions.


> QuorumCnxManager#receiveConnection may crash with random input
> --------------------------------------------------------------
>
>                 Key: ZOOKEEPER-2186
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2186
>             Project: ZooKeeper
>          Issue Type: Bug
>          Components: server
>    Affects Versions: 3.4.6, 3.5.0
>            Reporter: Raul Gutierrez Segales
>            Assignee: Raul Gutierrez Segales
>             Fix For: 3.4.7, 3.5.1, 3.6.0
>
>         Attachments: ZOOKEEPER-2186-v3.4.patch, ZOOKEEPER-2186.patch, 
> ZOOKEEPER-2186.patch, ZOOKEEPER-2186.patch
>
>
> This will allocate an arbitrarily large byte buffer (and try to read it!):
> {code}
>     public boolean receiveConnection(Socket sock) {
>         Long sid = null;
> ...
>                 sid = din.readLong();
>                 // next comes the #bytes in the remainder of the message      
>                                                                        
>                 int num_remaining_bytes = din.readInt();
>                 byte[] b = new byte[num_remaining_bytes];
>                 // remove the remainder of the message from din               
>                                                                        
>                 int num_read = din.read(b);
> {code}
> This will crash the QuorumCnxManager thread, so the cluster will keep going 
> but future elections might fail to converge (ditto for leaving/joining 
> members). 
> Patch coming up in a bit.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to