[jira] [Updated] (ZOOKEEPER-2202) Cluster crashes when reconfig adds an unreachable observer

2017-03-13 Thread Michael Han (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2202?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Han updated ZOOKEEPER-2202:
---
Fix Version/s: (was: 3.5.3)
   3.5.4

> Cluster crashes when reconfig adds an unreachable observer
> --
>
> Key: ZOOKEEPER-2202
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2202
> Project: ZooKeeper
>  Issue Type: Bug
>Affects Versions: 3.5.0, 3.6.0
>Reporter: Raul Gutierrez Segales
>Assignee: Raul Gutierrez Segales
> Fix For: 3.5.4, 3.6.0
>
> Attachments: ZOOKEEPER-2202.patch
>
>
> While adding support for reconfig() in Kazoo 
> (https://github.com/python-zk/kazoo/pull/333) I found that the cluster can be 
> crashed if you add an observer whose election port isn't reachable (i.e.: 
> packets for that destination are dropped, not rejected). This will raise a 
> SocketTimeoutException which will bring down the PrepRequestProcessor:
> {code}
> 2015-06-02 14:37:16,473 [myid:3] - WARN  [ProcessThread(sid:3 
> cport:-1)::QuorumCnxManager@384] - Cannot open channel to 100 at election 
> address /8.8.8.8:38703
> java.net.SocketTimeoutException: connect timed out
> at java.net.PlainSocketImpl.socketConnect(Native Method)
> at 
> java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:345)
> at 
> java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206)
> at 
> java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188)
> at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
> at java.net.Socket.connect(Socket.java:589)
> at 
> org.apache.zookeeper.server.quorum.QuorumCnxManager.connectOne(QuorumCnxManager.java:369)
> at 
> org.apache.zookeeper.server.quorum.QuorumPeer.connectNewPeers(QuorumPeer.java:1288)
> at 
> org.apache.zookeeper.server.quorum.QuorumPeer.setLastSeenQuorumVerifier(QuorumPeer.java:1315)
> at org.apache.zookeeper.server.quorum.Leader.propose(Leader.java:1056)
> at 
> org.apache.zookeeper.server.quorum.ProposalRequestProcessor.processRequest(ProposalRequestProcessor.java:78)
> at 
> org.apache.zookeeper.server.PrepRequestProcessor.pRequest(PrepRequestProcessor.java:877)
> at 
> org.apache.zookeeper.server.PrepRequestProcessor.run(PrepRequestProcessor.java:143)
> {code}
> A simple repro can be obtained by using the code in the referenced pull 
> request above and using 8.8.8.8:3888 (for example) instead of a free (but 
> closed) port in the loopback. 
> I think that adding an Observer (or a Participant) that isn't currently 
> reachable is a valid use case (i.e.: you are provisioning the machine and 
> it's not currently needed) so I think we could handle this with lower connect 
> timeouts, not sure. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (ZOOKEEPER-2202) Cluster crashes when reconfig adds an unreachable observer

2016-06-21 Thread Chris Nauroth (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2202?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Nauroth updated ZOOKEEPER-2202:
-
Fix Version/s: (was: 3.5.2)
   3.5.3

> Cluster crashes when reconfig adds an unreachable observer
> --
>
> Key: ZOOKEEPER-2202
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2202
> Project: ZooKeeper
>  Issue Type: Bug
>Affects Versions: 3.5.0, 3.6.0
>Reporter: Raul Gutierrez Segales
>Assignee: Raul Gutierrez Segales
> Fix For: 3.6.0, 3.5.3
>
> Attachments: ZOOKEEPER-2202.patch
>
>
> While adding support for reconfig() in Kazoo 
> (https://github.com/python-zk/kazoo/pull/333) I found that the cluster can be 
> crashed if you add an observer whose election port isn't reachable (i.e.: 
> packets for that destination are dropped, not rejected). This will raise a 
> SocketTimeoutException which will bring down the PrepRequestProcessor:
> {code}
> 2015-06-02 14:37:16,473 [myid:3] - WARN  [ProcessThread(sid:3 
> cport:-1)::QuorumCnxManager@384] - Cannot open channel to 100 at election 
> address /8.8.8.8:38703
> java.net.SocketTimeoutException: connect timed out
> at java.net.PlainSocketImpl.socketConnect(Native Method)
> at 
> java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:345)
> at 
> java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206)
> at 
> java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188)
> at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
> at java.net.Socket.connect(Socket.java:589)
> at 
> org.apache.zookeeper.server.quorum.QuorumCnxManager.connectOne(QuorumCnxManager.java:369)
> at 
> org.apache.zookeeper.server.quorum.QuorumPeer.connectNewPeers(QuorumPeer.java:1288)
> at 
> org.apache.zookeeper.server.quorum.QuorumPeer.setLastSeenQuorumVerifier(QuorumPeer.java:1315)
> at org.apache.zookeeper.server.quorum.Leader.propose(Leader.java:1056)
> at 
> org.apache.zookeeper.server.quorum.ProposalRequestProcessor.processRequest(ProposalRequestProcessor.java:78)
> at 
> org.apache.zookeeper.server.PrepRequestProcessor.pRequest(PrepRequestProcessor.java:877)
> at 
> org.apache.zookeeper.server.PrepRequestProcessor.run(PrepRequestProcessor.java:143)
> {code}
> A simple repro can be obtained by using the code in the referenced pull 
> request above and using 8.8.8.8:3888 (for example) instead of a free (but 
> closed) port in the loopback. 
> I think that adding an Observer (or a Participant) that isn't currently 
> reachable is a valid use case (i.e.: you are provisioning the machine and 
> it's not currently needed) so I think we could handle this with lower connect 
> timeouts, not sure. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (ZOOKEEPER-2202) Cluster crashes when reconfig adds an unreachable observer

2016-03-02 Thread Patrick Hunt (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2202?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Patrick Hunt updated ZOOKEEPER-2202:

Assignee: Raul Gutierrez Segales

> Cluster crashes when reconfig adds an unreachable observer
> --
>
> Key: ZOOKEEPER-2202
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2202
> Project: ZooKeeper
>  Issue Type: Bug
>Affects Versions: 3.5.0, 3.6.0
>Reporter: Raul Gutierrez Segales
>Assignee: Raul Gutierrez Segales
> Fix For: 3.5.2, 3.6.0
>
> Attachments: ZOOKEEPER-2202.patch
>
>
> While adding support for reconfig() in Kazoo 
> (https://github.com/python-zk/kazoo/pull/333) I found that the cluster can be 
> crashed if you add an observer whose election port isn't reachable (i.e.: 
> packets for that destination are dropped, not rejected). This will raise a 
> SocketTimeoutException which will bring down the PrepRequestProcessor:
> {code}
> 2015-06-02 14:37:16,473 [myid:3] - WARN  [ProcessThread(sid:3 
> cport:-1)::QuorumCnxManager@384] - Cannot open channel to 100 at election 
> address /8.8.8.8:38703
> java.net.SocketTimeoutException: connect timed out
> at java.net.PlainSocketImpl.socketConnect(Native Method)
> at 
> java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:345)
> at 
> java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206)
> at 
> java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188)
> at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
> at java.net.Socket.connect(Socket.java:589)
> at 
> org.apache.zookeeper.server.quorum.QuorumCnxManager.connectOne(QuorumCnxManager.java:369)
> at 
> org.apache.zookeeper.server.quorum.QuorumPeer.connectNewPeers(QuorumPeer.java:1288)
> at 
> org.apache.zookeeper.server.quorum.QuorumPeer.setLastSeenQuorumVerifier(QuorumPeer.java:1315)
> at org.apache.zookeeper.server.quorum.Leader.propose(Leader.java:1056)
> at 
> org.apache.zookeeper.server.quorum.ProposalRequestProcessor.processRequest(ProposalRequestProcessor.java:78)
> at 
> org.apache.zookeeper.server.PrepRequestProcessor.pRequest(PrepRequestProcessor.java:877)
> at 
> org.apache.zookeeper.server.PrepRequestProcessor.run(PrepRequestProcessor.java:143)
> {code}
> A simple repro can be obtained by using the code in the referenced pull 
> request above and using 8.8.8.8:3888 (for example) instead of a free (but 
> closed) port in the loopback. 
> I think that adding an Observer (or a Participant) that isn't currently 
> reachable is a valid use case (i.e.: you are provisioning the machine and 
> it's not currently needed) so I think we could handle this with lower connect 
> timeouts, not sure. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (ZOOKEEPER-2202) Cluster crashes when reconfig adds an unreachable observer

2015-09-28 Thread Raul Gutierrez Segales (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2202?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Raul Gutierrez Segales updated ZOOKEEPER-2202:
--
Attachment: ZOOKEEPER-2202.patch

[~shralex]: does this make sense to you? I'll add a test a bit later today. 
Thanks!

cc: [~cnauroth], [~hdeng]

> Cluster crashes when reconfig adds an unreachable observer
> --
>
> Key: ZOOKEEPER-2202
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2202
> Project: ZooKeeper
>  Issue Type: Bug
>Affects Versions: 3.5.0, 3.6.0
>Reporter: Raul Gutierrez Segales
> Fix For: 3.5.2, 3.6.0
>
> Attachments: ZOOKEEPER-2202.patch
>
>
> While adding support for reconfig() in Kazoo 
> (https://github.com/python-zk/kazoo/pull/333) I found that the cluster can be 
> crashed if you add an observer whose election port isn't reachable (i.e.: 
> packets for that destination are dropped, not rejected). This will raise a 
> SocketTimeoutException which will bring down the PrepRequestProcessor:
> {code}
> 2015-06-02 14:37:16,473 [myid:3] - WARN  [ProcessThread(sid:3 
> cport:-1)::QuorumCnxManager@384] - Cannot open channel to 100 at election 
> address /8.8.8.8:38703
> java.net.SocketTimeoutException: connect timed out
> at java.net.PlainSocketImpl.socketConnect(Native Method)
> at 
> java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:345)
> at 
> java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206)
> at 
> java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188)
> at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
> at java.net.Socket.connect(Socket.java:589)
> at 
> org.apache.zookeeper.server.quorum.QuorumCnxManager.connectOne(QuorumCnxManager.java:369)
> at 
> org.apache.zookeeper.server.quorum.QuorumPeer.connectNewPeers(QuorumPeer.java:1288)
> at 
> org.apache.zookeeper.server.quorum.QuorumPeer.setLastSeenQuorumVerifier(QuorumPeer.java:1315)
> at org.apache.zookeeper.server.quorum.Leader.propose(Leader.java:1056)
> at 
> org.apache.zookeeper.server.quorum.ProposalRequestProcessor.processRequest(ProposalRequestProcessor.java:78)
> at 
> org.apache.zookeeper.server.PrepRequestProcessor.pRequest(PrepRequestProcessor.java:877)
> at 
> org.apache.zookeeper.server.PrepRequestProcessor.run(PrepRequestProcessor.java:143)
> {code}
> A simple repro can be obtained by using the code in the referenced pull 
> request above and using 8.8.8.8:3888 (for example) instead of a free (but 
> closed) port in the loopback. 
> I think that adding an Observer (or a Participant) that isn't currently 
> reachable is a valid use case (i.e.: you are provisioning the machine and 
> it's not currently needed) so I think we could handle this with lower connect 
> timeouts, not sure. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (ZOOKEEPER-2202) Cluster crashes when reconfig adds an unreachable observer

2015-06-08 Thread Michi Mutsuzaki (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2202?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michi Mutsuzaki updated ZOOKEEPER-2202:
---
Fix Version/s: (was: 3.5.1)
   3.5.2

 Cluster crashes when reconfig adds an unreachable observer
 --

 Key: ZOOKEEPER-2202
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2202
 Project: ZooKeeper
  Issue Type: Bug
Affects Versions: 3.5.0, 3.6.0
Reporter: Raul Gutierrez Segales
 Fix For: 3.5.2, 3.6.0


 While adding support for reconfig() in Kazoo 
 (https://github.com/python-zk/kazoo/pull/333) I found that the cluster can be 
 crashed if you add an observer whose election port isn't reachable (i.e.: 
 packets for that destination are dropped, not rejected). This will raise a 
 SocketTimeoutException which will bring down the PrepRequestProcessor:
 {code}
 2015-06-02 14:37:16,473 [myid:3] - WARN  [ProcessThread(sid:3 
 cport:-1)::QuorumCnxManager@384] - Cannot open channel to 100 at election 
 address /8.8.8.8:38703
 java.net.SocketTimeoutException: connect timed out
 at java.net.PlainSocketImpl.socketConnect(Native Method)
 at 
 java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:345)
 at 
 java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206)
 at 
 java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188)
 at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
 at java.net.Socket.connect(Socket.java:589)
 at 
 org.apache.zookeeper.server.quorum.QuorumCnxManager.connectOne(QuorumCnxManager.java:369)
 at 
 org.apache.zookeeper.server.quorum.QuorumPeer.connectNewPeers(QuorumPeer.java:1288)
 at 
 org.apache.zookeeper.server.quorum.QuorumPeer.setLastSeenQuorumVerifier(QuorumPeer.java:1315)
 at org.apache.zookeeper.server.quorum.Leader.propose(Leader.java:1056)
 at 
 org.apache.zookeeper.server.quorum.ProposalRequestProcessor.processRequest(ProposalRequestProcessor.java:78)
 at 
 org.apache.zookeeper.server.PrepRequestProcessor.pRequest(PrepRequestProcessor.java:877)
 at 
 org.apache.zookeeper.server.PrepRequestProcessor.run(PrepRequestProcessor.java:143)
 {code}
 A simple repro can be obtained by using the code in the referenced pull 
 request above and using 8.8.8.8:3888 (for example) instead of a free (but 
 closed) port in the loopback. 
 I think that adding an Observer (or a Participant) that isn't currently 
 reachable is a valid use case (i.e.: you are provisioning the machine and 
 it's not currently needed) so I think we could handle this with lower connect 
 timeouts, not sure. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (ZOOKEEPER-2202) Cluster crashes when reconfig adds an unreachable observer

2015-06-02 Thread Raul Gutierrez Segales (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2202?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Raul Gutierrez Segales updated ZOOKEEPER-2202:
--
Summary: Cluster crashes when reconfig adds an unreachable observer  (was: 
Cluster crashes when reconfig adds an unreaachable observer)

 Cluster crashes when reconfig adds an unreachable observer
 --

 Key: ZOOKEEPER-2202
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2202
 Project: ZooKeeper
  Issue Type: Bug
Affects Versions: 3.5.0, 3.6.0
Reporter: Raul Gutierrez Segales
 Fix For: 3.5.1, 3.6.0


 While adding support for reconfig() in Kazoo 
 (https://github.com/python-zk/kazoo/pull/333) I found that the cluster can be 
 crashed if you add an observer whose election port isn't reachable (i.e.: 
 packets for that destination are dropped, not rejected). This will raise a 
 SocketTimeoutException which will bring down the PrepRequestProcessor:
 {code}
 2015-06-02 14:37:16,473 [myid:3] - WARN  [ProcessThread(sid:3 
 cport:-1)::QuorumCnxManager@384] - Cannot open channel to 100 at election 
 address /8.8.8.8:38703
 java.net.SocketTimeoutException: connect timed out
 at java.net.PlainSocketImpl.socketConnect(Native Method)
 at 
 java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:345)
 at 
 java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206)
 at 
 java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188)
 at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
 at java.net.Socket.connect(Socket.java:589)
 at 
 org.apache.zookeeper.server.quorum.QuorumCnxManager.connectOne(QuorumCnxManager.java:369)
 at 
 org.apache.zookeeper.server.quorum.QuorumPeer.connectNewPeers(QuorumPeer.java:1288)
 at 
 org.apache.zookeeper.server.quorum.QuorumPeer.setLastSeenQuorumVerifier(QuorumPeer.java:1315)
 at org.apache.zookeeper.server.quorum.Leader.propose(Leader.java:1056)
 at 
 org.apache.zookeeper.server.quorum.ProposalRequestProcessor.processRequest(ProposalRequestProcessor.java:78)
 at 
 org.apache.zookeeper.server.PrepRequestProcessor.pRequest(PrepRequestProcessor.java:877)
 at 
 org.apache.zookeeper.server.PrepRequestProcessor.run(PrepRequestProcessor.java:143)
 {code}
 A simple repro can be obtained by using the code in the referenced pull 
 request above and using 8.8.8.8:3888 (for example) instead of a free (but 
 closed) port in the loopback. 
 I think that adding an Observer (or a Participant) that isn't currently 
 reachable is a valid use case (i.e.: you are provisioning the machine and 
 it's not currently needed) so I think we could handle this with lower connect 
 timeouts, not sure. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)