[jira] Commented: (ZOOKEEPER-498) Unending Leader Elections : WAN configuration

2009-08-12 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-498?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12742287#action_12742287
 ] 

Hudson commented on ZOOKEEPER-498:
--

Integrated in ZooKeeper-trunk #413 (See 
[http://hudson.zones.apache.org/hudson/job/ZooKeeper-trunk/413/])
. Unending Leader Elections : WAN configuration (flavio via  mahadev)


 Unending Leader Elections : WAN configuration
 -

 Key: ZOOKEEPER-498
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-498
 Project: Zookeeper
  Issue Type: Bug
  Components: leaderElection
Affects Versions: 3.2.0
 Environment: Each machine:
 CentOS 5.2 64-bit
 2GB ram
 java version 1.6.0_13
 Java(TM) SE Runtime Environment (build 1.6.0_13-b03)
 Java HotSpot(TM) 64-Bit Server VM (build 11.3-b02, mixed 
 Network Topology:
 DC : central data center
 POD(N): remote data center
 Zookeeper Topology:
 Leaders may be elected only in DC (weight = 1)
 Only followers are elected in PODS (weight = 0)
Reporter: Todd Greenwood-Geer
Assignee: Flavio Paiva Junqueira
Priority: Critical
 Fix For: 3.2.1, 3.3.0

 Attachments: dc-zook-logs-01.tar.gz, pod-zook-logs-01.tar.gz, 
 zk498-test.tar.gz, zoo.cfg, ZOOKEEPER-498-3.2.patch, ZOOKEEPER-498.patch, 
 ZOOKEEPER-498.patch, ZOOKEEPER-498.patch, ZOOKEEPER-498.patch


 In a WAN configuration, ZooKeeper is endlessly electing, terminating, and 
 re-electing a ZooKeeper leader. The WAN configuration involves two groups, a 
 central DC group of ZK servers that have a voting weight = 1, and a group of 
 servers in remote pods with a voting weight of 0.
 What we expect to see is leaders elected only in the DC, and the pods to 
 contain only followers. What we are seeing is a continuous cycling of 
 leaders. We have seen this consistently with 3.2.0, 3.2.0 + recommended 
 patches (473, 479, 481, 491), and now release 3.2.1.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (ZOOKEEPER-483) ZK fataled on me, and ugly

2009-08-12 Thread Mahadev konar (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-483?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mahadev konar updated ZOOKEEPER-483:


Attachment: QuorumTest.log.gz

attaching the log file with trace turned on...

 ZK fataled on me, and ugly
 --

 Key: ZOOKEEPER-483
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-483
 Project: Zookeeper
  Issue Type: Bug
Affects Versions: 3.1.1
Reporter: ryan rawson
Assignee: Benjamin Reed
 Fix For: 3.2.1, 3.3.0

 Attachments: QuorumTest.log, QuorumTest.log.gz, zklogs.tar.gz, 
 ZOOKEEPER-483.patch, ZOOKEEPER-483.patch, ZOOKEEPER-483.patch


 here are the part of the log whereby my zookeeper instance crashed, taking 3 
 out of 5 down, and thus ruining the quorum for all clients:
 2009-07-23 12:29:06,769 WARN org.apache.zookeeper.server.NIOServerCnxn: 
 Exception causing close of session 0x52276d1d5161350 due to 
 java.io.IOException: Read error
 2009-07-23 12:29:00,756 WARN org.apache.zookeeper.server.quorum.Follower: 
 Exception when following the leader
 java.io.EOFException
 at java.io.DataInputStream.readInt(DataInputStream.java:375)
 at 
 org.apache.jute.BinaryInputArchive.readInt(BinaryInputArchive.java:63)
 at 
 org.apache.zookeeper.server.quorum.QuorumPacket.deserialize(QuorumPacket.java:65)
 at 
 org.apache.jute.BinaryInputArchive.readRecord(BinaryInputArchive.java:108)
 at 
 org.apache.zookeeper.server.quorum.Follower.readPacket(Follower.java:114)
 at 
 org.apache.zookeeper.server.quorum.Follower.followLeader(Follower.java:243)
 at 
 org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:494)
 2009-07-23 12:29:06,770 INFO org.apache.zookeeper.server.NIOServerCnxn: 
 closing session:0x52276d1d5161350 NIOServerCnxn: 
 java.nio.channels.SocketChannel[connected local=/10.20.20.151:2181 
 remote=/10.20.20.168:39489]
 2009-07-23 12:29:06,770 INFO org.apache.zookeeper.server.NIOServerCnxn: 
 closing session:0x12276d15dfb0578 NIOServerCnxn: 
 java.nio.channels.SocketChannel[connected local=/10.20.20.151:2181 
 remote=/10.20.20.159:46797]
 2009-07-23 12:29:06,771 INFO org.apache.zookeeper.server.NIOServerCnxn: 
 closing session:0x42276d1d3fa013e NIOServerCnxn: 
 java.nio.channels.SocketChannel[connected local=/10.20.20.151:2181 
 remote=/10.20.20.153:33998]
 2009-07-23 12:29:06,771 WARN org.apache.zookeeper.server.NIOServerCnxn: 
 Exception causing close of session 0x52276d1d5160593 due to 
 java.io.IOException: Read error
 2009-07-23 12:29:06,808 INFO org.apache.zookeeper.server.NIOServerCnxn: 
 closing session:0x32276d15d2e02bb NIOServerCnxn: 
 java.nio.channels.SocketChannel[connected local=/10.20.20.151:2181 
 remote=/10.20.20.158:53758]
 2009-07-23 12:29:06,809 INFO org.apache.zookeeper.server.NIOServerCnxn: 
 closing session:0x42276d1d3fa13e4 NIOServerCnxn: 
 java.nio.channels.SocketChannel[connected local=/10.20.20.151:2181 
 remote=/10.20.20.154:58681]
 2009-07-23 12:29:06,809 INFO org.apache.zookeeper.server.NIOServerCnxn: 
 closing session:0x22276d15e691382 NIOServerCnxn: 
 java.nio.channels.SocketChannel[connected local=/10.20.20.151:2181 
 remote=/10.20.20.162:59967]
 2009-07-23 12:29:06,809 INFO org.apache.zookeeper.server.NIOServerCnxn: 
 closing session:0x12276d15dfb1354 NIOServerCnxn: 
 java.nio.channels.SocketChannel[connected local=/10.20.20.151:2181 
 remote=/10.20.20.163:49957]
 2009-07-23 12:29:06,809 INFO org.apache.zookeeper.server.NIOServerCnxn: 
 closing session:0x42276d1d3fa13cd NIOServerCnxn: 
 java.nio.channels.SocketChannel[connected local=/10.20.20.151:2181 
 remote=/10.20.20.150:34212]
 2009-07-23 12:29:06,809 INFO org.apache.zookeeper.server.NIOServerCnxn: 
 closing session:0x22276d15e691383 NIOServerCnxn: 
 java.nio.channels.SocketChannel[connected local=/10.20.20.151:2181 
 remote=/10.20.20.159:46813]
 2009-07-23 12:29:06,809 INFO org.apache.zookeeper.server.NIOServerCnxn: 
 closing session:0x12276d15dfb0350 NIOServerCnxn: 
 java.nio.channels.SocketChannel[connected local=/10.20.20.151:2181 
 remote=/10.20.20.162:59956]
 2009-07-23 12:29:06,809 INFO org.apache.zookeeper.server.NIOServerCnxn: 
 closing session:0x32276d15d2e139b NIOServerCnxn: 
 java.nio.channels.SocketChannel[connected local=/10.20.20.151:2181 
 remote=/10.20.20.156:55138]
 2009-07-23 12:29:06,809 INFO org.apache.zookeeper.server.NIOServerCnxn: 
 closing session:0x32276d15d2e1398 NIOServerCnxn: 
 java.nio.channels.SocketChannel[connected local=/10.20.20.151:2181 
 remote=/10.20.20.167:41257]
 2009-07-23 12:29:06,810 INFO org.apache.zookeeper.server.NIOServerCnxn: 
 closing session:0x52276d1d5161355 NIOServerCnxn: 
 java.nio.channels.SocketChannel[connected local=/10.20.20.151:2181 
 remote=/10.20.20.153:34032]
 2009-07-23 12:29:06,810 INFO org.apache.zookeeper.server.NIOServerCnxn: 
 closing 

[jira] Commented: (ZOOKEEPER-368) Observers

2009-08-12 Thread Henry Robinson (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12742532#action_12742532
 ] 

Henry Robinson commented on ZOOKEEPER-368:
--

Anyone had a chance to try this out yet?

 Observers
 -

 Key: ZOOKEEPER-368
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-368
 Project: Zookeeper
  Issue Type: New Feature
  Components: quorum
Reporter: Flavio Paiva Junqueira
Assignee: Henry Robinson
 Attachments: obs-refactor.patch, observer-refactor.patch, 
 observers.patch, ZOOKEEPER-368.patch, ZOOKEEPER-368.patch, 
 ZOOKEEPER-368.patch, ZOOKEEPER-368.patch, ZOOKEEPER-368.patch, 
 ZOOKEEPER-368.patch


 Currently, all servers of an ensemble participate actively in reaching 
 agreement on the order of ZooKeeper transactions. That is, all followers 
 receive proposals, acknowledge them, and receive commit messages from the 
 leader. A leader issues commit messages once it receives acknowledgments from 
 a quorum of followers. For cross-colo operation, it would be useful to have a 
 third role: observer. Using Paxos terminology, observers are similar to 
 learners. An observer does not participate actively in the agreement step of 
 the atomic broadcast protocol. Instead, it only commits proposals that have 
 been accepted by some quorum of followers.
 One simple solution to implement observers is to have the leader forwarding 
 commit messages not only to followers but also to observers, and have 
 observers applying transactions according to the order followers agreed upon. 
 In the current implementation of the protocol, however, commit messages do 
 not carry their corresponding transaction payload because all servers 
 different from the leader are followers and followers receive such a payload 
 first through a proposal message. Just forwarding commit messages as they 
 currently are to an observer consequently is not sufficient. We have a couple 
 of options:
 1- Include the transaction payload along in commit messages to observers;
 2- Send proposals to observers as well.
 Number 2 is simpler to implement because it doesn't require changing the 
 protocol implementation, but it increases traffic slightly. The performance 
 impact due to such an increase might be insignificant, though.
 For scalability purposes, we may consider having followers also forwarding 
 commit messages to observers. With this option, observers can connect to 
 followers, and receive messages from followers. This choice is important to 
 avoid increasing the load on the leader with the number of observers. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-368) Observers

2009-08-12 Thread Flavio Paiva Junqueira (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12742550#action_12742550
 ] 

Flavio Paiva Junqueira commented on ZOOKEEPER-368:
--

Henry, we have been concentrating on the jiras for the 3.2.1 release. We'll get 
back to observers as soon as we are done with this release. Is this ok?

 Observers
 -

 Key: ZOOKEEPER-368
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-368
 Project: Zookeeper
  Issue Type: New Feature
  Components: quorum
Reporter: Flavio Paiva Junqueira
Assignee: Henry Robinson
 Attachments: obs-refactor.patch, observer-refactor.patch, 
 observers.patch, ZOOKEEPER-368.patch, ZOOKEEPER-368.patch, 
 ZOOKEEPER-368.patch, ZOOKEEPER-368.patch, ZOOKEEPER-368.patch, 
 ZOOKEEPER-368.patch


 Currently, all servers of an ensemble participate actively in reaching 
 agreement on the order of ZooKeeper transactions. That is, all followers 
 receive proposals, acknowledge them, and receive commit messages from the 
 leader. A leader issues commit messages once it receives acknowledgments from 
 a quorum of followers. For cross-colo operation, it would be useful to have a 
 third role: observer. Using Paxos terminology, observers are similar to 
 learners. An observer does not participate actively in the agreement step of 
 the atomic broadcast protocol. Instead, it only commits proposals that have 
 been accepted by some quorum of followers.
 One simple solution to implement observers is to have the leader forwarding 
 commit messages not only to followers but also to observers, and have 
 observers applying transactions according to the order followers agreed upon. 
 In the current implementation of the protocol, however, commit messages do 
 not carry their corresponding transaction payload because all servers 
 different from the leader are followers and followers receive such a payload 
 first through a proposal message. Just forwarding commit messages as they 
 currently are to an observer consequently is not sufficient. We have a couple 
 of options:
 1- Include the transaction payload along in commit messages to observers;
 2- Send proposals to observers as well.
 Number 2 is simpler to implement because it doesn't require changing the 
 protocol implementation, but it increases traffic slightly. The performance 
 impact due to such an increase might be insignificant, though.
 For scalability purposes, we may consider having followers also forwarding 
 commit messages to observers. With this option, observers can connect to 
 followers, and receive messages from followers. This choice is important to 
 avoid increasing the load on the leader with the number of observers. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-368) Observers

2009-08-12 Thread Henry Robinson (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12742553#action_12742553
 ] 

Henry Robinson commented on ZOOKEEPER-368:
--

Of course! That was just a keepalive message :)

 Observers
 -

 Key: ZOOKEEPER-368
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-368
 Project: Zookeeper
  Issue Type: New Feature
  Components: quorum
Reporter: Flavio Paiva Junqueira
Assignee: Henry Robinson
 Attachments: obs-refactor.patch, observer-refactor.patch, 
 observers.patch, ZOOKEEPER-368.patch, ZOOKEEPER-368.patch, 
 ZOOKEEPER-368.patch, ZOOKEEPER-368.patch, ZOOKEEPER-368.patch, 
 ZOOKEEPER-368.patch


 Currently, all servers of an ensemble participate actively in reaching 
 agreement on the order of ZooKeeper transactions. That is, all followers 
 receive proposals, acknowledge them, and receive commit messages from the 
 leader. A leader issues commit messages once it receives acknowledgments from 
 a quorum of followers. For cross-colo operation, it would be useful to have a 
 third role: observer. Using Paxos terminology, observers are similar to 
 learners. An observer does not participate actively in the agreement step of 
 the atomic broadcast protocol. Instead, it only commits proposals that have 
 been accepted by some quorum of followers.
 One simple solution to implement observers is to have the leader forwarding 
 commit messages not only to followers but also to observers, and have 
 observers applying transactions according to the order followers agreed upon. 
 In the current implementation of the protocol, however, commit messages do 
 not carry their corresponding transaction payload because all servers 
 different from the leader are followers and followers receive such a payload 
 first through a proposal message. Just forwarding commit messages as they 
 currently are to an observer consequently is not sufficient. We have a couple 
 of options:
 1- Include the transaction payload along in commit messages to observers;
 2- Send proposals to observers as well.
 Number 2 is simpler to implement because it doesn't require changing the 
 protocol implementation, but it increases traffic slightly. The performance 
 impact due to such an increase might be insignificant, though.
 For scalability purposes, we may consider having followers also forwarding 
 commit messages to observers. With this option, observers can connect to 
 followers, and receive messages from followers. This choice is important to 
 avoid increasing the load on the leader with the number of observers. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.