Re: Pathological ZK cluster: 1 server verbosely WARN'ing, other 2 servers pegging CPU

2010-04-30 Thread Patrick Hunt
On 04/30/2010 10:16 AM, Aaron Crow wrote: Hi Patrick, thanks for your time and detailed questions. No worries. When we hear about an issue we're very interested to followup and resolve it, regardless of the source. We take the project goals of high reliability/availablity _very_ seriously,

Re: Question on maintaining leader/membership status in zookeeper

2010-04-30 Thread Mahadev Konar
Hi Lei, In this case, the Leader will be disconnected from ZK cluster and will give up its leadership. Since its disconnected, ZK cluster will realize that the Leader is dead! When Zk cluster realizes that the Leader is dead (this is because the zk cluster hasn't heard from the Leader for a

Re: Question on maintaining leader/membership status in zookeeper

2010-04-30 Thread Lei Gao
Hi Mahadev, Why would the leader be disconnected from ZK? ZK is fine communicating with the leader in this case. We are talking about asymmetric network failure. Yes. Leader could consider all the slaves being down if it tracks the status of all slaves himself. But I guess if ZK is used for for

Re: Question on maintaining leader/membership status in zookeeper

2010-04-30 Thread Mahadev Konar
Hi Lei, Sorry I minsinterpreted your question! The scenario you describe could be handled in such a way - You could have a status node in ZooKeeper which every slave will subscribe to and update! If one of the slave nodes sees that there have been too many connection refused to the Leader by the

Re: Question on maintaining leader/membership status in zookeeper

2010-04-30 Thread Ted Dunning
Lei, I think that Mahadev was correct that there is some confusion here. Leader election is normally a term used for an operation that is entirely internal to ZK. It is very robust and you probably don't need to worry about it. You can then use ZK in your application to pick a lead machine for

Re: Question on maintaining leader/membership status in zookeeper

2010-04-30 Thread Lei Gao
Hi Mahadev, First of all, I like to thank you for being patient with me - my questions seem unclear to many of you who try to help me. I guess clients have to be smart enough to trigger a new leader election by trying to delete the znode. But in this case, ZK should not allow any single or

Re: Question on maintaining leader/membership status in zookeeper

2010-04-30 Thread Patrick Hunt
I believe Lei's concern is that the leader and all slaves can talk to ZK, but the slaves cannot talk to the leader. As a result no work can be done. However nothing will happen on the ZK side since everyone is heartbeating properly. Mahadev I think you came up with a pretty good solution.

Re: Question on maintaining leader/membership status in zookeeper

2010-04-30 Thread Mahadev Konar
HI Lei, ZooKeeper provides a set of primitives which allows you to do all kinds of things! You might want to take a look at the api and some examples of zookeeper recipes to see how it works and probably that will clear things out for you. Here are the links: