RE: Re: Leader election stalled

2008-09-16 Thread Benjamin Reed
@hadoop.apache.org Subject:Re: Leader election stalled Got it, thanks! I believe the problem still exists- please see my comment. Best, Austin On Sep 16, 2008, at 3:26 AM, Flavio Junqueira wrote: > Austin, Please check: > > https://issues.apache.org/jira/browse/ZOOKEEPER-140 &

Re: Leader election stalled

2008-09-16 Thread Austin Shoemaker
[mailto:[EMAIL PROTECTED] Sent: Tuesday, September 16, 2008 12:22 PM To: zookeeper-user@hadoop.apache.org Subject: Re: Leader election stalled Ben, Here is a proposed fix for the deadlock issue in QuorumCnxManager. The protocol starts by an initiator invoking handleConnection(socket_out) where socket

RE: Leader election stalled

2008-09-16 Thread Flavio Junqueira
Austin, Please check: https://issues.apache.org/jira/browse/ZOOKEEPER-140 Thanks, -Flavio > -Original Message- > From: Austin Shoemaker [mailto:[EMAIL PROTECTED] > Sent: Tuesday, September 16, 2008 12:22 PM > To: zookeeper-user@hadoop.apache.org > Subject: Re: Leader e

Re: Leader election stalled

2008-09-16 Thread Austin Shoemaker
u also need to set the election port (TCP) to be used. See http://zookeeper.wiki.sourceforge.net/ZooKeeperConfiguration for more details. ben -Original Message- From: Austin Shoemaker [mailto:[EMAIL PROTECTED] Sent: Tuesday, September 02, 2008 9:57 AM To: zookeeper-user@hadoop.apache.

Re: Leader election stalled

2008-09-12 Thread Austin Shoemaker
to:[EMAIL PROTECTED] Sent: Tuesday, September 02, 2008 9:57 AM To: zookeeper-user@hadoop.apache.org Subject: Leader election stalled Hi, We have run into a situation where killing the leader results in followers perpetually trying to reelect that leader. We have 11 zookeeper (2.2.1 from SF.net) server

Re: Leader election stalled

2008-09-02 Thread Austin Shoemaker
eader election stalled Hi Austin, Did you kill the leader process? It looks like that you didn't kill the server since its responding to ruok. Is that true? mahadev On 9/2/08 9:56 AM, "Austin Shoemaker" <[EMAIL PROTECTED]> wrote: Hi, We have run into a situation where kil

RE: Leader election stalled

2008-09-02 Thread Benjamin Reed
ailto:[EMAIL PROTECTED] Sent: Tuesday, September 02, 2008 10:06 AM To: zookeeper-user@hadoop.apache.org Subject: Re: Leader election stalled Hi Austin, Did you kill the leader process? It looks like that you didn't kill the server since its responding to ruok. Is that true? mahadev On 9/2/

RE: Leader election stalled

2008-09-02 Thread Benjamin Reed
Shoemaker [mailto:[EMAIL PROTECTED] Sent: Tuesday, September 02, 2008 9:57 AM To: zookeeper-user@hadoop.apache.org Subject: Leader election stalled Hi, We have run into a situation where killing the leader results in followers perpetually trying to reelect that leader. We have 11 zookeeper (2.2.1

Re: Leader election stalled

2008-09-02 Thread Austin Shoemaker
We killed the leader process, and it does not respond to ruok or election datagrams. Austin On Sep 2, 2008, at 10:05 AM, Mahadev Konar wrote: Hi Austin, Did you kill the leader process? It looks like that you didn't kill the server since its responding to ruok. Is that true? mahadev On

Re: Leader election stalled

2008-09-02 Thread Mahadev Konar
Hi Austin, Did you kill the leader process? It looks like that you didn't kill the server since its responding to ruok. Is that true? mahadev On 9/2/08 9:56 AM, "Austin Shoemaker" <[EMAIL PROTECTED]> wrote: > Hi, > > We have run into a situation where killing the leader results in followers >

Leader election stalled

2008-09-02 Thread Austin Shoemaker
Hi, We have run into a situation where killing the leader results in followers perpetually trying to reelect that leader. We have 11 zookeeper (2.2.1 from SF.net) servers and 256 clients connecting at random. We kill the leader and observe the impact, monitoring a script that repeatedly prints th