Re: Leader Elections

2009-07-20 Thread Flavio Junqueira
For the partial subscription, my take right now is that it should be part of the registration procedure. When an observer joins, it contacts some ensemble server (a follower or the current leader) and appends a path to the initial message it sends to this server. This path corresponds to t

Re: Leader Elections

2009-07-20 Thread Flavio Junqueira
Henry's observation sounds right. I'd like to point out, though, that for BCP it might be interesting in some cases to allow multiple groups to contain prospective leaders. To tolerate any group of a set of groups failing completely, you would need at least 3 groups, so it is probably not a

RE: Leader Elections

2009-07-20 Thread Todd Greenwood
Henry, cool. When youre patch is ready for testing, I'll devote some time to take a test pass on it. -Original Message- From: Henry Robinson [mailto:he...@cloudera.com] Sent: Monday, July 20, 2009 2:54 PM To: zookeeper-user@hadoop.apache.org Subject: Re: Leader Elections On Mon, Jul 20,

Re: Leader Elections

2009-07-20 Thread Henry Robinson
On Mon, Jul 20, 2009 at 7:50 PM, Todd Greenwood wrote: > Flavio, Ted, Henry, Scott, this would perfectly well for my use case > provided: > > SINGLE ENSEMBLE: >GROUP A : ZK Servers w/ read/write AND Leader Elections >GROUP B : ZK Servers w/ read/write W/O Leader Elections > > So, w

Re: Leader Elections

2009-07-20 Thread Henry Robinson
I think partial subscription for an Observer would be easy to do - I don't think it will make it into 368 which is big enough already, but it would not be an enormous amount of work. The main thing to do is to figure out the protocol for subscription; probably just a new message type. That said, it

Re: Leader Elections

2009-07-20 Thread Scott Carey
On 7/20/09 2:21 PM, "Scott Carey" wrote: > > Yes, a more general partial graph subscription / ownership framework would > allow for not just better WAN scalability but also (and more critically IMO) > higher reliability. Often, some large subset of application functionality is > local to one n

Re: Leader Elections

2009-07-20 Thread Scott Carey
Todd has put it much more eloquently. Comments below: On 7/20/09 11:50 AM, "Todd Greenwood" wrote: > Flavio, Ted, Henry, Scott, this would perfectly well for my use case > provided: > > SINGLE ENSEMBLE: > GROUP A : ZK Servers w/ read/write AND Leader Elections > GROUP B : ZK Se

Re: Leader Elections

2009-07-20 Thread Mahadev Konar
Both of the options that Scott mentioned are quite interesting. Quite a few of our users are interested in these two features. I think for 2, we should be able to use observers with a subscription to the master cluster with interested in a special subtree. That avoids too much of cross talk. Henry

RE: Leader Elections

2009-07-20 Thread Todd Greenwood
Flavio, Ted, Henry, Scott, this would perfectly well for my use case provided: SINGLE ENSEMBLE: GROUP A : ZK Servers w/ read/write AND Leader Elections GROUP B : ZK Servers w/ read/write W/O Leader Elections So, we can craft this via Observers and Hiererarchial Quorum groups? Grea

Re: Leader Elections

2009-07-20 Thread Scott Carey
Observers would be awesome especially with a couple enhancements / extensions: An option for the observers to enter a special state if the WAN link goes down to the "master" cluster. A read-only option would be great. However, allowing certain types of writes to continue on a limited basis would