Re: Restarting leader zookeeper instance made quorum lost

Ted Dunning Wed, 09 Apr 2014 14:24:18 -0700

On Wed, Apr 9, 2014 at 12:56 PM, Bae, Jae Hyeon <[email protected]> wrote:


> Let me clarify. a) is correct. There were normally running 5 instances with
> 3 as quorum. I restarted the leader instance and while re-electing leader,
> zookeeper cluster lost quorum for a minute and a few zookeeper clients lost
> connection. So, this is the form of losing quorum, correct?
>

Yes.


> Is there any way to avoid losing quorum while rolling restart of zookeeper
> cluster, specifically the leader instance?
>

No.

You have to always have 3 ZK nodes live in order to maintain continuous
operation.

Rolling restart implies that you wait long enough after restarting each
node so that it has a chance to rejoin the quorum.  If you do that then
restarting the leader will result in a tiny moment when writes will not be
accepted and may require some ZK clients to transparently reconnect to a
different ZK node, but it should be hard to detect any outage.



>
> Thank you
> Best, Jae
>
>
> On Wed, Apr 9, 2014 at 12:06 PM, Ted Dunning <[email protected]>
> wrote:
>
> > Your email is a little ambiguous.
> >
> > a) "5 instances with 3 as quorum" could mean 5 instances configured and
> > running normally.
> >
> > Or
> >
> > b) it could mean 5 instances with 2 instances that are down.
> >
> > In (a) restarting the leader instance *should* cause the cluster to do a
> > leader election again and form a new quorum.  That is a form of losing
> > quorum.  If that is what you mean, this is normal.  A new quorum should
> be
> > formed and things should continue fairly soon.
> >
> > In (b), restarting the leader will result in only 2 instances running
> which
> > is not enough to maintain quorum and until you have at least 3 nodes
> > running again, you can't proceed.
> >
> >
> >
> >
> >
> >
> > On Wed, Apr 9, 2014 at 11:03 AM, Bae, Jae Hyeon <[email protected]>
> > wrote:
> >
> > > Hi zookeeper users
> > >
> > > While rolling restart zookeeper cluster of 5 instances with 3 as
> quorum,
> > > restarting the leader instance made quorum lost. Is this expected?
> > > Otherwise, how can I restart the leader instance without interrupting
> > whole
> > > cluster? Or is this fixed in 3.4.6?
> > >
> > > Thank you
> > > Best, Jae
> > >
> >
>

Re: Restarting leader zookeeper instance made quorum lost

Reply via email to