I'd certainly like to understand the fundamental problem you're seeing of
why any server is unable to enter quorum for any period of time without
being partitioned, etc. Is there a ticket open for this or do you think
it's just part of your env somehow?

As for the larger question, why not run the check on a regular timing and
fail if it isn't in quorum for more than N checks? We could add a 4lw but
it seems like you should be able to figure this out in other ways.

C


On Tue, Feb 18, 2014 at 5:51 PM, Deepak Jagtap <[email protected]>wrote:

> Hi All,
>
> I came across couple of instances where one zookeeper server was falling
> out from the quorum due to some bug/issue with leader election not
> completing successfully.
>
> We are trying to mitigate this problem by monitoring status of zookeeper
> server to check if it is part of the quorum.
> If it's not part of the quorum for very long time we restart zookeeper
> server so that it can join the quorum again.
>
> Currently there is no way to check if server is part of quorum :
> 'ruok'  returns 'imok' even if zookeeper server is running and is not part
> of quorum(i.e it might be continuously running leader election)
> 'mntr' command reports this information but it doesn't report how long
> server is in that state.
>
> I want to restart zookeeper server only if out of quorum for certain amount
> of time (say: 2 minutes).
> Do I need to add a new four letter word command to report this info or is
> there any other way I can achieve this?
>
> I would be more than happy to add this to zookeeper if its helpful for
> other zookeeper users.
>
> Thanks & Regards,
> Deepak
>

Reply via email to