>> we should measure the total time more accurately

+1 - it would be good to have a new metric to measure reconfiguration time,
and leaving existing LE time metric dedicated to measure the conventional
FLE time. Mixing both (as of today) will provide some confusing insights on
how long the conventional FLE actually took.

On Mon, Jul 29, 2019 at 7:13 PM Alexander Shraer <shra...@gmail.com> wrote:

> Please see comments inline.
>
> Thanks,
> Alex
>
> On Mon, Jul 29, 2019 at 5:29 PM Karolos Antoniadis <karo...@gmail.com>
> wrote:
>
> > Hi ZooKeeper developers,
> >
> > ZooKeeper seems to be logging a "*LEADER ELECTION TOOK*" message even
> > though no leader election takes place during a reconfiguration.
> >
> > This can be reproduced by following these steps:
> > 1) start a ZooKeeper cluster (e.g., 3 participants)
> > 2) start a client that connects to some follower
> > 3) perform a *reconfig* operation that removes the leader from the
> cluster
> >
> > After the reconfiguration takes place, we can see that the log files of
> the
> > remaining participants contain a "*LEADER ELECTION TOOK*" message. For
> > example, a line that contains
> >
> > *2019-07-29 23:07:38,518 [myid:2] - INFO
> >  [QuorumPeer[myid=2](plain=0.0.0.0:2792)(secure=disabled):Follower@75] -
> > FOLLOWING - LEADER ELECTION TOOK - 57 MS*
> >
> > However, no leader election took place. With that, I mean that no server
> > went *LOOKING *and then started voting and sending notifications to other
> > participants as would be in a normal leader election.
> > It seems, that before the *reconfig *is committed, the participant that
> is
> > going to be the next leader is already decided (see here:
> >
> >
> https://github.com/apache/zookeeper/blob/master/zookeeper-server/src/main/java/org/apache/zookeeper/server/quorum/Leader.java#L865
> > ).
> >
> > I think the above issue raises the following questions:
> > - Should we avoid logging LEADER ELECTION messages altogether in such
> > cases?
> >
>
> In the specific scenario you described, the leader has changed, but our
> heuristic for choosing the leader apparently worked and a new leader could
> be elected without running the full election.
> Notice that we could be unlucky and the designated leader could be offline,
> and then we'll fall back on election. It's useful to know how much time it
> takes to start following the new leader.
>
>
> > - Or, should there be some logging for the time it took for the
> > reconfiguration (e.g., the time between a participant gets a *reconfig*
> > operation till the operation is committed)? Would such a time value be
> > useful?
> >
>
> IIRC the LEADER ELECTION message is used for this purpose. if you just look
> on the time to commit the reconfig operation, you won't
> account for the work that happens when the commit message is received, such
> as leader re-election, role-change (follower->observer conversion and such)
> etc which is what takes most of the time.
> Committing a reconfig operation is usually not much more expensive than
> committing a normal operation. But perhaps you're right that we should
> measure the total time more accurately. Would you
> like to open a Jira and perhaps take a stab at improving this ?
>
> >
> > Best,
> > Karolos
> >
>

Reply via email to