[
https://issues.apache.org/jira/browse/ZOOKEEPER-1814?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13955013#comment-13955013
]
Daniel Peon commented on ZOOKEEPER-1814:
----------------------------------------
Hi,
Reason of modification:
----------------------------
When the FastLeaderElection process reaches a big timeout, the process waits
until that timeout before taking control. Thus, if a shutdown happens during
that waiting time, the process doesn't realize of the shutdown until the
timeout, which is so long. Reducing the maximum time to a reasonable value (2-5
seconds) we are not sending a heavy traffic, but we can detect a shutdown in
few seconds instead of 1 minute in the worst case.
Reason of configurable parameter:
-------------------------------------------
We may have chosen a fixed value, but the reason of making
maxFleNotificationInterval a configurable parameter is backward compatibility.
If someone's deployment is already working by using that 60 seconds maximum
time for any reason, I didn't want to modify the default behavior.
Regards.
> Reduction of waiting time during Fast Leader Election
> -----------------------------------------------------
>
> Key: ZOOKEEPER-1814
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1814
> Project: ZooKeeper
> Issue Type: Bug
> Components: leaderElection
> Affects Versions: 3.4.5, 3.5.0
> Reporter: Daniel Peon
> Assignee: Daniel Peon
> Fix For: 3.5.0
>
> Attachments: ZOOKEEPER-1814.patch, ZOOKEEPER-1814.patch,
> ZOOKEEPER-1814.patch, ZOOKEEPER-1814.patch, ZOOKEEPER-1814.patch,
> ZOOKEEPER-1814.patch, ZOOKEEPER-1814.patch
>
> Original Estimate: 24h
> Remaining Estimate: 24h
>
> FastLeader election takes long time because of the exponential backoff.
> Currently the time is 60 seconds.
> It would be interesting to give the possibility to configure this parameter,
> like for example for a Server shutdown.
> Otherwise, it sometimes takes so long and it has been detected a test failure
> when executing: org.apache.zookeeper.server.quorum.QuorumPeerMainTest.
> This test case waits until 30 seconds and this is smaller than the 60 seconds
> where the leader election can be waiting for at the moment of shutting down.
> Considering the failure during the test case, this issue was considered a
> possible bug.
--
This message was sent by Atlassian JIRA
(v6.2#6252)