[
https://issues.apache.org/jira/browse/CURATOR-573?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17126352#comment-17126352
]
Viniti commented on CURATOR-573:
--------------------------------
[~randgalt] I am facing this issue intermittently(last time was 15 days ago) on
my staging environment, could not replicate on my local environment. I see
below logs as well, if that helps:
2020-06-04 18:23:29 INFO CuratorFrameworkImpl:937 - backgroundOperationsLoop
exiting
2020-06-04 18:23:29 ERROR LeaderSelector:454 - The leader threw an exception
java.lang.IllegalStateException: instance must be started before calling this
method
at
org.apache.curator.shaded.com.google.common.base.Preconditions.checkState(Preconditions.java:444)
at
org.apache.curator.framework.imps.CuratorFrameworkImpl.delete(CuratorFrameworkImpl.java:424)
at
org.apache.curator.framework.recipes.locks.LockInternals.deleteOurPath(LockInternals.java:347)
at
org.apache.curator.framework.recipes.locks.LockInternals.releaseLock(LockInternals.java:124)
at
org.apache.curator.framework.recipes.locks.InterProcessMutex.release(InterProcessMutex.java:154)
at
org.apache.curator.framework.recipes.leader.LeaderSelector.doWork(LeaderSelector.java:449)
at
org.apache.curator.framework.recipes.leader.LeaderSelector.doWorkLoop(LeaderSelector.java:466)
at
org.apache.curator.framework.recipes.leader.LeaderSelector.access$100(LeaderSelector.java:65)
at
org.apache.curator.framework.recipes.leader.LeaderSelector$2.call(LeaderSelector.java:246)
at
org.apache.curator.framework.recipes.leader.LeaderSelector$2.call(LeaderSelector.java:240)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
2020-06-04 18:23:29 INFO ClientCnxn:524 - EventThread shut down for session:
0x302f056e0960036
2020-06-04 18:23:29 INFO ZooKeeper:1422 - Session: 0x302f056e0960036 closed
2020-06-04 18:23:29 INFO CuratorFrameworkImpl:937 - backgroundOperationsLoop
exiting
2020-06-04 18:23:29 INFO ZooKeeper:1422 - Session: 0x302f056e0960035 closed
2020-06-04 18:23:29 INFO ClientCnxn:524 - EventThread shut down for session:
0x302f056e0960035
> No leader is getting selected intermittently
> --------------------------------------------
>
> Key: CURATOR-573
> URL: https://issues.apache.org/jira/browse/CURATOR-573
> Project: Apache Curator
> Issue Type: Bug
> Components: Apache, Framework, Recipes
> Affects Versions: 4.0.1
> Reporter: Viniti
> Priority: Critical
>
> I am using Apache Curator Leader Election Recipe :
> https://curator.apache.org/curator-recipes/leader-election.html in my
> application.
> Zookeeper version : 3.5.7
> Curator : 4.0.1
> Below are the sequence of steps:
> 1. Whenever my tomcat server instance is getting up, I create a single
> CuratorFramework instance(single instance per tomcat server) and start it :
> ```
> CuratorFramework client = CuratorFrameworkFactory.newClient(connectionString,
> retryPolicy);
> client.start();
> if(!client.blockUntilConnected(10, TimeUnit.MINUTES)){
> LOGGER.error("Zookeeper connection could not establish!");
> throw new RuntimeException("Zookeeper connection could not establish");
> }
> ```
> 2. Create an instance of LSAdapter and start it:
> ```
> LSAdapter adapter = new LSAdapter(client, <some_metadata>);
> adapter.start();
> ```
> Below is my LSAdapter class :
> ```
> public class LSAdapter extends LeaderSelectorListenerAdapter implements
> Closeable {
> //<Class instance variables defined>
> public LSAdapter(CuratorFramework client, <some_metadata>) {
> leaderSelector = new LeaderSelector(client,
> <path_to_be_used_for_leader_election>, this);
> leaderSelector.autoRequeue();
> }
> public void start() throws IOException {
> leaderSelector.start();
> }
> @Override
> public void close() throws IOException {
> leaderSelector.close();
> }
> @Override
> public void takeLeadership(CuratorFramework client) throws Exception {
> final int waitSeconds = (int) (5 * Math.random()) + 1;
> LOGGER.info(name + " is now the leader. Waiting " + waitSeconds + "
> seconds...");
> LOGGER.debug(name + " has been leader " + leaderCount.getAndIncrement() + "
> time(s) before.");
> while (true) {
> try {
> Thread.sleep(TimeUnit.SECONDS.toMillis(waitSeconds));
> //do leader tasks
> } catch (InterruptedException e) {
> LOGGER.error(name + " was interrupted.");
> //cleanup
> Thread.currentThread().interrupt();
> } finally {
> }
> }
> }
> }
> ```
> 4. When server instance is getting down, close LSAdapter instance(which
> application is using) and close CuratorFramework client created
> ```
> CloseableUtils.closeQuietly(lsAdapter);
> curatorFrameworkClient.close();
> ```
> The issue I am facing is that at times, when server is restarted, no leader
> gets elected. I checked that by tracing the log inside takeLeadership(). I
> have two tomcat server instances with above code, connecting to same
> zookeeper quorum and most of the times one of the instance becomes leader but
> when this issue happens, both of them becomes follower. Please suggest what
> am I doing wrong.
>
--
This message was sent by Atlassian Jira
(v8.3.4#803005)