[ 
https://issues.apache.org/jira/browse/CURATOR-573?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17126352#comment-17126352
 ] 

Viniti commented on CURATOR-573:
--------------------------------

[~randgalt] I am facing this issue intermittently(last time was 15 days ago) on 
my staging environment, could not replicate on my local environment. I see 
below logs as well, if that helps:

2020-06-04 18:23:29 INFO  CuratorFrameworkImpl:937 - backgroundOperationsLoop 
exiting
2020-06-04 18:23:29 ERROR LeaderSelector:454 - The leader threw an exception
java.lang.IllegalStateException: instance must be started before calling this 
method
        at 
org.apache.curator.shaded.com.google.common.base.Preconditions.checkState(Preconditions.java:444)
        at 
org.apache.curator.framework.imps.CuratorFrameworkImpl.delete(CuratorFrameworkImpl.java:424)
        at 
org.apache.curator.framework.recipes.locks.LockInternals.deleteOurPath(LockInternals.java:347)
        at 
org.apache.curator.framework.recipes.locks.LockInternals.releaseLock(LockInternals.java:124)
        at 
org.apache.curator.framework.recipes.locks.InterProcessMutex.release(InterProcessMutex.java:154)
        at 
org.apache.curator.framework.recipes.leader.LeaderSelector.doWork(LeaderSelector.java:449)
        at 
org.apache.curator.framework.recipes.leader.LeaderSelector.doWorkLoop(LeaderSelector.java:466)
        at 
org.apache.curator.framework.recipes.leader.LeaderSelector.access$100(LeaderSelector.java:65)
        at 
org.apache.curator.framework.recipes.leader.LeaderSelector$2.call(LeaderSelector.java:246)
        at 
org.apache.curator.framework.recipes.leader.LeaderSelector$2.call(LeaderSelector.java:240)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)
2020-06-04 18:23:29 INFO  ClientCnxn:524 - EventThread shut down for session: 
0x302f056e0960036
2020-06-04 18:23:29 INFO  ZooKeeper:1422 - Session: 0x302f056e0960036 closed
2020-06-04 18:23:29 INFO  CuratorFrameworkImpl:937 - backgroundOperationsLoop 
exiting
2020-06-04 18:23:29 INFO  ZooKeeper:1422 - Session: 0x302f056e0960035 closed
2020-06-04 18:23:29 INFO  ClientCnxn:524 - EventThread shut down for session: 
0x302f056e0960035

> No leader is getting selected intermittently
> --------------------------------------------
>
>                 Key: CURATOR-573
>                 URL: https://issues.apache.org/jira/browse/CURATOR-573
>             Project: Apache Curator
>          Issue Type: Bug
>          Components: Apache, Framework, Recipes
>    Affects Versions: 4.0.1
>            Reporter: Viniti
>            Priority: Critical
>
> I am using Apache Curator Leader Election Recipe : 
> https://curator.apache.org/curator-recipes/leader-election.html in my 
> application.
> Zookeeper version : 3.5.7
> Curator : 4.0.1
> Below are the sequence of steps:
> 1. Whenever my tomcat server instance is getting up, I create a single 
> CuratorFramework instance(single instance per tomcat server) and start it : 
> ```
> CuratorFramework client = CuratorFrameworkFactory.newClient(connectionString, 
> retryPolicy);
> client.start();
> if(!client.blockUntilConnected(10, TimeUnit.MINUTES)){
>  LOGGER.error("Zookeeper connection could not establish!");
>  throw new RuntimeException("Zookeeper connection could not establish");
> }
> ```
> 2. Create an instance of LSAdapter and start it:
> ```
> LSAdapter adapter = new LSAdapter(client, <some_metadata>);
> adapter.start();
> ```
> Below is my LSAdapter class :
> ```
> public class LSAdapter extends LeaderSelectorListenerAdapter implements 
> Closeable {
> //<Class instance variables defined>
>  public LSAdapter(CuratorFramework client, <some_metadata>) {
>  leaderSelector = new LeaderSelector(client, 
> <path_to_be_used_for_leader_election>, this);
>  leaderSelector.autoRequeue();
>  }
> public void start() throws IOException {
>  leaderSelector.start();
>  }
> @Override
>  public void close() throws IOException {
>  leaderSelector.close();
>  }
> @Override
>  public void takeLeadership(CuratorFramework client) throws Exception {
>  final int waitSeconds = (int) (5 * Math.random()) + 1;
> LOGGER.info(name + " is now the leader. Waiting " + waitSeconds + " 
> seconds...");
>  LOGGER.debug(name + " has been leader " + leaderCount.getAndIncrement() + " 
> time(s) before.");
>  while (true) {
>  try {
>  Thread.sleep(TimeUnit.SECONDS.toMillis(waitSeconds));
>  //do leader tasks
>  } catch (InterruptedException e) {
>  LOGGER.error(name + " was interrupted.");
>  //cleanup
>  Thread.currentThread().interrupt();
>  } finally {
> }
>  }
>  }
> }
> ```
> 4. When server instance is getting down, close LSAdapter instance(which 
> application is using) and close CuratorFramework client created
> ```
> CloseableUtils.closeQuietly(lsAdapter);
> curatorFrameworkClient.close();
> ```
> The issue I am facing is that at times, when server is restarted, no leader 
> gets elected. I checked that by tracing the log inside takeLeadership(). I 
> have two tomcat server instances with above code, connecting to same 
> zookeeper quorum and most of the times one of the instance becomes leader but 
> when this issue happens, both of them becomes follower. Please suggest what 
> am I doing wrong.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to