[ 
https://issues.apache.org/jira/browse/CURATOR-573?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17127410#comment-17127410
 ] 

Viniti commented on CURATOR-573:
--------------------------------

Sure, I will change the code as suggested(below are the changes). Please keep 
this bug opened for 2 days so that I can test if this change fixes the leader 
election issue. I will close this bug after that. Thanks for your time and help.
{code:java|title=takeLeadership()|borderStyle=solid}
    @Override
    public void takeLeadership(CuratorFramework client) throws Exception {
        // we are now the leader. This method should not return until we want 
to relinquish leadership

        final int waitSeconds = (int) (5 * Math.random()) + 1;

        LOGGER.info(name + " is now the leader. Waiting " + waitSeconds + " 
seconds...");
        LOGGER.debug(name + " has been leader " + leaderCount.getAndIncrement() 
+ " time(s) before.");
        //do leader task
        try {
            while (true) {
                Thread.sleep(TimeUnit.SECONDS.toMillis(waitSeconds));
            }
        } catch (InterruptedException e) {
             LOGGER.error(name + " was interrupted.");
             //application code cleanup for leader tasks       
            Thread.currentThread().interrupt();
        }
    }
{code}

> No leader is getting selected intermittently
> --------------------------------------------
>
>                 Key: CURATOR-573
>                 URL: https://issues.apache.org/jira/browse/CURATOR-573
>             Project: Apache Curator
>          Issue Type: Bug
>          Components: Apache, Framework, Recipes
>    Affects Versions: 4.0.1
>            Reporter: Viniti
>            Priority: Critical
>
> I am using Apache Curator Leader Election Recipe : 
> https://curator.apache.org/curator-recipes/leader-election.html in my 
> application.
> Zookeeper version : 3.5.7
> Curator : 4.0.1
> Below are the sequence of steps:
> 1. Whenever my tomcat server instance is getting up, I create a single 
> CuratorFramework instance(single instance per tomcat server) and start it : 
> {code:title=StartUp Code|borderStyle=solid}
> CuratorFramework client = CuratorFrameworkFactory.newClient(connectionString, 
> retryPolicy);
> client.start();
> if(!client.blockUntilConnected(10, TimeUnit.MINUTES)){
>  LOGGER.error("Zookeeper connection could not establish!");
>  throw new RuntimeException("Zookeeper connection could not establish");
> }
> {code}
> 2. Create an instance of LSAdapter and start it:
> {code:title=LSAdapter initializing|borderStyle=solid}
> LSAdapter adapter = new LSAdapter(client, <some_metadata>);
> adapter.start();
> {code}
> Below is my LSAdapter class :
> {code:title=LSAdapter.java|borderStyle=solid}
> public class LSAdapter extends LeaderSelectorListenerAdapter implements 
> Closeable {
> //<Class instance variables defined>
>  public LSAdapter(CuratorFramework client, <some_metadata>) {
>  leaderSelector = new LeaderSelector(client, 
> <path_to_be_used_for_leader_election>, this);
>  leaderSelector.autoRequeue();
>  }
> public void start() throws IOException {
>  leaderSelector.start();
>  }
> @Override
>  public void close() throws IOException {
>  leaderSelector.close();
>  }
> @Override
>  public void takeLeadership(CuratorFramework client) throws Exception {
>  final int waitSeconds = (int) (5 * Math.random()) + 1;
> LOGGER.info(name + " is now the leader. Waiting " + waitSeconds + " 
> seconds...");
>  LOGGER.debug(name + " has been leader " + leaderCount.getAndIncrement() + " 
> time(s) before.");
>  while (true) {
>  try {
>  Thread.sleep(TimeUnit.SECONDS.toMillis(waitSeconds));
>  //do leader tasks
>  } catch (InterruptedException e) {
>  LOGGER.error(name + " was interrupted.");
>  //cleanup
> /*Here, code is creating a znode. If client 's current state is CLOSED, this 
> line will throw exception resulting in takeLeadership() exit. Else if, client 
> state is STARTED, znode should be created. In case when LSAdaptor.close() is 
> called, the client state will always be CLOSED at this line, and an exception 
> is expected to be thrown.*/
> //This line will always throw exception when client state is "CLOSED" and 
> because of which takeLeadership will exit
> ZookeeperUtil.createEphemeral(client, <some_path>);
> Thread.currentThread().interrupt();
>  } finally {
> }
>  }
>  }
> }
> {code}
> 4. When server instance is getting down, close LSAdapter instance(which 
> application is using) and close CuratorFramework client created
> {code:title=PreDestroy code|borderStyle=solid}
> CloseableUtils.closeQuietly(lsAdapter);
> curatorFrameworkClient.close();
> {code}
> The issue I am facing is that at times, when server is restarted, no leader 
> gets elected. I checked that by tracing the log inside takeLeadership(). I 
> have two tomcat server instances with above code, connecting to same 
> zookeeper quorum and most of the times one of the instance becomes leader but 
> when this issue happens, both of them becomes follower. Please suggest what 
> am I doing wrong.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to