Re: [jira] [Commented] (CURATOR-466) LeaderSelector gets in an inconsistent state when releasing resources.

kenneth mcfarland Tue, 13 Nov 2018 12:57:02 -0800

Your error messages look a lot like I have seen for about a year or more is
it related to this below?


https://issues.apache.org/jira/plugins/servlet/mobile#issue/CURATOR-468

We stopped using it and switched to another leader election class because
of the above issue, it was the only way to kill the spurious exceptions.

When I sit down and can get finer detailed info ill tell you what selection
method we used.

Cheers!!



On Tue, Nov 13, 2018, 12:18 PM Mikhail Pryakhin (JIRA) <[email protected]
wrote:

>
>     [
> https://issues.apache.org/jira/browse/CURATOR-466?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16685697#comment-16685697
> ]
>
> Mikhail Pryakhin commented on CURATOR-466:
> ------------------------------------------
>
> [~randgalt] Thank you.
>
> Do I get you right that closing only a framework instance is a correct way
> for a client to give up participation in a leader election process?
>
> > LeaderSelector gets in an inconsistent state when releasing resources.
> > ----------------------------------------------------------------------
> >
> >                 Key: CURATOR-466
> >                 URL: https://issues.apache.org/jira/browse/CURATOR-466
> >             Project: Apache Curator
> >          Issue Type: Bug
> >          Components: Recipes
> >    Affects Versions: 4.0.1
> >            Reporter: Mikhail Pryakhin
> >            Priority: Major
> >
> > I'm using the leader election recipe that works well until I encountered
> application shutdown.
> > here is my example:
> >
> > {code:java}
> > CuratorFramework framework = CuratorFrameworkFactory.builder()
> >     .connectString("localhost:2181")
> >     .retryPolicy(new RetryOneTime(100))
> >     .build();
> > LeaderSelector leaderSelector = new LeaderSelector(
> >     framework,
> >     "/path",
> >     new LeaderSelectorListener() {
> >         volatile boolean stopped;
> >         @Override
> >         public void takeLeadership(CuratorFramework client) throws
> Exception {
> >             System.out.println("I'm a new leader!");
> >             try {
> >                 while (!Thread.currentThread().isInterrupted() &&
> !stopped) {
> >                     TimeUnit.SECONDS.sleep(1);
> >                 }
> >             } finally {
> >                 System.out.println("I'm not a leader anymore..");
> >             }
> >         }
> >         @Override
> >         public void stateChanged(CuratorFramework client,
> ConnectionState     newState) {
> >             if
> (client.getConnectionStateErrorPolicy().isErrorState(newState)) {
> >                 stopped = true;
> >             }
> >          }
> >   }
> > );
> > framework.start();
> > leaderSelector.start();
> > TimeUnit.SECONDS.sleep(5);
> > leaderSelector.close();   //(1)
> > framework.close();        //(2){code}
> >
> > When I release resources by calling close method first on the
> LeaderSelector instance and then on the CurtorFramework instance (lines 1
> and 2) I always get the following exception:
> >
> > {code:java}
> > java.lang.IllegalStateException: instance must be started before calling
> this method
> > at 
> > org.apache.curator.shaded.com.google.common.base.Preconditions.checkState(Preconditions.java:444)
> ~[curator-client-4.0.1.jar:?]
> > at
> org.apache.curator.framework.imps.CuratorFrameworkImpl.delete(CuratorFrameworkImpl.java:424)
> ~[curator-framework-4.0.1.jar:4.0.1]
> > at
> org.apache.curator.framework.recipes.locks.LockInternals.deleteOurPath(LockInternals.java:347)
> ~[curator-recipes-4.0.1.jar:4.0.1]
> > at
> org.apache.curator.framework.recipes.locks.LockInternals.releaseLock(LockInternals.java:124)
> ~[curator-recipes-4.0.1.jar:4.0.1]
> > at
> org.apache.curator.framework.recipes.locks.InterProcessMutex.release(InterProcessMutex.java:154)
> ~[curator-recipes-4.0.1.jar:4.0.1]
> > at
> org.apache.curator.framework.recipes.leader.LeaderSelector.doWork(LeaderSelector.java:449)
> [curator-recipes-4.0.1.jar:4.0.1]
> > at
> org.apache.curator.framework.recipes.leader.LeaderSelector.doWorkLoop(LeaderSelector.java:466)
> [curator-recipes-4.0.1.jar:4.0.1]
> > at
> org.apache.curator.framework.recipes.leader.LeaderSelector.access$100(LeaderSelector.java:65)
> [curator-recipes-4.0.1.jar:4.0.1]
> > at
> org.apache.curator.framework.recipes.leader.LeaderSelector$2.call(LeaderSelector.java:246)
> [curator-recipes-4.0.1.jar:4.0.1]
> > at
> org.apache.curator.framework.recipes.leader.LeaderSelector$2.call(LeaderSelector.java:240)
> [curator-recipes-4.0.1.jar:4.0.1]
> > at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_141]
> > at
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> [?:1.8.0_141]
> > at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_141]
> > at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> [?:1.8.0_141]
> > at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> [?:1.8.0_141]
> > at java.lang.Thread.run(Thread.java:748) [?:1.8.0_141]
> > {code}
> >
> > The reason for the exception is that the non-blocking
> LeaderSelector.close method delegates call to the internal executor
> service, which abruptly cancels the running futures with the
> interptIfRunning flag set to true. Right after this, the CuratorFramework
> close method is called. By the meantime, the future being canceled executes
> the finally block where it calls methods on the already closed
> CuratorFramework instance which leads to throwing an exception.
> > I thought I can wait a bit until the LeaderSelector instance is closed,
> so I tried to delay for some time before closing the CuratorFramework
> instance, but doing so leads to another exception:
> > {code:java}
> > ava.lang.InterruptedException: null
> > at java.lang.Object.wait(Native Method) ~[?:1.8.0_141]
> > at java.lang.Object.wait(Object.java:502) ~[?:1.8.0_141]
> > at org.apache.zookeeper.ClientCnxn.submitRequest(ClientCnxn.java:1409)
> ~[zookeeper-3.4.12.jar:3.4.12--1]
> > at org.apache.zookeeper.ZooKeeper.delete(ZooKeeper.java:874)
> ~[zookeeper-3.4.12.jar:3.4.12--1]
> > at
> org.apache.curator.framework.imps.DeleteBuilderImpl$5.call(DeleteBuilderImpl.java:274)
> ~[curator-framework-4.0.1.jar:4.0.1]
> > at
> org.apache.curator.framework.imps.DeleteBuilderImpl$5.call(DeleteBuilderImpl.java:268)
> ~[curator-framework-4.0.1.jar:4.0.1]
> > at
> org.apache.curator.connection.StandardConnectionHandlingPolicy.callWithRetry(StandardConnectionHandlingPolicy.java:64)
> ~[curator-client-4.0.1.jar:?]
> > at org.apache.curator.RetryLoop.callWithRetry(RetryLoop.java:100)
> ~[curator-client-4.0.1.jar:?]
> > at
> org.apache.curator.framework.imps.DeleteBuilderImpl.pathInForeground(DeleteBuilderImpl.java:265)
> ~[curator-framework-4.0.1.jar:4.0.1]
> > at
> org.apache.curator.framework.imps.DeleteBuilderImpl.forPath(DeleteBuilderImpl.java:249)
> ~[curator-framework-4.0.1.jar:4.0.1]
> > at
> org.apache.curator.framework.imps.DeleteBuilderImpl.forPath(DeleteBuilderImpl.java:34)
> ~[curator-framework-4.0.1.jar:4.0.1]
> > at
> org.apache.curator.framework.recipes.locks.LockInternals.deleteOurPath(LockInternals.java:347)
> ~[curator-recipes-4.0.1.jar:4.0.1]
> > at
> org.apache.curator.framework.recipes.locks.LockInternals.releaseLock(LockInternals.java:124)
> ~[curator-recipes-4.0.1.jar:4.0.1]
> > at
> org.apache.curator.framework.recipes.locks.InterProcessMutex.release(InterProcessMutex.java:154)
> ~[curator-recipes-4.0.1.jar:4.0.1]
> > at
> org.apache.curator.framework.recipes.leader.LeaderSelector.doWork(LeaderSelector.java:449)
> [curator-recipes-4.0.1.jar:4.0.1]
> > at
> org.apache.curator.framework.recipes.leader.LeaderSelector.doWorkLoop(LeaderSelector.java:466)
> [curator-recipes-4.0.1.jar:4.0.1]
> > at
> org.apache.curator.framework.recipes.leader.LeaderSelector.access$100(LeaderSelector.java:65)
> [curator-recipes-4.0.1.jar:4.0.1]
> > at
> org.apache.curator.framework.recipes.leader.LeaderSelector$2.call(LeaderSelector.java:246)
> [curator-recipes-4.0.1.jar:4.0.1]
> > at
> org.apache.curator.framework.recipes.leader.LeaderSelector$2.call(LeaderSelector.java:240)
> [curator-recipes-4.0.1.jar:4.0.1]
> > at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_141]
> > at
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> [?:1.8.0_141]
> > at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_141]
> > at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> [?:1.8.0_141]
> > at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> [?:1.8.0_141]
> > at java.lang.Thread.run(Thread.java:748) [?:1.8.0_141]
> > {code}
> > At this time the exception is caused by the future being canceled with
> the interptIfRunning flag set to true in the LeaderSelector close method.
> > As the LeaderSelector implementation is based on the InterPorcessMutex
> that works with ephemeral nodes, do we really need to manually clean up on
> shutdown? As far as I know, the ephemeral nodes are deleted when the client
> disconnects.
> >
>
>
>
> --
> This message was sent by Atlassian JIRA
> (v7.6.3#76005)
>

Re: [jira] [Commented] (CURATOR-466) LeaderSelector gets in an inconsistent state when releasing resources.

Reply via email to