[ 
https://issues.apache.org/jira/browse/CURATOR-595?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Francesco Nigro updated CURATOR-595:
------------------------------------
    Description: 
I'm not sure this is the right place to raise this, but I've added this test on 
TestInterProcessSemaphore:

{code:java}
    @Test
    public void testAcquireAfterLostServerOnRestart() throws Exception {
        final int sessionTimout = 4000;
        final int connectionTimout = 2000;
        try (CuratorFramework client = 
CuratorFrameworkFactory.newClient(server.getConnectString(), sessionTimout, 
connectionTimout, new RetryNTimes(0, 1))) {
            client.start();
            client.blockUntilConnected();
            final InterProcessSemaphoreV2 semaphore = new 
InterProcessSemaphoreV2(client, "/1", 1);
            assertNotNull(semaphore.acquire());
            CountDownLatch lost = new CountDownLatch(1);
            client.getConnectionStateListenable().addListener((client1, 
newState) -> {
                if (newState == ConnectionState.LOST) {
                    lost.countDown();
                }
            });
            server.stop();
            lost.await();
        }
        server.restart();
        try (CuratorFramework client = 
CuratorFrameworkFactory.newClient(server.getConnectString(), sessionTimout, 
connectionTimout, new RetryNTimes(0, 1))) {
            client.start();
            client.blockUntilConnected();
            final InterProcessSemaphoreV2 semaphore = new 
InterProcessSemaphoreV2(client, "/1", 1);
            final int serverTick = ZooKeeperServer.DEFAULT_TICK_TIME;
            Thread.sleep(sessionTimout + serverTick);
            assertNotNull(semaphore.acquire(0, TimeUnit.SECONDS));
        }
    }
{code}
And this is not passing: the doc of InterProcessSemaphoreV2 state that 
bq. "However, if the client session drops (crash, etc.), any leases held by the 
client are automatically closed and made available to other clients." 
maybe I'm missing something obvious on the ZK server config instead.

Just checked out that by running on separated processes the same test:
# start server on process A
# start lease acquire on process B, listening for LOST events before suicide
# restart server on Process A cause process B to suicide (as expected)
# start lease acquire on process C, now succeed

It seems that there is something going on in the intra-process case that's not 
working as expected (to me, at least).

NOTE: as written in newer comments, raising the timeout doesn't seems to work 
too and different boxes are getting different outcomes (making this an 
intermittent failure).



  was:
I'm not sure this is the right place to raise this, but I've added this test on 
TestInterProcessSemaphore:

{code:java}
    @Test
    public void testAcquireAfterLostServerOnRestart() throws Exception {
        final int sessionTimout = 4000;
        final int connectionTimout = 2000;
        try (CuratorFramework client = 
CuratorFrameworkFactory.newClient(server.getConnectString(), sessionTimout, 
connectionTimout, new RetryNTimes(0, 1))) {
            client.start();
            client.blockUntilConnected();
            final InterProcessSemaphoreV2 semaphore = new 
InterProcessSemaphoreV2(client, "/1", 1);
            assertNotNull(semaphore.acquire());
            CountDownLatch lost = new CountDownLatch(1);
            client.getConnectionStateListenable().addListener((client1, 
newState) -> {
                if (newState == ConnectionState.LOST) {
                    lost.countDown();
                }
            });
            server.stop();
            lost.await();
        }
        server.restart();
        try (CuratorFramework client = 
CuratorFrameworkFactory.newClient(server.getConnectString(), sessionTimout, 
connectionTimout, new RetryNTimes(0, 1))) {
            client.start();
            client.blockUntilConnected();
            final InterProcessSemaphoreV2 semaphore = new 
InterProcessSemaphoreV2(client, "/1", 1);
            final int serverTick = ZooKeeperServer.DEFAULT_TICK_TIME
            Thread.sleep(sessionTimout + serverTick);
            assertNotNull(semaphore.acquire(0, TimeUnit.SECONDS));
        }
    }
{code}
And this is not passing: the doc of InterProcessSemaphoreV2 state that 
bq. "However, if the client session drops (crash, etc.), any leases held by the 
client are automatically closed and made available to other clients." 
maybe I'm missing something obvious on the ZK server config instead.

Just checked out that by running on separated processes the same test:
# start server on process A
# start lease acquire on process B, listening for LOST events before suicide
# restart server on Process A cause process B to suicide (as expected)
# start lease acquire on process C, now succeed

It seems that there is something going on in the intra-process case that's not 
working as expected (to me, at least).




> InterProcessSemaphoreV2 LOST isn't releasing permits for other clients
> ----------------------------------------------------------------------
>
>                 Key: CURATOR-595
>                 URL: https://issues.apache.org/jira/browse/CURATOR-595
>             Project: Apache Curator
>          Issue Type: Bug
>          Components: Recipes
>    Affects Versions: 5.1.0
>            Reporter: Francesco Nigro
>            Assignee: Jordan Zimmerman
>            Priority: Major
>
> I'm not sure this is the right place to raise this, but I've added this test 
> on TestInterProcessSemaphore:
> {code:java}
>     @Test
>     public void testAcquireAfterLostServerOnRestart() throws Exception {
>         final int sessionTimout = 4000;
>         final int connectionTimout = 2000;
>         try (CuratorFramework client = 
> CuratorFrameworkFactory.newClient(server.getConnectString(), sessionTimout, 
> connectionTimout, new RetryNTimes(0, 1))) {
>             client.start();
>             client.blockUntilConnected();
>             final InterProcessSemaphoreV2 semaphore = new 
> InterProcessSemaphoreV2(client, "/1", 1);
>             assertNotNull(semaphore.acquire());
>             CountDownLatch lost = new CountDownLatch(1);
>             client.getConnectionStateListenable().addListener((client1, 
> newState) -> {
>                 if (newState == ConnectionState.LOST) {
>                     lost.countDown();
>                 }
>             });
>             server.stop();
>             lost.await();
>         }
>         server.restart();
>         try (CuratorFramework client = 
> CuratorFrameworkFactory.newClient(server.getConnectString(), sessionTimout, 
> connectionTimout, new RetryNTimes(0, 1))) {
>             client.start();
>             client.blockUntilConnected();
>             final InterProcessSemaphoreV2 semaphore = new 
> InterProcessSemaphoreV2(client, "/1", 1);
>             final int serverTick = ZooKeeperServer.DEFAULT_TICK_TIME;
>             Thread.sleep(sessionTimout + serverTick);
>             assertNotNull(semaphore.acquire(0, TimeUnit.SECONDS));
>         }
>     }
> {code}
> And this is not passing: the doc of InterProcessSemaphoreV2 state that 
> bq. "However, if the client session drops (crash, etc.), any leases held by 
> the client are automatically closed and made available to other clients." 
> maybe I'm missing something obvious on the ZK server config instead.
> Just checked out that by running on separated processes the same test:
> # start server on process A
> # start lease acquire on process B, listening for LOST events before suicide
> # restart server on Process A cause process B to suicide (as expected)
> # start lease acquire on process C, now succeed
> It seems that there is something going on in the intra-process case that's 
> not working as expected (to me, at least).
> NOTE: as written in newer comments, raising the timeout doesn't seems to work 
> too and different boxes are getting different outcomes (making this an 
> intermittent failure).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to