[
https://issues.apache.org/jira/browse/CURATOR-326?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Gerd Behrmann updated CURATOR-326:
----------------------------------
Description:
Calling CuratorFramework#createContainers on a non-existing path while the
client is not yet connected to the server (e.g. server is down while client is
starting) fails silently if enough time goes by before the server is started.
The following unit test demonstrates the issue:
{code:java}
package dmg.cells.zookeeper;
import org.apache.curator.framework.CuratorFramework;
import org.apache.curator.framework.CuratorFrameworkFactory;
import org.apache.curator.retry.RetryForever;
import org.apache.curator.test.TestingServer;
import org.apache.curator.test.Timing;
import org.junit.Test;
import static junit.framework.TestCase.assertNotNull;
public class CuratorTest
{
@Test
public void createContainersTest() throws Exception
{
TestingServer server = new TestingServer(false);
Timing timing = new Timing();
CuratorFramework client =
CuratorFrameworkFactory.newClient(server.getConnectString(), timing.session(),
timing.connection(), new RetryForever(100));
try {
new Thread() {
@Override
public void run()
{
try {
Thread.sleep(30000);
server.start();
} catch (Exception e) {
e.printStackTrace();
}
}
}.start();
client.start();
client.createContainers("/this/does/not/exist");
assertNotNull(client.checkExists().forPath("/this/does/not/exist"));
} finally {
client.close();
server.stop();
}
}
}
{code}
The delay before starting the server is significant. If only sleeping for 10
seconds, the unit test passes. Sleeping for 30 seconds triggers a code path in
Curator that will cause CuratorFramework#createContainers to wait until the
server is started, yet it returns without exception and without creating the
path. The assertion fails.
I tracked down the issue to ExistingBuilderImpl#pathInForeground in which the
call to ZKPath#mkdirs is wrapped with a try-catch that ignores the exception.
Thus the failed operation is neither retried nor propagated.
Specifically this causes silent problems with PathAndChildren cache as it uses
EnsureContainer#ensure during startup to ensure that the path exists. This
internally calls the above createContainers. When it fails silently, the recipe
fails to register a watcher on the non-existing path and consequently the cache
stays empty even when the server finally is started and the path is populated
by another client.
was:
Calling CuratorFramework#createContainers on a non-existing path while the
client is not yet connected to the server (e.g. server is down while client is
starting) fails silently if enough time goes by before the server is started.
The following unit test demonstrates the issue:
package dmg.cells.zookeeper;
import org.apache.curator.framework.CuratorFramework;
import org.apache.curator.framework.CuratorFrameworkFactory;
import org.apache.curator.retry.RetryForever;
import org.apache.curator.test.TestingServer;
import org.apache.curator.test.Timing;
import org.junit.Test;
import static junit.framework.TestCase.assertNotNull;
public class CuratorTest
{
@Test
public void createContainersTest() throws Exception
{
TestingServer server = new TestingServer(false);
Timing timing = new Timing();
CuratorFramework client =
CuratorFrameworkFactory.newClient(server.getConnectString(), timing.session(),
timing.connection(), new RetryForever(100));
try {
new Thread() {
@Override
public void run()
{
try {
Thread.sleep(30000);
server.start();
} catch (Exception e) {
e.printStackTrace();
}
}
}.start();
client.start();
client.createContainers("/this/does/not/exist");
assertNotNull(client.checkExists().forPath("/this/does/not/exist"));
} finally {
client.close();
server.stop();
}
}
}
The delay before starting the server is significant. If only sleeping for 10
seconds, the unit test passes. Sleeping for 30 seconds triggers a code path in
Curator that will cause CuratorFramework#createContainers to wait until the
server is started, yet it returns without exception and without creating the
path. The assertion fails.
I tracked down the issue to ExistingBuilderImpl#pathInForeground in which the
call to ZKPath#mkdirs is wrapped with a try-catch that ignores the exception.
Thus the failed operation is neither retried nor propagated.
Specifically this causes silent problems with PathAndChildren cache as it uses
EnsureContainer#ensure during startup to ensure that the path exists. This
internally calls the above createContainers. When it fails silently, the recipe
fails to register a watcher on the non-existing path and consequently the cache
stays empty even when the server finally is started and the path is populated
by another client.
> createContainers fails silently if client is not connected
> ----------------------------------------------------------
>
> Key: CURATOR-326
> URL: https://issues.apache.org/jira/browse/CURATOR-326
> Project: Apache Curator
> Issue Type: Bug
> Components: Framework
> Affects Versions: 2.10.0
> Reporter: Gerd Behrmann
>
> Calling CuratorFramework#createContainers on a non-existing path while the
> client is not yet connected to the server (e.g. server is down while client
> is starting) fails silently if enough time goes by before the server is
> started.
> The following unit test demonstrates the issue:
> {code:java}
> package dmg.cells.zookeeper;
> import org.apache.curator.framework.CuratorFramework;
> import org.apache.curator.framework.CuratorFrameworkFactory;
> import org.apache.curator.retry.RetryForever;
> import org.apache.curator.test.TestingServer;
> import org.apache.curator.test.Timing;
> import org.junit.Test;
> import static junit.framework.TestCase.assertNotNull;
> public class CuratorTest
> {
> @Test
> public void createContainersTest() throws Exception
> {
> TestingServer server = new TestingServer(false);
> Timing timing = new Timing();
> CuratorFramework client =
> CuratorFrameworkFactory.newClient(server.getConnectString(),
> timing.session(), timing.connection(), new RetryForever(100));
> try {
> new Thread() {
> @Override
> public void run()
> {
> try {
> Thread.sleep(30000);
> server.start();
> } catch (Exception e) {
> e.printStackTrace();
> }
> }
> }.start();
> client.start();
> client.createContainers("/this/does/not/exist");
>
> assertNotNull(client.checkExists().forPath("/this/does/not/exist"));
> } finally {
> client.close();
> server.stop();
> }
> }
> }
> {code}
> The delay before starting the server is significant. If only sleeping for 10
> seconds, the unit test passes. Sleeping for 30 seconds triggers a code path
> in Curator that will cause CuratorFramework#createContainers to wait until
> the server is started, yet it returns without exception and without creating
> the path. The assertion fails.
> I tracked down the issue to ExistingBuilderImpl#pathInForeground in which the
> call to ZKPath#mkdirs is wrapped with a try-catch that ignores the exception.
> Thus the failed operation is neither retried nor propagated.
> Specifically this causes silent problems with PathAndChildren cache as it
> uses EnsureContainer#ensure during startup to ensure that the path exists.
> This internally calls the above createContainers. When it fails silently, the
> recipe fails to register a watcher on the non-existing path and consequently
> the cache stays empty even when the server finally is started and the path is
> populated by another client.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)