[ 
https://issues.apache.org/jira/browse/CURATOR-328?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jordan Zimmerman resolved CURATOR-328.
--------------------------------------
    Resolution: Fixed

> PathChildrenCache fails silently if server is unavailable for sufficient time 
> when client starts
> ------------------------------------------------------------------------------------------------
>
>                 Key: CURATOR-328
>                 URL: https://issues.apache.org/jira/browse/CURATOR-328
>             Project: Apache Curator
>          Issue Type: Bug
>          Components: Recipes
>    Affects Versions: 3.1.0, 2.10.0
>            Reporter: Gerd Behrmann
>            Assignee: Jordan Zimmerman
>             Fix For: 2.10.1, 3.1.1
>
>
> When initializing the PathChildrenCache, if the curator client is not yet 
> connected to the ZooKeeper server (e.g. the server is down or the network 
> connection is unavailable), then the internal initialization of the cache 
> will eventually fail silently and the cache stays empty even after the client 
> finally connects to the server and the path is populated with znodes.
> The following unit test demonstrates the problem (the unit test is ugly as 
> the problem depends on timing, but it suffices to demonstrate the issue):
> {code:java}
>     @Test
>     public void pathChildrenCacheTest() throws Exception
>     {
>         TestingServer server = new TestingServer(false);
>         Timing timing = new Timing();
>         CuratorFramework client = CuratorFrameworkFactory.newClient(
>                 server.getConnectString(), timing.session(), 
> timing.connection(), new ExponentialBackoffRetry(1000, 3));
>         try {
>             new Thread() {
>                 @Override
>                 public void run()
>                 {
>                     try {
>                         Thread.sleep(60000);
>                         server.start();
>                     } catch (Exception e) {
>                         e.printStackTrace();
>                     }
>                 }
>             }.start();
>             client.start();
>             PathChildrenCache cache = new PathChildrenCache(client, "/", 
> true);
>             cache.start();
>             client.blockUntilConnected();
>             
> client.create().creatingParentContainersIfNeeded().forPath("/baz", new byte[] 
> {1,2,3});
>             assertNotNull("/baz does not exist", 
> client.checkExists().forPath("/baz"));
>             /* Ugly hack for this test to ensure the cache got time to update 
> itself. */
>             Thread.sleep(1000);
>             assertNotNull("cache doesn't see /baz", 
> cache.getCurrentData("/baz"));
>         } finally {
>             client.close();
>             server.stop();
>         }
>     }
> {code}
> Here the server startup is delayed until some point after the curator client 
> was started and after the recipe has been created. Eventually the server 
> starts and the path is populated with data - some time is given for the cache 
> to update itself, yet no data is visible: The second assertion fails.
> If the startup time is reduced to - say - 20 seconds, the test passes.
> If the client is allowed to first connect to the server before creating the 
> recipe and then disconnect and reconnect after creating the recipe, then the 
> test passes too.
> I tracked down the problem to the state change listener of the recipe: If the 
> connection to the server is down for long enough, the refresh call during the 
> background initialization will eventually fail (ensurePath throws an 
> exception). This isn't a problem as the recipe has a state change listener, 
> so it gets notified when the client eventually connects to the server. The 
> handleStateChange method however doesn't react to a CONNECTED event - only to 
> a RECONNECTED event. Thus if the client has been connected to the server in 
> the past, everything works, however if this is the first time it connects, 
> the recipe will not react to the event and thus not refresh itself.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to