[
https://issues.apache.org/jira/browse/ZOOKEEPER-1065?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Gunnar Wagenknecht resolved ZOOKEEPER-1065.
-------------------------------------------
Resolution: Invalid
Thanks very much. I didn't understand the logs properly. That help me a lot in
setting the right breakpoint and understanding the issue further. It's within
my application, well, sort of.
> Possible timing issue in embedded server
> ----------------------------------------
>
> Key: ZOOKEEPER-1065
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1065
> Project: ZooKeeper
> Issue Type: Bug
> Components: java client, server
> Affects Versions: 3.3.3
> Environment: Windows 7, 32bit, Core2 Duo T9300, JDK 1.6.0_24,
> ZooKeeper data on 500GB hybrid Seagate HDD with 4GB SSD cache
> Reporter: Gunnar Wagenknecht
> Attachments: zookeeper-nonode-issue.log
>
>
> I have an application that uses ZooKeeper. There is an ensemble in
> production. But in order to simplify development the application will
> start an embedded ZooKeeper server when started in development mode. We
> are experiencing a timing issue with ZooKeeper 3.3.3 and I was wondering
> if this is allowed to be happen or if we did something wrong when
> starting the embedded server.
> Basically, we have a watch registered using an #exists call and watch
> code like the following.
> {code}
> @Override
> public void process(final WatchedEvent event) {
> switch (event.getType()) {
> ...
> case NodeCreated:
> pathCreated(event.getPath());
> break;
> ...
> }
> }
> @Override
> protected void pathCreated(final String path) {
> // process events only for this node
> if (!isMyPath(path))
> return;
> try {
> loadNode(); // calls zk.getData(String, Watcher, Stat)
> } catch (final Exception e) {
> // got NoNodeException here (but not when debugging)
> log(..., e)
> }
> }
> {code}
> From inspecting the logs we noticed a NoNodeException. When setting
> breakpoints on #loadNode and stepping through we don't get the
> exception. But when setting a breakpoint on #log only we got a hit and
> could confirm the issue this way.
> The path is actually some levels deep. All the parent paths don't exist
> either so they are created as well. However, no exception is thrown fro
> them. The sequence is as follows.
> {noformat}
> /l1 --> watch triggered, getData, no exception
> /l1/l2 --> watch triggered, getData, no exception
> /l1/l2/l3 --> watch triggered, getData, no exception
> /l1/l2/l3/l4 --> watch triggered, getData, no exception
> /l1/l2/l3/l4/l5 --> watch triggered, getData, no exception
> /l1/l2/l3/l4/l5/l6 --> watch triggered, getData, NoNodeException
> {noformat}
> The only difference is that all paths up to including l5 do not actually
> have any data. Only l6 has some data. Could there be some latency issues?
> For completeness, the embedded server is started as follows.
> {code}
> // disable LOG4J JMX stuff
> System.setProperty("zookeeper.jmx.log4j.disable", Boolean.TRUE.toString());
> // get directories
> final File dataDir = new File(config.getDataLogDir());
> final File snapDir = new File(config.getDataDir());
> // clean old logs
> PurgeTxnLog.purge(dataDir, snapDir, 3);
> // create standalone server
> zkServer = new ZooKeeperServer();
> zkServer.setTxnLogFactory(new FileTxnSnapLog(dataDir, snapDir));
> zkServer.setTickTime(config.getTickTime());
> zkServer.setMinSessionTimeout(config.getMinSessionTimeout());
> zkServer.setMaxSessionTimeout(config.getMaxSessionTimeout());
> factory = new NIOServerCnxn.Factory(config.getClientPortAddress(),
> config.getMaxClientCnxns());
> // start server
> LOG.info("Starting ZooKeeper standalone server.");
> try {
> factory.startup(zkServer);
> } catch (final InterruptedException e) {
> LOG.warn("Interrupted during server start.", e);
> Thread.currentThread().interrupt();
> }
> {code}
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira