Ethan Li created STORM-2674:
-------------------------------
Summary: NoNodeException when ZooKeeper tries to delete nodes
Key: STORM-2674
URL: https://issues.apache.org/jira/browse/STORM-2674
Project: Apache Storm
Issue Type: Bug
Reporter: Ethan Li
When [StormClusterStateImpl reportError
function|https://github.com/apache/storm/blob/master/storm-client/src/jvm/org/apache/storm/cluster/StormClusterStateImpl.java#L652-L660]
is called, it will get all the children of
{code:java}
/storm/errors/<topo-id>/count/
{code}
and delete some znodes to keep latest 10 errors. NoNodeException could happen
when any znode is already deleted by other executors.
{code:java}
java.lang.RuntimeException: java.lang.RuntimeException:
java.lang.RuntimeException:
org.apache.zookeeper.KeeperException$NoNodeException: KeeperErrorCode = NoNode
for /errors/fastwc-halferrors-1-1501689263/count/e0000000562 at
org.apache.storm.utils.Utils$2.run(Utils.java:345) at
java.lang.Thread.run(Thread.java:748) Caused by: java.lang.RuntimeException:
java.lang.RuntimeException:
org.apache.zookeeper.KeeperException$NoNodeException: KeeperErrorCode = NoNode
for /errors/fastwc-halferrors-1-1501689263/count/e0000000562 at
org.apache.storm.utils.DisruptorQueue.consumeBatchToCursor(DisruptorQueue.java:489)
at
org.apache.storm.utils.DisruptorQueue.consumeBatchWhenAvailable(DisruptorQueue.java:455)
at org.apache.storm.executor.bolt.BoltExecutor$1.call(BoltExecutor.java:98) at
org.apache.storm.utils.Utils$2.run(Utils.java:335) ... 1 more Caused by:
java.lang.RuntimeException:
org.apache.zookeeper.KeeperException$NoNodeException: KeeperErrorCode = NoNode
for /errors/fastwc-halferrors-1-1501689263/count/e0000000562 at
org.apache.storm.utils.Utils.wrapInRuntime(Utils.java:413) at
org.apache.storm.zookeeper.ClientZookeeper.deleteNode(ClientZookeeper.java:165)
at org.apache.storm.cluster.ZKStateStorage.delete_node(ZKStateStorage.java:139)
at
org.apache.storm.cluster.StormClusterStateImpl.reportError(StormClusterStateImpl.java:655)
at org.apache.storm.executor.error.ReportError.report(ReportError.java:69) at
org.apache.storm.executor.bolt.BoltOutputCollectorImpl.reportError(BoltOutputCollectorImpl.java:154)
at org.apache.storm.task.OutputCollector.reportError(OutputCollector.java:234)
at
org.apache.storm.topology.BasicOutputCollector.reportError(BasicOutputCollector.java:70)
at
org.apache.storm.starter.FastWordCountTopology$WordCount.execute(FastWordCountTopology.java:113)
at
org.apache.storm.topology.BasicBoltExecutor.execute(BasicBoltExecutor.java:50)
at
org.apache.storm.executor.bolt.BoltExecutor.tupleActionFn(BoltExecutor.java:125)
at org.apache.storm.executor.Executor.onEvent(Executor.java:255) at
org.apache.storm.utils.DisruptorQueue.consumeBatchToCursor(DisruptorQueue.java:476)
... 4 more Caused by: org.apache.zookeeper.KeeperException$NoNodeException:
KeeperErrorCode = NoNode for
/errors/fastwc-halferrors-1-1501689263/count/e0000000562 at
org.apache.zookeeper.KeeperException.create(KeeperException.java:111) at
org.apache.zookeeper.KeeperException.create(KeeperException.java:51) at
org.apache.zookeeper.ZooKeeper.delete(ZooKeeper.java:873) at
org.apache.curator.framework.imps.DeleteBuilderImpl$5.call(DeleteBuilderImpl.java:250)
at
org.apache.curator.framework.imps.DeleteBuilderImpl$5.call(DeleteBuilderImpl.java:244)
at org.apache.curator.RetryLoop.callWithRetry(RetryLoop.java:109) at
org.apache.curator.framework.imps.DeleteBuilderImpl.pathInForeground(DeleteBuilderImpl.java:241)
at
org.apache.curator.framework.imps.DeleteBuilderImpl.forPath(DeleteBuilderImpl.java:225)
at
org.apache.curator.framework.imps.DeleteBuilderImpl.forPath(DeleteBuilderImpl.java:35)
at
org.apache.storm.zookeeper.ClientZookeeper.deleteNode(ClientZookeeper.java:158)
... 15 more
{code}
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)