[ https://issues.apache.org/jira/browse/STORM-1696?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
P. Taylor Goetz resolved STORM-1696. ------------------------------------ Resolution: Fixed > Backpressure flag not sync if zookeeper connection errors > --------------------------------------------------------- > > Key: STORM-1696 > URL: https://issues.apache.org/jira/browse/STORM-1696 > Project: Apache Storm > Issue Type: Bug > Affects Versions: 1.0.0, 2.0.0 > Reporter: Zhuo Liu > Assignee: Zhuo Liu > Priority: Blocker > Fix For: 2.0.0, 1.0.1 > > > When there is a zk exception happens during worker-backpressure!, > there is a bad state which can block the topology from running normally any > more. > The root cause: in worker/mk-backpressure-handler > if the worker-backpressure! fails once due to zk connection exception, > next time when this method gets called by WordBackpressureThread, because > (when (not= prev-backpressure-flag curr-backpressure-flag) will never be > true, the remote zk node can not be synced with local state. > This also explains why we will not see any problem when testing in a stable > (zk never fail) environment. > Solution is quite straightforward: first change the zk status, if succeeds, > change local status. > This fixes the hidden bug and removes redundant flags in executor-data and > worker-data (since we can get the executor status directly from the > "_throttleOn" boolean in the DisruptorQueue) -- This message was sent by Atlassian JIRA (v6.3.4#6332)