Neha Gupta created STORM-3705:
---------------------------------
Summary: Storm UI and nimbus CLI not working
Key: STORM-3705
URL: https://issues.apache.org/jira/browse/STORM-3705
Project: Apache Storm
Issue Type: Bug
Components: storm-core
Affects Versions: 1.2.1
Reporter: Neha Gupta
When deploying a topology on nimbus, there were errors in topology because of
wrong configuration, due to which every request was failing in the topology. In
our code, there is this logic that if a topology observes error more than a
particular threshold, then it will issue storm deactivate topology command to
nimbus.
The restart of topologies was being done via script.
The scenario was:
2 topologies were restarted successfully but facing errors due to wrong
configuration. Because of the same, deactivate command was submitted to nimbus.
3rd topology was killed successfully and command for topology submission for
the same was received successfully by nimbus.
At this point, storm UI stopped responding completely. When tried to run kill
command via CLI on nimbus, it didn't work either and stayed stuck.
At this point, the 2 topologies with errors were still running as deactivation
was not successful via nimbus. And the 3rd topology wasn't restarted via nimbus.
Any other commands ran on nimbus, stayed at stuck state until the nimbus was
stopped on the leader machine and another nimbus was made leader.
There are no helpful logs in nimbus or supervisors or worker logs of affected
topologies, apart from zookeeper info logs below:
{color:#FF0000}zookeeper [INFO]
exceptionorg.apache.storm.shade.org.apache.zookeeper.KeeperException$NoNodeException:
KeeperErrorCode = NoNode for
/errors/<AFFECTED-TOPOLOGY-ID>/<BOLTNAME-WITH_ERRORS>/e0000001494
{color}Storm version: 1.2.1
--
This message was sent by Atlassian Jira
(v8.3.4#803005)