Neha Gupta created STORM-3705:
---------------------------------

             Summary: Storm UI and nimbus CLI not working
                 Key: STORM-3705
                 URL: https://issues.apache.org/jira/browse/STORM-3705
             Project: Apache Storm
          Issue Type: Bug
          Components: storm-core
    Affects Versions: 1.2.1
            Reporter: Neha Gupta


When deploying a topology on nimbus, there were errors in topology because of 
wrong configuration, due to which every request was failing in the topology. In 
our code, there is this logic that if a topology observes error more than a 
particular threshold, then it will issue storm deactivate topology command to 
nimbus.

The restart of topologies was being done via script.

The scenario was:
2 topologies were restarted successfully but facing errors due to wrong 
configuration. Because of the same, deactivate command was submitted to nimbus.
3rd topology was killed successfully and command for topology submission for 
the same was received successfully by nimbus.

At this point, storm UI stopped responding completely. When tried to run kill 
command via CLI on nimbus, it didn't work either and stayed stuck.

At this point, the 2 topologies with errors were still running as deactivation 
was not successful via nimbus. And the 3rd topology wasn't restarted via nimbus.

Any other commands ran on nimbus, stayed at stuck state until the nimbus was 
stopped on the leader machine and another nimbus was made leader.

There are no helpful logs in nimbus or supervisors or worker logs of affected 
topologies, apart from zookeeper info logs below:
{color:#FF0000}zookeeper [INFO] 
exceptionorg.apache.storm.shade.org.apache.zookeeper.KeeperException$NoNodeException:
 KeeperErrorCode = NoNode for 
/errors/<AFFECTED-TOPOLOGY-ID>/<BOLTNAME-WITH_ERRORS>/e0000001494

{color}Storm version: 1.2.1



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to