Anton Alfred created STORM-2817:
-----------------------------------

             Summary: Topology Restart Counts are not maintained in Storm UI
                 Key: STORM-2817
                 URL: https://issues.apache.org/jira/browse/STORM-2817
             Project: Apache Storm
          Issue Type: Improvement
          Components: storm-ui
    Affects Versions: 1.0.2
         Environment: CentOS7, Docker
            Reporter: Anton Alfred
            Priority: Minor


On the Storm UI, we need an ability to have a Topology Submission Time, 
Topology Uptime as well as how many times a Topology worker process has 
restarted since last Submission.

The reason been, lets say we have a Supervisor with 8 GB RAM.
We also have 4 Slots on this Supervisor.
We submit 4 Topologies each with worker memory of 3 GB leading to a total of 
12GB / 8 GB utilization assuming not all topologies would use up all the memory 
at the same time.

Now, we find that topologies are dying behind the scenes due to out of memory 
and Storm Nimbus keeps restarting these topologies again.

The uptime requests as part of  [STORM-2816] 
(https://issues.apache.org/jira/browse/STORM-2816) we can address the uptime 
but it still won't say we have a deeper issue and the topologies are restarting 
behind the scene. Adding this counter would help to flag issues.

The counts should be at both per topology level like

Topology 1 
     Submission Time T1
     Uptime T2
     Restarts 4 (Possible log links to why restarted)

The other should be at the Storm UI level

Total Topologies : 20
Total Topologies Restart since Submission : 12 (Possible links to topologies 
that got restarted)

This way monitoring and alerting systems can hook into these counts and alert 
when things go wrong.




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to