Anton Alfred created STORM-2817:
-----------------------------------
Summary: Topology Restart Counts are not maintained in Storm UI
Key: STORM-2817
URL: https://issues.apache.org/jira/browse/STORM-2817
Project: Apache Storm
Issue Type: Improvement
Components: storm-ui
Affects Versions: 1.0.2
Environment: CentOS7, Docker
Reporter: Anton Alfred
Priority: Minor
On the Storm UI, we need an ability to have a Topology Submission Time,
Topology Uptime as well as how many times a Topology worker process has
restarted since last Submission.
The reason been, lets say we have a Supervisor with 8 GB RAM.
We also have 4 Slots on this Supervisor.
We submit 4 Topologies each with worker memory of 3 GB leading to a total of
12GB / 8 GB utilization assuming not all topologies would use up all the memory
at the same time.
Now, we find that topologies are dying behind the scenes due to out of memory
and Storm Nimbus keeps restarting these topologies again.
The uptime requests as part of [STORM-2816]
(https://issues.apache.org/jira/browse/STORM-2816) we can address the uptime
but it still won't say we have a deeper issue and the topologies are restarting
behind the scene. Adding this counter would help to flag issues.
The counts should be at both per topology level like
Topology 1
Submission Time T1
Uptime T2
Restarts 4 (Possible log links to why restarted)
The other should be at the Storm UI level
Total Topologies : 20
Total Topologies Restart since Submission : 12 (Possible links to topologies
that got restarted)
This way monitoring and alerting systems can hook into these counts and alert
when things go wrong.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)