errose28 commented on PR #9400:
URL: https://github.com/apache/ozone/pull/9400#issuecomment-3667660934
Thanks for adding this. I pulled up the Grafana chart in docker to look
around.
> IMO it would be even better if you can display "In Safe Mode" and "Exited
Safe Mode" instead of the numerical 0 and 1.
+1 to Wei-Chiu's comment here. We can have text labels and see enter/exit
safemode trends over time with Grafana's [state
timeline](https://grafana.com/docs/grafana/latest/visualizations/panels-visualizations/visualizations/state-timeline/).
Can we switch the binary plot to use this instead? A red block would indicate
when an SCM was in safemode, and a green block would indicate that it is out.
For the threshold to exit safemode on each rule, the two solid lines on top
of each other are difficult to read. We can either use a dashed line for the
target value, or use a gradient fill where the area at/above the threshold is
green and the area below is red. Also, the thresholds are expected to be the
same for all SCMs with HA. I think it would be easier to read if we just take
the max of thresholds returned by each SCM as a way to reduce this to a single
number, and plot that as the exit criteria without a corresponding hostname
label.
Can you share screenshots of what the updated dashboars look like in an SCM
HA cluster?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]