errose28 commented on PR #9400:
URL: https://github.com/apache/ozone/pull/9400#issuecomment-3667660934

   Thanks for adding this. I pulled up the Grafana chart in docker to look 
around.
   
   > IMO it would be even better if you can display "In Safe Mode" and "Exited 
Safe Mode" instead of the numerical 0 and 1.
    
   +1 to Wei-Chiu's comment here. We can have text labels and see enter/exit 
safemode trends over time with Grafana's [state 
timeline](https://grafana.com/docs/grafana/latest/visualizations/panels-visualizations/visualizations/state-timeline/).
 Can we switch the binary plot to use this instead? A red block would indicate 
when an SCM was in safemode, and a green block would indicate that it is out.
   
   For the threshold to exit safemode on each rule, the two solid lines on top 
of each other are difficult to read. We can either use a dashed line for the 
target value, or use a gradient fill where the area at/above the threshold is 
green and the area below is red. Also, the thresholds are expected to be the 
same for all SCMs with HA. I think it would be easier to read if we just take 
the max of thresholds returned by each SCM as a way to reduce this to a single 
number, and plot that as the exit criteria without a corresponding hostname 
label.
   
   Can you share screenshots of what the updated dashboars look like in an SCM 
HA cluster?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to