Hello Friends,

Greetings!!

We are currently using Aurora 0.17.0 and have a use-case wherein we want to 
continuously monitor the below SLA metrics for our clusters to detect any 
anomalies :

  *   Median Time To Assigned 
(MTTA<http://aurora.apache.org/documentation/latest/features/sla-metrics/#median-time-to-assigned-(mtta)>)
  *   Median Time To Starting 
(MTTS<http://aurora.apache.org/documentation/latest/features/sla-metrics/#median-time-to-starting-(mtts)>)
  *   Median Time To Running 
(MTTR<http://aurora.apache.org/documentation/latest/features/sla-metrics/#median-time-to-running-(mttr)>)

Currently, the sla_stat_refresh_interval for us is set to default 1 min.

Now, while using the /vars api endpoint to fetch the SLA metrics, aurora 
samples the data for metrics calculation of the above metrics only for the last 
one min at every 1 minute interval. It won’t give us the historical data for 
these metrics.

Does aurora expose any api endpoint to provide the historical data for these 
metrics over some configurable period of time? Is there any metric in 
/graphview endpoint for this?

Also, it will be great if anyone can suggest some ideas for monitoring around 
these metrics. I am , at present,  planning to keep polling the /vars endpoint 
regularly for data collection and use ELK stack for graphing and alerting.

Thanks for your time in advance !!

Regards,
Bipra.

Reply via email to