Maxim Khutornenko created AURORA-290:
----------------------------------------
Summary: Expose basic SLA job stats from scheduler
Key: AURORA-290
URL: https://issues.apache.org/jira/browse/AURORA-290
Project: Aurora
Issue Type: Epic
Components: Scheduler
Reporter: Maxim Khutornenko
Be able to collect and monitor Aurora job SLA (Service Level Agreements)
metrics that define the contractual relationship between the Aurora/Mesos
platform and hosted services. Specifically, collect the following stats:
ARD (Aggregate Regrettable Downtime) stat:
Per job
Per cluster
MTTA (Mean Time To Assigned) and MTTR (Mean Time To Running) stats:
Per job
Per cluster
Per instance size (small, medium, large, x-large)
By CPU
By RAM
By DISK
Job uptime stats (see V1 for more) at 99, 95, 90 and 75 percentiles (to be
refined)
--
This message was sent by Atlassian JIRA
(v6.2#6252)