Maxim Khutornenko created AURORA-290:
----------------------------------------

             Summary: Expose basic SLA job stats from scheduler
                 Key: AURORA-290
                 URL: https://issues.apache.org/jira/browse/AURORA-290
             Project: Aurora
          Issue Type: Epic
          Components: Scheduler
            Reporter: Maxim Khutornenko


Be able to collect and monitor Aurora job SLA (Service Level Agreements) 
metrics that define the contractual relationship between the Aurora/Mesos 
platform and hosted services. Specifically, collect the following stats:

ARD (Aggregate Regrettable Downtime) stat:
Per job
Per cluster

MTTA (Mean Time To Assigned) and MTTR (Mean Time To Running) stats: 
Per job
Per cluster
Per instance size (small, medium, large, x-large)
By CPU
By RAM
By DISK

Job uptime stats (see V1 for more) at 99, 95, 90 and 75 percentiles (to be 
refined)




--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to