Maxim Khutornenko created AURORA-208:
----------------------------------------

             Summary: Add sla_list_safe_domain command into aurora_admin client
                 Key: AURORA-208
                 URL: https://issues.apache.org/jira/browse/AURORA-208
             Project: Aurora
          Issue Type: Task
          Components: Client
            Reporter: Maxim Khutornenko
            Assignee: Maxim Khutornenko


sla_list_safe_domain
Usage: sla_list_safe_domain --cluster=cluster --attribute={rack | host} 
percentage duration [--override_jobs=filename] 
[--exclude_attr=filename][--list_jobs]
Returns a list of racks or hosts where it would be safe to kill tasks without 
violating their job SLA: percentage of tasks that stayed up within the last 
“duration” secs|mins|hrs|days. The SLA can be specified globally per cluster as 
pair of percentage and duration values or per job in a file. 

--cluster: 
Aurora cluster name.
--attribute: 
Currently supported attributes “host” or “rack”.
percentage:
Percentage of tasks required to be up within the duration. Applied to all jobs 
except those listed in --override file. 
duration:
Time interval (now - value) for the percentage of up tasks. Applied to all jobs 
except those listed in --override_jobs file. Format: 
<value>{secs|mins|hrs|days}.
--override_jobs: 
An optional file to load job specific SLAs that will override cluster-wide 
command line percentage and duration values. The file can have multiple lines 
in the following format:
role/env/job percentage duration
--exclude_attr:
An optional text file listing attribute values (one per line) to exclude from 
the result set if found.
--list_jobs:
Lists all affected job keys with projected new SLAs if their tasks get killed.



Examples: 
sla_list_safe_domain --cluster=smf1 --attribute=host 85 10mins
sla_list_safe_domain --cluster=smf1 --filename=~/rack.txt


Example (--attribute=rack):
aurora_admin list_safe_sla_domain --cluster=smf1 --attribute=rack 95 2hrs
aau smf1-aau-15-sr3.prod.twitter.com
aau smf1-aau-29-sr2.prod.twitter.com
aau smf1-aau-30-sr3.prod.twitter.com  
aev smf1-aev-02-sr2.prod.twitter.com  
aev smf1-aev-11-sr2.prod.twitter.com  
cnm smf1-cnm-26-sr3.prod.twitter.com 
cnm smf1-cnm-27-sr3.prod.twitter.com   
cnm smf1-cnm-28-sr3.prod.twitter.com   

Example output (--attribute=host):
aurora_admin list_safe_sla_domain --cluster=smf1 --attribute=host 95 2hrs
smf1-ayz-29-sr3.prod.twitter.com 
smf1-cgk-11-sr2.prod.twitter.com
smf1-cga-22-sr2.prod.twitter.com
smf1-cnk-03-sr3.prod.twitter.com

Example output (--list_jobs):
aurora_admin list_safe_sla_domain --cluster=smf1 --attribute=rack 95 2hrs 
--list_jobs
ayz smf1-ayz-29-sr3.prod.twitter.com mesos/prod/labrat  96.00 2hrs
ayz smf1-ayz-29-sr3.prod.twitter.com mesos/prod/caliper         97.65 2hrs
ayz smf1-ayz-29-sr3.prod.twitter.com mesos/prod/packer  95.05 2hrs




--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Reply via email to