[ 
https://issues.apache.org/jira/browse/AURORA-279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14976885#comment-14976885
 ] 

David McLaughlin commented on AURORA-279:
-----------------------------------------

Is this feature necessary though? Why is a task reporting unhealthy if keeping 
it alive is better than restarting it? It should always be a no-op to kill 
something that is reporting unhealthy in terms of overall service health. This 
feels like giving service owners the rope to hang themselves with (i.e instance 
restarts being a healthy part of service operation).

The use case presented by Brian seems like something a traffic load balancer 
should do for you (redirecting traffic to 'healthier' instances based on 
sup-optimal performance of certain nodes). 

> Allow scheduler to decide how to respond to task health check failures
> ----------------------------------------------------------------------
>
>                 Key: AURORA-279
>                 URL: https://issues.apache.org/jira/browse/AURORA-279
>             Project: Aurora
>          Issue Type: Story
>          Components: Executor, Scheduler
>            Reporter: Bill Farner
>            Priority: Minor
>
> The executor is currently autonomous in deciding to kill tasks that have 
> failed health checks.  If health check failures synchronize across a service, 
> the service could suffer an outage.  SLA considerations may also need to be 
> me made before deciding to kill a task for health check failures.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to