mhenke1 commented on issue #3040: Adjust controller side action time-out to 
avoid invokers marked as unhealthy 
URL: 
https://github.com/apache/incubator-openwhisk/pull/3040#issuecomment-348910022
 
 
   @rabbah Let's take the two controller HA case. In case of non-perfect state 
synchronization between controllers more than 16 actions can be scheduled to 
one invoker.  16 of them get executed and the others queued. 
   
   When all 16 executed actions run to their maximal run time and expire, the 
next batch (the one waiting in the queue) can start after around one minute 
elapsed time. If these next actions also run to their maximal run time they 
return after around two minutes 
   (1 minute waiting for the first set of actions to stop plus  1 minute run 
time plus some overhead). 
   
   At the moment the controller only allows two minutes for an action to 
finish. So we might have the case that actions complete shortly after that two 
minute time and get regarded as failed by the controller. If we have to many of 
those cases the given invoker will be regarded as unhealthy. 
   
   During the last days we see a lot of these cases where invokers were marked 
as unhealthy and recovered after a short time. In the cases the invokers were 
busy with  batches of actions that were all timing-out nearly at the same time.
   
   In the case we add more controller to the HA game, the summed up waiting 
time might go up even more. Lets take the most pathological case in which the 
HA state is not at all synced*. 
   In this case all n controllers might place actions on one and the same 
invoker. The resulting overall wait time is the time for the first batch to be 
executed plus the time for the (n-1) batches  waiting in the queue. Therefore 
with this PR the wait time is calculated as (1 + (n-1)) minutes + some overhead.
   
   *Of course this last case is unlikely and hopefully a theoretical edge case. 

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

Reply via email to