Hi Markus Per our discussion on slack, I’m documenting below the concerns we discussed. (And thanks for fixing my math bug.)
The approach of being more introspective to detect overload is a good improvement over the ad hoc value set today. Thanks for bringing this up. This is a general improvement although I do have a concern about tying the system overload (and queuing depth) to active acks which also affects other components. Please allow me to explain these so that we can see if there's a real concern. Thanks for reviewing this on slack and discussion around this. ** Namely, the execution of sequences (and conductor actions) which wait for activations to process the next action --- if you're willing to tolerate a longer active ack, should the composition wait just as long? Second, we also had issues in the past where improper accounting of the requests outstanding for a user due to delayed or missing active acks would penalize and throttle a subject. Lastly, there is a backup mechanism for detecting completed activations from the active store, higher active acks means longer polls and load on the database. ** The need for this mechanism suggests the health protocol which uses pings alone is not sufficient and needs this secondary mechanism. The active acks as noted above now have a few intertwined dependences. ** Since the definition of overloaded here is tied to active acks timing out, I also think we would be changing the behavior of the system overall where requests that would be accepted and queued in the past would be rejected much more eagerly. This makes the issue also related to re-architecting the system with the overflow queue as previously discussed on the dev list because there are requests or which _waiting_ is ok (e.g., batch and triggers) vs blocking requests (web actions) where waiting too long is not acceptable. ** Of course this is related to the capacity in the system and assumes static capacity. Shameless plug for The Serverless Contract https://medium.com/openwhisk/the-serverless-contract-44329fab10fb. If you detect overload and add capacity, it's a different discussion (not rejecting requests subject to a max elasticity vs rejecting requests for a given capacity). Say an active ack for an activation _i_ times out if after time T(i) = L(i) x C + epsilon where L(i) is the action's max duration for activation i, and C is the constant fudge factor (which is indirectly the wait time in the queue for this activation). Let an invoker have N slots, all of which are occupied with max duration L(j) for all _j_ in the container pool >= L(i) that is all the slots are busy in the assigned pool and the hold time will be at least L(i) for all the slots. Since an active ack's time out T(i) is oblivious to the the requests ahead of it in the queue, it would take C x S requests ahead of activation i in the queue for the request to timeout. I think wlog we can ignore the epsilon (C x S + 1) for example would cover it, and we can ignore the actual execution time of activation i). The system would be overloaded when there are (K x S) + (K x (S x C + 1)). where K is the number of invokers, and S is the number of slots per invoker, and C is the queuing factor for requests in the queue ( >= 0) where all actions have an expected hold time that is the same So some numbers: K = 1 invokers x S = 16 slots per invokers, and C = 2: system will overload (and reject requests) after 49 activations are accepted. K = 10, 490, and K = 100 then 4900. If we increase C to tolerate more queuing, then we indirectly also affect the execution of compositions and quotas. I think we should as you suggest have a mechanism for detecting overload correctly so this is a better approach given where we are. We should caution that if a deployment has a disproportionately high overload setting in their configuration they will need to be aware of this change. -r > On Thu, Jul 12, 2018 at 11:38 AM, Markus Thoemmes > <[email protected]> wrote: > Hi OpenWhiskers, > > Today, we have an arbitrary system-wide limit of maximum concurrent > connections in the system. In general that is fine, but it doesn't have a > direct correlation to what's actually happening in the system. > > I propose to a new state to each monitored invoker: Overloaded. An invoker > will go into overloaded state if active-acks are starting to timeout. > Eventually, if the system is really overloaded, all Invokers will be in > overloaded state which will cause the loadbalancer to return a failure. This > failure now results in a `503 - System overloaded` message back to the user. > The system-wide concurrency limit would be removed. > > The organic system-limit will be adjustable by a timeout factor, which is > made adjustable https://github.com/apache/incubator-openwhisk/pull/3767. The > default is 2 * maximumActionRuntime + 1 minute. For the vast majority of > use-cases, this means that there are 3x more activations in the system than > it can handle or put differently: activations need to wait for minutes until > they are executed. I think it's safe to say that the system is overloaded if > this is true for all invokers in your system. > > Note: We used to handle active-ack timeouts as system errors and take > invokers into unhealthy state. While having the old non-consistent > loadbalancer, that caused a lot of "flappy" states in the invokers. With the > new consistent implementation, active-ack timeouts should only occur in > problematic situations (either the invoker itself is having problems, or > queueing). Taking the invoker out of the loadbalancer if there are > active-acks missing on that invoker is generally helpful, because missing > active-acks also means inconsistent state in the loadbalancer (it updates its > state as if the active-ack arrived correctly). > > A first stab at the implementation can be found here: > https://github.com/apache/incubator-openwhisk/pull/3875. > > Any concerns with that approach to place an upper bound on the system? > > Cheers, > Markus
