bdoyle0182 opened a new issue, #5257:
URL: https://github.com/apache/openwhisk/issues/5257

   On the old scheduler, concurrent throttling works such that you can only 
have x activations in the system at once for your namespace regardless of what 
actions they are for. If the namespace is getting 429's for concurrency on the 
old scheduler, so long as the application is continuously retrying those 
requests the entire workload will eventually process.
   
   On the new scheduler, that isn't necessarily the case. Say you have two 
functions that depend on one another A -> B. a is high traffic and fans out 
containers to the concurrency limit for that namespace. Then when b attempts to 
start running, it can't process any requests since it is namespace throttled 
and it didn't get any containers before it hit the limit. The workflow of A -> 
B then deadlocks because A is still receiving requests but the second function 
can never run. Using openwhisk will now require a user to do much more fine 
grained capacity planning based on throughput calculations to prevent this from 
happening whereas prior the user could just depend on openwhisk slowing them 
down but not halting processing of a specific action. This capacity planning 
can be hard to do because the scheduler isn't always making optimal decisions 
on whether to fan out new containers for an action or not, it's best effort so 
it's hard for a user to plan off that to never breach the thresho
 ld.
   
   My initial thought as a short term fix is if there is no space for a 
namespace and namespace throttling is turned on, if there are 0 containers for 
that action allow the creation of one container to give it some throughput. 
Maybe it gets action throttled which is okay but at least it's able to process 
eventually and prevent deadlocking of inter dependent functions within a 
namespace.
   
   For a more long term fix, we really should start planning for an action 
level concurrency limit implementation. Where it's hierarchical between action 
and namespace. Action can be provisioned with some of the concurrency from its 
namespace concurrency pool to guarantee that it will always get at least this 
much concurrency. The current namespace limit only really makes sense for the 
operator of the system, the user of the namespace should be able to better 
control the flow of traffic. And I think with the new scheduler, this becomes 
much more feasible for us to finally implement.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to