bdoyle0182 opened a new issue, #5257: URL: https://github.com/apache/openwhisk/issues/5257
On the old scheduler, concurrent throttling works such that you can only have x activations in the system at once for your namespace regardless of what actions they are for. If the namespace is getting 429's for concurrency on the old scheduler, so long as the application is continuously retrying those requests the entire workload will eventually process. On the new scheduler, that isn't necessarily the case. Say you have two functions that depend on one another A -> B. a is high traffic and fans out containers to the concurrency limit for that namespace. Then when b attempts to start running, it can't process any requests since it is namespace throttled and it didn't get any containers before it hit the limit. The workflow of A -> B then deadlocks because A is still receiving requests but the second function can never run. Using openwhisk will now require a user to do much more fine grained capacity planning based on throughput calculations to prevent this from happening whereas prior the user could just depend on openwhisk slowing them down but not halting processing of a specific action. This capacity planning can be hard to do because the scheduler isn't always making optimal decisions on whether to fan out new containers for an action or not, it's best effort so it's hard for a user to plan off that to never breach the thresho ld. My initial thought as a short term fix is if there is no space for a namespace and namespace throttling is turned on, if there are 0 containers for that action allow the creation of one container to give it some throughput. Maybe it gets action throttled which is okay but at least it's able to process eventually and prevent deadlocking of inter dependent functions within a namespace. For a more long term fix, we really should start planning for an action level concurrency limit implementation. Where it's hierarchical between action and namespace. Action can be provisioned with some of the concurrency from its namespace concurrency pool to guarantee that it will always get at least this much concurrency. The current namespace limit only really makes sense for the operator of the system, the user of the namespace should be able to better control the flow of traffic. And I think with the new scheduler, this becomes much more feasible for us to finally implement. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
