style95 commented on issue #5526: URL: https://github.com/apache/openwhisk/issues/5526#issuecomment-2597061549
Yes, with the FPCPoolBalancer, a dedicated queue is dynamically created/deleted for each action. At the very beginning, there would no queue for your action, and the PoolBalancer will try to create a queue. Activations are concurrently sent to the scheduler via Kafka. Once the queue is created, activations will be forwarded to the queue. Then the scheduler tries to create a container(s) for the given action. The number of containers to be created depends on the number of inflight activations, the average execution time of the action(in the past), etc. The queue(scheduler) considers those factors and periodically decides the number of containers to create(or not). Once a container is created, it will access the queue to fetch activations and execute them repeatedly. So the initial invocation is supposed to be slow because there is no a queue yet but subsequent invocation is much faster. The queue is running for [10 minutes](https://github.com/apache/openwhisk/blob/master/ansible/group_vars/all#L545) by default even if there is no activation at all. After 10 minutes, it goes to the idle state and after another [10 minutes](https://github.com/apache/openwhisk/blob/master/ansible/group_vars/all#L547C15-L547C51), it stops. These are configurable and designed to keep them running to avoid too frequent queue creation/deletion. It's a different aspect from the network-level "warming up". As you can see, it could theoretically take longer because of the additional network hop, the queue component. But in the real world, you have many concurrent requests and populating requests into a queue and consuming requests from the queue are working fully asynchronous. FPCPoolBalancer keeps sending requests to a queue while containers keep fetching requests as soon as they finish invocation. With ShardingPoolBalancer, the two parts are combined and performance is severely degraded when multiple actions are invoked. For more information, you can refer to [this document](https://cwiki.apache.org/confluence/display/OPENWHISK/New+architecture+proposal) and [my paper](https://ieeexplore.ieee.org/document/9499544). Aside from this, how many invokers and prewarm containers did you use? What is the value of `userMemory` of your invokers? How many concurrent activations did you make at a given time? I think there are still many things that possibly affect your result. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
