bdoyle0182 opened a new issue, #5256:
URL: https://github.com/apache/openwhisk/issues/5256

   This discussion originated on slack. I'm moving here for more formal 
discussion from the community on the topic.
   
   Original post:
   
   @bdoyle0182: Just wanted to open a discussion about the new scheduler. It's 
very optimized towards short running requests i.e. few milliseconds. However if 
there are very long running functions i.e. 10+ seconds then it scales out very 
quickly since the container throughput calculation is just going to be that it 
needs a new container for each added level of concurrency, this obviously is a 
very miniscule amount of faas use cases but it's supported nonetheless. These 
use cases are much more async then the normal use case of sync responses 
expected in a reasonable http request time of a few milliseconds so some 
latency to wait for available space should be much more acceptable. For example 
if a function takes 10 seconds to run, the user of that function won't really 
care if it has to wait 2-3 seconds for available space, and both the namespace 
and operator would likely prefer latency over uncontrolled fan out of 
concurrency. The problem imo is that the activation staleness value is 
 constant for all function types (currently 100ms). 100ms definitely makes 
sense for anything that runs within a second, but do we think that we could 
make this value dynamic to what the average duration is for that function? Or 
if I'm on the right track here on how we could potentially control fan out of 
long running functions and prefer latency over fan out?
   
   @style95: Yes, it's worth discussing.
   What you have said is correct. When designing the new scheduler, we 
prioritized latency over resources. It was based on the thought that public 
clouds like AWS would try to minimize latency no matter which type of functions 
are running and we also wanted to reduce the latency as much as we can. But it 
can lead to too many containers being provisioned at once. And it caused some 
trouble in our environment too when there are not many invoker nodes. This 
issue especially sticks out for long-running action as container provision 
takes generally more than 100ms. So even if more containers are being 
provisioned, messages become easily staled because all running containers are 
already handling activations that will take more than 100ms, and container 
provision also takes more than 100ms in turn activations in the queue generally 
wait for more than 100ms. One guard here is the scheduler does not provision 
containers more than the number of messages.
   So when there are 4 waiting messages, it only creates containers of up to 4. 
But if the concurrent limit is big(it's common for public clouds) and a huge 
number of messages are incoming, it will try to create a huge number of 
containers at once.
   
   We need more ways to do the fine-grained control of provisioning.
   
   @bdoyle0182: `This issue especially sticks out for long-running action as 
container provision takes generally more than 100ms`
   yes this is exactly what I'm finding. Container provision takes anywhere 
from 500ms to 2 seconds so when the wait time is 100ms the fan out of 
containers can be particularly bad because it checks every 100ms and provisions 
more each time and there won't be any activations complete for a couple seconds
   
   and creating a huge number of containers at once can slow down the docker 
daemon further making provisioning even slower (though with the new scheduler 
container provisioning is balanced across hosts unlike the old scheduler which 
is just one of many huge wins on keeping the docker daemon under control 
:slightly_smiling_face: )
   
   @style95: Yes. So I naively thought we need to control the number of 
concurrent provisioning. If it does not impact the whole system, we can still 
provision many containers for actions. But if it tries to create too many 
containers and it is expected it would cause any issues to the whole system, we 
can throttle them. But I couldn't think of it deeply yet.
   
   @rabbah: How do things look for functions that run for minutes? Will check 
out discussion in GitHub. I'm curious if there should be multiple schedulers 
which can be tailored for the function modality.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to