> The prototype PR from Tyson was based on a fixed capacity of concurrent activations per container. From that, I presume once the limit is reached, the load balancer would roll over to allocate a new container. +1. This is indeed the intent in this proposal.
> a much higher level of complexity and traditional behavior than what you described. Thanks for bringing up this point as well. IMO this also makes sense, and it's not in conflict with the current proposal, but rather an addition to it, for a later time. if OW doesn't monitor the activity at the action container level, then it's going to be hard to ensure reliable resource allocation across action containers. Based on my tests Memory is the only parameter we can correctly allocate for Docker containers. For CPU, unless "--cpu-set" is used, CPU is a shared resource across actions; one action may impact other actions. For network, unless we implement custom network drivers for containers, the bandwidth is shared between actions; one action can congest the network, impacting other actions as well . Disk I/O, same problem. So my point is that without monitoring, resource isolation (beyond memory) remains theoretical at this point. In an ideal picture OW would monitor closely any available parameters when invoking actions, through Tracing, monitoring containers, etc, anything that's available. Then through machine learning OW can learn what's a normal "SLA" for an action, maybe by simply learning the normal distribution of response times, if CPU and other parameters are too much to analyze. Then if the action doesn't behave normally for an Nth percentile, take 2 courses of action: 1) observe if the action has been impacted by other actions, and re-schedule it on other VMs if that's the case. Today OW tries to achieve some isolation through load balancer and invoker settings, but the rules are not dynamic. 2) otherwise, notify the developer that an anomaly is happening for one of the actions These examples are out of the scope for the current proposal. I only shared them so that we don't take monitoring out of the picture later. It's worth a separate conversation on this DL, and it's not as pressing as the performance topic is right now. Dragos On Thu, Jul 6, 2017 at 4:40 AM Michael M Behrendt < [email protected]> wrote: > thx for clarifying, very helpful. The approach you described could be > really interesting. I was thrown off by Dragos' comment saying: > > "What stops Openwhisk to be smart in observing the response times, CPU > consumption memory consumption of the running containers ? Doing so it > could learn automatically how many concurrent requests 1 action can > handle." > > ...which in my mind would have implied a much higher level of complexity > and traditional behavior than what you described. > > Dragos, > did I misinterpret you? > > > > Thanks & best regards > Michael > > > > > From: Rodric Rabbah <[email protected]> > To: [email protected] > Date: 07/06/2017 01:04 PM > Subject: Re: Improving support for UI driven use cases > > > > The prototype PR from Tyson was based on a fixed capacity of concurrent > activations per container. From that, I presume once the limit is reached, > the load balancer would roll over to allocate a new container. > > -r > > > On Jul 6, 2017, at 6:09 AM, Michael M Behrendt > <[email protected]> wrote: > > > > Hi Michael, > > > > thx for checking. I wasn't referring to adding/removing VMs, but rather > > activation contaIners. In today's model that is done intrinsically, > while > > I *think* in what Dragos described, the containers would have to be > > monitored somehow so this new component can decide (based on > > cpu/mem/io/etc load within the containers) when to add/remove > containers. > > > > > > Thanks & best regards > > Michael > > > > > >
