Re: Improving support for UI driven use cases

Dascalita Dragos Thu, 06 Jul 2017 13:01:02 -0700

> The prototype PR from Tyson was based on a fixed capacity of concurrent
activations per container. From that, I presume once the limit is reached,
the load balancer would roll over to allocate a new container.
+1. This is indeed the intent in this proposal.

> a much higher level of complexity and traditional behavior than what you
described.

Thanks for bringing up this point as well. IMO this also makes sense, and
it's not in conflict with the current proposal, but rather an addition to
it, for a later time.  if OW doesn't monitor the activity at the action
container level, then it's going to be hard to ensure reliable resource
allocation across action containers. Based on my tests Memory is the only
parameter we can correctly allocate for Docker containers. For CPU, unless
"--cpu-set" is used, CPU is a shared resource across actions; one action
may impact other actions. For network, unless we implement custom network
drivers for containers, the bandwidth is shared between actions; one action
can congest the network, impacting other actions as well . Disk I/O, same
problem.

So my point is that without monitoring, resource isolation (beyond memory)
remains theoretical at this point.

In an ideal picture OW would monitor closely any available parameters when
invoking actions, through Tracing, monitoring containers, etc, anything
that's available. Then through machine learning OW can learn what's a
normal "SLA" for an action, maybe by simply learning the normal
distribution of response times, if CPU and other parameters are too much to
analyze. Then if the action doesn't behave normally for an Nth percentile,
take 2 courses of action:
1) observe if the action has been impacted by other actions, and
re-schedule it on other VMs if that's the case. Today OW tries to achieve
some isolation through load balancer and invoker settings, but the rules
are not dynamic.
2) otherwise, notify the developer that an anomaly is happening for one of
the actions

These examples are out of the scope for the current proposal. I only shared
them so that we don't take monitoring out of the picture later. It's worth
a separate conversation on this DL, and it's not as pressing as the
performance topic is right now.

Dragos

On Thu, Jul 6, 2017 at 4:40 AM Michael M Behrendt <
[email protected]> wrote:

> thx for clarifying, very helpful. The approach you described could be
> really interesting. I was thrown off by Dragos' comment saying:
>
>  "What stops Openwhisk to be smart in observing the response times, CPU
> consumption memory consumption of the running containers ? Doing so it
> could learn automatically how many concurrent requests 1 action can
> handle."
>
> ...which in my mind would have implied a much higher level of complexity
> and traditional behavior than what you described.
>
> Dragos,
> did I misinterpret you?
>
>
>
> Thanks & best regards
> Michael
>
>
>
>
> From:   Rodric Rabbah <[email protected]>
> To:     [email protected]
> Date:   07/06/2017 01:04 PM
> Subject:        Re: Improving support for UI driven use cases
>
>
>
> The prototype PR from Tyson was based on a fixed capacity of concurrent
> activations per container. From that, I presume once the limit is reached,
> the load balancer would roll over to allocate a new container.
>
> -r
>
> > On Jul 6, 2017, at 6:09 AM, Michael M Behrendt
> <[email protected]> wrote:
> >
> > Hi Michael,
> >
> > thx for checking. I wasn't referring to adding/removing VMs, but rather
> > activation contaIners. In today's model that is done intrinsically,
> while
> > I *think* in what Dragos described, the containers would have to be
> > monitored somehow so this new component can decide (based on
> > cpu/mem/io/etc load within the containers) when to add/remove
> containers.
> >
> >
> > Thanks & best regards
> > Michael
>
>
>
>
>
>

Re: Improving support for UI driven use cases

Reply via email to