Re: Improving support for UI driven use cases

2017-07-08 Thread Nate DAmico
Exciting to see use cases pushing the limits of openwhisk and the proposal
for UI and lower latency use cases.

A lot of points going on in this thread and assume more will be flushed out
as well as the on the wiki page, wanted to add one more consideration the
group should take into account, sorry if muddying the waters some.

With the group efforts to port openwhisk to kubernetes this low
latency/scale-out use case seems would be effected partially or greatly
depending on how the implementation approach comes out.  Anything that is
dealing with spin up/scale-out/scale-down and generally scheduling of
containers would be effected when running in kube or the like.  In some
cases the underlying orchestrator would provide some "batteries included"
that openwhisk could possibly leverage or get "for free", such as something
like kubernetes pod scaling using cpu or other whisk provided metrics for
managing the pool of request handling containers:

https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale

Anyways, didn't want to muddy the waters, figured as both kube support and
this are in proposal stage, talk of making parts of openwhisk more
pluggable for kube support could have an effect on this approach.  It would
be great if when both this and kube support are in one wouldn't have to run
"native whisk" in order to take advantage of this proposal vs being able to
run on the "kubernetes whisk" as well, but understand sometimes things
happen in multiple stages of design and implementation

Nate


On Thu, Jul 6, 2017 at 1:00 PM, Dascalita Dragos <ddrag...@gmail.com> wrote:

> > The prototype PR from Tyson was based on a fixed capacity of concurrent
> activations per container. From that, I presume once the limit is reached,
> the load balancer would roll over to allocate a new container.
> +1. This is indeed the intent in this proposal.
>
> > a much higher level of complexity and traditional behavior than what you
> described.
>
> Thanks for bringing up this point as well. IMO this also makes sense, and
> it's not in conflict with the current proposal, but rather an addition to
> it, for a later time.  if OW doesn't monitor the activity at the action
> container level, then it's going to be hard to ensure reliable resource
> allocation across action containers. Based on my tests Memory is the only
> parameter we can correctly allocate for Docker containers. For CPU, unless
> "--cpu-set" is used, CPU is a shared resource across actions; one action
> may impact other actions. For network, unless we implement custom network
> drivers for containers, the bandwidth is shared between actions; one action
> can congest the network, impacting other actions as well . Disk I/O, same
> problem.
>
> So my point is that without monitoring, resource isolation (beyond memory)
> remains theoretical at this point.
>
> In an ideal picture OW would monitor closely any available parameters when
> invoking actions, through Tracing, monitoring containers, etc, anything
> that's available. Then through machine learning OW can learn what's a
> normal "SLA" for an action, maybe by simply learning the normal
> distribution of response times, if CPU and other parameters are too much to
> analyze. Then if the action doesn't behave normally for an Nth percentile,
> take 2 courses of action:
> 1) observe if the action has been impacted by other actions, and
> re-schedule it on other VMs if that's the case. Today OW tries to achieve
> some isolation through load balancer and invoker settings, but the rules
> are not dynamic.
> 2) otherwise, notify the developer that an anomaly is happening for one of
> the actions
>
> These examples are out of the scope for the current proposal. I only shared
> them so that we don't take monitoring out of the picture later. It's worth
> a separate conversation on this DL, and it's not as pressing as the
> performance topic is right now.
>
> Dragos
>
>
> On Thu, Jul 6, 2017 at 4:40 AM Michael M Behrendt <
> michaelbehre...@de.ibm.com> wrote:
>
> > thx for clarifying, very helpful. The approach you described could be
> > really interesting. I was thrown off by Dragos' comment saying:
> >
> >  "What stops Openwhisk to be smart in observing the response times, CPU
> > consumption memory consumption of the running containers ? Doing so it
> > could learn automatically how many concurrent requests 1 action can
> > handle."
> >
> > ...which in my mind would have implied a much higher level of complexity
> > and traditional behavior than what you described.
> >
> > Dragos,
> > did I misinterpret you?
> >
> >
> >
> > Thanks & best regards
> > Michael
> &

Re: Improving support for UI driven use cases

2017-07-06 Thread Dascalita Dragos
> The prototype PR from Tyson was based on a fixed capacity of concurrent
activations per container. From that, I presume once the limit is reached,
the load balancer would roll over to allocate a new container.
+1. This is indeed the intent in this proposal.

> a much higher level of complexity and traditional behavior than what you
described.

Thanks for bringing up this point as well. IMO this also makes sense, and
it's not in conflict with the current proposal, but rather an addition to
it, for a later time.  if OW doesn't monitor the activity at the action
container level, then it's going to be hard to ensure reliable resource
allocation across action containers. Based on my tests Memory is the only
parameter we can correctly allocate for Docker containers. For CPU, unless
"--cpu-set" is used, CPU is a shared resource across actions; one action
may impact other actions. For network, unless we implement custom network
drivers for containers, the bandwidth is shared between actions; one action
can congest the network, impacting other actions as well . Disk I/O, same
problem.

So my point is that without monitoring, resource isolation (beyond memory)
remains theoretical at this point.

In an ideal picture OW would monitor closely any available parameters when
invoking actions, through Tracing, monitoring containers, etc, anything
that's available. Then through machine learning OW can learn what's a
normal "SLA" for an action, maybe by simply learning the normal
distribution of response times, if CPU and other parameters are too much to
analyze. Then if the action doesn't behave normally for an Nth percentile,
take 2 courses of action:
1) observe if the action has been impacted by other actions, and
re-schedule it on other VMs if that's the case. Today OW tries to achieve
some isolation through load balancer and invoker settings, but the rules
are not dynamic.
2) otherwise, notify the developer that an anomaly is happening for one of
the actions

These examples are out of the scope for the current proposal. I only shared
them so that we don't take monitoring out of the picture later. It's worth
a separate conversation on this DL, and it's not as pressing as the
performance topic is right now.

Dragos


On Thu, Jul 6, 2017 at 4:40 AM Michael M Behrendt <
michaelbehre...@de.ibm.com> wrote:

> thx for clarifying, very helpful. The approach you described could be
> really interesting. I was thrown off by Dragos' comment saying:
>
>  "What stops Openwhisk to be smart in observing the response times, CPU
> consumption memory consumption of the running containers ? Doing so it
> could learn automatically how many concurrent requests 1 action can
> handle."
>
> ...which in my mind would have implied a much higher level of complexity
> and traditional behavior than what you described.
>
> Dragos,
> did I misinterpret you?
>
>
>
> Thanks & best regards
> Michael
>
>
>
>
> From:   Rodric Rabbah <rod...@gmail.com>
> To: dev@openwhisk.apache.org
> Date:   07/06/2017 01:04 PM
> Subject:Re: Improving support for UI driven use cases
>
>
>
> The prototype PR from Tyson was based on a fixed capacity of concurrent
> activations per container. From that, I presume once the limit is reached,
> the load balancer would roll over to allocate a new container.
>
> -r
>
> > On Jul 6, 2017, at 6:09 AM, Michael M Behrendt
> <michaelbehre...@de.ibm.com> wrote:
> >
> > Hi Michael,
> >
> > thx for checking. I wasn't referring to adding/removing VMs, but rather
> > activation contaIners. In today's model that is done intrinsically,
> while
> > I *think* in what Dragos described, the containers would have to be
> > monitored somehow so this new component can decide (based on
> > cpu/mem/io/etc load within the containers) when to add/remove
> containers.
> >
> >
> > Thanks & best regards
> > Michael
>
>
>
>
>
>


Re: Improving support for UI driven use cases

2017-07-06 Thread Michael M Behrendt

07/13 would work for me, while 9 am would be better.

Would that work for you?

Sent from my iPhone

> On 6. Jul 2017, at 19:05, Tyson Norris  wrote:
>
>
>> On Jul 5, 2017, at 10:44 PM, Tyson Norris 
wrote:
>>
>> I meant to add : I will work out with Dragos a time to propose asap, and
get back to the group so that we can negotiate a meeting time that will
work for everyone who wants to attend in realtime.
>>
>
> For a call, can people make July 13 8am PDT work? I think July 17 is the
next available time slot for us.
>
> Thanks
> Tyson


Re: Improving support for UI driven use cases

2017-07-06 Thread Tyson Norris

> On Jul 5, 2017, at 10:44 PM, Tyson Norris  wrote:
> 
> I meant to add : I will work out with Dragos a time to propose asap, and get 
> back to the group so that we can negotiate a meeting time that will work for 
> everyone who wants to attend in realtime.
> 

For a call, can people make July 13 8am PDT work? I think July 17 is the next 
available time slot for us.

Thanks
Tyson

Re: Improving support for UI driven use cases

2017-07-06 Thread Michael M Behrendt
thx for clarifying, very helpful. The approach you described could be 
really interesting. I was thrown off by Dragos' comment saying:

 "What stops Openwhisk to be smart in observing the response times, CPU 
consumption memory consumption of the running containers ? Doing so it 
could learn automatically how many concurrent requests 1 action can 
handle."

...which in my mind would have implied a much higher level of complexity 
and traditional behavior than what you described.

Dragos,
did I misinterpret you?



Thanks & best regards
Michael




From:   Rodric Rabbah <rod...@gmail.com>
To: dev@openwhisk.apache.org
Date:   07/06/2017 01:04 PM
Subject:    Re: Improving support for UI driven use cases



The prototype PR from Tyson was based on a fixed capacity of concurrent 
activations per container. From that, I presume once the limit is reached, 
the load balancer would roll over to allocate a new container.

-r

> On Jul 6, 2017, at 6:09 AM, Michael M Behrendt 
<michaelbehre...@de.ibm.com> wrote:
> 
> Hi Michael,
> 
> thx for checking. I wasn't referring to adding/removing VMs, but rather 
> activation contaIners. In today's model that is done intrinsically, 
while 
> I *think* in what Dragos described, the containers would have to be 
> monitored somehow so this new component can decide (based on 
> cpu/mem/io/etc load within the containers) when to add/remove 
containers.
> 
> 
> Thanks & best regards
> Michael







Re: Improving support for UI driven use cases

2017-07-06 Thread Michael M Behrendt
Hi Michael,

thx for checking. I wasn't referring to adding/removing VMs, but rather 
activation contaIners. In today's model that is done intrinsically, while 
I *think* in what Dragos described, the containers would have to be 
monitored somehow so this new component can decide (based on 
cpu/mem/io/etc load within the containers) when to add/remove containers.


Thanks & best regards
Michael

---
IBM Distinguished Engineer
Chief Architect, Serverless / FaaS & OpenWhisk
Mobile: +49-170-7993527
michaelbehre...@de.ibm.com |  @michael_beh

IBM Deutschland Research & Development GmbH / Vorsitzender des 
Aufsichtsrats: Martina Koederitz
Geschäftsführung: Dirk Wittkopp 
Sitz der Gesellschaft: Böblingen / Registergericht: Amtsgericht Stuttgart, 
HRB 243294 



From:   Michael Marth <mma...@adobe.com.INVALID>
To: "dev@openwhisk.apache.org" <dev@openwhisk.apache.org>
Date:   07/05/2017 08:28 PM
Subject:    Re: Improving support for UI driven use cases



Hi Michael,

To make sure we mean the same thing with the word ?autoscaling? in the 
context of this thread and in the context of OpenWhisk: I refer to the 
(automated) increase/decrease of the VMs that run the action containers.
Is that what you also refer to?

If so, then the proposal at hand is orthogonal to autoscaling. At its core 
it is about increasing the density of executing actions within one 
container and in that sense independent of how many containers, VMs, etc 
there are in the system or how the system is shrunk/grown.

In practical terms there is still a connection between proposal and 
scaling the VMs: if the density of executing actions is increased by 
orders of magnitude then the topic of scaling the VMs becomes a much less 
pressing topic (at least for the types of workload I described 
previously). But this practical consideration should not be mistaken for 
this being a discussion of autoscaling.

Please let me know if I misunderstood your use of the term autoscaling or 
if the above does not explain well.

Thanks!
Michael 




On 05/07/17 16:57, "Michael M Behrendt" <michaelbehre...@de.ibm.com> 
wrote:

>
>
>Hi Michael/Rodric,
>
>I'm struggling to understand how a separate invoker pool helps us 
avoiding
>to implement traditional autoscaling if we process multiple activations 
as
>threads within a shared process. Can you pls elaborate / provide an
>example?
>
>Sent from my iPhone
>
>> On 5. Jul 2017, at 16:53, Michael Marth <mma...@adobe.com.INVALID> 
wrote:
>>
>> Michael B,
>> Re your question: exactly what Rodric said :)
>>
>>
>>
>>> On 05/07/17 12:32, "Rodric Rabbah" <rod...@gmail.com> wrote:
>>>
>>> The issue at hand is precisely because there isn't any autoscaling of
>capacity (N invokers provide M containers per invoker). Once all those
>slots are consumed any new requests are queued - as previously discussed.
>>>
>>> Adding more density per vm is one way of providing additional capacity
>over finite resources. This is the essence of the initial proposal.
>>>
>>> As noted in previous discussions on this topic, this should be viewed 
as
>managing a different resource pool (and not the same pool of containers 
as
>ephemeral actions). Once you buy into that, generalization to other
>resource pools becomes natural.
>>>
>>> Going further, serverless becomes the new PaaS.
>>>
>>> -r
>>>
>>>> On Jul 5, 2017, at 6:11 AM, Michael M Behrendt
><michaelbehre...@de.ibm.com> wrote:
>>>>
>>>> Hi Michael,
>>>>
>>>> thanks for the feedback -- glad you like my stmt re value prop :-)
>>>>
>>>> I might not yet have fully gotten my head around Steve's proposal --
>what
>>>> are your thoughts on how this would help avoiding the 
reimplementation
>of
>>>> an autoscaling / feedback loop mechanism, as we know it from more
>>>> traditional runtime platforms?
>>>>
>>>>
>>>> Thanks & best regards
>>>> Michael
>>>>
>>>>
>>>>
>>>> From:   Michael Marth <mma...@adobe.com.INVALID>
>>>> To: "dev@openwhisk.apache.org" <dev@openwhisk.apache.org>
>>>> Date:   07/05/2017 11:25 AM
>>>> Subject:Re: Improving support for UI driven use cases
>>>>
>>>>
>>>>
>>>> Hi Michael,
>>>>
>>>> Totally agree with your statement
>>>> ?value prop of serverless is that folks don't have to care about 
that"
>>>>
>>>> Again, the proposal at hand does not intend to change 

Re: Improving support for UI driven use cases

2017-07-05 Thread Tyson Norris
Thanks everyone for the feedback.

I’d be happy to join a call -


A couple of details on the proposal that may or may not be clear:
- no changes to existing behavior without explicit adoption by the action 
developer or function client (e.g. developer would have to “allow” the function 
to receive concurrent activation)
- integrate this support at the load balancer level - instead of publishing to 
a Kafka topic for an invoker, publish to a container that was launched by an 
invoker. There is also no reason that multiple load balancers cannot be active, 
lending to “no changes to existing behavior”




On Jul 4, 2017, at 6:55 AM, Michael Marth 
> wrote:

Hi Jeremias, all,

Tyson and Dragos are travelling this week, so that I don’t know by when they 
get to respond. I have worked with them on this topic, so let me jump in and 
comment until they are able to reply.

From my POV having a call like you suggest is a really good idea. Let’s wait 
for Tyson & Dragos to chime in to find a date.

As you mention the discussion so far was jumping across different topics, 
especially the use case, the problem to be solved and the proposed solution. In 
preparation of the call I think we can clarify use case and problem on the 
list. Here’s my view:

Use Case

For us the use case can be summarised with “dynamic, high performance 
websites/mobile apps”. This implies:
1 High concurrency, i.e. Many requests coming in at the same time
2 The code to be executed is the same code across these different requests (as 
opposed to a long tail distribution of many different actions being executed 
concurrently). In our case “many” would mean “hundreds” or a few thousand.
3 The latency (time to start execution) matters, because human users are 
waiting for the response. Ideally, in these order of magnitudes of concurrent 
requests the latency should not change much.

All 3 requirements need to be satisfied for this use case.
In the discussion so far it was mentioned that there are other use cases which 
might have similar requirements. That’s great and I do not want to rule them 
out, obviously. The above is just to make it clear from where we are coming 
from.

At this point I would like to mention that it is my understanding that this use 
case is within OpenWhisk’s strike zone, i.e. Something that we all think is 
reasonable to support. Please speak up if you disagree.

The Problem

One can look at the problem in two ways:
Either you keep the resources of the OW system constant (i.e. No scaling). In 
that case latency increases very quickly as demonstrated by Tyson’s tests.
Or you increase the system’s capacity. In that case the amount of machines to 
satisfy this use case quickly becomes prohibitively expensive to run for the OW 
operator – where expensive is defined as “compared to traditional web servers” 
(in our case a standard Node.js server). Meaning, you need 100-1000 concurrent 
action containers to serve what can be served by 1 or 2 Node.js containers.

Of course, the proposed solution is not a fundamental “fix” for the above. It 
would only move the needle ~2 orders of magnitude – so that the current problem 
would not be a problem in reality anymore (and simply remain as a theoretical 
problem). For me that would be good enough.

The solution approach

Would not like to comment on the proposed solution’s details (and leave that to 
Dragos and Tyson). However, it was mentioned that the approach would change the 
programming model for users:
Our mindset and approach was that we explicitly do not want  to change how 
OpenWhisk exposes itself to users. Meaning, users should still be able to use 
NPMs, etc  - i.e. This would be an internal implementation detail that is not 
visible for users. (we can make things more explicit to users and e.g. Have 
them requests a special concurrent runtime if we wish to do so – so far we 
tried to make it transparent to users, though).

Many thanks
Michael



On 03/07/17 14:48, "Jeremias Werner" 
>
 wrote:

Hi

Thanks for the write-up and the proposal. I think this is a nice idea and
sounds like a nice way of increasing throughput. Reading through the thread
it feels like there are different topics/problems mixed-up and the
discussion is becoming very complex already.

Therefore I would like to suggest that we streamline the discussion a bit,
maybe in a zoom.us session where we first give Tyson and Dragos 
the chance
to walk through the proposal and clarify questions of the audience. Once we
are all on the same page we could think of a discussion about the benefits
(improved throughput, latency) vs. challanges (resource sharing, crash
model, container lifetime, programming model) on the core of the proposal:
running multiple activations in a single user container. Once we have a
common understanding on that part we 

Re: Improving support for UI driven use cases

2017-07-05 Thread Michael Marth
Hi Alex,

That is a very interesting question.
If the programming model and guarantees that are exposed to developers involve 
guarantees on amount of memory then I still see two options:
- reserve the full capacity (i.e. Current model)
- or “overbook” a container. (not exactly the spirit of the current proposal 
but would lead to similar results)
This leads into a more product-management-like discussion if asking the 
developers to specify the amount of RAM they desire is a good thing in the 
first place. In the spirit of “devs shall not care about infra” it might be 
preferable not even make devs think about that and just execute the code with 
just enough RAM (or whatever resources are needed).
I mean you can look at the fact that some serverless providers expose RAM, etc 
to the developers as actually breaking the abstraction and working against the 
core value prop.
TBH I am not sure if there is a “right” way to look at this topic. Might depend 
on circumstances of the OW deployment.

Michael






On 05/07/17 17:45, "Alex Glikson" <glik...@il.ibm.com> wrote:

>Once different 'flavors' of pools/invokers are supported, one could 
>implement whatever policy for resource allocation and/or isolation and/or 
>load balancing they want in an invoker (or group of invokers) - without 
>necessarily affecting the 'core' of OpenWhisk, as long as the programming 
>model remains the same.
>However, with containers handling multiple requests, I am not sure that 
>the latter will be still true -- in particular, whether the developer can 
>still assume dedicated resource allocation per action invocation 
>(primarily memory), or we would also need to surface heterogeneous 
>'flavors' of resources allocated for an action (which might be perceived 
>as a natural and good thing - or maybe the opposite, given that we are 
>trying to make the developer unaware of infrastructure).
>
>Regards,
>Alex
>
>
>
>
>From:   "Michael M Behrendt" <michaelbehre...@de.ibm.com>
>To:     dev@openwhisk.apache.org
>Date:   05/07/2017 05:58 PM
>Subject:Re: Improving support for UI driven use cases
>
>
>
>
>
>Hi Michael/Rodric,
>
>I'm struggling to understand how a separate invoker pool helps us avoiding
>to implement traditional autoscaling if we process multiple activations as
>threads within a shared process. Can you pls elaborate / provide an
>example?
>
>Sent from my iPhone
>
>> On 5. Jul 2017, at 16:53, Michael Marth <mma...@adobe.com.INVALID> 
>wrote:
>>
>> Michael B,
>> Re your question: exactly what Rodric said :)
>>
>>
>>
>>> On 05/07/17 12:32, "Rodric Rabbah" <rod...@gmail.com> wrote:
>>>
>>> The issue at hand is precisely because there isn't any autoscaling of
>capacity (N invokers provide M containers per invoker). Once all those
>slots are consumed any new requests are queued - as previously discussed.
>>>
>>> Adding more density per vm is one way of providing additional capacity
>over finite resources. This is the essence of the initial proposal.
>>>
>>> As noted in previous discussions on this topic, this should be viewed 
>as
>managing a different resource pool (and not the same pool of containers as
>ephemeral actions). Once you buy into that, generalization to other
>resource pools becomes natural.
>>>
>>> Going further, serverless becomes the new PaaS.
>>>
>>> -r
>>>
>>>> On Jul 5, 2017, at 6:11 AM, Michael M Behrendt
><michaelbehre...@de.ibm.com> wrote:
>>>>
>>>> Hi Michael,
>>>>
>>>> thanks for the feedback -- glad you like my stmt re value prop :-)
>>>>
>>>> I might not yet have fully gotten my head around Steve's proposal --
>what
>>>> are your thoughts on how this would help avoiding the reimplementation
>of
>>>> an autoscaling / feedback loop mechanism, as we know it from more
>>>> traditional runtime platforms?
>>>>
>>>>
>>>> Thanks & best regards
>>>> Michael
>>>>
>>>>
>>>>
>>>> From:   Michael Marth <mma...@adobe.com.INVALID>
>>>> To: "dev@openwhisk.apache.org" <dev@openwhisk.apache.org>
>>>> Date:   07/05/2017 11:25 AM
>>>> Subject:Re: Improving support for UI driven use cases
>>>>
>>>>
>>>>
>>>> Hi Michael,
>>>>
>>>> Totally agree with your statement
>>>> ?value prop of serverless is that folks don't have to care about that"
>>>>
>>>> Again, the proposal at ha

Re: Improving support for UI driven use cases

2017-07-05 Thread Michael Marth
Hi Michael,

To make sure we mean the same thing with the word “autoscaling” in the context 
of this thread and in the context of OpenWhisk: I refer to the (automated) 
increase/decrease of the VMs that run the action containers.
Is that what you also refer to?

If so, then the proposal at hand is orthogonal to autoscaling. At its core it 
is about increasing the density of executing actions within one container and 
in that sense independent of how many containers, VMs, etc there are in the 
system or how the system is shrunk/grown.

In practical terms there is still a connection between proposal and scaling the 
VMs: if the density of executing actions is increased by orders of magnitude 
then the topic of scaling the VMs becomes a much less pressing topic (at least 
for the types of workload I described previously). But this practical 
consideration should not be mistaken for this being a discussion of autoscaling.

Please let me know if I misunderstood your use of the term autoscaling or if 
the above does not explain well.

Thanks!
Michael 




On 05/07/17 16:57, "Michael M Behrendt" <michaelbehre...@de.ibm.com> wrote:

>
>
>Hi Michael/Rodric,
>
>I'm struggling to understand how a separate invoker pool helps us avoiding
>to implement traditional autoscaling if we process multiple activations as
>threads within a shared process. Can you pls elaborate / provide an
>example?
>
>Sent from my iPhone
>
>> On 5. Jul 2017, at 16:53, Michael Marth <mma...@adobe.com.INVALID> wrote:
>>
>> Michael B,
>> Re your question: exactly what Rodric said :)
>>
>>
>>
>>> On 05/07/17 12:32, "Rodric Rabbah" <rod...@gmail.com> wrote:
>>>
>>> The issue at hand is precisely because there isn't any autoscaling of
>capacity (N invokers provide M containers per invoker). Once all those
>slots are consumed any new requests are queued - as previously discussed.
>>>
>>> Adding more density per vm is one way of providing additional capacity
>over finite resources. This is the essence of the initial proposal.
>>>
>>> As noted in previous discussions on this topic, this should be viewed as
>managing a different resource pool (and not the same pool of containers as
>ephemeral actions). Once you buy into that, generalization to other
>resource pools becomes natural.
>>>
>>> Going further, serverless becomes the new PaaS.
>>>
>>> -r
>>>
>>>> On Jul 5, 2017, at 6:11 AM, Michael M Behrendt
><michaelbehre...@de.ibm.com> wrote:
>>>>
>>>> Hi Michael,
>>>>
>>>> thanks for the feedback -- glad you like my stmt re value prop :-)
>>>>
>>>> I might not yet have fully gotten my head around Steve's proposal --
>what
>>>> are your thoughts on how this would help avoiding the reimplementation
>of
>>>> an autoscaling / feedback loop mechanism, as we know it from more
>>>> traditional runtime platforms?
>>>>
>>>>
>>>> Thanks & best regards
>>>> Michael
>>>>
>>>>
>>>>
>>>> From:   Michael Marth <mma...@adobe.com.INVALID>
>>>> To: "dev@openwhisk.apache.org" <dev@openwhisk.apache.org>
>>>> Date:   07/05/2017 11:25 AM
>>>> Subject:Re: Improving support for UI driven use cases
>>>>
>>>>
>>>>
>>>> Hi Michael,
>>>>
>>>> Totally agree with your statement
>>>> ?value prop of serverless is that folks don't have to care about that"
>>>>
>>>> Again, the proposal at hand does not intend to change that at all. On
>the
>>>> contrary - in our mind it?s a requirement that the developer should not
>
>>>> change or that internals of the execution engines get exposed.
>>>>
>>>> I find Stephen?s comment about generalising the runtime behaviour very
>>>> exciting. It could open the door to very different types of workloads
>>>> (like training Tensorflow or running Spark jobs), but with the same
>value
>>>> prop: users do not have to care about the managing resources/servers.
>And
>>>> for providers of OW systems all the OW goodies would still apply (e.g.
>>>> running untrusted code). Moreover, if we split the Invoker into
>different
>>>> specialised Invokers then those different specialised workloads could
>live
>>>> independently from each other (in terms of code as well as resource
>>>> allocation in deployments).
>>>> You can probably tell I am

Re: Improving support for UI driven use cases

2017-07-05 Thread Michael M Behrendt


Again, this doesn't answer how to avoid implementing the autoscaling
function idescribed below...

Sent from my iPhone

> On 5. Jul 2017, at 17:45, Alex Glikson <glik...@il.ibm.com> wrote:
>
> Once different 'flavors' of pools/invokers are supported, one could
> implement whatever policy for resource allocation and/or isolation and/or

> load balancing they want in an invoker (or group of invokers) - without
> necessarily affecting the 'core' of OpenWhisk, as long as the programming

> model remains the same.
> However, with containers handling multiple requests, I am not sure that
> the latter will be still true -- in particular, whether the developer can

> still assume dedicated resource allocation per action invocation
> (primarily memory), or we would also need to surface heterogeneous
> 'flavors' of resources allocated for an action (which might be perceived
> as a natural and good thing - or maybe the opposite, given that we are
> trying to make the developer unaware of infrastructure).
>
> Regards,
> Alex
>
>
>
>
> From:   "Michael M Behrendt" <michaelbehre...@de.ibm.com>
> To: dev@openwhisk.apache.org
> Date:   05/07/2017 05:58 PM
> Subject:Re: Improving support for UI driven use cases
>
>
>
>
>
> Hi Michael/Rodric,
>
> I'm struggling to understand how a separate invoker pool helps us
avoiding
> to implement traditional autoscaling if we process multiple activations
as
> threads within a shared process. Can you pls elaborate / provide an
> example?
>
> Sent from my iPhone
>
>> On 5. Jul 2017, at 16:53, Michael Marth <mma...@adobe.com.INVALID>
> wrote:
>>
>> Michael B,
>> Re your question: exactly what Rodric said :)
>>
>>
>>
>>> On 05/07/17 12:32, "Rodric Rabbah" <rod...@gmail.com> wrote:
>>>
>>> The issue at hand is precisely because there isn't any autoscaling of
> capacity (N invokers provide M containers per invoker). Once all those
> slots are consumed any new requests are queued - as previously discussed.
>>>
>>> Adding more density per vm is one way of providing additional capacity
> over finite resources. This is the essence of the initial proposal.
>>>
>>> As noted in previous discussions on this topic, this should be viewed
> as
> managing a different resource pool (and not the same pool of containers
as
> ephemeral actions). Once you buy into that, generalization to other
> resource pools becomes natural.
>>>
>>> Going further, serverless becomes the new PaaS.
>>>
>>> -r
>>>
>>>> On Jul 5, 2017, at 6:11 AM, Michael M Behrendt
> <michaelbehre...@de.ibm.com> wrote:
>>>>
>>>> Hi Michael,
>>>>
>>>> thanks for the feedback -- glad you like my stmt re value prop :-)
>>>>
>>>> I might not yet have fully gotten my head around Steve's proposal --
> what
>>>> are your thoughts on how this would help avoiding the reimplementation
> of
>>>> an autoscaling / feedback loop mechanism, as we know it from more
>>>> traditional runtime platforms?
>>>>
>>>>
>>>> Thanks & best regards
>>>> Michael
>>>>
>>>>
>>>>
>>>> From:   Michael Marth <mma...@adobe.com.INVALID>
>>>> To: "dev@openwhisk.apache.org" <dev@openwhisk.apache.org>
>>>> Date:   07/05/2017 11:25 AM
>>>> Subject:Re: Improving support for UI driven use cases
>>>>
>>>>
>>>>
>>>> Hi Michael,
>>>>
>>>> Totally agree with your statement
>>>> ?value prop of serverless is that folks don't have to care about that"
>>>>
>>>> Again, the proposal at hand does not intend to change that at all. On
> the
>>>> contrary - in our mind it?s a requirement that the developer should
> not
>
>>>> change or that internals of the execution engines get exposed.
>>>>
>>>> I find Stephen?s comment about generalising the runtime behaviour very
>>>> exciting. It could open the door to very different types of workloads
>>>> (like training Tensorflow or running Spark jobs), but with the same
> value
>>>> prop: users do not have to care about the managing resources/servers.
> And
>>>> for providers of OW systems all the OW goodies would still apply (e.g.
>>>> running untrusted code). Moreover, if we split the Invoker into
> different
>>>> specialised Invokers then those different spec

Re: Improving support for UI driven use cases

2017-07-05 Thread Michael M Behrendt
Hi Michael,

thanks for the feedback -- glad you like my stmt re value prop :-)

I might not yet have fully gotten my head around Steve's proposal -- what 
are your thoughts on how this would help avoiding the reimplementation of 
an autoscaling / feedback loop mechanism, as we know it from more 
traditional runtime platforms?


Thanks & best regards
Michael



From:   Michael Marth <mma...@adobe.com.INVALID>
To: "dev@openwhisk.apache.org" <dev@openwhisk.apache.org>
Date:   07/05/2017 11:25 AM
Subject:    Re: Improving support for UI driven use cases



Hi Michael,

Totally agree with your statement
?value prop of serverless is that folks don't have to care about that"

Again, the proposal at hand does not intend to change that at all. On the 
contrary - in our mind it?s a requirement that the developer should not 
change or that internals of the execution engines get exposed.

I find Stephen?s comment about generalising the runtime behaviour very 
exciting. It could open the door to very different types of workloads 
(like training Tensorflow or running Spark jobs), but with the same value 
prop: users do not have to care about the managing resources/servers. And 
for providers of OW systems all the OW goodies would still apply (e.g. 
running untrusted code). Moreover, if we split the Invoker into different 
specialised Invokers then those different specialised workloads could live 
independently from each other (in terms of code as well as resource 
allocation in deployments).
You can probably tell I am really excited about Stephen's idea :) I think 
it would be a great step forward in increasing the use cases for OW.

Cheers
Michael





On 04/07/17 20:15, "Michael M Behrendt" <michaelbehre...@de.ibm.com> 
wrote:

>Hi Dragos,
>
>> What stops
>> Openwhisk to be smart in observing the response times, CPU consumption,
>> memory consumption of the running containers ? 
>
>What are your thoughts on how this approach would be different from the 
many IaaS- and PaaS-centric autoscaling solutions that have been built 
over the last years? All of them require relatively complex policies (eg 
scale based on cpu or mem utilization, end-user response time, etc.? What 
are the thresholds for when to add/remove capacity?), and a value prop of 
serverless is that folks don't have to care about that.
>
>we should discuss more during the call, but wanted to get this out as 
food for thought.
>
>Sent from my iPhone
>
>On 4. Jul 2017, at 18:50, Dascalita Dragos <ddrag...@gmail.com> wrote:
>
>>> How could a developer understand how many requests per container to 
set
>> 
>> James, this is a good point, along with the other points in your email.
>> 
>> I think the developer doesn't need to know this info actually. What 
stops
>> Openwhisk to be smart in observing the response times, CPU consumption,
>> memory consumption of the running containers ? Doing so it could learn
>> automatically how many concurrent requests 1 action can handle. It 
might be
>> easier to solve this problem efficiently, instead of the other problem
>> which pushes the entire system to its limits when a couple of actions 
get a
>> lot of traffic.
>> 
>> 
>> 
>>> On Mon, Jul 3, 2017 at 10:08 AM James Thomas <jthomas...@gmail.com> 
wrote:
>>> 
>>> +1 on Markus' points about "crash safety" and "scaling". I can 
understand
>>> the reasons behind exploring this change but from a developer 
experience
>>> point of view this adds introduces a large amount of complexity to the
>>> programming model.
>>> 
>>> If I have a concurrent container serving 100 requests and one of the
>>> requests triggers a fatal error how does that affect the other 
requests?
>>> Tearing down the entire runtime environment will destroy all those
>>> requests.
>>> 
>>> How could a developer understand how many requests per container to 
set
>>> without a manual trial and error process? It also means you have to 
start
>>> considering things like race conditions or other challenges of 
concurrent
>>> code execution. This makes debugging and monitoring also more 
challenging.
>>> 
>>> Looking at the other serverless providers, I've not seen this featured
>>> requested before. Developers generally ask AWS to raise the concurrent
>>> invocations limit for their application. This keeps the platform doing 
the
>>> hard task of managing resources and being efficient and allows them to 
use
>>> the same programming model.
>>> 
>>>> On 2 July 2017 at 11:05, Markus Thömmes <markusthoem...@me.com> 
wrote:
>>>> 
>>>>

Re: Improving support for UI driven use cases

2017-07-05 Thread Michael Marth
Hi Michael,

Totally agree with your statement
“value prop of serverless is that folks don't have to care about that"

Again, the proposal at hand does not intend to change that at all. On the 
contrary - in our mind it’s a requirement that the developer should not change 
or that internals of the execution engines get exposed.

I find Stephen’s comment about generalising the runtime behaviour very 
exciting. It could open the door to very different types of workloads (like 
training Tensorflow or running Spark jobs), but with the same value prop: users 
do not have to care about the managing resources/servers. And for providers of 
OW systems all the OW goodies would still apply (e.g. running untrusted code). 
Moreover, if we split the Invoker into different specialised Invokers then 
those different specialised workloads could live independently from each other 
(in terms of code as well as resource allocation in deployments).
You can probably tell I am really excited about Stephen's idea :) I think it 
would be a great step forward in increasing the use cases for OW.

Cheers
Michael





On 04/07/17 20:15, "Michael M Behrendt"  wrote:

>Hi Dragos,
>
>> What stops
>> Openwhisk to be smart in observing the response times, CPU consumption,
>> memory consumption of the running containers ? 
>
>What are your thoughts on how this approach would be different from the many 
>IaaS- and PaaS-centric autoscaling solutions that have been built over the 
>last years? All of them require relatively complex policies (eg scale based on 
>cpu or mem utilization, end-user response time, etc.? What are the thresholds 
>for when to add/remove capacity?), and a value prop of serverless is that 
>folks don't have to care about that.
>
>we should discuss more during the call, but wanted to get this out as food for 
>thought.
>
>Sent from my iPhone
>
>On 4. Jul 2017, at 18:50, Dascalita Dragos  wrote:
>
>>> How could a developer understand how many requests per container to set
>> 
>> James, this is a good point, along with the other points in your email.
>> 
>> I think the developer doesn't need to know this info actually. What stops
>> Openwhisk to be smart in observing the response times, CPU consumption,
>> memory consumption of the running containers ? Doing so it could learn
>> automatically how many concurrent requests 1 action can handle. It might be
>> easier to solve this problem efficiently, instead of the other problem
>> which pushes the entire system to its limits when a couple of actions get a
>> lot of traffic.
>> 
>> 
>> 
>>> On Mon, Jul 3, 2017 at 10:08 AM James Thomas  wrote:
>>> 
>>> +1 on Markus' points about "crash safety" and "scaling". I can understand
>>> the reasons behind exploring this change but from a developer experience
>>> point of view this adds introduces a large amount of complexity to the
>>> programming model.
>>> 
>>> If I have a concurrent container serving 100 requests and one of the
>>> requests triggers a fatal error how does that affect the other requests?
>>> Tearing down the entire runtime environment will destroy all those
>>> requests.
>>> 
>>> How could a developer understand how many requests per container to set
>>> without a manual trial and error process? It also means you have to start
>>> considering things like race conditions or other challenges of concurrent
>>> code execution. This makes debugging and monitoring also more challenging.
>>> 
>>> Looking at the other serverless providers, I've not seen this featured
>>> requested before. Developers generally ask AWS to raise the concurrent
>>> invocations limit for their application. This keeps the platform doing the
>>> hard task of managing resources and being efficient and allows them to use
>>> the same programming model.
>>> 
 On 2 July 2017 at 11:05, Markus Thömmes  wrote:
 
 ...
 
>>> 
 
>>> To Rodric's points I think there are two topics to speak about and discuss:
 
 1. The programming model: The current model encourages users to break
 their actions apart in "functions" that take payload and return payload.
 Having a deployment model outlined could as noted encourage users to use
 OpenWhisk as a way to rapidly deploy/undeploy their usual webserver based
 applications. The current model is nice in that it solves a lot of
>>> problems
 for the customer in terms of scalability and "crash safeness".
 
 2. Raw throughput of our deployment model: Setting the concerns aside I
 think it is valid to explore concurrent invocations of actions on the
>>> same
 container. This does not necessarily mean that users start to deploy
 monolithic apps as noted above, but it certainly could. Keeping our
 JSON-in/JSON-out at least for now though, could encourage users to
>>> continue
 to think in functions. Having a toggle per action which is disabled by
 

Re: Improving support for UI driven use cases

2017-07-04 Thread Michael M Behrendt


To Adrian's definition (which I also like) -- if the instance only lives
half a second, how does that fit with the autoscaling behavior you outlined
below, which i _think_ relies onmulti-threaded long-running processes?

Sent from my iPhone

On 4. Jul 2017, at 23:18, Dascalita Dragos  wrote:

>> how this approach would be different from the many IaaS- and
PaaS-centric
>
> I like Adrian Cockcroft's response (
> https://twitter.com/intent/like?tweet_id=736553530689998848 ) to this:
> *"...If your PaaS can efficiently start instances in 20ms that run for
half
> a second, then call it serverless..."*
>
> I think none of us here imagines that we're building a PaaS experience
for
> developers, nor does the current proposal intends to suggest we should. I
> also assume that none of us imagines to run in production a scalable
system
> with millions of concurrent users with the current setup that demands an
> incredible amount of resources.
>
> Quoting Michael M, which said it nicely, the intent is to make  "*the
> current problem ... not be a problem in reality anymore (and simply
remain
> as a theoretical problem)*".
>
> I think we have a way to be pragmatic about the current limitations, make
> it so that developers don't suffer b/c of this, and buy us enough time to
> implement the better model that should be used for serverless, where
> monitoring, "crash safety", "scaling" , and all of the concerns listed
> previously in this thread are addressed better, but at the same time, the
> performance doesn't have to suffer so much. This is the intent of this
> proposal.
>
>
>
> On Tue, Jul 4, 2017 at 11:15 AM Michael M Behrendt <
> michaelbehre...@de.ibm.com> wrote:
>
>> Hi Dragos,
>>
>>> What stops
>>> Openwhisk to be smart in observing the response times, CPU consumption,
>>> memory consumption of the running containers ?
>>
>> What are your thoughts on how this approach would be different from the
>> many IaaS- and PaaS-centric autoscaling solutions that have been built
over
>> the last years? All of them require relatively complex policies (eg
scale
>> based on cpu or mem utilization, end-user response time, etc.? What are
the
>> thresholds for when to add/remove capacity?), and a value prop of
>> serverless is that folks don't have to care about that.
>>
>> we should discuss more during the call, but wanted to get this out as
food
>> for thought.
>>
>> Sent from my iPhone
>>
>> On 4. Jul 2017, at 18:50, Dascalita Dragos  wrote:
>>
 How could a developer understand how many requests per container to
set
>>>
>>> James, this is a good point, along with the other points in your email.
>>>
>>> I think the developer doesn't need to know this info actually. What
stops
>>> Openwhisk to be smart in observing the response times, CPU consumption,
>>> memory consumption of the running containers ? Doing so it could learn
>>> automatically how many concurrent requests 1 action can handle. It
might
>> be
>>> easier to solve this problem efficiently, instead of the other problem
>>> which pushes the entire system to its limits when a couple of actions
>> get a
>>> lot of traffic.
>>>
>>>
>>>
 On Mon, Jul 3, 2017 at 10:08 AM James Thomas 
>> wrote:

 +1 on Markus' points about "crash safety" and "scaling". I can
>> understand
 the reasons behind exploring this change but from a developer
experience
 point of view this adds introduces a large amount of complexity to the
 programming model.

 If I have a concurrent container serving 100 requests and one of the
 requests triggers a fatal error how does that affect the other
requests?
 Tearing down the entire runtime environment will destroy all those
 requests.

 How could a developer understand how many requests per container to
set
 without a manual trial and error process? It also means you have to
>> start
 considering things like race conditions or other challenges of
>> concurrent
 code execution. This makes debugging and monitoring also more
>> challenging.

 Looking at the other serverless providers, I've not seen this featured
 requested before. Developers generally ask AWS to raise the concurrent
 invocations limit for their application. This keeps the platform doing
>> the
 hard task of managing resources and being efficient and allows them to
>> use
 the same programming model.

> On 2 July 2017 at 11:05, Markus Thömmes 
wrote:
>
> ...
>

>
 To Rodric's points I think there are two topics to speak about and
>> discuss:
>
> 1. The programming model: The current model encourages users to break
> their actions apart in "functions" that take payload and return
>> payload.
> Having a deployment model outlined could as noted encourage users to
>> use
> OpenWhisk as a way to rapidly deploy/undeploy their usual webserver
>> based
> 

Re: Improving support for UI driven use cases

2017-07-04 Thread Dascalita Dragos
>  how this approach would be different from the many IaaS- and PaaS-centric

I like Adrian Cockcroft's response (
https://twitter.com/intent/like?tweet_id=736553530689998848 ) to this:
*"...If your PaaS can efficiently start instances in 20ms that run for half
a second, then call it serverless..."*

I think none of us here imagines that we're building a PaaS experience for
developers, nor does the current proposal intends to suggest we should. I
also assume that none of us imagines to run in production a scalable system
with millions of concurrent users with the current setup that demands an
incredible amount of resources.

Quoting Michael M, which said it nicely, the intent is to make  "*the
current problem ... not be a problem in reality anymore (and simply remain
as a theoretical problem)*".

I think we have a way to be pragmatic about the current limitations, make
it so that developers don't suffer b/c of this, and buy us enough time to
implement the better model that should be used for serverless, where
monitoring, "crash safety", "scaling" , and all of the concerns listed
previously in this thread are addressed better, but at the same time, the
performance doesn't have to suffer so much. This is the intent of this
proposal.



On Tue, Jul 4, 2017 at 11:15 AM Michael M Behrendt <
michaelbehre...@de.ibm.com> wrote:

> Hi Dragos,
>
> > What stops
> > Openwhisk to be smart in observing the response times, CPU consumption,
> > memory consumption of the running containers ?
>
> What are your thoughts on how this approach would be different from the
> many IaaS- and PaaS-centric autoscaling solutions that have been built over
> the last years? All of them require relatively complex policies (eg scale
> based on cpu or mem utilization, end-user response time, etc.? What are the
> thresholds for when to add/remove capacity?), and a value prop of
> serverless is that folks don't have to care about that.
>
> we should discuss more during the call, but wanted to get this out as food
> for thought.
>
> Sent from my iPhone
>
> On 4. Jul 2017, at 18:50, Dascalita Dragos  wrote:
>
> >> How could a developer understand how many requests per container to set
> >
> > James, this is a good point, along with the other points in your email.
> >
> > I think the developer doesn't need to know this info actually. What stops
> > Openwhisk to be smart in observing the response times, CPU consumption,
> > memory consumption of the running containers ? Doing so it could learn
> > automatically how many concurrent requests 1 action can handle. It might
> be
> > easier to solve this problem efficiently, instead of the other problem
> > which pushes the entire system to its limits when a couple of actions
> get a
> > lot of traffic.
> >
> >
> >
> >> On Mon, Jul 3, 2017 at 10:08 AM James Thomas 
> wrote:
> >>
> >> +1 on Markus' points about "crash safety" and "scaling". I can
> understand
> >> the reasons behind exploring this change but from a developer experience
> >> point of view this adds introduces a large amount of complexity to the
> >> programming model.
> >>
> >> If I have a concurrent container serving 100 requests and one of the
> >> requests triggers a fatal error how does that affect the other requests?
> >> Tearing down the entire runtime environment will destroy all those
> >> requests.
> >>
> >> How could a developer understand how many requests per container to set
> >> without a manual trial and error process? It also means you have to
> start
> >> considering things like race conditions or other challenges of
> concurrent
> >> code execution. This makes debugging and monitoring also more
> challenging.
> >>
> >> Looking at the other serverless providers, I've not seen this featured
> >> requested before. Developers generally ask AWS to raise the concurrent
> >> invocations limit for their application. This keeps the platform doing
> the
> >> hard task of managing resources and being efficient and allows them to
> use
> >> the same programming model.
> >>
> >>> On 2 July 2017 at 11:05, Markus Thömmes  wrote:
> >>>
> >>> ...
> >>>
> >>
> >>>
> >> To Rodric's points I think there are two topics to speak about and
> discuss:
> >>>
> >>> 1. The programming model: The current model encourages users to break
> >>> their actions apart in "functions" that take payload and return
> payload.
> >>> Having a deployment model outlined could as noted encourage users to
> use
> >>> OpenWhisk as a way to rapidly deploy/undeploy their usual webserver
> based
> >>> applications. The current model is nice in that it solves a lot of
> >> problems
> >>> for the customer in terms of scalability and "crash safeness".
> >>>
> >>> 2. Raw throughput of our deployment model: Setting the concerns aside I
> >>> think it is valid to explore concurrent invocations of actions on the
> >> same
> >>> container. This does not necessarily mean that users start to deploy
> >>> 

Re: Improving support for UI driven use cases

2017-07-04 Thread Michael M Behrendt
Hi Dragos,

> What stops
> Openwhisk to be smart in observing the response times, CPU consumption,
> memory consumption of the running containers ? 

What are your thoughts on how this approach would be different from the many 
IaaS- and PaaS-centric autoscaling solutions that have been built over the last 
years? All of them require relatively complex policies (eg scale based on cpu 
or mem utilization, end-user response time, etc.? What are the thresholds for 
when to add/remove capacity?), and a value prop of serverless is that folks 
don't have to care about that.

we should discuss more during the call, but wanted to get this out as food for 
thought.

Sent from my iPhone

On 4. Jul 2017, at 18:50, Dascalita Dragos  wrote:

>> How could a developer understand how many requests per container to set
> 
> James, this is a good point, along with the other points in your email.
> 
> I think the developer doesn't need to know this info actually. What stops
> Openwhisk to be smart in observing the response times, CPU consumption,
> memory consumption of the running containers ? Doing so it could learn
> automatically how many concurrent requests 1 action can handle. It might be
> easier to solve this problem efficiently, instead of the other problem
> which pushes the entire system to its limits when a couple of actions get a
> lot of traffic.
> 
> 
> 
>> On Mon, Jul 3, 2017 at 10:08 AM James Thomas  wrote:
>> 
>> +1 on Markus' points about "crash safety" and "scaling". I can understand
>> the reasons behind exploring this change but from a developer experience
>> point of view this adds introduces a large amount of complexity to the
>> programming model.
>> 
>> If I have a concurrent container serving 100 requests and one of the
>> requests triggers a fatal error how does that affect the other requests?
>> Tearing down the entire runtime environment will destroy all those
>> requests.
>> 
>> How could a developer understand how many requests per container to set
>> without a manual trial and error process? It also means you have to start
>> considering things like race conditions or other challenges of concurrent
>> code execution. This makes debugging and monitoring also more challenging.
>> 
>> Looking at the other serverless providers, I've not seen this featured
>> requested before. Developers generally ask AWS to raise the concurrent
>> invocations limit for their application. This keeps the platform doing the
>> hard task of managing resources and being efficient and allows them to use
>> the same programming model.
>> 
>>> On 2 July 2017 at 11:05, Markus Thömmes  wrote:
>>> 
>>> ...
>>> 
>> 
>>> 
>> To Rodric's points I think there are two topics to speak about and discuss:
>>> 
>>> 1. The programming model: The current model encourages users to break
>>> their actions apart in "functions" that take payload and return payload.
>>> Having a deployment model outlined could as noted encourage users to use
>>> OpenWhisk as a way to rapidly deploy/undeploy their usual webserver based
>>> applications. The current model is nice in that it solves a lot of
>> problems
>>> for the customer in terms of scalability and "crash safeness".
>>> 
>>> 2. Raw throughput of our deployment model: Setting the concerns aside I
>>> think it is valid to explore concurrent invocations of actions on the
>> same
>>> container. This does not necessarily mean that users start to deploy
>>> monolithic apps as noted above, but it certainly could. Keeping our
>>> JSON-in/JSON-out at least for now though, could encourage users to
>> continue
>>> to think in functions. Having a toggle per action which is disabled by
>>> default might be a good way to start here, since many users might need to
>>> change action code to support that notion and for some applications it
>>> might not be valid at all. I think it was also already noted, that this
>>> imposes some of the "old-fashioned" problems on the user, like: How many
>>> concurrent requests will my action be able to handle? That kinda defeats
>>> the seemless-scalability point of serverless.
>>> 
>>> Cheers,
>>> Markus
>>> 
>>> 
>> --
>> Regards,
>> James Thomas
>> 



Re: Improving support for UI driven use cases

2017-07-04 Thread Dascalita Dragos
Michael , +1 to how you summarized the problem.

> I’d suggest that the first step is to support “multiple heterogeneous
resource pools”

I'd like to reinforce Stephen's idea on "multiple resource pools". We've
been already using this idea in production systems successfully in other
setups, with Mesos, isolating the Spark workloads requiring state, from
other stateless workloads, or from GPU workloads. This idea would be a
perfect fit for Openwhisk. It can also be extended beyond the Invokers, to
other cluster managers like Mesos, and Kube.


On Tue, Jul 4, 2017 at 7:05 AM Stephen Fink  wrote:

> Hi all,
>
> I’ve been lurking a bit on this thread, but haven’t had time to fully
> digest all the issues.
>
> I’d suggest that the first step is to support “multiple heterogeneous
> resource pools”, where a resource pool is a set of invokers managed by a
> load balancer.  There are lots of reasons we may want to support invokers
> with different flavors:  long-running actions, invokers in a VPN, invokers
> with GPUs,  invokers with big memory, invokers which support concurrent
> execution, etc…  .  If we had a general way to plug in a new resource
> pool, folks could feel free to experiment with any new flavors they like
> without having to debate the implications on other flavors.
>
> I tend to doubt that there is a “one size fits all” solution here, so I’d
> suggest we bite the bullet and engineer for heterogeneity.
>
> SJF
>
>
> > On Jul 4, 2017, at 9:55 AM, Michael Marth 
> wrote:
> >
> > Hi Jeremias, all,
> >
> > Tyson and Dragos are travelling this week, so that I don’t know by when
> they get to respond. I have worked with them on this topic, so let me jump
> in and comment until they are able to reply.
> >
> > From my POV having a call like you suggest is a really good idea. Let’s
> wait for Tyson & Dragos to chime in to find a date.
> >
> > As you mention the discussion so far was jumping across different
> topics, especially the use case, the problem to be solved and the proposed
> solution. In preparation of the call I think we can clarify use case and
> problem on the list. Here’s my view:
> >
> > Use Case
> >
> > For us the use case can be summarised with “dynamic, high performance
> websites/mobile apps”. This implies:
> > 1 High concurrency, i.e. Many requests coming in at the same time
> > 2 The code to be executed is the same code across these different
> requests (as opposed to a long tail distribution of many different actions
> being executed concurrently). In our case “many” would mean “hundreds” or a
> few thousand.
> > 3 The latency (time to start execution) matters, because human users are
> waiting for the response. Ideally, in these order of magnitudes of
> concurrent requests the latency should not change much.
> >
> > All 3 requirements need to be satisfied for this use case.
> > In the discussion so far it was mentioned that there are other use cases
> which might have similar requirements. That’s great and I do not want to
> rule them out, obviously. The above is just to make it clear from where we
> are coming from.
> >
> > At this point I would like to mention that it is my understanding that
> this use case is within OpenWhisk’s strike zone, i.e. Something that we all
> think is reasonable to support. Please speak up if you disagree.
> >
> > The Problem
> >
> > One can look at the problem in two ways:
> > Either you keep the resources of the OW system constant (i.e. No
> scaling). In that case latency increases very quickly as demonstrated by
> Tyson’s tests.
> > Or you increase the system’s capacity. In that case the amount of
> machines to satisfy this use case quickly becomes prohibitively expensive
> to run for the OW operator – where expensive is defined as “compared to
> traditional web servers” (in our case a standard Node.js server). Meaning,
> you need 100-1000 concurrent action containers to serve what can be served
> by 1 or 2 Node.js containers.
> >
> > Of course, the proposed solution is not a fundamental “fix” for the
> above. It would only move the needle ~2 orders of magnitude – so that the
> current problem would not be a problem in reality anymore (and simply
> remain as a theoretical problem). For me that would be good enough.
> >
> > The solution approach
> >
> > Would not like to comment on the proposed solution’s details (and leave
> that to Dragos and Tyson). However, it was mentioned that the approach
> would change the programming model for users:
> > Our mindset and approach was that we explicitly do not want  to change
> how OpenWhisk exposes itself to users. Meaning, users should still be able
> to use NPMs, etc  - i.e. This would be an internal implementation detail
> that is not visible for users. (we can make things more explicit to users
> and e.g. Have them requests a special concurrent runtime if we wish to do
> so – so far we tried to make it transparent to users, though).
> >
> > Many thanks

Re: Improving support for UI driven use cases

2017-07-04 Thread Dascalita Dragos
> How could a developer understand how many requests per container to set

James, this is a good point, along with the other points in your email.

I think the developer doesn't need to know this info actually. What stops
Openwhisk to be smart in observing the response times, CPU consumption,
memory consumption of the running containers ? Doing so it could learn
automatically how many concurrent requests 1 action can handle. It might be
easier to solve this problem efficiently, instead of the other problem
which pushes the entire system to its limits when a couple of actions get a
lot of traffic.



On Mon, Jul 3, 2017 at 10:08 AM James Thomas  wrote:

> +1 on Markus' points about "crash safety" and "scaling". I can understand
> the reasons behind exploring this change but from a developer experience
> point of view this adds introduces a large amount of complexity to the
> programming model.
>
> If I have a concurrent container serving 100 requests and one of the
> requests triggers a fatal error how does that affect the other requests?
> Tearing down the entire runtime environment will destroy all those
> requests.
>
> How could a developer understand how many requests per container to set
> without a manual trial and error process? It also means you have to start
> considering things like race conditions or other challenges of concurrent
> code execution. This makes debugging and monitoring also more challenging.
>
> Looking at the other serverless providers, I've not seen this featured
> requested before. Developers generally ask AWS to raise the concurrent
> invocations limit for their application. This keeps the platform doing the
> hard task of managing resources and being efficient and allows them to use
> the same programming model.
>
> On 2 July 2017 at 11:05, Markus Thömmes  wrote:
>
> > ...
> >
>
> >
> To Rodric's points I think there are two topics to speak about and discuss:
> >
> > 1. The programming model: The current model encourages users to break
> > their actions apart in "functions" that take payload and return payload.
> > Having a deployment model outlined could as noted encourage users to use
> > OpenWhisk as a way to rapidly deploy/undeploy their usual webserver based
> > applications. The current model is nice in that it solves a lot of
> problems
> > for the customer in terms of scalability and "crash safeness".
> >
> > 2. Raw throughput of our deployment model: Setting the concerns aside I
> > think it is valid to explore concurrent invocations of actions on the
> same
> > container. This does not necessarily mean that users start to deploy
> > monolithic apps as noted above, but it certainly could. Keeping our
> > JSON-in/JSON-out at least for now though, could encourage users to
> continue
> > to think in functions. Having a toggle per action which is disabled by
> > default might be a good way to start here, since many users might need to
> > change action code to support that notion and for some applications it
> > might not be valid at all. I think it was also already noted, that this
> > imposes some of the "old-fashioned" problems on the user, like: How many
> > concurrent requests will my action be able to handle? That kinda defeats
> > the seemless-scalability point of serverless.
> >
> > Cheers,
> > Markus
> >
> >
> --
> Regards,
> James Thomas
>


Re: Improving support for UI driven use cases

2017-07-04 Thread Michael Marth
I like that approach a lot!




On 04/07/17 16:05, "Stephen Fink"  wrote:

>Hi all,
>
>I’ve been lurking a bit on this thread, but haven’t had time to fully digest 
>all the issues.
>
>I’d suggest that the first step is to support “multiple heterogeneous resource 
>pools”, where a resource pool is a set of invokers managed by a load balancer. 
> There are lots of reasons we may want to support invokers with different 
>flavors:  long-running actions, invokers in a VPN, invokers with GPUs,  
>invokers with big memory, invokers which support concurrent execution, etc…  . 
> If we had a general way to plug in a new resource pool, folks could feel 
>free to experiment with any new flavors they like without having to debate the 
>implications on other flavors.
>
>I tend to doubt that there is a “one size fits all” solution here, so I’d 
>suggest we bite the bullet and engineer for heterogeneity.
>
>SJF
>
>
>> On Jul 4, 2017, at 9:55 AM, Michael Marth  wrote:
>> 
>> Hi Jeremias, all,
>> 
>> Tyson and Dragos are travelling this week, so that I don’t know by when they 
>> get to respond. I have worked with them on this topic, so let me jump in and 
>> comment until they are able to reply.
>> 
>> From my POV having a call like you suggest is a really good idea. Let’s wait 
>> for Tyson & Dragos to chime in to find a date.
>> 
>> As you mention the discussion so far was jumping across different topics, 
>> especially the use case, the problem to be solved and the proposed solution. 
>> In preparation of the call I think we can clarify use case and problem on 
>> the list. Here’s my view:
>> 
>> Use Case
>> 
>> For us the use case can be summarised with “dynamic, high performance 
>> websites/mobile apps”. This implies:
>> 1 High concurrency, i.e. Many requests coming in at the same time
>> 2 The code to be executed is the same code across these different requests 
>> (as opposed to a long tail distribution of many different actions being 
>> executed concurrently). In our case “many” would mean “hundreds” or a few 
>> thousand.
>> 3 The latency (time to start execution) matters, because human users are 
>> waiting for the response. Ideally, in these order of magnitudes of 
>> concurrent requests the latency should not change much.
>> 
>> All 3 requirements need to be satisfied for this use case.
>> In the discussion so far it was mentioned that there are other use cases 
>> which might have similar requirements. That’s great and I do not want to 
>> rule them out, obviously. The above is just to make it clear from where we 
>> are coming from.
>> 
>> At this point I would like to mention that it is my understanding that this 
>> use case is within OpenWhisk’s strike zone, i.e. Something that we all think 
>> is reasonable to support. Please speak up if you disagree.
>> 
>> The Problem
>> 
>> One can look at the problem in two ways:
>> Either you keep the resources of the OW system constant (i.e. No scaling). 
>> In that case latency increases very quickly as demonstrated by Tyson’s tests.
>> Or you increase the system’s capacity. In that case the amount of machines 
>> to satisfy this use case quickly becomes prohibitively expensive to run for 
>> the OW operator – where expensive is defined as “compared to traditional web 
>> servers” (in our case a standard Node.js server). Meaning, you need 100-1000 
>> concurrent action containers to serve what can be served by 1 or 2 Node.js 
>> containers.
>> 
>> Of course, the proposed solution is not a fundamental “fix” for the above. 
>> It would only move the needle ~2 orders of magnitude – so that the current 
>> problem would not be a problem in reality anymore (and simply remain as a 
>> theoretical problem). For me that would be good enough.
>> 
>> The solution approach
>> 
>> Would not like to comment on the proposed solution’s details (and leave that 
>> to Dragos and Tyson). However, it was mentioned that the approach would 
>> change the programming model for users:
>> Our mindset and approach was that we explicitly do not want  to change how 
>> OpenWhisk exposes itself to users. Meaning, users should still be able to 
>> use NPMs, etc  - i.e. This would be an internal implementation detail that 
>> is not visible for users. (we can make things more explicit to users and 
>> e.g. Have them requests a special concurrent runtime if we wish to do so – 
>> so far we tried to make it transparent to users, though).
>> 
>> Many thanks
>> Michael
>> 
>> 
>> 
>> On 03/07/17 14:48, "Jeremias Werner" 
>> > wrote:
>> 
>> Hi
>> 
>> Thanks for the write-up and the proposal. I think this is a nice idea and
>> sounds like a nice way of increasing throughput. Reading through the thread
>> it feels like there are different topics/problems mixed-up and the
>> discussion is becoming very complex already.
>> 
>> Therefore I would like to suggest that we streamline the 

Re: Improving support for UI driven use cases

2017-07-04 Thread Stephen Fink
Hi all,

I’ve been lurking a bit on this thread, but haven’t had time to fully digest 
all the issues.

I’d suggest that the first step is to support “multiple heterogeneous resource 
pools”, where a resource pool is a set of invokers managed by a load balancer.  
There are lots of reasons we may want to support invokers with different 
flavors:  long-running actions, invokers in a VPN, invokers with GPUs,  
invokers with big memory, invokers which support concurrent execution, etc…  .  
If we had a general way to plug in a new resource pool, folks could feel 
free to experiment with any new flavors they like without having to debate the 
implications on other flavors.

I tend to doubt that there is a “one size fits all” solution here, so I’d 
suggest we bite the bullet and engineer for heterogeneity.

SJF


> On Jul 4, 2017, at 9:55 AM, Michael Marth  wrote:
> 
> Hi Jeremias, all,
> 
> Tyson and Dragos are travelling this week, so that I don’t know by when they 
> get to respond. I have worked with them on this topic, so let me jump in and 
> comment until they are able to reply.
> 
> From my POV having a call like you suggest is a really good idea. Let’s wait 
> for Tyson & Dragos to chime in to find a date.
> 
> As you mention the discussion so far was jumping across different topics, 
> especially the use case, the problem to be solved and the proposed solution. 
> In preparation of the call I think we can clarify use case and problem on the 
> list. Here’s my view:
> 
> Use Case
> 
> For us the use case can be summarised with “dynamic, high performance 
> websites/mobile apps”. This implies:
> 1 High concurrency, i.e. Many requests coming in at the same time
> 2 The code to be executed is the same code across these different requests 
> (as opposed to a long tail distribution of many different actions being 
> executed concurrently). In our case “many” would mean “hundreds” or a few 
> thousand.
> 3 The latency (time to start execution) matters, because human users are 
> waiting for the response. Ideally, in these order of magnitudes of concurrent 
> requests the latency should not change much.
> 
> All 3 requirements need to be satisfied for this use case.
> In the discussion so far it was mentioned that there are other use cases 
> which might have similar requirements. That’s great and I do not want to rule 
> them out, obviously. The above is just to make it clear from where we are 
> coming from.
> 
> At this point I would like to mention that it is my understanding that this 
> use case is within OpenWhisk’s strike zone, i.e. Something that we all think 
> is reasonable to support. Please speak up if you disagree.
> 
> The Problem
> 
> One can look at the problem in two ways:
> Either you keep the resources of the OW system constant (i.e. No scaling). In 
> that case latency increases very quickly as demonstrated by Tyson’s tests.
> Or you increase the system’s capacity. In that case the amount of machines to 
> satisfy this use case quickly becomes prohibitively expensive to run for the 
> OW operator – where expensive is defined as “compared to traditional web 
> servers” (in our case a standard Node.js server). Meaning, you need 100-1000 
> concurrent action containers to serve what can be served by 1 or 2 Node.js 
> containers.
> 
> Of course, the proposed solution is not a fundamental “fix” for the above. It 
> would only move the needle ~2 orders of magnitude – so that the current 
> problem would not be a problem in reality anymore (and simply remain as a 
> theoretical problem). For me that would be good enough.
> 
> The solution approach
> 
> Would not like to comment on the proposed solution’s details (and leave that 
> to Dragos and Tyson). However, it was mentioned that the approach would 
> change the programming model for users:
> Our mindset and approach was that we explicitly do not want  to change how 
> OpenWhisk exposes itself to users. Meaning, users should still be able to use 
> NPMs, etc  - i.e. This would be an internal implementation detail that is not 
> visible for users. (we can make things more explicit to users and e.g. Have 
> them requests a special concurrent runtime if we wish to do so – so far we 
> tried to make it transparent to users, though).
> 
> Many thanks
> Michael
> 
> 
> 
> On 03/07/17 14:48, "Jeremias Werner" 
> > wrote:
> 
> Hi
> 
> Thanks for the write-up and the proposal. I think this is a nice idea and
> sounds like a nice way of increasing throughput. Reading through the thread
> it feels like there are different topics/problems mixed-up and the
> discussion is becoming very complex already.
> 
> Therefore I would like to suggest that we streamline the discussion a bit,
> maybe in a zoom.us session where we first give Tyson and Dragos the chance
> to walk through the proposal and clarify questions of the audience. Once we
> are all on the same page we 

Re: Improving support for UI driven use cases

2017-07-04 Thread Michael Marth
Hi Jeremias, all,

Tyson and Dragos are travelling this week, so that I don’t know by when they 
get to respond. I have worked with them on this topic, so let me jump in and 
comment until they are able to reply.

From my POV having a call like you suggest is a really good idea. Let’s wait 
for Tyson & Dragos to chime in to find a date.

As you mention the discussion so far was jumping across different topics, 
especially the use case, the problem to be solved and the proposed solution. In 
preparation of the call I think we can clarify use case and problem on the 
list. Here’s my view:

Use Case

For us the use case can be summarised with “dynamic, high performance 
websites/mobile apps”. This implies:
1 High concurrency, i.e. Many requests coming in at the same time
2 The code to be executed is the same code across these different requests (as 
opposed to a long tail distribution of many different actions being executed 
concurrently). In our case “many” would mean “hundreds” or a few thousand.
3 The latency (time to start execution) matters, because human users are 
waiting for the response. Ideally, in these order of magnitudes of concurrent 
requests the latency should not change much.

All 3 requirements need to be satisfied for this use case.
In the discussion so far it was mentioned that there are other use cases which 
might have similar requirements. That’s great and I do not want to rule them 
out, obviously. The above is just to make it clear from where we are coming 
from.

At this point I would like to mention that it is my understanding that this use 
case is within OpenWhisk’s strike zone, i.e. Something that we all think is 
reasonable to support. Please speak up if you disagree.

The Problem

One can look at the problem in two ways:
Either you keep the resources of the OW system constant (i.e. No scaling). In 
that case latency increases very quickly as demonstrated by Tyson’s tests.
Or you increase the system’s capacity. In that case the amount of machines to 
satisfy this use case quickly becomes prohibitively expensive to run for the OW 
operator – where expensive is defined as “compared to traditional web servers” 
(in our case a standard Node.js server). Meaning, you need 100-1000 concurrent 
action containers to serve what can be served by 1 or 2 Node.js containers.

Of course, the proposed solution is not a fundamental “fix” for the above. It 
would only move the needle ~2 orders of magnitude – so that the current problem 
would not be a problem in reality anymore (and simply remain as a theoretical 
problem). For me that would be good enough.

The solution approach

Would not like to comment on the proposed solution’s details (and leave that to 
Dragos and Tyson). However, it was mentioned that the approach would change the 
programming model for users:
Our mindset and approach was that we explicitly do not want  to change how 
OpenWhisk exposes itself to users. Meaning, users should still be able to use 
NPMs, etc  - i.e. This would be an internal implementation detail that is not 
visible for users. (we can make things more explicit to users and e.g. Have 
them requests a special concurrent runtime if we wish to do so – so far we 
tried to make it transparent to users, though).

Many thanks
Michael



On 03/07/17 14:48, "Jeremias Werner" 
> wrote:

Hi

Thanks for the write-up and the proposal. I think this is a nice idea and
sounds like a nice way of increasing throughput. Reading through the thread
it feels like there are different topics/problems mixed-up and the
discussion is becoming very complex already.

Therefore I would like to suggest that we streamline the discussion a bit,
maybe in a zoom.us session where we first give Tyson and Dragos the chance
to walk through the proposal and clarify questions of the audience. Once we
are all on the same page we could think of a discussion about the benefits
(improved throughput, latency) vs. challanges (resource sharing, crash
model, container lifetime, programming model) on the core of the proposal:
running multiple activations in a single user container. Once we have a
common understanding on that part we could step-up in the architecture and
discuss what's needed on higher components like invoker/load-balancer to
get this integrated.

(I said zoom.us session since I liked the one we had a few weeks ago. It
was efficient and interactive. If you like I could volunteer to setup the
session and/or writing the script/summary)

what do you think?

Many thanks in advance!

Jeremias


On Sun, Jul 2, 2017 at 5:43 PM, Rodric Rabbah 
> wrote:

You're discounting with event driven all use cases that are still latency
sensitive because they complete a response by call back or actuation at
completion. IoT, chatbots, notifications, all examples in addition to ui
which are latency sensitive and having uniform expectations on queuing time
is 

Re: Improving support for UI driven use cases

2017-07-03 Thread James Thomas
+1 on Markus' points about "crash safety" and "scaling". I can understand
the reasons behind exploring this change but from a developer experience
point of view this adds introduces a large amount of complexity to the
programming model.

If I have a concurrent container serving 100 requests and one of the
requests triggers a fatal error how does that affect the other requests?
Tearing down the entire runtime environment will destroy all those
requests.

How could a developer understand how many requests per container to set
without a manual trial and error process? It also means you have to start
considering things like race conditions or other challenges of concurrent
code execution. This makes debugging and monitoring also more challenging.

Looking at the other serverless providers, I've not seen this featured
requested before. Developers generally ask AWS to raise the concurrent
invocations limit for their application. This keeps the platform doing the
hard task of managing resources and being efficient and allows them to use
the same programming model.

On 2 July 2017 at 11:05, Markus Thömmes  wrote:

> ...
>

>
To Rodric's points I think there are two topics to speak about and discuss:
>
> 1. The programming model: The current model encourages users to break
> their actions apart in "functions" that take payload and return payload.
> Having a deployment model outlined could as noted encourage users to use
> OpenWhisk as a way to rapidly deploy/undeploy their usual webserver based
> applications. The current model is nice in that it solves a lot of problems
> for the customer in terms of scalability and "crash safeness".
>
> 2. Raw throughput of our deployment model: Setting the concerns aside I
> think it is valid to explore concurrent invocations of actions on the same
> container. This does not necessarily mean that users start to deploy
> monolithic apps as noted above, but it certainly could. Keeping our
> JSON-in/JSON-out at least for now though, could encourage users to continue
> to think in functions. Having a toggle per action which is disabled by
> default might be a good way to start here, since many users might need to
> change action code to support that notion and for some applications it
> might not be valid at all. I think it was also already noted, that this
> imposes some of the "old-fashioned" problems on the user, like: How many
> concurrent requests will my action be able to handle? That kinda defeats
> the seemless-scalability point of serverless.
>
> Cheers,
> Markus
>
>
-- 
Regards,
James Thomas


Re: Improving support for UI driven use cases

2017-07-02 Thread Markus Thömmes

Right, I think the UI workflows are just an example of apps that are latency 
sensitive in general.

I had a discussion with Stephen Fink on the matter of detecting ourselves that 
an action is latency sensitive by using the blocking parameter or as mentioned 
the user's configuration in terms of web-action vs. non-web action. The 
conclusion there was, that we probably cannot reliably detect latency 
sensitivity without asking the user to do so. Having such an option has 
implications on other aspects of the platform: Why would one not choose that 
option?

To Rodric's points I think there are two topics to speak about and discuss:

1. The programming model: The current model encourages users to break their actions apart in 
"functions" that take payload and return payload. Having a deployment model outlined 
could as noted encourage users to use OpenWhisk as a way to rapidly deploy/undeploy their usual 
webserver based applications. The current model is nice in that it solves a lot of problems for the 
customer in terms of scalability and "crash safeness".

2. Raw throughput of our deployment model: Setting the concerns aside I think it is valid 
to explore concurrent invocations of actions on the same container. This does not 
necessarily mean that users start to deploy monolithic apps as noted above, but it 
certainly could. Keeping our JSON-in/JSON-out at least for now though, could encourage 
users to continue to think in functions. Having a toggle per action which is disabled by 
default might be a good way to start here, since many users might need to change action 
code to support that notion and for some applications it might not be valid at all. I 
think it was also already noted, that this imposes some of the "old-fashioned" 
problems on the user, like: How many concurrent requests will my action be able to 
handle? That kinda defeats the seemless-scalability point of serverless.

Cheers,
Markus

Am 02. Juli 2017 um 10:42 schrieb Rodric Rabbah :

The thoughts I shared around how to realize better packing with intrinsic 
actions are aligned with the your goals: getting more compute density with a 
smaller number of machines. This is a very worthwhile goal.

I noted earlier that packing more activations into a single container warrants a different resource manager with its own container life cycle management (e.g., it's almost at the level of: provision a container for me quickly and let me have it to run my monolithic code for as long as I want). 

Already some challenges were mentioned, wrt sharing state, resource leaks and possible data races. Perhaps defining the resource isolation model intra container - processes, threads, "node vm", ... - is helpful as you refine your proposal. This can address how one might deal with intra container noisy neighbors as well. 


Hence in terms of resource management as the platform level, I think it would 
be a mistake to treat intra container concurrency the same way as ephemeral 
activations, that are run and done. Once the architecture and scheduler 
supports a heterogenous mix of resources, then treating some actions as 
intrinsic operations becomes easier to realize; in other words complementary to 
the overall proposed direction if the architecture is done right.

To Alex's point, when you're optimizing for latency, you don't need to be 
constrained to UI applications. Maybe this is more of a practical motivation 
based on your workloads.

-r

On Jul 2, 2017, at 2:32 AM, Dascalita Dragos  wrote:

I think the opportunities for packing computation at finer granularity
will be there. In your approach you're tending, it seems, toward taking
monolithic codes and overlapping their computation. I tend to think this
will work better with another approach.

+1 to making the serverless system smarter in managing and running the code
at scale. I don't think the current state is there right now. There are
limitations which could be improved by simply allowing developers to
control which action can be invoked concurrently. We could also consider
designing the system to "learn" this intent by observing how the action is
configured by the developer: if it's an HTTP endpoint, or an event handler.

As long as today we can improve the performance by allowing concurrency in
actions, and by invoking them faster, why would we not benefit from this
now, and update the implementation later, once the system improves ? Or are
there better ways available now to match this performance that are not
captured in the proposal ?

Re: Improving support for UI driven use cases

2017-07-02 Thread Rodric Rabbah
The thoughts I shared around how to realize better packing with intrinsic 
actions are aligned with the your goals: getting more compute density with a 
smaller number of machines. This is a very worthwhile goal.

I noted earlier that packing more activations into a single container warrants 
a different resource manager with its own container life cycle management 
(e.g., it's almost at the level of: provision a container for me quickly and 
let me have it to run my monolithic code for as long as I want). 

Already some challenges were mentioned, wrt sharing state, resource leaks and 
possible data races. Perhaps defining the resource isolation model intra 
container - processes, threads, "node vm", ... - is helpful as you refine your 
proposal. This can address how one might deal with intra container noisy 
neighbors as well. 

Hence in terms of resource management as the platform level, I think it would 
be a mistake to treat intra container concurrency the same way as ephemeral 
activations, that are run and done. Once the architecture and scheduler 
supports a heterogenous mix of resources, then treating some actions as 
intrinsic operations becomes easier to realize; in other words complementary to 
the overall proposed direction if the architecture is done right.

To Alex's point, when you're optimizing for latency, you don't need to be 
constrained to UI applications. Maybe this is more of a practical motivation 
based on your workloads.

-r

On Jul 2, 2017, at 2:32 AM, Dascalita Dragos  wrote:

>> I think the opportunities for packing computation at finer granularity
> will be there. In your approach you're tending, it seems, toward taking
> monolithic codes and overlapping their computation. I tend to think this
> will work better with another approach.
> 
> +1 to making the serverless system smarter in managing and running the code
> at scale. I don't think the current state is there right now. There are
> limitations which could be improved by simply allowing developers to
> control which action can be invoked concurrently. We could also consider
> designing the system to "learn" this intent by observing how the action is
> configured by the developer: if it's an HTTP endpoint, or an event handler.
> 
> As long as today we can improve the performance by allowing concurrency in
> actions, and by invoking them faster, why would we not benefit from this
> now, and update the implementation later, once the system improves ? Or are
> there better ways available now to match this performance that are not
> captured in the proposal ?


Re: Improving support for UI driven use cases

2017-07-02 Thread Dascalita Dragos
>  I think the opportunities for packing computation at finer granularity
will be there. In your approach you're tending, it seems, toward taking
monolithic codes and overlapping their computation. I tend to think this
will work better with another approach.

+1 to making the serverless system smarter in managing and running the code
at scale. I don't think the current state is there right now. There are
limitations which could be improved by simply allowing developers to
control which action can be invoked concurrently. We could also consider
designing the system to "learn" this intent by observing how the action is
configured by the developer: if it's an HTTP endpoint, or an event handler.

As long as today we can improve the performance by allowing concurrency in
actions, and by invoking them faster, why would we not benefit from this
now, and update the implementation later, once the system improves ? Or are
there better ways available now to match this performance that are not
captured in the proposal ?


On Sat, Jul 1, 2017 at 10:29 PM Alex Glikson <glik...@il.ibm.com> wrote:

> My main point is - interactive Web applications is certainly not the only
> case which is sensitive to latency (or throughput) under variable load.
> Think of an event that a person presses 'emergency' button in an elevator,
> and we need to respond immediately (it might be even more important than
> occasionally getting a timeout on a web page). So, ideally, the solution
> should address *any* (or as many as possible of) such applications.
>
> Regards,
> Alex
>
>
>
> From:   Tyson Norris <tnor...@adobe.com.INVALID>
> To: "dev@openwhisk.apache.org" <dev@openwhisk.apache.org>
> Date:   02/07/2017 01:35 AM
> Subject:Re: Improving support for UI driven use cases
>
>
>
>
> > On Jul 1, 2017, at 2:07 PM, Alex Glikson <glik...@il.ibm.com> wrote:
> >
> >> a burst of users will quickly exhaust the system, which is only fine
> for
> > event handling cases, and not fine at all for UI use cases.
> >
> > Can you explain why is it fine for event handling cases?
> > I would assume that the key criteria would be, for example, around
> > throughput and/or latency (and their tradeoff with capacity), and not
> > necessarily the nature of the application per se.
> >
> > Regards,
> > Alex
>
> Sure - with event handling, where blocking=false, or where a timeout
> response of 202 (and fetch the response later) is tolerable,  exhausting
> container resources will simply mean that the latency goes up based on the
> number of events generated after the point of saturation.  If you can only
> process 100 events at one time, an arrival of 1000 events at the same time
> means that the second 100 events will only be processed after the first
> 100 (twice normal latency), third 100 events after that (3 times normal
> latency), 4th 100 events after that (4 times normal latency) etc. But if
> no user is sitting at a browser waiting for a response, it is unlikely
> they care whether the processing occurs 10ms or 10min after the triggering
> event. (This is exaggerating, but you get the point)
>
> In the case a user is staring at a browser waiting for response, such a
> variance in latency just due to the raw number of users in the system
> directly relating to the raw number of containers in the system, will not
> be usable. Consider concurrency not as a panacea for exhausting container
> pool resources, but rather a way to dampen the graph of user traffic
> increase vs required container pool increase, making it something like
> 1000:1 (1000 concurrent users requires 1 container) instead of it being a
> 1:1 relationship.
>
> Thanks
> Tyson
>
>
>
>
>


Re: Improving support for UI driven use cases

2017-07-01 Thread Alex Glikson
My main point is - interactive Web applications is certainly not the only 
case which is sensitive to latency (or throughput) under variable load. 
Think of an event that a person presses 'emergency' button in an elevator, 
and we need to respond immediately (it might be even more important than 
occasionally getting a timeout on a web page). So, ideally, the solution 
should address *any* (or as many as possible of) such applications.

Regards,
Alex



From:   Tyson Norris <tnor...@adobe.com.INVALID>
To: "dev@openwhisk.apache.org" <dev@openwhisk.apache.org>
Date:   02/07/2017 01:35 AM
Subject:    Re: Improving support for UI driven use cases




> On Jul 1, 2017, at 2:07 PM, Alex Glikson <glik...@il.ibm.com> wrote:
> 
>> a burst of users will quickly exhaust the system, which is only fine 
for 
> event handling cases, and not fine at all for UI use cases.
> 
> Can you explain why is it fine for event handling cases?
> I would assume that the key criteria would be, for example, around 
> throughput and/or latency (and their tradeoff with capacity), and not 
> necessarily the nature of the application per se.
> 
> Regards,
> Alex

Sure - with event handling, where blocking=false, or where a timeout 
response of 202 (and fetch the response later) is tolerable,  exhausting 
container resources will simply mean that the latency goes up based on the 
number of events generated after the point of saturation.  If you can only 
process 100 events at one time, an arrival of 1000 events at the same time 
means that the second 100 events will only be processed after the first 
100 (twice normal latency), third 100 events after that (3 times normal 
latency), 4th 100 events after that (4 times normal latency) etc. But if 
no user is sitting at a browser waiting for a response, it is unlikely 
they care whether the processing occurs 10ms or 10min after the triggering 
event. (This is exaggerating, but you get the point)

In the case a user is staring at a browser waiting for response, such a 
variance in latency just due to the raw number of users in the system 
directly relating to the raw number of containers in the system, will not 
be usable. Consider concurrency not as a panacea for exhausting container 
pool resources, but rather a way to dampen the graph of user traffic 
increase vs required container pool increase, making it something like 
1000:1 (1000 concurrent users requires 1 container) instead of it being a 
1:1 relationship. 

Thanks
Tyson






Re: Improving support for UI driven use cases

2017-07-01 Thread Rodric Rabbah

> I'm not sure how you would split out these network vs compute items without 
> action devs taking that responsibility (and not using libraries) or how it 
> would be done generically across runtimes.

You don't think this is already happening? When you use promises and chain 
promises together, you've already decomposed your computation into smaller 
operations. It is precisely this that makes the asynchronous model of computing 
work - even in a single threaded runtime. I think this can be exploited for a 
serverless polyglot composition. That in itself is a separate topic from the 
one your raised initially. I think the opportunities for packing computation at 
finer granularity will be there. In your approach you're tending, it seems, 
toward taking monolithic codes and overlapping their computation. I tend to 
think this will work better with another approach. In either case, figuring out 
how to manage several granularities of concurrent activations applies.

-r

Re: Improving support for UI driven use cases

2017-07-01 Thread Tyson Norris


On Jul 1, 2017, at 3:31 PM, Rodric Rabbah  wrote:

>> I’m not sure it would be worth it to force developers down a path of 
>> configuring actions based on the network ops of the code within, compared to 
>> simply allowing concurrency. 
> 
> I think it will happen naturally: imagine a sequence where the first 
> operation is a request and the rest is the processing. All such request ops 
> can be packed densely. It's way of increasing compute density without 
> exposing users to details. It was just an example of where I can see his 
> paying off. If all the actions are compute heavy there won't be any 
> concurrency in a nodejs runtime. Also among the thoughts I have is how one 
> can generalize this to apply to other language runtimes.
> 

Decomposing a set of network operations to a sequence of requests and 
operations implies not using any dependencies that use network operations. One 
convenience of allowing existing languages (instead of a language of sequenced 
steps) is that devs can use existing libraries that implement these operations 
in conventional ways. So while it may be possible to provide new ways to 
organize network and compute operations to maximize efficiency, there won't be 
a lot of ways to use conventional libraries without scaling based on 
conventional methods, like concurrent access to runtimes.

I'm not sure how you would split out these network vs compute items without 
action devs taking that responsibility (and not using libraries) or how it 
would be done generically across runtimes.




Re: Improving support for UI driven use cases

2017-07-01 Thread Tyson Norris

> On Jul 1, 2017, at 2:07 PM, Alex Glikson  wrote:
> 
>> a burst of users will quickly exhaust the system, which is only fine for 
> event handling cases, and not fine at all for UI use cases.
> 
> Can you explain why is it fine for event handling cases?
> I would assume that the key criteria would be, for example, around 
> throughput and/or latency (and their tradeoff with capacity), and not 
> necessarily the nature of the application per se.
> 
> Regards,
> Alex

Sure - with event handling, where blocking=false, or where a timeout response 
of 202 (and fetch the response later) is tolerable,  exhausting container 
resources will simply mean that the latency goes up based on the number of 
events generated after the point of saturation.  If you can only process 100 
events at one time, an arrival of 1000 events at the same time means that the 
second 100 events will only be processed after the first 100 (twice normal 
latency), third 100 events after that (3 times normal latency), 4th 100 events 
after that (4 times normal latency) etc. But if no user is sitting at a browser 
waiting for a response, it is unlikely they care whether the processing occurs 
10ms or 10min after the triggering event. (This is exaggerating, but you get 
the point)

In the case a user is staring at a browser waiting for response, such a 
variance in latency just due to the raw number of users in the system directly 
relating to the raw number of containers in the system, will not be usable. 
Consider concurrency not as a panacea for exhausting container pool resources, 
but rather a way to dampen the graph of user traffic increase vs required 
container pool increase, making it something like 1000:1 (1000 concurrent users 
requires 1 container) instead of it being a 1:1 relationship. 

Thanks
Tyson

Re: Improving support for UI driven use cases

2017-07-01 Thread Rodric Rabbah
> I’m not sure it would be worth it to force developers down a path of 
> configuring actions based on the network ops of the code within, compared to 
> simply allowing concurrency. 

I think it will happen naturally: imagine a sequence where the first operation 
is a request and the rest is the processing. All such request ops can be packed 
densely. It's way of increasing compute density without exposing users to 
details. It was just an example of where I can see his paying off. If all the 
actions are compute heavy there won't be any concurrency in a nodejs runtime. 
Also among the thoughts I have is how one can generalize this to apply to other 
language runtimes.

-r


Re: Improving support for UI driven use cases

2017-07-01 Thread Rodric Rabbah

> it is quite common, for example to run nodejs applications that happily serve 
> hundreds or thousands of concurrent users, 

I can see opportunities for treating certain actions as intrinsic for which 
this kind of gain can be realized. Specifically actions which are performing 
network operations then computing on the result. One can split the asynchronous 
io, and pack it into a more granular concurrency engine. Intuitively, I think 
it's more likely to benefit this way and protect against concurrency bugs and 
the like but using higher level action primitives.

-r

Re: Improving support for UI driven use cases

2017-07-01 Thread Rodric Rabbah
> the concurrency issue is currently entangled with the controller
discussion, because sequential processing is enforced

how so? if you invoke N actions they don't run sequentially - each is its
own activation, unless you actually invokes a sequence. Can you clarify
this point?

-r


Re: Improving support for UI driven use cases

2017-07-01 Thread Tyson Norris
sired resource allocation for such container?
>> Wouldn't this re-introduce issues related to sizing, scaling and
>> fragmentation of resources - nicely avoided with single-tasked containers?
>> Also, I wonder what would be the main motivation to implement such a
>> policy compared to just having a number of hot containers, ready to
>> process incoming requests?
>> 
>> Regards,
>> Alex
>> 
>> 
>> 
>> From:   Rodric Rabbah <rod...@gmail.com<mailto:rod...@gmail.com>>
>> To: dev@openwhisk.apache.org<mailto:dev@openwhisk.apache.org>
>> Cc: Dragos Dascalita Haut <ddas...@adobe.com<mailto:ddas...@adobe.com>>
>> Date:   01/07/2017 06:56 PM
>> Subject:Re: Improving support for UI driven use cases
>> 
>> 
>> 
>> Summarizing the wiki notes:
>> 
>> 1. separate control and data plane so that data plane is routed directly
>> to
>> the container
>> 2. desire multiple concurrent function activations in the same container
>> 
>> On 1, I think this is inline with an outstanding desire and corresponding
>> issues to take the data flow out of the system and off the control message
>> critical path. As you pointed out, this has a lot of benefits - including
>> one you didn't mention: streaming (web socket style) in/out of the
>> container. Related issues although not complete in [1][2].
>> 
>> On 2, I think you are starting to see some of the issues as you think
>> through the limits on the action and its container and what that means for
>> the user flow and experience. Both in terms of "time limit" and "memory
>> limit". I think the logging issue can be solved to disentangle
>> activations.
>> But I also wonder if these are going to be longer running "actions" and
>> hence, the model is different: short running vs long running container for
>> which there are different life cycles and hence different scheduling
>> decisions from different container pools.
>> 
>> [1] 
>> https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fapache%2Fincubator-openwhisk%2Fissues%2F788=02%7C01%7C%7Ceba25473c07d4e14b08708d4c09f5ebc%7Cfa7b1b5a7b34438794aed2c178decee1%7C0%7C0%7C636345238099425386=0I8dFYoUNO%2BA1q2CSMbxCl2ODIERO0letLKUZTuUNtA%3D=0
>> [2] 
>> https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fapache%2Fincubator-openwhisk%2Fissues%2F254=02%7C01%7C%7Ceba25473c07d4e14b08708d4c09f5ebc%7Cfa7b1b5a7b34438794aed2c178decee1%7C0%7C0%7C636345238099425386=G9vNfGDW7cHffU%2FotAyj1US7Hxd6Mqp7pvbwD4xG%2FZ8%3D=0
>> 



Re: Improving support for UI driven use cases

2017-07-01 Thread Markus Thömmes
Thanks for the vry detailed writeup! Great job!

One thing I'd note: As Rodric pointed out we should break the issues apart and 
address them one by one.

For instance the proposed loadbalancer changes (Controllers knows of Containers 
downstream) is desirable for each workload and not necessary bound to the core 
of the proposal, which I'd say is the concurrency discussion. I agree that 100% 
warm container usage is crucial there, but each load will benefit from high 
container reuse.

Just not to get the discussion too broad and unfocused.

On the topic itself: I think its a great idea to further push the performance 
and, as pointed out, make operating OpenWhisk more efficient. Most issue I see 
have already been pointed out so I won't repeat them. But I'm quite sure we can 
figure those out.

Have great weekend!

Von meinem iPhone gesendet

> Am 01.07.2017 um 19:26 schrieb Tyson Norris <tnor...@adobe.com.INVALID>:
> 
> RE: separate policies - I agree that it would make sense for separating 
> container pools in some way by “event driven” and “ui driven” - I don’t think 
> anything precludes that from happening, but its different than the notion of 
> “long running” and “short running”. e.g. if events flow into the system, the 
> container would be equally long running as if users are continually using the 
> system. I’m not suggesting changing the cold/warm start behavior, rather the 
> response and concurrency behavior, to be more in line with UI drive use 
> cases. In fact the “first user” experience would be exactly the same, its the 
> “99 users that arrive before the first user is complete” experience that 
> would be different. (It may also be appealing to have a notion of “prehot” 
> containers, but I think existing “prewarm” is good enough for many cases). If 
> it's useful to cap the pool usage for either of these cases, nothing prevents 
> that from happening, but I would start with a (simpler) case where there is.a 
> single pool that supports both usages - currently there is no pool that 
> reliably supports the ui case, since “bursty” traffic is immediately 
> excessively latent.
> 
> RE: concurrent requests resource usage: I would argue that you would 
> determine resource allocation in (nearly) the same way you should with 
> single-tasked containers. i.e. the only pragmatic way to estimate resource 
> usage is to measure it. In single task case, you might use curl to simulate a 
> single user. In concurrent tasks case, you might use wrk or gatling (or 
> something else) to simulate multiple users. Regardless of the tool, analyzing 
> code etc will not get close enough to accurate measurements compared to 
> empirical testing.
> 
> RE: motivation compared to having a number of hot containers - efficiency of 
> resource usage for the *OpenWhisk operator*. No one will be able to afford to 
> run open whisk if they have to run 100 containers *per action* to support a 
> burst of 100 users using any particular action. Consider a burst of 1000 or 
> 1 users, and a 1000 actions. If a single container can handle the burst 
> of 100 users, it will solve a lot of low-medium use cases efficiently, and in 
> the case of 1 users, running 100 containers will be more efficient than 
> the 1 containers you would have to run as single task.
> 
> WDYT?
> 
> Thanks for the feedback!
> Tyson
> 
> 
> On Jul 1, 2017, at 9:36 AM, Alex Glikson 
> <glik...@il.ibm.com<mailto:glik...@il.ibm.com>> wrote:
> 
> Having different policies for different container pools certainly makes
> sense. Moreover, enhancing the design/implementation so that there is more
> concurrency and less bottlenecks also sounds like an excellent idea.
> However, I am unsure specifically regarding the idea of handling multiple
> requests concurrently by the same container. For example, I wonder how one
> would determine the desired resource allocation for such container?
> Wouldn't this re-introduce issues related to sizing, scaling and
> fragmentation of resources - nicely avoided with single-tasked containers?
> Also, I wonder what would be the main motivation to implement such a
> policy compared to just having a number of hot containers, ready to
> process incoming requests?
> 
> Regards,
> Alex
> 
> 
> 
> From:   Rodric Rabbah <rod...@gmail.com<mailto:rod...@gmail.com>>
> To:     dev@openwhisk.apache.org<mailto:dev@openwhisk.apache.org>
> Cc: Dragos Dascalita Haut <ddas...@adobe.com<mailto:ddas...@adobe.com>>
> Date:   01/07/2017 06:56 PM
> Subject:Re: Improving support for UI driven use cases
> 
> 
> 
> Summarizing the wiki notes:
> 
> 1. separate control and data plane so that data plane is routed directly
> to
> the container
> 2

Re: Improving support for UI driven use cases

2017-07-01 Thread Tyson Norris
RE: separate policies - I agree that it would make sense for separating 
container pools in some way by “event driven” and “ui driven” - I don’t think 
anything precludes that from happening, but its different than the notion of 
“long running” and “short running”. e.g. if events flow into the system, the 
container would be equally long running as if users are continually using the 
system. I’m not suggesting changing the cold/warm start behavior, rather the 
response and concurrency behavior, to be more in line with UI drive use cases. 
In fact the “first user” experience would be exactly the same, its the “99 
users that arrive before the first user is complete” experience that would be 
different. (It may also be appealing to have a notion of “prehot” containers, 
but I think existing “prewarm” is good enough for many cases). If it's useful 
to cap the pool usage for either of these cases, nothing prevents that from 
happening, but I would start with a (simpler) case where there is.a single pool 
that supports both usages - currently there is no pool that reliably supports 
the ui case, since “bursty” traffic is immediately excessively latent.

RE: concurrent requests resource usage: I would argue that you would determine 
resource allocation in (nearly) the same way you should with single-tasked 
containers. i.e. the only pragmatic way to estimate resource usage is to 
measure it. In single task case, you might use curl to simulate a single user. 
In concurrent tasks case, you might use wrk or gatling (or something else) to 
simulate multiple users. Regardless of the tool, analyzing code etc will not 
get close enough to accurate measurements compared to empirical testing.

RE: motivation compared to having a number of hot containers - efficiency of 
resource usage for the *OpenWhisk operator*. No one will be able to afford to 
run open whisk if they have to run 100 containers *per action* to support a 
burst of 100 users using any particular action. Consider a burst of 1000 or 
1 users, and a 1000 actions. If a single container can handle the burst of 
100 users, it will solve a lot of low-medium use cases efficiently, and in the 
case of 1 users, running 100 containers will be more efficient than the 
1 containers you would have to run as single task.

WDYT?

Thanks for the feedback!
Tyson


On Jul 1, 2017, at 9:36 AM, Alex Glikson 
<glik...@il.ibm.com<mailto:glik...@il.ibm.com>> wrote:

Having different policies for different container pools certainly makes
sense. Moreover, enhancing the design/implementation so that there is more
concurrency and less bottlenecks also sounds like an excellent idea.
However, I am unsure specifically regarding the idea of handling multiple
requests concurrently by the same container. For example, I wonder how one
would determine the desired resource allocation for such container?
Wouldn't this re-introduce issues related to sizing, scaling and
fragmentation of resources - nicely avoided with single-tasked containers?
Also, I wonder what would be the main motivation to implement such a
policy compared to just having a number of hot containers, ready to
process incoming requests?

Regards,
Alex



From:   Rodric Rabbah <rod...@gmail.com<mailto:rod...@gmail.com>>
To: dev@openwhisk.apache.org<mailto:dev@openwhisk.apache.org>
Cc: Dragos Dascalita Haut <ddas...@adobe.com<mailto:ddas...@adobe.com>>
Date:   01/07/2017 06:56 PM
Subject:        Re: Improving support for UI driven use cases



Summarizing the wiki notes:

1. separate control and data plane so that data plane is routed directly
to
the container
2. desire multiple concurrent function activations in the same container

On 1, I think this is inline with an outstanding desire and corresponding
issues to take the data flow out of the system and off the control message
critical path. As you pointed out, this has a lot of benefits - including
one you didn't mention: streaming (web socket style) in/out of the
container. Related issues although not complete in [1][2].

On 2, I think you are starting to see some of the issues as you think
through the limits on the action and its container and what that means for
the user flow and experience. Both in terms of "time limit" and "memory
limit". I think the logging issue can be solved to disentangle
activations.
But I also wonder if these are going to be longer running "actions" and
hence, the model is different: short running vs long running container for
which there are different life cycles and hence different scheduling
decisions from different container pools.

[1] 
https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fapache%2Fincubator-openwhisk%2Fissues%2F788=02%7C01%7C%7Ceba25473c07d4e14b08708d4c09f5ebc%7Cfa7b1b5a7b34438794aed2c178decee1%7C0%7C0%7C636345238099425386=0I8dFYoUNO%2BA1q2CSMbxCl2ODIERO0letLKUZTuUNtA%3D=0
[2] 
https://na01.safelinks.p

Re: Improving support for UI driven use cases

2017-07-01 Thread Rodric Rabbah
Summarizing the wiki notes:

1. separate control and data plane so that data plane is routed directly to
the container
2. desire multiple concurrent function activations in the same container

On 1, I think this is inline with an outstanding desire and corresponding
issues to take the data flow out of the system and off the control message
critical path. As you pointed out, this has a lot of benefits - including
one you didn't mention: streaming (web socket style) in/out of the
container. Related issues although not complete in [1][2].

On 2, I think you are starting to see some of the issues as you think
through the limits on the action and its container and what that means for
the user flow and experience. Both in terms of "time limit" and "memory
limit". I think the logging issue can be solved to disentangle activations.
But I also wonder if these are going to be longer running "actions" and
hence, the model is different: short running vs long running container for
which there are different life cycles and hence different scheduling
decisions from different container pools.

[1] https://github.com/apache/incubator-openwhisk/issues/788
[2] https://github.com/apache/incubator-openwhisk/issues/254