I like that approach a lot!
On 04/07/17 16:05, "Stephen Fink" <fink.step...@gmail.com> wrote: >Hi all, > >I’ve been lurking a bit on this thread, but haven’t had time to fully digest >all the issues. > >I’d suggest that the first step is to support “multiple heterogeneous resource >pools”, where a resource pool is a set of invokers managed by a load balancer. > There are lots of reasons we may want to support invokers with different >flavors: long-running actions, invokers in a VPN, invokers with GPUs, >invokers with big memory, invokers which support concurrent execution, etc… . > If we had a general way to plug in a new resource pool, folks could feel >free to experiment with any new flavors they like without having to debate the >implications on other flavors. > >I tend to doubt that there is a “one size fits all” solution here, so I’d >suggest we bite the bullet and engineer for heterogeneity. > >SJF > > >> On Jul 4, 2017, at 9:55 AM, Michael Marth <mma...@adobe.com.INVALID> wrote: >> >> Hi Jeremias, all, >> >> Tyson and Dragos are travelling this week, so that I don’t know by when they >> get to respond. I have worked with them on this topic, so let me jump in and >> comment until they are able to reply. >> >> From my POV having a call like you suggest is a really good idea. Let’s wait >> for Tyson & Dragos to chime in to find a date. >> >> As you mention the discussion so far was jumping across different topics, >> especially the use case, the problem to be solved and the proposed solution. >> In preparation of the call I think we can clarify use case and problem on >> the list. Here’s my view: >> >> Use Case >> >> For us the use case can be summarised with “dynamic, high performance >> websites/mobile apps”. This implies: >> 1 High concurrency, i.e. Many requests coming in at the same time >> 2 The code to be executed is the same code across these different requests >> (as opposed to a long tail distribution of many different actions being >> executed concurrently). In our case “many” would mean “hundreds” or a few >> thousand. >> 3 The latency (time to start execution) matters, because human users are >> waiting for the response. Ideally, in these order of magnitudes of >> concurrent requests the latency should not change much. >> >> All 3 requirements need to be satisfied for this use case. >> In the discussion so far it was mentioned that there are other use cases >> which might have similar requirements. That’s great and I do not want to >> rule them out, obviously. The above is just to make it clear from where we >> are coming from. >> >> At this point I would like to mention that it is my understanding that this >> use case is within OpenWhisk’s strike zone, i.e. Something that we all think >> is reasonable to support. Please speak up if you disagree. >> >> The Problem >> >> One can look at the problem in two ways: >> Either you keep the resources of the OW system constant (i.e. No scaling). >> In that case latency increases very quickly as demonstrated by Tyson’s tests. >> Or you increase the system’s capacity. In that case the amount of machines >> to satisfy this use case quickly becomes prohibitively expensive to run for >> the OW operator – where expensive is defined as “compared to traditional web >> servers” (in our case a standard Node.js server). Meaning, you need 100-1000 >> concurrent action containers to serve what can be served by 1 or 2 Node.js >> containers. >> >> Of course, the proposed solution is not a fundamental “fix” for the above. >> It would only move the needle ~2 orders of magnitude – so that the current >> problem would not be a problem in reality anymore (and simply remain as a >> theoretical problem). For me that would be good enough. >> >> The solution approach >> >> Would not like to comment on the proposed solution’s details (and leave that >> to Dragos and Tyson). However, it was mentioned that the approach would >> change the programming model for users: >> Our mindset and approach was that we explicitly do not want to change how >> OpenWhisk exposes itself to users. Meaning, users should still be able to >> use NPMs, etc - i.e. This would be an internal implementation detail that >> is not visible for users. (we can make things more explicit to users and >> e.g. Have them requests a special concurrent runtime if we wish to do so – >> so far we tried to make it transparent to users, though). >> >> Many thanks >> Michael >> >> >> >> On 03/07/17 14:48, "Jeremias Werner" >> <jeremias.wer...@gmail.com<mailto:jeremias.wer...@gmail.com>> wrote: >> >> Hi >> >> Thanks for the write-up and the proposal. I think this is a nice idea and >> sounds like a nice way of increasing throughput. Reading through the thread >> it feels like there are different topics/problems mixed-up and the >> discussion is becoming very complex already. >> >> Therefore I would like to suggest that we streamline the discussion a bit, >> maybe in a zoom.us session where we first give Tyson and Dragos the chance >> to walk through the proposal and clarify questions of the audience. Once we >> are all on the same page we could think of a discussion about the benefits >> (improved throughput, latency) vs. challanges (resource sharing, crash >> model, container lifetime, programming model) on the core of the proposal: >> running multiple activations in a single user container. Once we have a >> common understanding on that part we could step-up in the architecture and >> discuss what's needed on higher components like invoker/load-balancer to >> get this integrated. >> >> (I said zoom.us session since I liked the one we had a few weeks ago. It >> was efficient and interactive. If you like I could volunteer to setup the >> session and/or writing the script/summary) >> >> what do you think? >> >> Many thanks in advance! >> >> Jeremias >> >> >> On Sun, Jul 2, 2017 at 5:43 PM, Rodric Rabbah >> <rod...@gmail.com<mailto:rod...@gmail.com>> wrote: >> >> You're discounting with event driven all use cases that are still latency >> sensitive because they complete a response by call back or actuation at >> completion. IoT, chatbots, notifications, all examples in addition to ui >> which are latency sensitive and having uniform expectations on queuing time >> is of value. >> >> -r >> >