Markus, I appreciate the enhancements you mentioned in the wiki, and I'm very 
much inline with the ideas you brought in there.



"...having the ContainerManager be a cluster singleton..."

I was just in process to reply with the same idea :)

In addition, I was thinking we can leverage Akka Distributed Data [1] to keep 
all ContainerRouter actors eventually consistent. When creating a new 
container, the ContainerManager can write with a consistency "WriteAll"; it 
would be a little slower but it would improve consistency.


The "edge-case" isn't clear to me b/c I'm coming from the assumption that it 
doesn't matter which ContainerRouter handles the next request, given that all 
actors have the same data. Maybe you can help me understand better the 
edge-case ?


Re Knative approach, can you expand why the execution layer/data plane would be 
replaced entirely by Knative serving ? I think knative serving handles very 
well some cases like API requests, but it's not designed to guarantee 
concurrency restrictions like "1 request at a time per container" - something 
that AI Actions need.


Thanks,

dragos


[1] - https://doc.akka.io/docs/akka/2.5/distributed-data.html


________________________________
From: David P Grove <gro...@us.ibm.com>
Sent: Tuesday, August 14, 2018 2:15:13 PM
To: dev@openwhisk.apache.org
Subject: Re: Proposal on a future architecture of OpenWhisk




"Markus Thömmes" <markusthoem...@apache.org> wrote on 08/14/2018 10:06:49
AM:
>
> I just published a revision on the initial proposal I made. I still owe a
> lot of sequence diagrams for the container distribution, sorry for taking
> so long on that, I'm working on it.
>
> I did include a clear seperation of concerns into the proposal, where
> user-facing abstractions and the execution (loadbalacing, scaling) of
> functions are loosely coupled. That enables us to exchange the execution
> system while not changing anything in the Controllers at all (to an
> extent). The interface to talk to the execution layer is HTTP.
>

Nice writeup!

For me, the part of the design I'm wondering about is the separation of the
ContainerManager and the ContainerRouter and having the ContainerManager by
a cluster singleton. With Kubernetes blinders on, it seems more natural to
me to fuse the ContainerManager into each of the ContainerRouter instances
(since there is very little to the ContainerManager except (a) talking to
Kubernetes and (b) keeping track of which Containers it has handed out to
which ContainerRouters -- a task which is eliminated if we fuse them).

The main challenge is dealing with your "edge case" where the optimal
number of containers to create to execute a function is less than the
number of ContainerRouters.  I suspect this is actually an important case
to handle well for large-scale deployments of OpenWhisk.  Having 20ish
ContainerRouters on a large cluster seems plausible, and then we'd expect a
long tail of functions where the optimal number of container instances is
less than 20.

I wonder if we can partially mitigate this problem by doing some amount of
smart routing in the Controller.  For example, the first level of routing
could be based on the kind of the action (nodejs:6, python, etc).  That
could then vector to per-runtime ContainerRouters which dynamically
auto-scale based on load.  Since there doesn't have to be a fixed division
of actual execution resources to each ContainerRouter this could work.  It
also lets easily stemcells for multiple runtimes without worrying about
wasting too many resources.

How do you want to deal with design alternatives?  Should I be adding to
the wiki page?  Doing something else?

--dave

Reply via email to