[CONF] OpenWhisk > OpenWhisk on Kubernetes

2019-05-09 Thread Dragos Dascalita Haut (Confluence)
Title: Message Title



 
 
 
There's 1 new edit on this page 
 
 
 
 
 
 
  
 
 
 
 
 
 
 
 
 
 
 
 
 
 
OpenWhisk on Kubernetes 
 
 
  
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

 
 
 
 
Dragos Dascalita Haut edited this page 
 
 
  
 
 

 
 
 
  
 
 
  
 
 
  
 
 
 
 
 
 
 
 
 
 
 
 
 
 
Here's what changed: 
 
 
 
 
 
 
 
 
 
 
 ... The Control Plane should provide an API used by the Data Plane to cold-start actions, and it should also emit events each time a change in the resource allocation happens; each time GC removes idle containers, or each time a new action is created, the Control Plane should notify all Data Plane instances of such changes.  
 
 
 
 Info 
 
 
 
 
  To be detailed: Unlike the current OW, the system is "async by default". The new design is "sync by default". The open question is how to handle async cases.   
 
 
Data Plane The data plane layer invokes actions as fast as possible. When an action needs to be cold-started, the data plane delegates this to the Control Plane, awaiting for the action to become ready before invoking it. Once an action is warmed-up the data plane is notified, and if it was waiting for such event in order to invoke an activation, it should resume the execution.  ...  
 
 
  
 
 
  
 
 
  
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
Go to page history 
 
 
  
 
 
  
 
 
  
 
 
  
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
View page 
 
 
  
 
 
  
 
 
  
 
 
  
 
 
 
 
 
 
 
 
 
 
Stop watching space
• 
 
 
 
 
 
 
Manage notifications 
 
 
 
 
 
 
 
 
 
 
  
 
 
This message was sent by Atlassian Confluence 6.15.2  
 
 
  
 
 
 
 
 
 
 
 
 




[CONF] OpenWhisk > OpenWhisk on Kubernetes

2019-05-09 Thread Dragos Dascalita Haut (Confluence)
Title: Message Title



 
 
 
There's 1 new edit on this page 
 
 
 
 
 
 
  
 
 
 
 
 
 
 
 
 
 
 
 
 
 
OpenWhisk on Kubernetes 
 
 
  
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

 
 
 
 
Dragos Dascalita Haut edited this page 
 
 
  
 
 

 
 
 
  
 
 
  
 
 
  
 
 
 
 
 
 
 
 
 
 
 
 
 
 
Here's what changed: 
 
 
 
 
 
 
 
 
 
 
 ... 
 
The request arrives from a client 
Authentication and Authorization 
 
The Container Router validates the Authorization header with OpenWhisk Auth Service 
The response of the Auth Service is cached  
  
Routing 
 
Check namespace limits 
Forward the request to a container selected from a list of warmed actions that the Action Router keeps.  
 
 (new) Streaming the request to the action would be a nice; OpenWhisk doesn't have support for this, and such feature could remove the max payload limits 
 (new) Websockets could also be supported, another missing feature in OpenWhisk. 
  
  
Container Proxy sidecar 
 
Check action concurrency limit 
Buffer a few more requests, queueing them into an overflow buffer; this may be something useful when cold-start could take longer than just queuing a few more requests. Blackbox actions that need to download the docker image may benefit from this more. This idea is inspired from KNative Serving.  
  
Invoke the action and return the response 
 
 (new) Caching the action response could be another nice to have feature, which is not implemented in OpenWhisk. Caching should be controller by the action response. 
  
Collect activation info. 
Sequence support. 
 
 If the action is part of a sequence, then the Router should have logic to invoke the next action in the sequence. 
 
 Other ideas to explore to support sequences, should the support in the ContainerRouter is too difficult to implement 
 
 ContainerProxy could "understand" sequences  
 Or reuse Composer and implement sequence-as-an-action.    
  
  
  
 Flow for cold-start When the Action Proxy is at capacity, it should return a 429 message back to the Container Router. A Retry-After  header could specify   or   for a CircuitBreaker in the ContainerRouter to avoid routing to that action. The time window for retry should ideally be computed from the response times observed by the Container Proxy.  ...  
 
 
  
 
 
  
 
 
  
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
Go to page history 
 
 
  
 
 
  
 
 
  
 
 
  
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
View page 
 
 
  
 
 
  
 
 
  
 
 
  
 
 
 
 
 
 
 
 
 
 
Stop watching space
• 
 
 
 
 
 
 
Manage notifications 
 
 
 
 
 
 
 
 
 
 
  
 
 
This message was sent by Atlassian Confluence 6.15.2  
 
 
  
 
 
 
 
 
 
 
 
 




[CONF] OpenWhisk > OpenWhisk on Kubernetes

2019-04-12 Thread Dragos Dascalita Haut (Confluence)
Title: Message Title



 
 
 
There's 1 new edit on this page 
 
 
 
 
 
 
  
 
 
 
 
 
 
 
 
 
 
 
 
 
 
OpenWhisk on Kubernetes 
 
 
  
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

 
 
 
 
Dragos Dascalita Haut edited this page 
 
 
  
 
 

 
 
 
  
 
 
  
 
 
  
 
 
 
 
 
 
 
 
 
 
 
 
 
 
Here's what changed: 
 
 
 
 
 
 
 
 
 
 
 ... 
 
Invoke an authN/authZ service, reusing the existing authN/authZ implementation in OpenWhisk 
 
 it may be possible to implement this inside the proxy itself. i.e. NGINX supports extensibility through lua thanks to Openresty. Routing  
 
Including support for sequences  
 
Throttling  
 
Respect namespace limits 
Respect Action level concurrency  
  
Caching the response, based on what the action returns. I.e. an action that validates an OAuth token could instruct the system to cache the response for that token until the token expiration timeit expires. 
Support API Management 
 
 Otherwise the existing OpenWhisk Gateway can be reused  
  
 Support for Observability: Metrics, activation info, tracing, etc 
 ... 
 
The request arrives from a client 
Authentication and Authorization 
 
The Container Router validates the Authorization header with OpenWhisk Auth Service 
The response of the Auth Service is cached  
  
Routing 
 
Check namespace limits 
Forward the request to a container selected from a list of warmed actions that the Action Router keeps.  
 
 (new) Streaming the request to the action would be a nice; OpenWhisk doesn't have support for this, and such feature could remove the max payload limits 
 (new) Websockets could also be supported, another missing feature in OpenWhisk. 
  
  
Container Proxy sidecar 
 
Check action concurrency limit 
Buffer a few more requests, queueing them into an overflow buffer; this may be something useful when cold-start could take longer than just queuing a few more requests. Blackbox actions that need to download the docker image may benefit from this more. This idea is inspired from KNative Serving.  
  
Invoke the action and return the response 
 
 (new) Caching the action response could be another nice to have feature, which is not implemented in OpenWhisk. Caching should be controller by the action response. 
  
Collect activation info. 
Sequence support. If the action is part of a sequence, then the Router should have logic to invoke the next action in the sequence. 
 ... When the Action Proxy is at capacity, it should return a 429 message back to the Container Router. A Retry-After  header could specify   or   for a CircuitBreaker in the ContainerRouter so the router doesn't retry againto avoid routing to that action. The time window for retry should ideally be computed from the response times observed by the Container Proxy.   TBD   
 
 
 
 Gliffy Diagram 
 
 
 
 
 
 
 
 
name 
OpenWhisk-ColdStart-ControlPlane 
 
 
pagePin 
2 
 
 
  
 
 
  The green steps are additional steps required for cold-start:   4. Container Proxy returns a 429  indicating the action has reached its max concurrency and can't take more activations. If there's no container running for that action, skip to step 5.   5. Container Router goes to the DistributedContainerPool  to request a new container to be created   6. After the container is created, all Container Router instances are informed, and the activation proceeds as in the Flow for the warm container described above.  Control Plane Candidates: 
 
 TBDOpenWhisk Controller and Invoker - refactored into a single service that meets the requirements  
 Control Plane concerns: 
 
Cold-start actions - allocate resource 
Garbage Collect idle actions - de-allocate resources 
 The Control Plane should be used by the Data Plane only when cold-starting new actions. ...  DistributedContainerPool   This Component is at the core of the Control Plane. It should be concerned with the following:  
 
 globalPool  
 
 Cluster Wide view of all running actions  
 Distributed Map with minimum data about actions needed for ResourceAllocator and GC  
 it should sync with Kubernetes from time to time to update the state, in case a container dies, or a Kubernetes operation kills that container  
 
 resourceAllocator - SingletonActor  
 
 It’s in charge to start containers on a node that has resources  
 When allocating resources, Placement Strategies should consider CPU, MEM, GPU, Network, and other resources an action might consume.  
 
  GC  ... 
 
 garbageCollector - SingletonActor  
 
 it removes idle actions  
 It needs to be a singleton so that when deciding what resource to free, in can avoid fragmentation. In other words, it should free resources to make the free space as compact as possible.   
 
 This is particularly important when scaling down the nodes running actions  
  
 Its implementation should be configurable and swappable   
 
 Management Plane  It This can reuse the OpenWhisk implementation.    Candidates:  
 
 OpenWhisk Controller, slimmed for Management APIs  
 Previous discussions Provide 

[CONF] OpenWhisk > OpenWhisk on Kubernetes

2019-04-03 Thread Dragos Dascalita Haut (Confluence)
Title: Message Title



 
 
 
There's 1 new edit on this page 
 
 
 
 
 
 
  
 
 
 
 
 
 
 
 
 
 
 
 
 
 
OpenWhisk on Kubernetes 
 
 
  
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

 
 
 
 
Dragos Dascalita Haut edited this page 
 
 
  
 
 

 
 
 
  
 
 
  
 
 
  
 
 
 
 
 
 
 
 
 
 
 
 
 
 
Here's what changed: 
 
 
 
 
 
 
 
 
 
 
 ...  
 
 
 
 Gliffy Diagram 
 
 
 
 
 
 
 
 
name 
OpenWhisk-ManagementControlData-plane 
 
 
pagePin 
23 
 
 
  
 
 
  Management Plane  TBD   Control Plane   TBD   Data Plane   The management layer exposes an API that is primarily serving developers that manage actions and APIs. The wsk  CLI interacts with this layer.   OpenWhisk operators may interact with this layer to manage namespaces too.    Control Plane   The control plane layer has 2 main responsibilities: cold-starting new actions, and removing idle actions (Garbage Collector).    This layer could be implemented by reusing the JVM architecture. In a JVM architecture the memory is allocated and is then freed with the help of a GC component. Similarly, a FaaS system needs to allocate resources such as cpu, memory, disk space to each action, then, when an action becomes idle, the FaaS system needs to scale it down to zero, freeing the resources.   The control plane should:  
 
 use the Cluster Manager's API to start and stop actions;  
 inform the Data Plane when starting or removing an action;  
 provide a configurable GC which should avoid fragmentation where possible; the less fragmentation the more compact the pool of resources is.    
 use a mark-and-sweep GC logic to remove containers, to allow enough time for the Data Plane to stop sending traffic to the actions marked for removal  
  The Control Plane should provide an API used by the Data Plane to cold-start actions, and it should also emit events each time a change in the resource allocation happens; each time GC removes idle containers, or each time a new action is created, the Control Plane should notify all Data Plane instances of such changes.    Data Plane   The data plane layer invokes actions as fast as possible. When an action needs to be cold-started, the data plane delegates this to the Control Plane, awaiting for the action to become ready before invoking it. Once an action is warmed-up the data plane is notified, and if it was waiting for such event in order to invoke an activation, it should resume the execution.    The Data Plane invokes warmed actions without going to the Control Plane. The only time Control Plane is used in an activation flow, is when a cold-start is required.   The Data Plane should stop sending traffic to actions that are marked for removal by the Control Plane. The only exception is when an action marked for removal receives an activation in the mean time, in which case the Data Plane informs the Control Plane, which  may choose to remove the "mark for removal" and keep the action running, or recycle the action with a new one.     This layer should have support for sequences, and for ensuring the default FaaS execution model which sends only 1 request at a time to an action.     CNCF Projects to integrate with  TBDPrevious discussions Provide support for integration with Kubernetes. One approach could be to deploy and run the components on a Kubernetes provider as we do for Vagrant, Docker, Docker-Compose, and OpenStack. ...  
 
 
  
 
 
  
 
 
  
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
Go to page history 
 
 
  
 
 
  
 
 
  
 
 
  
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
View page 
 
 
  
 
 
  
 
 
  
 
 
  
 
 
 
 
 
 
 
 
 
 
Stop watching space
• 
 
 
 
 
 
 
Manage notifications 
 
 
 
 
 
 
 
 
 
 
  
 
 
This message was sent by Atlassian Confluence 6.14.2  
 
 
  
 
 
 
 
 
 
 
 
 




[CONF] OpenWhisk > OpenWhisk on Kubernetes

2019-04-03 Thread Dragos Dascalita Haut (Confluence)
Title: Message Title



 
 
 
There's 1 new edit on this page 
 
 
 
 
 
 
  
 
 
 
 
 
 
 
 
 
 
 
 
 
 
OpenWhisk on Kubernetes 
 
 
  
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

 
 
 
 
Dragos Dascalita Haut edited this page 
 
 
  
 
 

 
 
 
  
 
 
  
 
 
  
 
 
 
 
 
 
 
 
 
 
 
 
 
 
Here's what changed: 
 
 
 
 
 
 
 
 
 
 
  Goals   
 
 
 
 Gliffy Diagram 
 
 
 
 
 
 
 
 
name 
OpenWhisk-w-ClusterManager-highlevel 
 
 
pagePin 
3 
 
 
  
 
 
  
 
 Try to reuse Kubernetes components in OW  
 Keep the existing developer experience and CLI  
  Current deployment   
 
 
 
 Gliffy Diagram 
 
 
 
 
 
 
 
 
name 
OpenWhisk-with-ClusterManager-simplified 
 
 
pagePin 
5 
 
 
  
 
 
     Originally OpenWhisk was built with the assumption that each Invoker is responsible for a single VM in the cluster. With a Cluster Manager, this premise changes, as a single Invoker could be in charge of the entire cluster.  The Cluster Manager is responsible for each VM. From the Invoker's perspective, the entire cluster looks like a single pool of resources.   The current OpenWhisk components, Controller and Invoker, have problems managing the same pull of resources. For example:  
 
 when 2 or more Invokers manage the same resources, conflicts may arise due to the fact that Invokers don't share any model  
 the load balancing logic in the Controller becomes less important given than it doesn't matter which Invoker executes a given action, because it will still execute it on the same pool of resources  
 the max memory limit set per invoker is also not useful    
  CNCF based Architecture   Given these new premises, and the experience of building a FaaS solution that the OpenWhisk community has, can the OW system benefit from a Cluster Manager to simplify its implementation, while keeping the same developer experience ?   This document looks at some possible options to achieve this with Kubernetes and other solutions from the CNCF landscape.   Management, Control, and Data Plane   OpenWhisk system can be decomposed in 3 distinct areas of concern, inspired from Network Devices and Systems designs.  
 
 
 
 Gliffy Diagram 
 
 
 
 
 
 
 
 
name 
OpenWhisk-ManagementControlData-plane 
 
 
pagePin 
2 
 
 
  
 
 
   Management Plane   TBD   Control Plane   TBD   Data Plane   TBD  Previous discussions  Provide support for integration with Kubernetes. One approach could be to deploy and run the components on a Kubernetes provider as we do for Vagrant, Docker, Docker-Compose, and OpenStack. ...  
 
 
  
 
 
  
 
 
  
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
Go to page history 
 
 
  
 
 
  
 
 
  
 
 
  
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
View page 
 
 
  
 
 
  
 
 
  
 
 
  
 
 
 
 
 
 
 
 
 
 
Stop watching space
• 
 
 
 
 
 
 
Manage notifications 
 
 
 
 
 
 
 
 
 
 
  
 
 
This message was sent by Atlassian Confluence 6.14.2