On Mon, Apr 27, 2015 at 12:28:01PM -0400, Rabi Mishra wrote:
> Hi All,
> 
> Deploying Kubernetes(k8s) cluster on any OpenStack based cloud for container 
> based workload is a standard deployment pattern. However, auto-scaling this 
> cluster based on load would require some integration between k8s OpenStack 
> components. While looking at the option of leveraging Heat ASG to achieve 
> autoscaling, I came across few requirements that the list can discuss and 
> arrive at the best possible solution.
> 
> A typical k8s deployment scenario on OpenStack would be as below.
> 
> - Master (single VM)
> - Minions/Nodes (AutoScalingGroup)
> 
> AutoScaling of the cluster would involve both scaling of minions/nodes and 
> scaling Pods(ReplicationControllers). 
> 
> 1. Scaling Nodes/Minions:
> 
> We already have utilization stats collected at the hypervisor level, as 
> ceilometer compute agent polls the local libvirt daemon to acquire 
> performance data for the local instances/nodes.

I really doubts if those metrics are so useful to trigger a scaling
operation. My suspicion is based on two assumptions: 1) autoscaling
requests should come from the user application or service, not from the
controller plane, the application knows best whether scaling is needed;
2) hypervisor level metrics may be misleading in some cases. For
example, it cannot give an accurate CPU utilization number in the case
of CPU overcommit which is a common practice.

> Also, Kubelet (running on the node) collects the cAdvisor stats. However, 
> cAdvisor stats are not fed back to the scheduler at present and scheduler 
> uses a simple round-robin method for scheduling.

It looks like a multi-layer resource management problem which needs a
wholistic design. I'm not quite sure if scheduling at the container
layer alone can help improve resource utilization or not.

> Req 1: We would need a way to push stats from the kubelet/cAdvisor to 
> ceilometer directly or via the master(using heapster). Alarms based on these 
> stats can then be used to scale up/down the ASG. 

To send a sample to ceilometer for triggering autoscaling, we will need
some user credentials to authenticate with keystone (even with trusts).
We need to pass the project-id in and out so that ceilometer will know
the correct scope for evaluation. We also need a standard way to tag
samples with the stack ID and maybe also the ASG ID. I'd love to see
this done transparently, i.e. no matching_metadata or query confusions.

> There is an existing blueprint[1] for an inspector implementation for docker 
> hypervisor(nova-docker). However, we would probably require an agent running 
> on the nodes or master and send the cAdvisor or heapster stats to ceilometer. 
> I've seen some discussions on possibility of leveraging keystone trusts with 
> ceilometer client. 

An agent is needed, definitely.

> Req 2: Autoscaling Group is expected to notify the master that a new node has 
> been added/removed. Before removing a node the master/scheduler has to mark 
> node as 
> unschedulable. 

A little bit confused here ... are we scaling the containers or the
nodes or both?

> Req 3: Notify containers/pods that the node would be removed for them to stop 
> accepting any traffic, persist data. It would also require a cooldown period 
> before the node removal. 

There have been some discussions on sending messages, but so far I don't
think there is a conclusion on the generic solution.

Just my $0.02.

BTW, we have been looking into similar problems in the Senlin project.

Regards,
  Qiming

> Both requirement 2 and 3 would probably require generating scaling event 
> notifications/signals for master and containers to consume and probably some 
> ASG lifecycle hooks.  
> 
> 
> Req 4: In case of too many 'pending' pods to be scheduled, scheduler would 
> signal ASG to scale up. This is similar to Req 1. 
> 
> 
> 2. Scaling Pods
> 
> Currently manual scaling of pods is possible by resizing 
> ReplicationControllers. k8s community is working on an abstraction, 
> AutoScaler[2] on top of ReplicationController(RC) that provides 
> intention/rule based autoscaling. There would be a requirement to collect 
> cAdvisor/Heapster stats to signal the AutoScaler too. Probably this is beyond 
> the scope of OpenStack.
> 
> Any thoughts and ideas on how to realize this use-case would be appreciated.
> 
> 
> [1] 
> https://review.openstack.org/gitweb?p=openstack%2Fceilometer-specs.git;a=commitdiff;h=6ea7026b754563e18014a32e16ad954c86bd8d6b
> [2] 
> https://github.com/GoogleCloudPlatform/kubernetes/blob/master/docs/proposals/autoscaling.md
> 
> Regards,
> Rabi Mishra
> 
> 
> __________________________________________________________________________
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: [email protected]?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> 


__________________________________________________________________________
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: [email protected]?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Reply via email to