Re: [openstack-dev] [heat] Kubernetes AutoScaling with Heat AutoScalingGroup and Ceilometer

2015-04-28 Thread Qiming Teng
On Mon, Apr 27, 2015 at 12:28:01PM -0400, Rabi Mishra wrote:
 Hi All,
 
 Deploying Kubernetes(k8s) cluster on any OpenStack based cloud for container 
 based workload is a standard deployment pattern. However, auto-scaling this 
 cluster based on load would require some integration between k8s OpenStack 
 components. While looking at the option of leveraging Heat ASG to achieve 
 autoscaling, I came across few requirements that the list can discuss and 
 arrive at the best possible solution.
 
 A typical k8s deployment scenario on OpenStack would be as below.
 
 - Master (single VM)
 - Minions/Nodes (AutoScalingGroup)
 
 AutoScaling of the cluster would involve both scaling of minions/nodes and 
 scaling Pods(ReplicationControllers). 
 
 1. Scaling Nodes/Minions:
 
 We already have utilization stats collected at the hypervisor level, as 
 ceilometer compute agent polls the local libvirt daemon to acquire 
 performance data for the local instances/nodes.

I really doubts if those metrics are so useful to trigger a scaling
operation. My suspicion is based on two assumptions: 1) autoscaling
requests should come from the user application or service, not from the
controller plane, the application knows best whether scaling is needed;
2) hypervisor level metrics may be misleading in some cases. For
example, it cannot give an accurate CPU utilization number in the case
of CPU overcommit which is a common practice.

 Also, Kubelet (running on the node) collects the cAdvisor stats. However, 
 cAdvisor stats are not fed back to the scheduler at present and scheduler 
 uses a simple round-robin method for scheduling.

It looks like a multi-layer resource management problem which needs a
wholistic design. I'm not quite sure if scheduling at the container
layer alone can help improve resource utilization or not.

 Req 1: We would need a way to push stats from the kubelet/cAdvisor to 
 ceilometer directly or via the master(using heapster). Alarms based on these 
 stats can then be used to scale up/down the ASG. 

To send a sample to ceilometer for triggering autoscaling, we will need
some user credentials to authenticate with keystone (even with trusts).
We need to pass the project-id in and out so that ceilometer will know
the correct scope for evaluation. We also need a standard way to tag
samples with the stack ID and maybe also the ASG ID. I'd love to see
this done transparently, i.e. no matching_metadata or query confusions.

 There is an existing blueprint[1] for an inspector implementation for docker 
 hypervisor(nova-docker). However, we would probably require an agent running 
 on the nodes or master and send the cAdvisor or heapster stats to ceilometer. 
 I've seen some discussions on possibility of leveraging keystone trusts with 
 ceilometer client. 

An agent is needed, definitely.

 Req 2: Autoscaling Group is expected to notify the master that a new node has 
 been added/removed. Before removing a node the master/scheduler has to mark 
 node as 
 unschedulable. 

A little bit confused here ... are we scaling the containers or the
nodes or both?

 Req 3: Notify containers/pods that the node would be removed for them to stop 
 accepting any traffic, persist data. It would also require a cooldown period 
 before the node removal. 

There have been some discussions on sending messages, but so far I don't
think there is a conclusion on the generic solution.

Just my $0.02.

BTW, we have been looking into similar problems in the Senlin project.

Regards,
  Qiming

 Both requirement 2 and 3 would probably require generating scaling event 
 notifications/signals for master and containers to consume and probably some 
 ASG lifecycle hooks.  
 
 
 Req 4: In case of too many 'pending' pods to be scheduled, scheduler would 
 signal ASG to scale up. This is similar to Req 1. 
 
 
 2. Scaling Pods
 
 Currently manual scaling of pods is possible by resizing 
 ReplicationControllers. k8s community is working on an abstraction, 
 AutoScaler[2] on top of ReplicationController(RC) that provides 
 intention/rule based autoscaling. There would be a requirement to collect 
 cAdvisor/Heapster stats to signal the AutoScaler too. Probably this is beyond 
 the scope of OpenStack.
 
 Any thoughts and ideas on how to realize this use-case would be appreciated.
 
 
 [1] 
 https://review.openstack.org/gitweb?p=openstack%2Fceilometer-specs.git;a=commitdiff;h=6ea7026b754563e18014a32e16ad954c86bd8d6b
 [2] 
 https://github.com/GoogleCloudPlatform/kubernetes/blob/master/docs/proposals/autoscaling.md
 
 Regards,
 Rabi Mishra
 
 
 __
 OpenStack Development Mailing List (not for usage questions)
 Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
 


__
OpenStack Development Mailing List (not for usage 

Re: [openstack-dev] [heat] Kubernetes AutoScaling with Heat AutoScalingGroup and Ceilometer

2015-04-28 Thread Georgy Okrokvertskhov
You can take a look onto Murano Kubernetes package. There is no autoscaling
out of the box, but it will be quite trivial to add a new action for that
as there are functions to add new ETC and Kubernetes nodes on master as
well as there is a function to add a new VM.

Here is an example of a scaleUp action:
https://github.com/gokrokvertskhov/murano-app-incubator/blob/monitoring-ha/io.murano.apps.java.HelloWorldCluster/Classes/HelloWorldCluster.murano#L93

Here is Kubernetes scaleUp action:
https://github.com/openstack/murano-apps/blob/master/Docker/Kubernetes/KubernetesCluster/package/Classes/KubernetesCluster.yaml#L441

And here is a place where Kubernetes master is update with a new node info:
https://github.com/openstack/murano-apps/blob/master/Docker/Kubernetes/KubernetesCluster/package/Classes/KubernetesMinionNode.yaml#L90

By that way as you can see there is cAdvisor setup on a new node too.

Thanks
Gosha


On Tue, Apr 28, 2015 at 8:52 AM, Rabi Mishra ramis...@redhat.com wrote:


 - Original Message -
  On Mon, Apr 27, 2015 at 12:28:01PM -0400, Rabi Mishra wrote:
   Hi All,
  
   Deploying Kubernetes(k8s) cluster on any OpenStack based cloud for
   container based workload is a standard deployment pattern. However,
   auto-scaling this cluster based on load would require some integration
   between k8s OpenStack components. While looking at the option of
   leveraging Heat ASG to achieve autoscaling, I came across few
 requirements
   that the list can discuss and arrive at the best possible solution.
  
   A typical k8s deployment scenario on OpenStack would be as below.
  
   - Master (single VM)
   - Minions/Nodes (AutoScalingGroup)
  
   AutoScaling of the cluster would involve both scaling of minions/nodes
 and
   scaling Pods(ReplicationControllers).
  
   1. Scaling Nodes/Minions:
  
   We already have utilization stats collected at the hypervisor level, as
   ceilometer compute agent polls the local libvirt daemon to acquire
   performance data for the local instances/nodes.
 
  I really doubts if those metrics are so useful to trigger a scaling
  operation. My suspicion is based on two assumptions: 1) autoscaling
  requests should come from the user application or service, not from the
  controller plane, the application knows best whether scaling is needed;
  2) hypervisor level metrics may be misleading in some cases. For
  example, it cannot give an accurate CPU utilization number in the case
  of CPU overcommit which is a common practice.

 I agree that correct utilization statistics is complex with virtual
 infrastructure.
 However, I think physical+hypervisor metrics (collected by compute agent)
 should be a
 good point to start.

   Also, Kubelet (running on the node) collects the cAdvisor stats.
 However,
   cAdvisor stats are not fed back to the scheduler at present and
 scheduler
   uses a simple round-robin method for scheduling.
 
  It looks like a multi-layer resource management problem which needs a
  wholistic design. I'm not quite sure if scheduling at the container
  layer alone can help improve resource utilization or not.

 k8s scheduler is going to improve over time to use the cAdvisor/heapster
 metrics for
 better scheduling. IMO, we should leave that for k8s to handle.

 My point is on getting that metrics to ceilometer either from the nodes or
 from the \
 scheduler/master.

   Req 1: We would need a way to push stats from the kubelet/cAdvisor to
   ceilometer directly or via the master(using heapster). Alarms based on
   these stats can then be used to scale up/down the ASG.
 
  To send a sample to ceilometer for triggering autoscaling, we will need
  some user credentials to authenticate with keystone (even with trusts).
  We need to pass the project-id in and out so that ceilometer will know
  the correct scope for evaluation. We also need a standard way to tag
  samples with the stack ID and maybe also the ASG ID. I'd love to see
  this done transparently, i.e. no matching_metadata or query confusions.
 
   There is an existing blueprint[1] for an inspector implementation for
   docker hypervisor(nova-docker). However, we would probably require an
   agent running on the nodes or master and send the cAdvisor or heapster
   stats to ceilometer. I've seen some discussions on possibility of
   leveraging keystone trusts with ceilometer client.
 
  An agent is needed, definitely.
 
   Req 2: Autoscaling Group is expected to notify the master that a new
 node
   has been added/removed. Before removing a node the master/scheduler
 has to
   mark node as
   unschedulable.
 
  A little bit confused here ... are we scaling the containers or the
  nodes or both?

 We would only focusing on the nodes. However, adding/removing nodes
 without the k8s master/scheduler
 knowing about it (so that it can schedule pods or make them
 unschedulable)would be useless.

   Req 3: Notify containers/pods that the node would be removed for them
 to
   stop accepting any traffic, 

Re: [openstack-dev] [heat] Kubernetes AutoScaling with Heat AutoScalingGroup and Ceilometer

2015-04-28 Thread Rabi Mishra

- Original Message -
 On Mon, Apr 27, 2015 at 12:28:01PM -0400, Rabi Mishra wrote:
  Hi All,
  
  Deploying Kubernetes(k8s) cluster on any OpenStack based cloud for
  container based workload is a standard deployment pattern. However,
  auto-scaling this cluster based on load would require some integration
  between k8s OpenStack components. While looking at the option of
  leveraging Heat ASG to achieve autoscaling, I came across few requirements
  that the list can discuss and arrive at the best possible solution.
  
  A typical k8s deployment scenario on OpenStack would be as below.
  
  - Master (single VM)
  - Minions/Nodes (AutoScalingGroup)
  
  AutoScaling of the cluster would involve both scaling of minions/nodes and
  scaling Pods(ReplicationControllers).
  
  1. Scaling Nodes/Minions:
  
  We already have utilization stats collected at the hypervisor level, as
  ceilometer compute agent polls the local libvirt daemon to acquire
  performance data for the local instances/nodes.
 
 I really doubts if those metrics are so useful to trigger a scaling
 operation. My suspicion is based on two assumptions: 1) autoscaling
 requests should come from the user application or service, not from the
 controller plane, the application knows best whether scaling is needed;
 2) hypervisor level metrics may be misleading in some cases. For
 example, it cannot give an accurate CPU utilization number in the case
 of CPU overcommit which is a common practice.

I agree that correct utilization statistics is complex with virtual 
infrastructure.
However, I think physical+hypervisor metrics (collected by compute agent) 
should be a 
good point to start.
 
  Also, Kubelet (running on the node) collects the cAdvisor stats. However,
  cAdvisor stats are not fed back to the scheduler at present and scheduler
  uses a simple round-robin method for scheduling.
 
 It looks like a multi-layer resource management problem which needs a
 wholistic design. I'm not quite sure if scheduling at the container
 layer alone can help improve resource utilization or not.

k8s scheduler is going to improve over time to use the cAdvisor/heapster 
metrics for
better scheduling. IMO, we should leave that for k8s to handle.

My point is on getting that metrics to ceilometer either from the nodes or from 
the \
scheduler/master.

  Req 1: We would need a way to push stats from the kubelet/cAdvisor to
  ceilometer directly or via the master(using heapster). Alarms based on
  these stats can then be used to scale up/down the ASG.
 
 To send a sample to ceilometer for triggering autoscaling, we will need
 some user credentials to authenticate with keystone (even with trusts).
 We need to pass the project-id in and out so that ceilometer will know
 the correct scope for evaluation. We also need a standard way to tag
 samples with the stack ID and maybe also the ASG ID. I'd love to see
 this done transparently, i.e. no matching_metadata or query confusions.
 
  There is an existing blueprint[1] for an inspector implementation for
  docker hypervisor(nova-docker). However, we would probably require an
  agent running on the nodes or master and send the cAdvisor or heapster
  stats to ceilometer. I've seen some discussions on possibility of
  leveraging keystone trusts with ceilometer client.
 
 An agent is needed, definitely.
 
  Req 2: Autoscaling Group is expected to notify the master that a new node
  has been added/removed. Before removing a node the master/scheduler has to
  mark node as
  unschedulable.
 
 A little bit confused here ... are we scaling the containers or the
 nodes or both?

We would only focusing on the nodes. However, adding/removing nodes without the 
k8s master/scheduler 
knowing about it (so that it can schedule pods or make them unschedulable)would 
be useless.

  Req 3: Notify containers/pods that the node would be removed for them to
  stop accepting any traffic, persist data. It would also require a cooldown
  period before the node removal.
 
 There have been some discussions on sending messages, but so far I don't
 think there is a conclusion on the generic solution.
 
 Just my $0.02.

Thanks Qiming.

 BTW, we have been looking into similar problems in the Senlin project.

Great. We can probably discuss these during the Summit? I assume there is 
already a session
on Senlin planned, right?

 
 Regards,
   Qiming
 
  Both requirement 2 and 3 would probably require generating scaling event
  notifications/signals for master and containers to consume and probably
  some ASG lifecycle hooks.
  
  
  Req 4: In case of too many 'pending' pods to be scheduled, scheduler would
  signal ASG to scale up. This is similar to Req 1.
  
  
  2. Scaling Pods
  
  Currently manual scaling of pods is possible by resizing
  ReplicationControllers. k8s community is working on an abstraction,
  AutoScaler[2] on top of ReplicationController(RC) that provides
  intention/rule based autoscaling. There would be a requirement to 

[openstack-dev] [heat] Kubernetes AutoScaling with Heat AutoScalingGroup and Ceilometer

2015-04-27 Thread Rabi Mishra
Hi All,

Deploying Kubernetes(k8s) cluster on any OpenStack based cloud for container 
based workload is a standard deployment pattern. However, auto-scaling this 
cluster based on load would require some integration between k8s OpenStack 
components. While looking at the option of leveraging Heat ASG to achieve 
autoscaling, I came across few requirements that the list can discuss and 
arrive at the best possible solution.

A typical k8s deployment scenario on OpenStack would be as below.

- Master (single VM)
- Minions/Nodes (AutoScalingGroup)

AutoScaling of the cluster would involve both scaling of minions/nodes and 
scaling Pods(ReplicationControllers). 

1. Scaling Nodes/Minions:

We already have utilization stats collected at the hypervisor level, as 
ceilometer compute agent polls the local libvirt daemon to acquire performance 
data for the local instances/nodes. Also, Kubelet (running on the node) 
collects the cAdvisor stats. However, cAdvisor stats are not fed back to the 
scheduler at present and scheduler uses a simple round-robin method for 
scheduling.

Req 1: We would need a way to push stats from the kubelet/cAdvisor to 
ceilometer directly or via the master(using heapster). Alarms based on these 
stats can then be used to scale up/down the ASG. 

There is an existing blueprint[1] for an inspector implementation for docker 
hypervisor(nova-docker). However, we would probably require an agent running on 
the nodes or master and send the cAdvisor or heapster stats to ceilometer. I've 
seen some discussions on possibility of leveraging keystone trusts with 
ceilometer client. 

Req 2: Autoscaling Group is expected to notify the master that a new node has 
been added/removed. Before removing a node the master/scheduler has to mark 
node as 
unschedulable. 

Req 3: Notify containers/pods that the node would be removed for them to stop 
accepting any traffic, persist data. It would also require a cooldown period 
before the node removal. 

Both requirement 2 and 3 would probably require generating scaling event 
notifications/signals for master and containers to consume and probably some 
ASG lifecycle hooks.  


Req 4: In case of too many 'pending' pods to be scheduled, scheduler would 
signal ASG to scale up. This is similar to Req 1. 
 

2. Scaling Pods

Currently manual scaling of pods is possible by resizing 
ReplicationControllers. k8s community is working on an abstraction, 
AutoScaler[2] on top of ReplicationController(RC) that provides intention/rule 
based autoscaling. There would be a requirement to collect cAdvisor/Heapster 
stats to signal the AutoScaler too. Probably this is beyond the scope of 
OpenStack.

Any thoughts and ideas on how to realize this use-case would be appreciated.


[1] 
https://review.openstack.org/gitweb?p=openstack%2Fceilometer-specs.git;a=commitdiff;h=6ea7026b754563e18014a32e16ad954c86bd8d6b
[2] 
https://github.com/GoogleCloudPlatform/kubernetes/blob/master/docs/proposals/autoscaling.md

Regards,
Rabi Mishra


__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev