Re: [openstack-dev] [TripleO][Kolla][Heat][Higgins][Magnum][Kuryr] Gap analysis: Heat as a k8s orchestrator

2016-05-29 Thread Steve Baker

On 29/05/16 08:16, Hongbin Lu wrote:



-Original Message-
From: Zane Bitter [mailto:zbit...@redhat.com]
Sent: May-27-16 6:31 PM
To: OpenStack Development Mailing List
Subject: [openstack-dev] [TripleO][Kolla][Heat][Higgins][Magnum][Kuryr]
Gap analysis: Heat as a k8s orchestrator

I spent a bit of time exploring the idea of using Heat as an external
orchestration layer on top of Kubernetes - specifically in the case of
TripleO controller nodes but I think it could be more generally useful
too - but eventually came to the conclusion it doesn't work yet, and
probably won't for a while. Nevertheless, I think it's helpful to
document a bit to help other people avoid going down the same path, and
also to help us focus on working toward the point where it _is_
possible, since I think there are other contexts where it would be
useful too.

We tend to refer to Kubernetes as a "Container Orchestration Engine"
but it does not actually do any orchestration, unless you count just
starting everything at roughly the same time as 'orchestration'. Which
I wouldn't. You generally handle any orchestration requirements between
services within the containers themselves, possibly using external
services like etcd to co-ordinate. (The Kubernetes project refer to
this as "choreography", and explicitly disclaim any attempt at
orchestration.)

What Kubernetes *does* do is more like an actively-managed version of
Heat's SoftwareDeploymentGroup (emphasis on the _Group_). Brief recap:
SoftwareDeploymentGroup is a type of ResourceGroup; you give it a map
of resource names to server UUIDs and it creates a SoftwareDeployment
for each server. You have to generate the list of servers somehow to
give it (the easiest way is to obtain it from the output of another
ResourceGroup containing the servers). If e.g. a server goes down you
have to detect that externally, and trigger a Heat update that removes
it from the templates, redeploys a replacement server, and regenerates
the server list before a replacement SoftwareDeployment is created. In
constrast, Kubernetes is running on a cluster of servers, can use rules
to determine where to run containers, and can very quickly redeploy
without external intervention in response to a server or container
falling over. (It also does rolling updates, which Heat can also do
albeit in a somewhat hacky way when it comes to SoftwareDeployments -
which we're planning to fix.)

So this seems like an opportunity: if the dependencies between services
could be encoded in Heat templates rather than baked into the
containers then we could use Heat as the orchestration layer following
the dependency-based style I outlined in [1]. (TripleO is already
moving in this direction with the way that composable-roles uses
SoftwareDeploymentGroups.) One caveat is that fully using this style
likely rules out for all practical purposes the current Pacemaker-based
HA solution. We'd need to move to a lighter-weight HA solution, but I
know that TripleO is considering that anyway.

What's more though, assuming this could be made to work for a
Kubernetes cluster, a couple of remappings in the Heat environment file
should get you an otherwise-equivalent single-node non-HA deployment
basically for free. That's particularly exciting to me because there
are definitely deployments of TripleO that need HA clustering and
deployments that don't and which wouldn't want to pay the complexity
cost of running Kubernetes when they don't make any real use of it.

So you'd have a Heat resource type for the controller cluster that maps
to either an OS::Nova::Server or (the equivalent of) an OS::Magnum::Bay,
and a bunch of software deployments that map to either a
OS::Heat::SoftwareDeployment that calls (I assume) docker-compose
directly or a Kubernetes Pod resource to be named later.

The first obstacle is that we'd need that Kubernetes Pod resource in
Heat. Currently there is no such resource type, and the OpenStack API
that would be expected to provide that API (Magnum's /container
endpoint) is being deprecated, so that's not a long-term solution.[2]
Some folks from the Magnum community may or may not be working on a
separate project (which may or may not be called Higgins) to do that.
It'd be some time away though.

An alternative, though not a good one, would be to create a Kubernetes
resource type in Heat that has the credentials passed in somehow. I'm
very against that though. Heat is just not good at handling credentials
other than Keystone ones. We haven't ever created a resource type like
this before, except for the Docker one in /contrib that serves as a
prime example of what *not* to do. And if it doesn't make sense to wrap
an OpenStack API around this then IMO it isn't going to make any more
sense to wrap a Heat resource around it.

There are ways to alleviate the credential handling issue. Fi

Re: [openstack-dev] [TripleO][Kolla][Heat][Higgins][Magnum][Kuryr] Gap analysis: Heat as a k8s orchestrator

2016-05-29 Thread Hongbin Lu


> -Original Message-
> From: Steven Dake (stdake) [mailto:std...@cisco.com]
> Sent: May-29-16 3:29 PM
> To: OpenStack Development Mailing List (not for usage questions)
> Subject: Re: [openstack-dev]
> [TripleO][Kolla][Heat][Higgins][Magnum][Kuryr] Gap analysis: Heat as a
> k8s orchestrator
> 
> Quick question below.
> 
> On 5/28/16, 1:16 PM, "Hongbin Lu"  wrote:
> 
> >
> >
> >> -Original Message-
> >> From: Zane Bitter [mailto:zbit...@redhat.com]
> >> Sent: May-27-16 6:31 PM
> >> To: OpenStack Development Mailing List
> >> Subject: [openstack-dev]
> >> [TripleO][Kolla][Heat][Higgins][Magnum][Kuryr]
> >> Gap analysis: Heat as a k8s orchestrator
> >>
> >> I spent a bit of time exploring the idea of using Heat as an
> external
> >> orchestration layer on top of Kubernetes - specifically in the case
> >> of TripleO controller nodes but I think it could be more generally
> >> useful too - but eventually came to the conclusion it doesn't work
> >> yet, and probably won't for a while. Nevertheless, I think it's
> >> helpful to document a bit to help other people avoid going down the
> >> same path, and also to help us focus on working toward the point
> >> where it _is_ possible, since I think there are other contexts where
> >> it would be useful too.
> >>
> >> We tend to refer to Kubernetes as a "Container Orchestration Engine"
> >> but it does not actually do any orchestration, unless you count just
> >> starting everything at roughly the same time as 'orchestration'.
> >> Which I wouldn't. You generally handle any orchestration
> requirements
> >> between services within the containers themselves, possibly using
> >> external services like etcd to co-ordinate. (The Kubernetes project
> >> refer to this as "choreography", and explicitly disclaim any attempt
> >> at
> >> orchestration.)
> >>
> >> What Kubernetes *does* do is more like an actively-managed version
> of
> >> Heat's SoftwareDeploymentGroup (emphasis on the _Group_). Brief
> recap:
> >> SoftwareDeploymentGroup is a type of ResourceGroup; you give it a
> map
> >> of resource names to server UUIDs and it creates a
> SoftwareDeployment
> >> for each server. You have to generate the list of servers somehow to
> >> give it (the easiest way is to obtain it from the output of another
> >> ResourceGroup containing the servers). If e.g. a server goes down
> you
> >> have to detect that externally, and trigger a Heat update that
> >> removes it from the templates, redeploys a replacement server, and
> >> regenerates the server list before a replacement SoftwareDeployment
> >> is created. In constrast, Kubernetes is running on a cluster of
> >> servers, can use rules to determine where to run containers, and can
> >> very quickly redeploy without external intervention in response to a
> >> server or container falling over. (It also does rolling updates,
> >> which Heat can also do albeit in a somewhat hacky way when it comes
> >> to SoftwareDeployments - which we're planning to fix.)
> >>
> >> So this seems like an opportunity: if the dependencies between
> >> services could be encoded in Heat templates rather than baked into
> >> the containers then we could use Heat as the orchestration layer
> >> following the dependency-based style I outlined in [1]. (TripleO is
> >> already moving in this direction with the way that composable-roles
> >> uses
> >> SoftwareDeploymentGroups.) One caveat is that fully using this style
> >> likely rules out for all practical purposes the current
> >> Pacemaker-based HA solution. We'd need to move to a lighter-weight
> HA
> >> solution, but I know that TripleO is considering that anyway.
> >>
> >> What's more though, assuming this could be made to work for a
> >> Kubernetes cluster, a couple of remappings in the Heat environment
> >> file should get you an otherwise-equivalent single-node non-HA
> >> deployment basically for free. That's particularly exciting to me
> >> because there are definitely deployments of TripleO that need HA
> >> clustering and deployments that don't and which wouldn't want to pay
> >> the complexity cost of running Kubernetes when they don't make any
> real use of it.
> >>
> >> So you'd have a Heat resource type for the controlle

Re: [openstack-dev] [TripleO][Kolla][Heat][Higgins][Magnum][Kuryr] Gap analysis: Heat as a k8s orchestrator

2016-05-29 Thread Steven Dake (stdake)
Quick question below.

On 5/28/16, 1:16 PM, "Hongbin Lu"  wrote:

>
>
>> -Original Message-
>> From: Zane Bitter [mailto:zbit...@redhat.com]
>> Sent: May-27-16 6:31 PM
>> To: OpenStack Development Mailing List
>> Subject: [openstack-dev] [TripleO][Kolla][Heat][Higgins][Magnum][Kuryr]
>> Gap analysis: Heat as a k8s orchestrator
>> 
>> I spent a bit of time exploring the idea of using Heat as an external
>> orchestration layer on top of Kubernetes - specifically in the case of
>> TripleO controller nodes but I think it could be more generally useful
>> too - but eventually came to the conclusion it doesn't work yet, and
>> probably won't for a while. Nevertheless, I think it's helpful to
>> document a bit to help other people avoid going down the same path, and
>> also to help us focus on working toward the point where it _is_
>> possible, since I think there are other contexts where it would be
>> useful too.
>> 
>> We tend to refer to Kubernetes as a "Container Orchestration Engine"
>> but it does not actually do any orchestration, unless you count just
>> starting everything at roughly the same time as 'orchestration'. Which
>> I wouldn't. You generally handle any orchestration requirements between
>> services within the containers themselves, possibly using external
>> services like etcd to co-ordinate. (The Kubernetes project refer to
>> this as "choreography", and explicitly disclaim any attempt at
>> orchestration.)
>> 
>> What Kubernetes *does* do is more like an actively-managed version of
>> Heat's SoftwareDeploymentGroup (emphasis on the _Group_). Brief recap:
>> SoftwareDeploymentGroup is a type of ResourceGroup; you give it a map
>> of resource names to server UUIDs and it creates a SoftwareDeployment
>> for each server. You have to generate the list of servers somehow to
>> give it (the easiest way is to obtain it from the output of another
>> ResourceGroup containing the servers). If e.g. a server goes down you
>> have to detect that externally, and trigger a Heat update that removes
>> it from the templates, redeploys a replacement server, and regenerates
>> the server list before a replacement SoftwareDeployment is created. In
>> constrast, Kubernetes is running on a cluster of servers, can use rules
>> to determine where to run containers, and can very quickly redeploy
>> without external intervention in response to a server or container
>> falling over. (It also does rolling updates, which Heat can also do
>> albeit in a somewhat hacky way when it comes to SoftwareDeployments -
>> which we're planning to fix.)
>> 
>> So this seems like an opportunity: if the dependencies between services
>> could be encoded in Heat templates rather than baked into the
>> containers then we could use Heat as the orchestration layer following
>> the dependency-based style I outlined in [1]. (TripleO is already
>> moving in this direction with the way that composable-roles uses
>> SoftwareDeploymentGroups.) One caveat is that fully using this style
>> likely rules out for all practical purposes the current Pacemaker-based
>> HA solution. We'd need to move to a lighter-weight HA solution, but I
>> know that TripleO is considering that anyway.
>> 
>> What's more though, assuming this could be made to work for a
>> Kubernetes cluster, a couple of remappings in the Heat environment file
>> should get you an otherwise-equivalent single-node non-HA deployment
>> basically for free. That's particularly exciting to me because there
>> are definitely deployments of TripleO that need HA clustering and
>> deployments that don't and which wouldn't want to pay the complexity
>> cost of running Kubernetes when they don't make any real use of it.
>> 
>> So you'd have a Heat resource type for the controller cluster that maps
>> to either an OS::Nova::Server or (the equivalent of) an OS::Magnum::Bay,
>> and a bunch of software deployments that map to either a
>> OS::Heat::SoftwareDeployment that calls (I assume) docker-compose
>> directly or a Kubernetes Pod resource to be named later.
>> 
>> The first obstacle is that we'd need that Kubernetes Pod resource in
>> Heat. Currently there is no such resource type, and the OpenStack API
>> that would be expected to provide that API (Magnum's /container
>> endpoint) is being deprecated, so that's not a long-term solution.[2]
>> Some folks from the Magnum community may or may not be working on a
>> separate project (whic

Re: [openstack-dev] [TripleO][Kolla][Heat][Higgins][Magnum][Kuryr] Gap analysis: Heat as a k8s orchestrator

2016-05-29 Thread Steven Dake (stdake)
Hongbin,

Re Netowrk coverage, he is talking about the best practice way to deploy
an OpenStack cloud.  I have a diagram here:

http://www.gliffy.com/go/publish/10486755

I think what Zane is getting that network Diagram above to magically map
into Kubernetes is not possible at present and may not be possible ever.

Regards
-steve


On 5/28/16, 1:16 PM, "Hongbin Lu"  wrote:

>
>
>> -Original Message-
>> From: Zane Bitter [mailto:zbit...@redhat.com]
>> Sent: May-27-16 6:31 PM
>> To: OpenStack Development Mailing List
>> Subject: [openstack-dev] [TripleO][Kolla][Heat][Higgins][Magnum][Kuryr]
>> Gap analysis: Heat as a k8s orchestrator
>> 
>> I spent a bit of time exploring the idea of using Heat as an external
>> orchestration layer on top of Kubernetes - specifically in the case of
>> TripleO controller nodes but I think it could be more generally useful
>> too - but eventually came to the conclusion it doesn't work yet, and
>> probably won't for a while. Nevertheless, I think it's helpful to
>> document a bit to help other people avoid going down the same path, and
>> also to help us focus on working toward the point where it _is_
>> possible, since I think there are other contexts where it would be
>> useful too.
>> 
>> We tend to refer to Kubernetes as a "Container Orchestration Engine"
>> but it does not actually do any orchestration, unless you count just
>> starting everything at roughly the same time as 'orchestration'. Which
>> I wouldn't. You generally handle any orchestration requirements between
>> services within the containers themselves, possibly using external
>> services like etcd to co-ordinate. (The Kubernetes project refer to
>> this as "choreography", and explicitly disclaim any attempt at
>> orchestration.)
>> 
>> What Kubernetes *does* do is more like an actively-managed version of
>> Heat's SoftwareDeploymentGroup (emphasis on the _Group_). Brief recap:
>> SoftwareDeploymentGroup is a type of ResourceGroup; you give it a map
>> of resource names to server UUIDs and it creates a SoftwareDeployment
>> for each server. You have to generate the list of servers somehow to
>> give it (the easiest way is to obtain it from the output of another
>> ResourceGroup containing the servers). If e.g. a server goes down you
>> have to detect that externally, and trigger a Heat update that removes
>> it from the templates, redeploys a replacement server, and regenerates
>> the server list before a replacement SoftwareDeployment is created. In
>> constrast, Kubernetes is running on a cluster of servers, can use rules
>> to determine where to run containers, and can very quickly redeploy
>> without external intervention in response to a server or container
>> falling over. (It also does rolling updates, which Heat can also do
>> albeit in a somewhat hacky way when it comes to SoftwareDeployments -
>> which we're planning to fix.)
>> 
>> So this seems like an opportunity: if the dependencies between services
>> could be encoded in Heat templates rather than baked into the
>> containers then we could use Heat as the orchestration layer following
>> the dependency-based style I outlined in [1]. (TripleO is already
>> moving in this direction with the way that composable-roles uses
>> SoftwareDeploymentGroups.) One caveat is that fully using this style
>> likely rules out for all practical purposes the current Pacemaker-based
>> HA solution. We'd need to move to a lighter-weight HA solution, but I
>> know that TripleO is considering that anyway.
>> 
>> What's more though, assuming this could be made to work for a
>> Kubernetes cluster, a couple of remappings in the Heat environment file
>> should get you an otherwise-equivalent single-node non-HA deployment
>> basically for free. That's particularly exciting to me because there
>> are definitely deployments of TripleO that need HA clustering and
>> deployments that don't and which wouldn't want to pay the complexity
>> cost of running Kubernetes when they don't make any real use of it.
>> 
>> So you'd have a Heat resource type for the controller cluster that maps
>> to either an OS::Nova::Server or (the equivalent of) an OS::Magnum::Bay,
>> and a bunch of software deployments that map to either a
>> OS::Heat::SoftwareDeployment that calls (I assume) docker-compose
>> directly or a Kubernetes Pod resource to be named later.
>> 
>> The first obstacle is that we'd need that Kubernetes Pod resource in
>> Heat. Currently there is

Re: [openstack-dev] [TripleO][Kolla][Heat][Higgins][Magnum][Kuryr] Gap analysis: Heat as a k8s orchestrator

2016-05-28 Thread Hongbin Lu


> -Original Message-
> From: Zane Bitter [mailto:zbit...@redhat.com]
> Sent: May-27-16 6:31 PM
> To: OpenStack Development Mailing List
> Subject: [openstack-dev] [TripleO][Kolla][Heat][Higgins][Magnum][Kuryr]
> Gap analysis: Heat as a k8s orchestrator
> 
> I spent a bit of time exploring the idea of using Heat as an external
> orchestration layer on top of Kubernetes - specifically in the case of
> TripleO controller nodes but I think it could be more generally useful
> too - but eventually came to the conclusion it doesn't work yet, and
> probably won't for a while. Nevertheless, I think it's helpful to
> document a bit to help other people avoid going down the same path, and
> also to help us focus on working toward the point where it _is_
> possible, since I think there are other contexts where it would be
> useful too.
> 
> We tend to refer to Kubernetes as a "Container Orchestration Engine"
> but it does not actually do any orchestration, unless you count just
> starting everything at roughly the same time as 'orchestration'. Which
> I wouldn't. You generally handle any orchestration requirements between
> services within the containers themselves, possibly using external
> services like etcd to co-ordinate. (The Kubernetes project refer to
> this as "choreography", and explicitly disclaim any attempt at
> orchestration.)
> 
> What Kubernetes *does* do is more like an actively-managed version of
> Heat's SoftwareDeploymentGroup (emphasis on the _Group_). Brief recap:
> SoftwareDeploymentGroup is a type of ResourceGroup; you give it a map
> of resource names to server UUIDs and it creates a SoftwareDeployment
> for each server. You have to generate the list of servers somehow to
> give it (the easiest way is to obtain it from the output of another
> ResourceGroup containing the servers). If e.g. a server goes down you
> have to detect that externally, and trigger a Heat update that removes
> it from the templates, redeploys a replacement server, and regenerates
> the server list before a replacement SoftwareDeployment is created. In
> constrast, Kubernetes is running on a cluster of servers, can use rules
> to determine where to run containers, and can very quickly redeploy
> without external intervention in response to a server or container
> falling over. (It also does rolling updates, which Heat can also do
> albeit in a somewhat hacky way when it comes to SoftwareDeployments -
> which we're planning to fix.)
> 
> So this seems like an opportunity: if the dependencies between services
> could be encoded in Heat templates rather than baked into the
> containers then we could use Heat as the orchestration layer following
> the dependency-based style I outlined in [1]. (TripleO is already
> moving in this direction with the way that composable-roles uses
> SoftwareDeploymentGroups.) One caveat is that fully using this style
> likely rules out for all practical purposes the current Pacemaker-based
> HA solution. We'd need to move to a lighter-weight HA solution, but I
> know that TripleO is considering that anyway.
> 
> What's more though, assuming this could be made to work for a
> Kubernetes cluster, a couple of remappings in the Heat environment file
> should get you an otherwise-equivalent single-node non-HA deployment
> basically for free. That's particularly exciting to me because there
> are definitely deployments of TripleO that need HA clustering and
> deployments that don't and which wouldn't want to pay the complexity
> cost of running Kubernetes when they don't make any real use of it.
> 
> So you'd have a Heat resource type for the controller cluster that maps
> to either an OS::Nova::Server or (the equivalent of) an OS::Magnum::Bay,
> and a bunch of software deployments that map to either a
> OS::Heat::SoftwareDeployment that calls (I assume) docker-compose
> directly or a Kubernetes Pod resource to be named later.
> 
> The first obstacle is that we'd need that Kubernetes Pod resource in
> Heat. Currently there is no such resource type, and the OpenStack API
> that would be expected to provide that API (Magnum's /container
> endpoint) is being deprecated, so that's not a long-term solution.[2]
> Some folks from the Magnum community may or may not be working on a
> separate project (which may or may not be called Higgins) to do that.
> It'd be some time away though.
> 
> An alternative, though not a good one, would be to create a Kubernetes
> resource type in Heat that has the credentials passed in somehow. I'm
> very against that though. Heat is just not good at handling credentials
> other than Keystone ones. We haven

Re: [openstack-dev] [TripleO][Kolla][Heat][Higgins][Magnum][Kuryr] Gap analysis: Heat as a k8s orchestrator

2016-05-27 Thread Fox, Kevin M
Hi Zane,

I've been working on the k8s side of the equation right now...

See these two PR's:
https://github.com/kubernetes/kubernetes/pull/25391
https://github.com/kubernetes/kubernetes/pull/25624

I'm still hopeful these can make k8s 1.3 as experimental plugins. There is 
keystone username/password auth support in 1.2 & 1.3, but it is unsuitable for 
heat usage. It also does not support authorization at all.

After these patches are in, heat, horizon, and higgins should be able to use 
the k8s api. I believe they should be complete enough for testing now though, 
if you want to build it yourself.

There also will need to be a small patch to magnum to set the right flags to 
bind the deployed k8s to the local cloud if you want to use magnum to deploy.

After the patches are in, I was thinking about taking a stab at a heat resource 
for deployments, but if you can get to it before I can, that would be great 
too. :)

Thanks,
Kevin

From: Zane Bitter [zbit...@redhat.com]
Sent: Friday, May 27, 2016 3:30 PM
To: OpenStack Development Mailing List
Subject: [openstack-dev] [TripleO][Kolla][Heat][Higgins][Magnum][Kuryr] Gap 
analysis: Heat as a k8s orchestrator

I spent a bit of time exploring the idea of using Heat as an external
orchestration layer on top of Kubernetes - specifically in the case of
TripleO controller nodes but I think it could be more generally useful
too - but eventually came to the conclusion it doesn't work yet, and
probably won't for a while. Nevertheless, I think it's helpful to
document a bit to help other people avoid going down the same path, and
also to help us focus on working toward the point where it _is_
possible, since I think there are other contexts where it would be
useful too.

We tend to refer to Kubernetes as a "Container Orchestration Engine" but
it does not actually do any orchestration, unless you count just
starting everything at roughly the same time as 'orchestration'. Which I
wouldn't. You generally handle any orchestration requirements between
services within the containers themselves, possibly using external
services like etcd to co-ordinate. (The Kubernetes project refer to this
as "choreography", and explicitly disclaim any attempt at orchestration.)

What Kubernetes *does* do is more like an actively-managed version of
Heat's SoftwareDeploymentGroup (emphasis on the _Group_). Brief recap:
SoftwareDeploymentGroup is a type of ResourceGroup; you give it a map of
resource names to server UUIDs and it creates a SoftwareDeployment for
each server. You have to generate the list of servers somehow to give it
(the easiest way is to obtain it from the output of another
ResourceGroup containing the servers). If e.g. a server goes down you
have to detect that externally, and trigger a Heat update that removes
it from the templates, redeploys a replacement server, and regenerates
the server list before a replacement SoftwareDeployment is created. In
constrast, Kubernetes is running on a cluster of servers, can use rules
to determine where to run containers, and can very quickly redeploy
without external intervention in response to a server or container
falling over. (It also does rolling updates, which Heat can also do
albeit in a somewhat hacky way when it comes to SoftwareDeployments -
which we're planning to fix.)

So this seems like an opportunity: if the dependencies between services
could be encoded in Heat templates rather than baked into the containers
then we could use Heat as the orchestration layer following the
dependency-based style I outlined in [1]. (TripleO is already moving in
this direction with the way that composable-roles uses
SoftwareDeploymentGroups.) One caveat is that fully using this style
likely rules out for all practical purposes the current Pacemaker-based
HA solution. We'd need to move to a lighter-weight HA solution, but I
know that TripleO is considering that anyway.

What's more though, assuming this could be made to work for a Kubernetes
cluster, a couple of remappings in the Heat environment file should get
you an otherwise-equivalent single-node non-HA deployment basically for
free. That's particularly exciting to me because there are definitely
deployments of TripleO that need HA clustering and deployments that
don't and which wouldn't want to pay the complexity cost of running
Kubernetes when they don't make any real use of it.

So you'd have a Heat resource type for the controller cluster that maps
to either an OS::Nova::Server or (the equivalent of) an OS::Magnum::Bay,
and a bunch of software deployments that map to either a
OS::Heat::SoftwareDeployment that calls (I assume) docker-compose
directly or a Kubernetes Pod resource to be named later.

The first obstacle is that we'd need that Kubernetes Pod resource in
Heat. Currently there is no such resource type, and the OpenS

[openstack-dev] [TripleO][Kolla][Heat][Higgins][Magnum][Kuryr] Gap analysis: Heat as a k8s orchestrator

2016-05-27 Thread Zane Bitter
I spent a bit of time exploring the idea of using Heat as an external 
orchestration layer on top of Kubernetes - specifically in the case of 
TripleO controller nodes but I think it could be more generally useful 
too - but eventually came to the conclusion it doesn't work yet, and 
probably won't for a while. Nevertheless, I think it's helpful to 
document a bit to help other people avoid going down the same path, and 
also to help us focus on working toward the point where it _is_ 
possible, since I think there are other contexts where it would be 
useful too.


We tend to refer to Kubernetes as a "Container Orchestration Engine" but 
it does not actually do any orchestration, unless you count just 
starting everything at roughly the same time as 'orchestration'. Which I 
wouldn't. You generally handle any orchestration requirements between 
services within the containers themselves, possibly using external 
services like etcd to co-ordinate. (The Kubernetes project refer to this 
as "choreography", and explicitly disclaim any attempt at orchestration.)


What Kubernetes *does* do is more like an actively-managed version of 
Heat's SoftwareDeploymentGroup (emphasis on the _Group_). Brief recap: 
SoftwareDeploymentGroup is a type of ResourceGroup; you give it a map of 
resource names to server UUIDs and it creates a SoftwareDeployment for 
each server. You have to generate the list of servers somehow to give it 
(the easiest way is to obtain it from the output of another 
ResourceGroup containing the servers). If e.g. a server goes down you 
have to detect that externally, and trigger a Heat update that removes 
it from the templates, redeploys a replacement server, and regenerates 
the server list before a replacement SoftwareDeployment is created. In 
constrast, Kubernetes is running on a cluster of servers, can use rules 
to determine where to run containers, and can very quickly redeploy 
without external intervention in response to a server or container 
falling over. (It also does rolling updates, which Heat can also do 
albeit in a somewhat hacky way when it comes to SoftwareDeployments - 
which we're planning to fix.)


So this seems like an opportunity: if the dependencies between services 
could be encoded in Heat templates rather than baked into the containers 
then we could use Heat as the orchestration layer following the 
dependency-based style I outlined in [1]. (TripleO is already moving in 
this direction with the way that composable-roles uses 
SoftwareDeploymentGroups.) One caveat is that fully using this style 
likely rules out for all practical purposes the current Pacemaker-based 
HA solution. We'd need to move to a lighter-weight HA solution, but I 
know that TripleO is considering that anyway.


What's more though, assuming this could be made to work for a Kubernetes 
cluster, a couple of remappings in the Heat environment file should get 
you an otherwise-equivalent single-node non-HA deployment basically for 
free. That's particularly exciting to me because there are definitely 
deployments of TripleO that need HA clustering and deployments that 
don't and which wouldn't want to pay the complexity cost of running 
Kubernetes when they don't make any real use of it.


So you'd have a Heat resource type for the controller cluster that maps 
to either an OS::Nova::Server or (the equivalent of) an OS::Magnum::Bay, 
and a bunch of software deployments that map to either a 
OS::Heat::SoftwareDeployment that calls (I assume) docker-compose 
directly or a Kubernetes Pod resource to be named later.


The first obstacle is that we'd need that Kubernetes Pod resource in 
Heat. Currently there is no such resource type, and the OpenStack API 
that would be expected to provide that API (Magnum's /container 
endpoint) is being deprecated, so that's not a long-term solution.[2] 
Some folks from the Magnum community may or may not be working on a 
separate project (which may or may not be called Higgins) to do that. 
It'd be some time away though.


An alternative, though not a good one, would be to create a Kubernetes 
resource type in Heat that has the credentials passed in somehow. I'm 
very against that though. Heat is just not good at handling credentials 
other than Keystone ones. We haven't ever created a resource type like 
this before, except for the Docker one in /contrib that serves as a 
prime example of what *not* to do. And if it doesn't make sense to wrap 
an OpenStack API around this then IMO it isn't going to make any more 
sense to wrap a Heat resource around it.


A third option might be a SoftwareDeployment, possibly on one of the 
controller nodes themselves, that calls the k8s client. (We could create 
a software deployment hook to make this easy.) That would suffer from 
all of the same issues that TripleO currently has about having to choose 
a server on which to deploy though.


The secondary obstacle is networking. TripleO has some pretty 
complicated networking require