Re: [openstack-dev] [magnum][keystone] clusters, trustees and projects

2018-03-01 Thread Ricardo Rocha
Hi.

I had added an item for this:
https://bugs.launchpad.net/magnum/+bug/1752433

after the last reply and a bit of searching around.

It's not urgent but we already got a couple cases in our deployment.

Cheers,
Ricardo

On Thu, Mar 1, 2018 at 3:44 PM, Spyros Trigazis <strig...@gmail.com> wrote:
> Hello,
>
> After discussion with the keystone team at the above session, keystone
> will not provide a way to transfer trusts nor application credentials,
> since it doesn't address the above problem (the member that leaves the team
> can auth with keystone if he has the trust/app-creds).
>
> In magnum we need a way for admins and the cluster owner to rotate the
> trust or app-creds and certificates.
>
> We can leverage the existing rotate_ca api for rotating the ca and at the
> same
> time the trust. Since this api is designed only to rotate the ca, we can
> add a cluster action to transter ownership of the cluster. This action
> should be
> allowed to be executed by the admin or the current owner of a given cluster.
>
> At the same time, the trust created by heat for every stack suffers from the
> same problem, we should check with the heat team what is their plan.
>
> Cheers,
> Spyros
>
> On 27 February 2018 at 20:53, Ricardo Rocha <rocha.po...@gmail.com> wrote:
>>
>> Hi Lance.
>>
>> On Mon, Feb 26, 2018 at 4:45 PM, Lance Bragstad <lbrags...@gmail.com>
>> wrote:
>> >
>> >
>> > On 02/26/2018 10:17 AM, Ricardo Rocha wrote:
>> >> Hi.
>> >>
>> >> We have an issue on the way Magnum uses keystone trusts.
>> >>
>> >> Magnum clusters are created in a given project using HEAT, and require
>> >> a trust token to communicate back with OpenStack services -  there is
>> >> also integration with Kubernetes via a cloud provider.
>> >>
>> >> This trust belongs to a given user, not the project, so whenever we
>> >> disable the user's account - for example when a user leaves the
>> >> organization - the cluster becomes unhealthy as the trust is no longer
>> >> valid. Given the token is available in the cluster nodes, accessible
>> >> by users, a trust linked to a service account is also not a viable
>> >> solution.
>> >>
>> >> Is there an existing alternative for this kind of use case? I guess
>> >> what we might need is a trust that is linked to the project.
>> > This was proposed in the original application credential specification
>> > [0] [1]. The problem is that you're sharing an authentication mechanism
>> > with multiple people when you associate it to the life cycle of a
>> > project. When a user is deleted or removed from the project, nothing
>> > would stop them from accessing OpenStack APIs if the application
>> > credential or trust isn't rotated out. Even if the credential or trust
>> > were scoped to the project's life cycle, it would need to be rotated out
>> > and replaced when users come and go for the same reason. So it would
>> > still be associated to the user life cycle, just indirectly. Otherwise
>> > you're allowing unauthorized access to something that should be
>> > protected.
>> >
>> > If you're at the PTG - we will be having a session on application
>> > credentials tomorrow (Tuesday) afternoon [2] in the identity-integration
>> > room [3].
>>
>> Thanks for the reply, i now understand the issue.
>>
>> I'm not at the PTG. Had a look at the etherpad but it seems app
>> credentials will have a similar lifecycle so not suitable for the use
>> case above - for the same reasons you mention.
>>
>> I wonder what's the alternative to achieve what we need in Magnum?
>>
>> Cheers,
>>   Ricardo
>>
>> > [0] https://review.openstack.org/#/c/450415/
>> > [1] https://review.openstack.org/#/c/512505/
>> > [2] https://etherpad.openstack.org/p/application-credentials-rocky-ptg
>> > [3] http://ptg.openstack.org/ptg.html
>> >>
>> >> I believe the same issue would be there using application credentials,
>> >> as the ownership is similar.
>> >>
>> >> Cheers,
>> >>   Ricardo
>> >>
>> >>
>> >> __
>> >> OpenStack Development Mailing List (not for usage questions)
>> >> Unsubscribe:
>> >> openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
>> >> http://lists.openstack.org/cgi-bin/mailman/listinfo/op

Re: [openstack-dev] [magnum][keystone] clusters, trustees and projects

2018-02-27 Thread Ricardo Rocha
Hi Lance.

On Mon, Feb 26, 2018 at 4:45 PM, Lance Bragstad <lbrags...@gmail.com> wrote:
>
>
> On 02/26/2018 10:17 AM, Ricardo Rocha wrote:
>> Hi.
>>
>> We have an issue on the way Magnum uses keystone trusts.
>>
>> Magnum clusters are created in a given project using HEAT, and require
>> a trust token to communicate back with OpenStack services -  there is
>> also integration with Kubernetes via a cloud provider.
>>
>> This trust belongs to a given user, not the project, so whenever we
>> disable the user's account - for example when a user leaves the
>> organization - the cluster becomes unhealthy as the trust is no longer
>> valid. Given the token is available in the cluster nodes, accessible
>> by users, a trust linked to a service account is also not a viable
>> solution.
>>
>> Is there an existing alternative for this kind of use case? I guess
>> what we might need is a trust that is linked to the project.
> This was proposed in the original application credential specification
> [0] [1]. The problem is that you're sharing an authentication mechanism
> with multiple people when you associate it to the life cycle of a
> project. When a user is deleted or removed from the project, nothing
> would stop them from accessing OpenStack APIs if the application
> credential or trust isn't rotated out. Even if the credential or trust
> were scoped to the project's life cycle, it would need to be rotated out
> and replaced when users come and go for the same reason. So it would
> still be associated to the user life cycle, just indirectly. Otherwise
> you're allowing unauthorized access to something that should be protected.
>
> If you're at the PTG - we will be having a session on application
> credentials tomorrow (Tuesday) afternoon [2] in the identity-integration
> room [3].

Thanks for the reply, i now understand the issue.

I'm not at the PTG. Had a look at the etherpad but it seems app
credentials will have a similar lifecycle so not suitable for the use
case above - for the same reasons you mention.

I wonder what's the alternative to achieve what we need in Magnum?

Cheers,
  Ricardo

> [0] https://review.openstack.org/#/c/450415/
> [1] https://review.openstack.org/#/c/512505/
> [2] https://etherpad.openstack.org/p/application-credentials-rocky-ptg
> [3] http://ptg.openstack.org/ptg.html
>>
>> I believe the same issue would be there using application credentials,
>> as the ownership is similar.
>>
>> Cheers,
>>   Ricardo
>>
>> __
>> OpenStack Development Mailing List (not for usage questions)
>> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
>
>
> __
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [magnum][keystone] clusters, trustees and projects

2018-02-26 Thread Ricardo Rocha
Hi.

We have an issue on the way Magnum uses keystone trusts.

Magnum clusters are created in a given project using HEAT, and require
a trust token to communicate back with OpenStack services -  there is
also integration with Kubernetes via a cloud provider.

This trust belongs to a given user, not the project, so whenever we
disable the user's account - for example when a user leaves the
organization - the cluster becomes unhealthy as the trust is no longer
valid. Given the token is available in the cluster nodes, accessible
by users, a trust linked to a service account is also not a viable
solution.

Is there an existing alternative for this kind of use case? I guess
what we might need is a trust that is linked to the project.

I believe the same issue would be there using application credentials,
as the ownership is similar.

Cheers,
  Ricardo

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Magnum] Docker Swarm Mode Support

2017-11-02 Thread Ricardo Rocha
Hi again.

On Wed, Nov 1, 2017 at 9:47 PM, Vahric MUHTARYAN <vah...@doruk.net.tr> wrote:
> Hello Ricardo ,
>
> Thanks for your explanation and answers.
> One more question, what is the possibility to keep using Newton (right now i 
> have it) and use latest Magnum features like swarm mode without upgrade 
> Openstack ? Does it possible ?

I don't think this functionality is available in Magnum Newton.

One option though is to upgrade only Magnum, there should be no
dependency on more recent versions of other components - assuming you
either have a separate control plane for Magnum or are able to split
it.

Cheers,
  Ricardo

>
> Regards
> VM
>
> On 30.10.2017 01:19, "Ricardo Rocha" <rocha.po...@gmail.com> wrote:
>
> Hi Vahric.
>
> On Fri, Oct 27, 2017 at 9:51 PM, Vahric MUHTARYAN <vah...@doruk.net.tr> 
> wrote:
> > Hello All ,
> >
> >
> >
> > I found some blueprint about supporting Docker Swarm Mode
> > https://blueprints.launchpad.net/magnum/+spec/swarm-mode-support
> >
> >
> >
> > I understood that related development is not over yet and no any 
> Openstack
> > version or Magnum version to test it also looks like some more thing to 
> do.
> >
> > Could you pls inform when we should expect support of Docker Swarm Mode 
> ?
>
> Swarm mode is already available in Pike:
> https://docs.openstack.org/releasenotes/magnum/pike.html
>
> > Another question is fedora atomic is good but looks like its not 
> up2date for
> > docker , instead of use Fedora Atomic , why you do not use Ubuntu, or 
> some
> > other OS and directly install docker with requested version ?
>
> Atomic also has advantages (immutable, etc), it's working well for us
> at CERN. There are also Suse and CoreOS drivers, but i'm not familiar
> with those.
>
> Most pieces have moved to Atomic system containers, including all
> kubernetes components so the versions are decouple from the Atomic
> version.
>
> We've also deployed locally a patch running docker itself in a system
> container, this will get upstream with:
> https://bugs.launchpad.net/magnum/+bug/1727700
>
> With this we allow our users to deploy clusters with any docker
> version (selectable with a label), currently up to 17.09.
>
> > And last, to help to over waiting items “Next working items: ”  how we 
> could
> > help ?
>
> I'll let Spyros reply to this and give you more info on the above items 
> too.
>
> Regards,
>   Ricardo
>
> >
> >
> >
> > Regards
> >
> > Vahric Muhtaryan
> >
> >
> > 
> __
> > OpenStack Development Mailing List (not for usage questions)
> > Unsubscribe: 
> openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> >
>
> __
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
>
>
>
> __
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Magnum] Docker Swarm Mode Support

2017-10-29 Thread Ricardo Rocha
Hi Vahric.

On Fri, Oct 27, 2017 at 9:51 PM, Vahric MUHTARYAN  wrote:
> Hello All ,
>
>
>
> I found some blueprint about supporting Docker Swarm Mode
> https://blueprints.launchpad.net/magnum/+spec/swarm-mode-support
>
>
>
> I understood that related development is not over yet and no any Openstack
> version or Magnum version to test it also looks like some more thing to do.
>
> Could you pls inform when we should expect support of Docker Swarm Mode ?

Swarm mode is already available in Pike:
https://docs.openstack.org/releasenotes/magnum/pike.html

> Another question is fedora atomic is good but looks like its not up2date for
> docker , instead of use Fedora Atomic , why you do not use Ubuntu, or some
> other OS and directly install docker with requested version ?

Atomic also has advantages (immutable, etc), it's working well for us
at CERN. There are also Suse and CoreOS drivers, but i'm not familiar
with those.

Most pieces have moved to Atomic system containers, including all
kubernetes components so the versions are decouple from the Atomic
version.

We've also deployed locally a patch running docker itself in a system
container, this will get upstream with:
https://bugs.launchpad.net/magnum/+bug/1727700

With this we allow our users to deploy clusters with any docker
version (selectable with a label), currently up to 17.09.

> And last, to help to over waiting items “Next working items: ”  how we could
> help ?

I'll let Spyros reply to this and give you more info on the above items too.

Regards,
  Ricardo

>
>
>
> Regards
>
> Vahric Muhtaryan
>
>
> __
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [magnum] spec for cluster federation

2017-08-03 Thread Ricardo Rocha
Hi.

We've recently started looking at federating kubernetes clusters,
using some of our internal Magnum clusters and others deployed in
external clouds. With kubernetes 1.7 most of the functionality we need
is already available.

Looking forward we submitted a spec to integrate this into Magnum:
https://review.openstack.org/#/c/489609/

We will work on this once it gets approved, but please review the spec
and provide feedback.

Regards,
  Ricardo

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [fuxi][kuryr] Where to commit codes for Fuxi-golang

2017-06-02 Thread Ricardo Rocha
Hi Hongbin.

Regarding your comments below, some quick clarifications for people
less familiar with Magnum.

1. Rexray / Cinder integration

- Magnum uses an alpine based rexray image, compressed size is 33MB
(the download size), so pretty good
- Deploying a full Magnum cluster of 128 nodes takes less than 5
minutes in our production environment. The issue you mention only
exists in the upstream builds and is valid for all container images,
and is due to nodes in infra having a combination of non-nested
virtualization and/or slow connectivity (there were several attempts
to fix this)
- Not sure about mystery bugs, but the ones we found were fixed by Mathieu:
https://github.com/codedellemc/libstorage/pull/243

2. Enterprise ready

Certainly this means different things for different people, at CERN we
run ~80 clusters in our production service covering many use cases.
Magnum currently lacks the ability to properly upgrade the COE version
for running clusters, which is a problem for long lived services
(which are not the majority of our use cases today). This is the main
focus on the currently cycle.

Hope this adds some relevant information.

Cheers,
  Ricardo

On Wed, May 31, 2017 at 5:55 PM, Hongbin Lu  wrote:
> Please find my replies inline.
>
>
>
> Best regards,
>
> Hongbin
>
>
>
> From: Spyros Trigazis [mailto:strig...@gmail.com]
> Sent: May-30-17 9:56 AM
> To: OpenStack Development Mailing List (not for usage questions)
> Subject: Re: [openstack-dev] [fuxi][kuryr] Where to commit codes for
> Fuxi-golang
>
>
>
>
>
>
>
> On 30 May 2017 at 15:26, Hongbin Lu  wrote:
>
> Please consider leveraging Fuxi instead.
>
>
>
> Is there a missing functionality from rexray?
>
>
>
> [Hongbin Lu] From my understanding, Rexray targets on the overcloud use
> cases and assumes that containers are running on top of nova instances. You
> mentioned Magnum is leveraging Rexray for Cinder integration. Actually, I am
> the core reviewer who reviewed and approved those Rexray patches. From what
> I observed, the functionalities provided by Rexray are minimal. What it was
> doing is simply calling Cinder API to search an existing volume, attach the
> volume to the Nova instance, and let docker to bind-mount the volume to the
> container. At the time I was testing it, it seems to have some mystery bugs
> that prevented me to get the cluster to work. It was packaged by a large
> container image, which might take more than 5 minutes to pull down. With
> that said, Rexray might be a choice for someone who are looking for cross
> cloud-providers solution. Fuxi will focus on OpenStack and targets on both
> overcloud and undercloud use cases. That means Fuxi can work with
> Nova+Cinder or a standalone Cinder. As John pointed out in another reply,
> another benefit of Fuxi is to resolve the fragmentation problem of existing
> solutions. Those are the differentiators of Fuxi.
>
>
>
> Kuryr/Fuxi team is working very hard to deliver the docker network/storage
> plugins. I wish you will work with us to get them integrated with
> Magnum-provisioned cluster.
>
>
>
> Patches are welcome to support fuxi as an *option* instead of rexray, so
> users can choose.
>
>
>
> Currently, COE clusters provisioned by Magnum is far away from
> enterprise-ready. I think the Magnum project will be better off if it can
> adopt Kuryr/Fuxi which will give you a better OpenStack integration.
>
>
>
> Best regards,
>
> Hongbin
>
>
>
> fuxi feature request: Add authentication using a trustee and a trustID.
>
>
>
> [Hongbin Lu] I believe this is already supported.
>
>
>
> Cheers,
> Spyros
>
>
>
>
>
> From: Spyros Trigazis [mailto:strig...@gmail.com]
> Sent: May-30-17 7:47 AM
> To: OpenStack Development Mailing List (not for usage questions)
> Subject: Re: [openstack-dev] [fuxi][kuryr] Where to commit codes for
> Fuxi-golang
>
>
>
> FYI, there is already a cinder volume driver for docker available, written
>
> in golang, from rexray [1].
>
>
> Our team recently contributed to libstorage [3], it could support manila
> too. Rexray
> also supports the popular cloud providers.
>
> Magnum's docker swarm cluster driver, already leverages rexray for cinder
> integration. [2]
>
> Cheers,
> Spyros
>
>
>
> [1] https://github.com/codedellemc/rexray/releases/tag/v0.9.0
>
> [2] https://github.com/codedellemc/libstorage/releases/tag/v0.6.0
>
> [3]
> http://git.openstack.org/cgit/openstack/magnum/tree/magnum/drivers/common/templates/swarm/fragments/volume-service.sh?h=stable/ocata
>
>
>
> On 27 May 2017 at 12:15, zengchen  wrote:
>
> Hi John & Ben:
>
>  I have committed a patch[1] to add a new repository to Openstack. Please
> take a look at it. Thanks very much!
>
>
>
>  [1]: https://review.openstack.org/#/c/468635
>
>
>
> Best Wishes!
>
> zengchen
>
>
>
>
> 在 2017-05-26 21:30:48,"John Griffith"  写道:
>
>
>
>
>
> On Thu, May 25, 2017 at 10:01 PM, zengchen  wrote:
>
>
>
> 

Re: [openstack-dev] [magnum][containers] Size of userdata in drivers

2017-05-04 Thread Ricardo Rocha
Hi Kevin.

We've hit this locally in the past, and adding core-dns i see the
sample for kubernetes atomic.

Spyros is dropping some fragments that are not needed to temporarily
get around the issue. Is there any trick in Heat we can use? zipping
the fragments should give some gain, is this possible?

Cheers,
  Ricardo

On Mon, Apr 24, 2017 at 11:56 PM, Kevin Lefevre  wrote:
> Hi, I recently stumbled on this bug 
> https://bugs.launchpad.net/magnum/+bug/1680900 in which Spyros says we are 
> about to hit the 64k limit for Nova user-data.
>
> One way to prevent this is to reduce the size of software config. But there 
> is still many things to be added to templates.
>
> I’m talking only about Kubernetes for now :
>
> I know some other Kubernetes projects (on AWS for example with kube-aws) are 
> using object storage (AWS S3) to bypass the limit of AWS Cloudformation and 
> store stack-templates and user-data but I don’t think it is possible on 
> OpenStack with Nova/Swift
>
> Since we rely on an internet connection anyway (except when running local 
> copy of hypercube image) for a majority of deployment when pulling hypercube 
> and other Kubernetes components, maybe we could rely on upstream for some 
> user-data and save some space.
>
> A lot of driver maintenance include syncing Kubernetes manifest from upstream 
> changes, bumping version, this is fine for the core components for now (api, 
> proxy, controller, scheduler) but is bit more tricky when we start adding the 
> addons (which are bigger and take a lot more space).
>
> Kubernetes official salt base deployment already provides templating (sed) 
> for commons addons, e.g.:
>
> https://github.com/kubernetes/kubernetes/blob/release-1.6/cluster/addons/dns/kubedns-controller.yaml.sed
>
> These template are already versioned and maintained by upstream. Depending on 
> the Kubernetes branches used we could get directly the right addons from 
> upstream. This prevents errors and having to sync and upgrade the addons.
>
> This is just a thought and of course there are downsides to this and maybe it 
> goes against the project goal because we required internet access but we 
> could for example offer a way to pull addons or other config manifest from 
> local object storage.
>
> I know this also causes problems for idempotence and gate testing because we 
> cannot vouch for upstream changes but in theory Kubernetes releases and 
> addons are already tested against a specific version by their CI.
>
>
>
> __
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [magnum][osc] What name to use for magnum commands in osc?

2017-03-22 Thread Ricardo Rocha
Hi.

One simplification would be:
openstack coe create/list/show/config/update
openstack coe template create/list/show/update
openstack coe ca show/sign

This covers all the required commands and is a bit less verbose. The
cluster word is too generic and probably adds no useful info.

Whatever it is, kerberos support for the magnum client is very much
needed and welcome! :)

Cheers,
  Ricardo

On Tue, Mar 21, 2017 at 2:54 PM, Spyros Trigazis  wrote:
> IMO, coe is a little confusing. It is a term used by people related somehow
> to the magnum community. When I describe to users how to use magnum,
> I spent a few moments explaining what we call coe.
>
> I prefer one of the following:
> * openstack magnum cluster create|delete|...
> * openstack mcluster create|delete|...
> * both the above
>
> It is very intuitive for users because, they will be using an openstack
> cloud
> and they will be wanting to use the magnum service. So, it only make sense
> to type openstack magnum cluster or mcluster which is shorter.
>
>
> On 21 March 2017 at 02:24, Qiming Teng  wrote:
>>
>> On Mon, Mar 20, 2017 at 03:35:18PM -0400, Jay Pipes wrote:
>> > On 03/20/2017 03:08 PM, Adrian Otto wrote:
>> > >Team,
>> > >
>> > >Stephen Watson has been working on an magnum feature to add magnum
>> > > commands to the openstack client by implementing a plugin:
>> > >
>> >
>> > > >https://review.openstack.org/#/q/status:open+project:openstack/python-magnumclient+osc
>> > >
>> > >In review of this work, a question has resurfaced, as to what the
>> > > client command name should be for magnum related commands. Naturally, 
>> > > we’d
>> > > like to have the name “cluster” but that word is already in use by 
>> > > Senlin.
>> >
>> > Unfortunately, the Senlin API uses a whole bunch of generic terms as
>> > top-level REST resources, including "cluster", "event", "action",
>> > "profile", "policy", and "node". :( I've warned before that use of
>> > these generic terms in OpenStack APIs without a central group
>> > responsible for curating the API would lead to problems like this.
>> > This is why, IMHO, we need the API working group to be ultimately
>> > responsible for preventing this type of thing from happening.
>> > Otherwise, there ends up being a whole bunch of duplication and same
>> > terms being used for entirely different things.
>> >
>>
>> Well, I believe the name and namespaces used by Senlin is very clean.
>> Please see the following outputs. All commands are contained in the
>> cluster namespace to avoid any conflicts with any other projects.
>>
>> On the other hand, is there any document stating that Magnum is about
>> providing clustering service? Why Magnum cares so much about the top
>> level noun if it is not its business?
>
>
> From magnum's wiki page [1]:
> "Magnum uses Heat to orchestrate an OS image which contains Docker
> and Kubernetes and runs that image in either virtual machines or bare
> metal in a cluster configuration."
>
> Many services may offer clusters indirectly. Clusters is NOT magnum's focus,
> but we can't refer to a collection of virtual machines or physical servers
> with
> another name. Bay proven to be confusing to users. I don't think that magnum
> should reserve the cluster noun, even if it was available.
>
> [1] https://wiki.openstack.org/wiki/Magnum
>
>>
>>
>>
>> $ openstack --help | grep cluster
>>
>>   --os-clustering-api-version 
>>
>>   cluster action list  List actions.
>>   cluster action show  Show detailed info about the specified action.
>>   cluster build info  Retrieve build information.
>>   cluster check  Check the cluster(s).
>>   cluster collect  Collect attributes across a cluster.
>>   cluster create  Create the cluster.
>>   cluster delete  Delete the cluster(s).
>>   cluster event list  List events.
>>   cluster event show  Describe the event.
>>   cluster expand  Scale out a cluster by the specified number of nodes.
>>   cluster list   List the user's clusters.
>>   cluster members add  Add specified nodes to cluster.
>>   cluster members del  Delete specified nodes from cluster.
>>   cluster members list  List nodes from cluster.
>>   cluster members replace  Replace the nodes in a cluster with
>>   specified nodes.
>>   cluster node check  Check the node(s).
>>   cluster node create  Create the node.
>>   cluster node delete  Delete the node(s).
>>   cluster node list  Show list of nodes.
>>   cluster node recover  Recover the node(s).
>>   cluster node show  Show detailed info about the specified node.
>>   cluster node update  Update the node.
>>   cluster policy attach  Attach policy to cluster.
>>   cluster policy binding list  List policies from cluster.
>>   cluster policy binding show  Show a specific policy that is bound to
>>   the specified cluster.
>>   cluster policy binding update  Update a policy's properties on a
>>   cluster.
>>   cluster policy create  Create a policy.
>>   cluster policy 

Re: [openstack-dev] [neutron]

2017-01-27 Thread Ricardo Rocha
Hi.

Do you have a pointer to how you extended the driver to have this?

Thanks!

Ricardo

On Fri, Nov 18, 2016 at 2:02 PM, ZZelle  wrote:
> Hello,
>
> AFAIK, it's not possible.
>
> I did a similar thing by extending neutron iptables driver in order to set
> "pre-rules".
>
> Best regards,
>
>
> Cédric/ZZelle
>
> On Fri, Nov 18, 2016 at 1:58 PM, Iago Santos Pardo
>  wrote:
>>
>> Hello,
>>
>> We are using Neutron with the linuxbridge plugin and security groups
>> enabled and we have some custom rules in iptables running on the compute
>> nodes. When the agent rebuilds the firewall it changes the rules order,
>> putting the neutron chains on the top. Is there any way to preserve the
>> rules order and tell neutron to ignore our rules or stuck them on the top?
>>
>> Thank you so much.
>> __
>> OpenStack Development Mailing List (not for usage questions)
>> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
>
>
> __
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [containers][magnum] Magnum team at Summit?

2017-01-19 Thread Ricardo Rocha
Hi.

It would be great to meet in any case.

We've been exploring Atomic system containers (as in 'atomic install
--system ...') for our internal plugins at CERN, and having some
issues with runc and selinux definitions plus some atomic command
bugs. It's mostly due to the config.json being a hard one to build
manually (or the config.json.template passed to Atomic), especially
after we've got used to the nice docker usability by now :) In any
case the atomic blog posts are incredibly useful, thanks for that!

To explain why we're trying this: we're running all our internal
plugins inside containers (this is for support of internal systems we
add to upstream Magnum). Running them in docker is problematic for two
reasons:
* they are visible to the users of the cluster (which is confusing,
and allows them to easily shoot themselves in the foot by killing
them)
* they cause a race condition when restarting docker if volumes were
previously created, as docker tries to make the volumes available
before launching any container

Having them managed by systemd and run directly in runc solves both of
the issues above. I understand docker 1.13 has a new plugin API which
might (or maybe not) help with this, but i haven't had time to try it
(all of the above is with docker 1.12).

Cheers,
  Ricardo

On Wed, Jan 18, 2017 at 7:18 PM, Josh Berkus  wrote:
> Magnum Devs:
>
> Is there going to be a magnum team meeting around OpenStack Summit in
> Boston?
>
> I'm the community manager for Atomic Host, so if you're going to have
> Magnum meetings, I'd like to send you some Atomic engineers to field any
> questions/issues at the Summit.
>
> --
> --
> Josh Berkus
> Project Atomic
> Red Hat OSAS
>
> __
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [magnum] Managing cluster drivers as individual distro packages

2016-11-22 Thread Ricardo Rocha
Hi.

I think option 1. is the best one right now, mostly to reduce the
impact on the ongoing developments.

Upgrades, flattening, template versioning and node groups are supposed
to land a lot of patches in the next couple months, moving the drivers
into separate repos now could be a distraction.

We can revisit this later... and maybe improve at the same time the
user experience of adding custom drivers, which is not yet great
(we're using the glance image metadata for this as there's no explicit
--driver option yet).

Cheers,
Ricardo

On Fri, Nov 18, 2016 at 6:04 PM, Drago Rosson
 wrote:
> If we were to go with (2), what should happen to the common code?
>
> From: Spyros Trigazis 
> Reply-To: "OpenStack Development Mailing List (not for usage questions)"
> 
> Date: Friday, November 18, 2016 at 8:34 AM
> To: "OpenStack Development Mailing List (not for usage questions)"
> 
> Subject: [openstack-dev] [magnum] Managing cluster drivers as individual
> distro packages
>
> Hi all,
>
> In magnum, we implement cluster drivers for the different combinations
> of COEs (Container Orchestration Engines) and Operating Systems. The
> reasoning behind it is to better encapsulate driver-specific logic and to
> allow
> operators deploy custom drivers with their deployment specific changes.
>
> For example, operators might want to:
> * have only custom drivers and not install the upstream ones at all
> * offer user only some of the available drivers
> * create different combinations of  COE + os_distro
> * create new experimental/staging drivers
>
> It would be reasonable to manage magnum's cluster drivers as different
> packages, since they are designed to be treated as individual entities. To
> do
> so, we have two options:
>
> 1. in-tree:  remove the entrypoints from magnum/setup.cfg to not install
> them
> by default. This will require some plumbing to manage them like separate
> python
> packages, but allows magnum's development team to manage the official
> drivers
> inside the service repo.
>
> 2. separate repo: This option sounds cleaner, but requires more refactoring
> and
> will separate more the drivers from service, having significant impact in
> the
> development process.
>
> Thoughts?
>
> Spyros
>
> __
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [magnum] 2 million requests / sec, 100s of nodes

2016-08-09 Thread Ricardo Rocha
On Tue, Aug 9, 2016 at 10:00 PM, Clint Byrum <cl...@fewbar.com> wrote:
> Excerpts from Ricardo Rocha's message of 2016-08-08 11:51:00 +0200:
>> Hi.
>>
>> On Mon, Aug 8, 2016 at 1:52 AM, Clint Byrum <cl...@fewbar.com> wrote:
>> > Excerpts from Steve Baker's message of 2016-08-08 10:11:29 +1200:
>> >> On 05/08/16 21:48, Ricardo Rocha wrote:
>> >> > Hi.
>> >> >
>> >> > Quick update is 1000 nodes and 7 million reqs/sec :) - and the number
>> >> > of requests should be higher but we had some internal issues. We have
>> >> > a submission for barcelona to provide a lot more details.
>> >> >
>> >> > But a couple questions came during the exercise:
>> >> >
>> >> > 1. Do we really need a volume in the VMs? On large clusters this is a
>> >> > burden, and local storage only should be enough?
>> >> >
>> >> > 2. We observe a significant delay (~10min, which is half the total
>> >> > time to deploy the cluster) on heat when it seems to be crunching the
>> >> > kube_minions nested stacks. Once it's done, it still adds new stacks
>> >> > gradually, so it doesn't look like it precomputed all the info in 
>> >> > advance
>> >> >
>> >> > Anyone tried to scale Heat to stacks this size? We end up with a stack
>> >> > with:
>> >> > * 1000 nested stacks (depth 2)
>> >> > * 22000 resources
>> >> > * 47008 events
>> >> >
>> >> > And already changed most of the timeout/retrial values for rpc to get
>> >> > this working.
>> >> >
>> >> > This delay is already visible in clusters of 512 nodes, but 40% of the
>> >> > time in 1000 nodes seems like something we could improve. Any hints on
>> >> > Heat configuration optimizations for large stacks very welcome.
>> >> >
>> >> Yes, we recommend you set the following in /etc/heat/heat.conf [DEFAULT]:
>> >> max_resources_per_stack = -1
>> >>
>> >> Enforcing this for large stacks has a very high overhead, we make this
>> >> change in the TripleO undercloud too.
>> >>
>> >
>> > Wouldn't this necessitate having a private Heat just for Magnum? Not
>> > having a resource limit per stack would leave your Heat engines
>> > vulnerable to being DoS'd by malicious users, since one can create many
>> > many thousands of resources, and thus python objects, in just a couple
>> > of cleverly crafted templates (which is why I added the setting).
>> >
>> > This makes perfect sense in the undercloud of TripleO, which is a
>> > private, single tenant OpenStack. But, for Magnum.. now you're talking
>> > about the Heat that users have access to.
>>
>> We have it already at -1 for these tests. As you say a malicious user
>> could DoS, right now this is manageable in our environment. But maybe
>> move it to a per tenant value, or some special policy? The stacks are
>> created under a separate domain for magnum (for trustees), we could
>> also use that for separation.
>>
>> A separate heat instance sounds like an overkill.
>>
>
> It does, but there's really no way around it. If Magnum users are going
> to create massive stacks, then all of the heat engines will need to be
> able to handle massive stacks anyway, and a quota system would just mean
> that only Magnum gets to fully utilize those engines, which doesn't
> really make much sense at all, does it?

The best might be to see if there are improvements possible either in
the Heat engine (lots of what Zane mentioned seems to be of help,
we're willing to try that) or in the way Magnum creates the stacks.

In any case, things work right now just not perfect yet. Still ok to
get 1000 node clusters deployed in < 25min, people can handle that :)

Thanks!

Ricardo

>
> __
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [magnum][heat] 2 million requests / sec, 100s of nodes

2016-08-08 Thread Ricardo Rocha
Hi.

On Mon, Aug 8, 2016 at 6:17 PM, Zane Bitter <zbit...@redhat.com> wrote:
> On 05/08/16 12:01, Hongbin Lu wrote:
>>
>> Add [heat] to the title to get more feedback.
>>
>>
>>
>> Best regards,
>>
>> Hongbin
>>
>>
>>
>> *From:*Ricardo Rocha [mailto:rocha.po...@gmail.com]
>> *Sent:* August-05-16 5:48 AM
>> *To:* OpenStack Development Mailing List (not for usage questions)
>> *Subject:* Re: [openstack-dev] [magnum] 2 million requests / sec, 100s
>> of nodes
>>
>>
>>
>> Hi.
>>
>>
>>
>> Quick update is 1000 nodes and 7 million reqs/sec :) - and the number of
>> requests should be higher but we had some internal issues. We have a
>> submission for barcelona to provide a lot more details.
>>
>>
>>
>> But a couple questions came during the exercise:
>>
>>
>>
>> 1. Do we really need a volume in the VMs? On large clusters this is a
>> burden, and local storage only should be enough?
>>
>>
>>
>> 2. We observe a significant delay (~10min, which is half the total time
>> to deploy the cluster) on heat when it seems to be crunching the
>> kube_minions nested stacks. Once it's done, it still adds new stacks
>> gradually, so it doesn't look like it precomputed all the info in advance
>>
>>
>>
>> Anyone tried to scale Heat to stacks this size? We end up with a stack
>> with:
>>
>> * 1000 nested stacks (depth 2)
>>
>> * 22000 resources
>>
>> * 47008 events
>
>
> Wow, that's a big stack :) TripleO has certainly been pushing the boundaries
> of how big a stack Heat can handle, but this sounds like another step up
> even from there.
>
>> And already changed most of the timeout/retrial values for rpc to get
>> this working.
>>
>>
>>
>> This delay is already visible in clusters of 512 nodes, but 40% of the
>> time in 1000 nodes seems like something we could improve. Any hints on
>> Heat configuration optimizations for large stacks very welcome.
>
>
> Y'all were right to set max_resources_per_stack to -1, because actually
> checking the number of resources in a tree of stacks is sloow. (Not as
> slow as it used to be when it was O(n^2), but still pretty slow.)
>
> We're actively working on trying to make Heat more horizontally scalable
> (even at the cost of some performance penalty) so that if you need to handle
> this kind of scale then you'll be able to reach it by adding more
> heat-engines. Another big step forward on this front is coming with Newton,
> as (barring major bugs) the convergence_engine architecture will be enabled
> by default.
>
> RPC timeouts are caused by the synchronous work that Heat does before
> returning a result to the caller. Most of this is validation of the data
> provided by the user. We've talked about trying to reduce the amount of
> validation done synchronously to a minimum (just enough to guarantee that we
> can store and retrieve the data from the DB) and push the rest into the
> asynchronous part of the stack operation alongside the actual create/update.
> (FWIW, TripleO typically uses a 600s RPC timeout.)
>
> The "QueuePool limit of size ... overflow ... reached" sounds like we're
> pulling messages off the queue even when we don't have threads available in
> the pool to pass them to. If you have a fix for this it would be much
> appreciated. However, I don't think there's any guarantee that just leaving
> messages on the queue can't lead to deadlocks. The problem with very large
> trees of nested stacks is not so much that it's a lot of stacks (Heat
> doesn't have _too_ much trouble with that) but that they all have to be
> processed simultaneously. e.g. to validate the top level stack you also need
> to validate all of the lower level stacks before returning the result. If
> higher-level stacks consume all of the thread pools then you'll get a
> deadlock as you'll be unable to validate any lower-level stacks. At this
> point you'd have maxed out the capacity of your Heat engines to process
> stacks simultaneously and you'd need to scale out to more Heat engines. The
> solution is probably to try limit the number of nested stack validations we
> send out concurrently.
>
> Improving performance at scale is a priority area of focus for the Heat team
> at the moment. That's been mostly driven by TripleO and Sahara, but we'd be
> very keen to hear about the kind of loads that Magnum is putting on Heat and
> working with folks across the community to figure out how to improve things
> for those use cases.

Thanks for the detailed reply, especially regarding the handling

Re: [openstack-dev] [magnum] 2 million requests / sec, 100s of nodes

2016-08-08 Thread Ricardo Rocha
On Mon, Aug 8, 2016 at 11:51 AM, Ricardo Rocha <rocha.po...@gmail.com> wrote:
> Hi.
>
> On Mon, Aug 8, 2016 at 1:52 AM, Clint Byrum <cl...@fewbar.com> wrote:
>> Excerpts from Steve Baker's message of 2016-08-08 10:11:29 +1200:
>>> On 05/08/16 21:48, Ricardo Rocha wrote:
>>> > Hi.
>>> >
>>> > Quick update is 1000 nodes and 7 million reqs/sec :) - and the number
>>> > of requests should be higher but we had some internal issues. We have
>>> > a submission for barcelona to provide a lot more details.
>>> >
>>> > But a couple questions came during the exercise:
>>> >
>>> > 1. Do we really need a volume in the VMs? On large clusters this is a
>>> > burden, and local storage only should be enough?
>>> >
>>> > 2. We observe a significant delay (~10min, which is half the total
>>> > time to deploy the cluster) on heat when it seems to be crunching the
>>> > kube_minions nested stacks. Once it's done, it still adds new stacks
>>> > gradually, so it doesn't look like it precomputed all the info in advance
>>> >
>>> > Anyone tried to scale Heat to stacks this size? We end up with a stack
>>> > with:
>>> > * 1000 nested stacks (depth 2)
>>> > * 22000 resources
>>> > * 47008 events
>>> >
>>> > And already changed most of the timeout/retrial values for rpc to get
>>> > this working.
>>> >
>>> > This delay is already visible in clusters of 512 nodes, but 40% of the
>>> > time in 1000 nodes seems like something we could improve. Any hints on
>>> > Heat configuration optimizations for large stacks very welcome.
>>> >
>>> Yes, we recommend you set the following in /etc/heat/heat.conf [DEFAULT]:
>>> max_resources_per_stack = -1
>>>
>>> Enforcing this for large stacks has a very high overhead, we make this
>>> change in the TripleO undercloud too.
>>>
>>
>> Wouldn't this necessitate having a private Heat just for Magnum? Not
>> having a resource limit per stack would leave your Heat engines
>> vulnerable to being DoS'd by malicious users, since one can create many
>> many thousands of resources, and thus python objects, in just a couple
>> of cleverly crafted templates (which is why I added the setting).
>>
>> This makes perfect sense in the undercloud of TripleO, which is a
>> private, single tenant OpenStack. But, for Magnum.. now you're talking
>> about the Heat that users have access to.
>
> We have it already at -1 for these tests. As you say a malicious user
> could DoS, right now this is manageable in our environment. But maybe
> move it to a per tenant value, or some special policy? The stacks are
> created under a separate domain for magnum (for trustees), we could
> also use that for separation.

For reference we also changed max_stacks_per_tenant, which is:
# Maximum number of stacks any one tenant may have active at one time. (integer
# value)

For the 1000 node bay test we had to increase it.

>
> A separate heat instance sounds like an overkill.
>
> Cheers,
> Ricardo
>
>>
>> __
>> OpenStack Development Mailing List (not for usage questions)
>> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [magnum] 2 million requests / sec, 100s of nodes

2016-08-08 Thread Ricardo Rocha
Hi.

On Mon, Aug 8, 2016 at 1:52 AM, Clint Byrum <cl...@fewbar.com> wrote:
> Excerpts from Steve Baker's message of 2016-08-08 10:11:29 +1200:
>> On 05/08/16 21:48, Ricardo Rocha wrote:
>> > Hi.
>> >
>> > Quick update is 1000 nodes and 7 million reqs/sec :) - and the number
>> > of requests should be higher but we had some internal issues. We have
>> > a submission for barcelona to provide a lot more details.
>> >
>> > But a couple questions came during the exercise:
>> >
>> > 1. Do we really need a volume in the VMs? On large clusters this is a
>> > burden, and local storage only should be enough?
>> >
>> > 2. We observe a significant delay (~10min, which is half the total
>> > time to deploy the cluster) on heat when it seems to be crunching the
>> > kube_minions nested stacks. Once it's done, it still adds new stacks
>> > gradually, so it doesn't look like it precomputed all the info in advance
>> >
>> > Anyone tried to scale Heat to stacks this size? We end up with a stack
>> > with:
>> > * 1000 nested stacks (depth 2)
>> > * 22000 resources
>> > * 47008 events
>> >
>> > And already changed most of the timeout/retrial values for rpc to get
>> > this working.
>> >
>> > This delay is already visible in clusters of 512 nodes, but 40% of the
>> > time in 1000 nodes seems like something we could improve. Any hints on
>> > Heat configuration optimizations for large stacks very welcome.
>> >
>> Yes, we recommend you set the following in /etc/heat/heat.conf [DEFAULT]:
>> max_resources_per_stack = -1
>>
>> Enforcing this for large stacks has a very high overhead, we make this
>> change in the TripleO undercloud too.
>>
>
> Wouldn't this necessitate having a private Heat just for Magnum? Not
> having a resource limit per stack would leave your Heat engines
> vulnerable to being DoS'd by malicious users, since one can create many
> many thousands of resources, and thus python objects, in just a couple
> of cleverly crafted templates (which is why I added the setting).
>
> This makes perfect sense in the undercloud of TripleO, which is a
> private, single tenant OpenStack. But, for Magnum.. now you're talking
> about the Heat that users have access to.

We have it already at -1 for these tests. As you say a malicious user
could DoS, right now this is manageable in our environment. But maybe
move it to a per tenant value, or some special policy? The stacks are
created under a separate domain for magnum (for trustees), we could
also use that for separation.

A separate heat instance sounds like an overkill.

Cheers,
Ricardo

>
> __
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [magnum] 2 million requests / sec, 100s of nodes

2016-08-07 Thread Ricardo Rocha
Hi Ton.

I think we should. Also in cases where multiple volume types are available
(in our case with different iops) there would be additional parameters
required to select the volume type. I'll add it this week.

It's a detail though, spawning container clusters with Magnum is now super
easy (and fast!).

Cheers,
  Ricardo

On Fri, Aug 5, 2016 at 5:11 PM, Ton Ngo <t...@us.ibm.com> wrote:

> Hi Ricardo,
> For your question 1, you can modify the Heat template to not create the
> Cinder volume and tweak the call to
> configure-docker-storage.sh to use local storage. It should be fairly
> straightforward. You just need to make
> sure the local storage of the flavor is sufficient to host the containers
> in the benchmark.
> If you think this is a common scenario, we can open a blueprint for this
> option.
> Ton,
>
> [image: Inactive hide details for Ricardo Rocha ---08/05/2016 04:51:55
> AM---Hi. Quick update is 1000 nodes and 7 million reqs/sec :) -]Ricardo
> Rocha ---08/05/2016 04:51:55 AM---Hi. Quick update is 1000 nodes and 7
> million reqs/sec :) - and the number of
>
> From: Ricardo Rocha <rocha.po...@gmail.com>
> To: "OpenStack Development Mailing List (not for usage questions)" <
> openstack-dev@lists.openstack.org>
> Date: 08/05/2016 04:51 AM
>
> Subject: Re: [openstack-dev] [magnum] 2 million requests / sec, 100s of
> nodes
> --
>
>
>
> Hi.
>
> Quick update is 1000 nodes and 7 million reqs/sec :) - and the number of
> requests should be higher but we had some internal issues. We have a
> submission for barcelona to provide a lot more details.
>
> But a couple questions came during the exercise:
>
> 1. Do we really need a volume in the VMs? On large clusters this is a
> burden, and local storage only should be enough?
>
> 2. We observe a significant delay (~10min, which is half the total time to
> deploy the cluster) on heat when it seems to be crunching the kube_minions
> nested stacks. Once it's done, it still adds new stacks gradually, so it
> doesn't look like it precomputed all the info in advance
>
> Anyone tried to scale Heat to stacks this size? We end up with a stack
> with:
> * 1000 nested stacks (depth 2)
> * 22000 resources
> * 47008 events
>
> And already changed most of the timeout/retrial values for rpc to get this
> working.
>
> This delay is already visible in clusters of 512 nodes, but 40% of the
> time in 1000 nodes seems like something we could improve. Any hints on Heat
> configuration optimizations for large stacks very welcome.
>
> Cheers,
>   Ricardo
>
> On Sun, Jun 19, 2016 at 10:59 PM, Brad Topol <*bto...@us.ibm.com*
> <bto...@us.ibm.com>> wrote:
>
>Thanks Ricardo! This is very exciting progress!
>
>--Brad
>
>
>Brad Topol, Ph.D.
>IBM Distinguished Engineer
>OpenStack
>(919) 543-0646
>Internet: *bto...@us.ibm.com* <bto...@us.ibm.com>
>Assistant: Kendra Witherspoon (919) 254-0680
>
>[image: Inactive hide details for Ton Ngo---06/17/2016 12:10:33
>PM---Thanks Ricardo for sharing the data, this is really encouraging! T]Ton
>Ngo---06/17/2016 12:10:33 PM---Thanks Ricardo for sharing the data, this is
>really encouraging! Ton,
>
>From: Ton Ngo/Watson/IBM@IBMUS
>To: "OpenStack Development Mailing List \(not for usage questions\)" <
>*openstack-dev@lists.openstack.org* <openstack-dev@lists.openstack.org>
>>
>Date: 06/17/2016 12:10 PM
>Subject: Re: [openstack-dev] [magnum] 2 million requests / sec, 100s
>of nodes
>
>
>--
>
>
>
>    Thanks Ricardo for sharing the data, this is really encouraging!
>Ton,
>
>[image: Inactive hide details for Ricardo Rocha ---06/17/2016 08:16:15
>AM---Hi. Just thought the Magnum team would be happy to hear :)]Ricardo
>Rocha ---06/17/2016 08:16:15 AM---Hi. Just thought the Magnum team would be
>happy to hear :)
>
>From: Ricardo Rocha <*rocha.po...@gmail.com* <rocha.po...@gmail.com>>
>To: "OpenStack Development Mailing List (not for usage questions)" <
>*openstack-dev@lists.openstack.org* <openstack-dev@lists.openstack.org>
>>
>Date: 06/17/2016 08:16 AM
>Subject: [openstack-dev] [magnum] 2 million requests / sec, 100s of
>nodes
>--
>
>
>
>Hi.
>
>Just thought the Magnum team would be happy to hear :)
>
>We had access to some hardware the last couple days, and tried some
>tests with Magnum and Kubernetes - following an original blog post
>from the kubernetes team.
>
>

Re: [openstack-dev] [magnum] 2 million requests / sec, 100s of nodes

2016-08-05 Thread Ricardo Rocha
Hi.

Quick update is 1000 nodes and 7 million reqs/sec :) - and the number of
requests should be higher but we had some internal issues. We have a
submission for barcelona to provide a lot more details.

But a couple questions came during the exercise:

1. Do we really need a volume in the VMs? On large clusters this is a
burden, and local storage only should be enough?

2. We observe a significant delay (~10min, which is half the total time to
deploy the cluster) on heat when it seems to be crunching the kube_minions
nested stacks. Once it's done, it still adds new stacks gradually, so it
doesn't look like it precomputed all the info in advance

Anyone tried to scale Heat to stacks this size? We end up with a stack with:
* 1000 nested stacks (depth 2)
* 22000 resources
* 47008 events

And already changed most of the timeout/retrial values for rpc to get this
working.

This delay is already visible in clusters of 512 nodes, but 40% of the time
in 1000 nodes seems like something we could improve. Any hints on Heat
configuration optimizations for large stacks very welcome.

Cheers,
  Ricardo

On Sun, Jun 19, 2016 at 10:59 PM, Brad Topol <bto...@us.ibm.com> wrote:

> Thanks Ricardo! This is very exciting progress!
>
> --Brad
>
>
> Brad Topol, Ph.D.
> IBM Distinguished Engineer
> OpenStack
> (919) 543-0646
> Internet: bto...@us.ibm.com
> Assistant: Kendra Witherspoon (919) 254-0680
>
> [image: Inactive hide details for Ton Ngo---06/17/2016 12:10:33
> PM---Thanks Ricardo for sharing the data, this is really encouraging! T]Ton
> Ngo---06/17/2016 12:10:33 PM---Thanks Ricardo for sharing the data, this is
> really encouraging! Ton,
>
> From: Ton Ngo/Watson/IBM@IBMUS
> To: "OpenStack Development Mailing List \(not for usage questions\)" <
> openstack-dev@lists.openstack.org>
> Date: 06/17/2016 12:10 PM
> Subject: Re: [openstack-dev] [magnum] 2 million requests / sec, 100s of
> nodes
>
> --
>
>
>
> Thanks Ricardo for sharing the data, this is really encouraging!
> Ton,
>
> [image: Inactive hide details for Ricardo Rocha ---06/17/2016 08:16:15
> AM---Hi. Just thought the Magnum team would be happy to hear :)]Ricardo
> Rocha ---06/17/2016 08:16:15 AM---Hi. Just thought the Magnum team would be
> happy to hear :)
>
> From: Ricardo Rocha <rocha.po...@gmail.com>
> To: "OpenStack Development Mailing List (not for usage questions)" <
> openstack-dev@lists.openstack.org>
> Date: 06/17/2016 08:16 AM
> Subject: [openstack-dev] [magnum] 2 million requests / sec, 100s of nodes
> --
>
>
>
> Hi.
>
> Just thought the Magnum team would be happy to hear :)
>
> We had access to some hardware the last couple days, and tried some
> tests with Magnum and Kubernetes - following an original blog post
> from the kubernetes team.
>
> Got a 200 node kubernetes bay (800 cores) reaching 2 million requests /
> sec.
>
> Check here for some details:
>
> *https://openstack-in-production.blogspot.ch/2016/06/scaling-magnum-and-kubernetes-2-million.html*
> <https://openstack-in-production.blogspot.ch/2016/06/scaling-magnum-and-kubernetes-2-million.html>
>
> We'll try bigger in a couple weeks, also using the Rally work from
> Winnie, Ton and Spyros to see where it breaks. Already identified a
> couple issues, will add bugs or push patches for those. If you have
> ideas or suggestions for the next tests let us know.
>
> Magnum is looking pretty good!
>
> Cheers,
> Ricardo
>
> __
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> *http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev*
> <http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev>
>
>
>
> __
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
>
>
>
> __
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
>
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [magnum] 2 million requests / sec, 100s of nodes

2016-06-17 Thread Ricardo Rocha
Hi.

Just thought the Magnum team would be happy to hear :)

We had access to some hardware the last couple days, and tried some
tests with Magnum and Kubernetes - following an original blog post
from the kubernetes team.

Got a 200 node kubernetes bay (800 cores) reaching 2 million requests / sec.

Check here for some details:
https://openstack-in-production.blogspot.ch/2016/06/scaling-magnum-and-kubernetes-2-million.html

We'll try bigger in a couple weeks, also using the Rally work from
Winnie, Ton and Spyros to see where it breaks. Already identified a
couple issues, will add bugs or push patches for those. If you have
ideas or suggestions for the next tests let us know.

Magnum is looking pretty good!

Cheers,
Ricardo

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Magnum] The Magnum Midcycle

2016-06-15 Thread Ricardo Rocha
Hi Hongbin.

Thanks for considering it, maybe it works next time the offer stays :)

Cheers,
Ricardo

On Tue, Jun 14, 2016 at 10:26 PM, Hongbin Lu <hongbin...@huawei.com> wrote:
> Hi Tim,
>
>
>
> Thanks for providing the host. We discussed the midcycle location at the
> last team meeting. It looks a significant number of Magnum team members has
> difficulties to travel to Geneva, so we are not able to hold the midcycle at
> CERN. Thanks again for the willingness to host us.
>
>
>
> Best regards,
>
> Hongbin
>
>
>
> From: Tim Bell [mailto:tim.b...@cern.ch]
> Sent: June-09-16 2:27 PM
>
> To: OpenStack Development Mailing List (not for usage questions)
> Subject: Re: [openstack-dev] [Magnum] The Magnum Midcycle
>
>
>
> If we can confirm the dates and location, there is a reasonable chance we
> could also offer remote conferencing using Vidyo at CERN. While it is not
> the same as an F2F experience, it would provide the possibility for remote
> participation for those who could not make it to Geneva.
>
>
>
> We may also be able to organize tours, such as to the anti-matter factory
> and super conducting magnet test labs prior or afterwards if anyone is
> interested…
>
>
>
> Tim
>
>
>
> From: Spyros Trigazis <strig...@gmail.com>
> Reply-To: "OpenStack Development Mailing List (not for usage questions)"
> <openstack-dev@lists.openstack.org>
> Date: Wednesday 8 June 2016 at 16:43
> To: "OpenStack Development Mailing List (not for usage questions)"
> <openstack-dev@lists.openstack.org>
> Subject: Re: [openstack-dev] [Magnum] The Magnum Midcycle
>
>
>
> Hi Hongbin.
>
>
>
> CERN's location: https://goo.gl/maps/DWbDVjnAvJJ2
>
>
>
> Cheers,
>
> Spyros
>
>
>
>
>
> On 8 June 2016 at 16:01, Hongbin Lu <hongbin...@huawei.com> wrote:
>
> Ricardo,
>
> Thanks for the offer. Would I know where is the exact location?
>
> Best regards,
> Hongbin
>
>
>> -Original Message-
>> From: Ricardo Rocha [mailto:rocha.po...@gmail.com]
>> Sent: June-08-16 5:43 AM
>> To: OpenStack Development Mailing List (not for usage questions)
>> Subject: Re: [openstack-dev] [Magnum] The Magnum Midcycle
>>
>> Hi Hongbin.
>>
>> Not sure how this fits everyone, but we would be happy to host it at
>> CERN. How do people feel about it? We can add a nice tour of the place
>> as a bonus :)
>>
>> Let us know.
>>
>> Ricardo
>>
>>
>>
>> On Tue, Jun 7, 2016 at 10:32 PM, Hongbin Lu <hongbin...@huawei.com>
>> wrote:
>> > Hi all,
>> >
>> >
>> >
>> > Please find the Doodle pool below for selecting the Magnum midcycle
>> date.
>> > Presumably, it will be a 2 days event. The location is undecided for
>> now.
>> > The previous midcycles were hosted in bay area so I guess we will
>> stay
>> > there at this time.
>> >
>> >
>> >
>> > http://doodle.com/poll/5tbcyc37yb7ckiec
>> >
>> >
>> >
>> > In addition, the Magnum team is finding a host for the midcycle.
>> > Please let us know if you interest to host us.
>> >
>> >
>> >
>> > Best regards,
>> >
>> > Hongbin
>> >
>> >
>> >
>> __
>
>> >  OpenStack Development Mailing List (not for usage questions)
>
>> > Unsubscribe:
>> > openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
>> > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>> >
>>
>> ___
>> ___
>> OpenStack Development Mailing List (not for usage questions)
>> Unsubscribe: OpenStack-dev-
>> requ...@lists.openstack.org?subject:unsubscribe
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
> __
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
>
>
>
> __
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Magnum] The Magnum Midcycle

2016-06-08 Thread Ricardo Rocha
Hi Hongbin.

Not sure how this fits everyone, but we would be happy to host it at
CERN. How do people feel about it? We can add a nice tour of the place
as a bonus :)

Let us know.

Ricardo



On Tue, Jun 7, 2016 at 10:32 PM, Hongbin Lu  wrote:
> Hi all,
>
>
>
> Please find the Doodle pool below for selecting the Magnum midcycle date.
> Presumably, it will be a 2 days event. The location is undecided for now.
> The previous midcycles were hosted in bay area so I guess we will stay there
> at this time.
>
>
>
> http://doodle.com/poll/5tbcyc37yb7ckiec
>
>
>
> In addition, the Magnum team is finding a host for the midcycle. Please let
> us know if you interest to host us.
>
>
>
> Best regards,
>
> Hongbin
>
>
> __
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [magnum] Discuss the idea of manually managing the bay nodes

2016-06-07 Thread Ricardo Rocha
+1 on this. Another use case would be 'fast storage' for dbs, 'any
storage' for memcache and web servers. Relying on labels for this
makes it really simple.

The alternative of doing it with multiple clusters adds complexity to
the cluster(s) description by users.

On Fri, Jun 3, 2016 at 1:54 AM, Fox, Kevin M  wrote:
> As an operator that has clouds that are partitioned into different host 
> aggregates with different flavors targeting them, I totally believe we will 
> have users that want to have a single k8s cluster span multiple different 
> flavor types. I'm sure once I deploy magnum, I will want it too. You could 
> have some special hardware on some nodes, not on others. but you can still 
> have cattle, if you have enough of them and the labels are set appropriately. 
> Labels allow you to continue to partition things when you need to, and ignore 
> it when you dont, making administration significantly easier.
>
> Say I have a tenant with 5 gpu nodes, and 10 regular nodes allocated into a 
> k8s cluster. I may want 30 instances of container x that doesn't care where 
> they land, and prefer 5 instances that need cuda. The former can be deployed 
> with a k8s deployment. The latter can be deployed with a daemonset. All 
> should work well and very non pet'ish. The whole tenant could be viewed with 
> a single pane of glass, making it easy to manage.
>
> Thanks,
> Kevin
> 
> From: Adrian Otto [adrian.o...@rackspace.com]
> Sent: Thursday, June 02, 2016 4:24 PM
> To: OpenStack Development Mailing List (not for usage questions)
> Subject: Re: [openstack-dev] [magnum] Discuss the idea of manually managing 
> the bay nodes
>
> I am really struggling to accept the idea of heterogeneous clusters. My 
> experience causes me to question whether a heterogeneus cluster makes sense 
> for Magnum. I will try to explain why I have this hesitation:
>
> 1) If you have a heterogeneous cluster, it suggests that you are using 
> external intelligence to manage the cluster, rather than relying on it to be 
> self-managing. This is an anti-pattern that I refer to as “pets" rather than 
> “cattle”. The anti-pattern results in brittle deployments that rely on 
> external intelligence to manage (upgrade, diagnose, and repair) the cluster. 
> The automation of the management is much harder when a cluster is 
> heterogeneous.
>
> 2) If you have a heterogeneous cluster, it can fall out of balance. This 
> means that if one of your “important” or “large” members fail, there may not 
> be adequate remaining members in the cluster to continue operating properly 
> in the degraded state. The logic of how to track and deal with this needs to 
> be handled. It’s much simpler in the heterogeneous case.
>
> 3) Heterogeneous clusters are complex compared to homogeneous clusters. They 
> are harder to work with, and that usually means that unplanned outages are 
> more frequent, and last longer than they with a homogeneous cluster.
>
> Summary:
>
> Heterogeneous:
>   - Complex
>   - Prone to imbalance upon node failure
>   - Less reliable
>
> Heterogeneous:
>   - Simple
>   - Don’t get imbalanced when a min_members concept is supported by the 
> cluster controller
>   - More reliable
>
> My bias is to assert that applications that want a heterogeneous mix of 
> system capacities at a node level should be deployed on multiple homogeneous 
> bays, not a single heterogeneous one. That way you end up with a composition 
> of simple systems rather than a larger complex one.
>
> Adrian
>
>
>> On Jun 1, 2016, at 3:02 PM, Hongbin Lu  wrote:
>>
>> Personally, I think this is a good idea, since it can address a set of 
>> similar use cases like below:
>> * I want to deploy a k8s cluster to 2 availability zone (in future 2 
>> regions/clouds).
>> * I want to spin up N nodes in AZ1, M nodes in AZ2.
>> * I want to scale the number of nodes in specific AZ/region/cloud. For 
>> example, add/remove K nodes from AZ1 (with AZ2 untouched).
>>
>> The use case above should be very common and universal everywhere. To 
>> address the use case, Magnum needs to support provisioning heterogeneous set 
>> of nodes at deploy time and managing them at runtime. It looks the proposed 
>> idea (manually managing individual nodes or individual group of nodes) can 
>> address this requirement very well. Besides the proposed idea, I cannot 
>> think of an alternative solution.
>>
>> Therefore, I vote to support the proposed idea.
>>
>> Best regards,
>> Hongbin
>>
>>> -Original Message-
>>> From: Hongbin Lu
>>> Sent: June-01-16 11:44 AM
>>> To: OpenStack Development Mailing List (not for usage questions)
>>> Subject: RE: [openstack-dev] [magnum] Discuss the idea of manually
>>> managing the bay nodes
>>>
>>> Hi team,
>>>
>>> A blueprint was created for tracking this idea:
>>> https://blueprints.launchpad.net/magnum/+spec/manually-manage-bay-
>>> nodes . I won't approve the BP until 

Re: [openstack-dev] [magnum] Notes for Magnum design summit

2016-05-03 Thread Ricardo Rocha
Hi.

On Mon, May 2, 2016 at 7:11 PM, Cammann, Tom  wrote:
> Thanks for the write up Hongbin and thanks to all those who contributed to 
> the design summit. A few comments on the summaries below.
>
> 6. Ironic Integration: 
> https://etherpad.openstack.org/p/newton-magnum-ironic-integration
> - Start the implementation immediately
> - Prefer quick work-around for identified issues (cinder volume attachment, 
> variation of number of ports, etc.)
>
> We need to implement a bay template that can use a flat networking model as 
> this is the only networking model Ironic currently supports. Multi-tenant 
> networking is imminent. This should be done before work on an Ironic template 
> starts.
>
> 7. Magnum adoption challenges: 
> https://etherpad.openstack.org/p/newton-magnum-adoption-challenges
> - The challenges is listed in the etherpad above
>
> Ideally we need to turn this list into a set of actions which we can 
> implement over the cycle, i.e. create a BP to remove requirement for LBaaS.

There's one for floating IPs already:
https://blueprints.launchpad.net/magnum/+spec/bay-with-no-floating-ips

>
> 9. Magnum Heat template version: 
> https://etherpad.openstack.org/p/newton-magnum-heat-template-versioning
> - In each bay driver, version the template and template definition.
> - Bump template version for minor changes, and bump bay driver version for 
> major changes.
>
> We decided only bay driver versioning was required. The template and template 
> driver does not need versioning due to the fact we can get heat to pass back 
> the template which it used to create the bay.

This was also my understanding. We won't use heat template versioning,
just the bays.

> 10. Monitoring: https://etherpad.openstack.org/p/newton-magnum-monitoring
> - Add support for sending notifications to Ceilometer
> - Revisit bay monitoring and self-healing later
> - Container monitoring should not be done by Magnum, but it can be done by 
> cAdvisor, Heapster, etc.
>
> We split this topic into 3 parts – bay telemetry, bay monitoring, container 
> monitoring.
> Bay telemetry is done around actions such as bay/baymodel CRUD operations. 
> This is implemented using using ceilometer notifications.
> Bay monitoring is around monitoring health of individual nodes in the bay 
> cluster and we decided to postpone work as more investigation is required on 
> what this should look like and what users actually need.
> Container monitoring focuses on what containers are running in the bay and 
> general usage of the bay COE. We decided this will be done completed by 
> Magnum by adding access to cAdvisor/heapster through baking access to 
> cAdvisor by default.

I think we're missing a blueprint for this one too.

Ricardo

>
> - Manually manage bay nodes (instead of being managed by Heat ResourceGroup): 
> It can address the use case of heterogeneity of bay nodes (i.e. different 
> availability zones, flavors), but need to elaborate the details further.
>
> The idea revolves around creating a heat stack for each node in the bay. This 
> idea shows a lot of promise but needs more investigation and isn’t a current 
> priority.
>
> Tom
>
>
> From: Hongbin Lu 
> Reply-To: "OpenStack Development Mailing List (not for usage questions)" 
> 
> Date: Saturday, 30 April 2016 at 05:05
> To: "OpenStack Development Mailing List (not for usage questions)" 
> 
> Subject: [openstack-dev] [magnum] Notes for Magnum design summit
>
> Hi team,
>
> For reference, below is a summary of the discussions/decisions in Austin 
> design summit. Please feel free to point out if anything is incorrect or 
> incomplete. Thanks.
>
> 1. Bay driver: https://etherpad.openstack.org/p/newton-magnum-bay-driver
> - Refactor existing code into bay drivers
> - Each bay driver will be versioned
> - Individual bay driver can have API extension and magnum CLI could load the 
> extensions dynamically
> - Work incrementally and support same API before and after the driver change
>
> 2. Bay lifecycle operations: 
> https://etherpad.openstack.org/p/newton-magnum-bays-lifecycle-operations
> - Support the following operations: reset the bay, rebuild the bay, rotate 
> TLS certificates in the bay, adjust storage of the bay, scale the bay.
>
> 3. Scalability: https://etherpad.openstack.org/p/newton-magnum-scalability
> - Implement Magnum plugin for Rally
> - Implement the spec to address the scalability of deploying multiple bays 
> concurrently: https://review.openstack.org/#/c/275003/
>
> 4. Container storage: 
> https://etherpad.openstack.org/p/newton-magnum-container-storage
> - Allow choice of storage driver
> - Allow choice of data volume driver
> - Work with Kuryr/Fuxi team to have data volume driver available in COEs 
> upstream
>
> 5. Container network: 
> https://etherpad.openstack.org/p/newton-magnum-container-network
> - Discuss how to scope/pass/store OpenStack 

Re: [openstack-dev] [magnum] High Availability

2016-04-21 Thread Ricardo Rocha
Hi.

The thread is a month old, but I sent a shorter version of this to
Daneyon before with some info on the things we dealt with to get
Magnum deployed successfully. We wrapped it up in a post (there's a
video linked there with some demos at the end):

http://openstack-in-production.blogspot.ch/2016/04/containers-and-cern-cloud.html

Hopefully the pointers to the relevant blueprints for some of the
issues we found will be useful for others.

Cheers,
  Ricardo

On Fri, Mar 18, 2016 at 3:42 PM, Ricardo Rocha <rocha.po...@gmail.com> wrote:
> Hi.
>
> We're running a Magnum pilot service - which means it's being
> maintained just like all other OpenStack services and running on the
> production infrastructure, but only available to a subset of tenants
> for a start.
>
> We're learning a lot in the process and will happily report on this in
> the next couple weeks.
>
> The quick summary is that it's looking good and stable with a few
> hicks in the setup, which are handled by patches already under review.
> The one we need the most is the trustee user (USER_TOKEN in the bay
> heat params is preventing scaling after the token expires), but with
> the review in good shape we look forward to try it very soon.
>
> Regarding barbican we'll keep you posted, we're working on the missing
> puppet bits.
>
> Ricardo
>
> On Fri, Mar 18, 2016 at 2:30 AM, Daneyon Hansen (danehans)
> <daneh...@cisco.com> wrote:
>> Adrian/Hongbin,
>>
>> Thanks for taking the time to provide your input on this matter. After 
>> reviewing your feedback, my takeaway is that Magnum is not ready for 
>> production without implementing Barbican or some other future feature such 
>> as the Keystone option Adrian provided.
>>
>> All,
>>
>> Is anyone using Magnum in production? If so, I would appreciate your input.
>>
>> -Daneyon Hansen
>>
>>> On Mar 17, 2016, at 6:16 PM, Adrian Otto <adrian.o...@rackspace.com> wrote:
>>>
>>> Hongbin,
>>>
>>> One alternative we could discuss as an option for operators that have a 
>>> good reason not to use Barbican, is to use Keystone.
>>>
>>> Keystone credentials store: 
>>> http://specs.openstack.org/openstack/keystone-specs/api/v3/identity-api-v3.html#credentials-v3-credentials
>>>
>>> The contents are stored in plain text in the Keystone DB, so we would want 
>>> to generate an encryption key per bay, encrypt the certificate and store it 
>>> in keystone. We would then use the same key to decrypt it upon reading the 
>>> key back. This might be an acceptable middle ground for clouds that will 
>>> not or can not run Barbican. This should work for any OpenStack cloud since 
>>> Grizzly. The total amount of code in Magnum would be small, as the API 
>>> already exists. We would need a library function to encrypt and decrypt the 
>>> data, and ideally a way to select different encryption algorithms in case 
>>> one is judged weak at some point in the future, justifying the use of an 
>>> alternate.
>>>
>>> Adrian
>>>
>>>> On Mar 17, 2016, at 4:55 PM, Adrian Otto <adrian.o...@rackspace.com> wrote:
>>>>
>>>> Hongbin,
>>>>
>>>>> On Mar 17, 2016, at 2:25 PM, Hongbin Lu <hongbin...@huawei.com> wrote:
>>>>>
>>>>> Adrian,
>>>>>
>>>>> I think we need a boarder set of inputs in this matter, so I moved the 
>>>>> discussion from whiteboard back to here. Please check my replies inline.
>>>>>
>>>>>> I would like to get a clear problem statement written for this.
>>>>>> As I see it, the problem is that there is no safe place to put 
>>>>>> certificates in clouds that do not run Barbican.
>>>>>> It seems the solution is to make it easy to add Barbican such that it's 
>>>>>> included in the setup for Magnum.
>>>>> No, the solution is to explore an non-Barbican solution to store 
>>>>> certificates securely.
>>>>
>>>> I am seeking more clarity about why a non-Barbican solution is desired. 
>>>> Why is there resistance to adopting both Magnum and Barbican together? I 
>>>> think the answer is that people think they can make Magnum work with 
>>>> really old clouds that were set up before Barbican was introduced. That 
>>>> expectation is simply not reasonable. If there were a way to easily add 
>>>> Barbican to older clouds, perhaps this reluctance would melt away.
>>>>
>>>

Re: [openstack-dev] [Magnum] Magnum supports 2 Nova flavor to provision minion nodes

2016-04-20 Thread Ricardo Rocha
Hi Hongbin.

On Wed, Apr 20, 2016 at 8:13 PM, Hongbin Lu  wrote:
>
>
>
>
> From: Duan, Li-Gong (Gary, HPServers-Core-OE-PSC)
> [mailto:li-gong.d...@hpe.com]
> Sent: April-20-16 3:39 AM
> To: OpenStack Development Mailing List (not for usage questions)
> Subject: [openstack-dev] [Magnum] Magnum supports 2 Nova flavor to provision
> minion nodes
>
>
>
> Hi Folks,
>
>
>
> We are considering whether Magnum can supports 2 Nova flavors to provision
> Kubernetes and other COE minion nodes.
>
> This requirement comes from the below use cases:
>
> -  There are 2 kind of baremetal machines in customer site: one is
> legacy machines which doesn’t support UEFI secure boot and others are new
> machines which support UEFI secure boot. User want to use Magnum to
> provisions a Magnum bay of Kubernetes from these 2 kind of baremetal
> machines and for the machines supporting secure boot, user wants to use UEFI
> secure boot to boot them up. And 2 Kubernetes label(secure-booted and
> non-secure-booted) are created and User can deploy their
> data-senstive/cirtical workload/containers/pods on the baremetal machines
> which are secure-booted.
>
>
>
> This requirement requires Magnum to supports 2 Nova flavors(one is
> “extra_spec: secure_boot=True” and the other doesn’t specify it) based on
> the Ironic
> feature(https://specs.openstack.org/openstack/ironic-specs/specs/kilo-implemented/uefi-secure-boot.html
> ).
>
>
>
> Could you kindly give me some comments on these requirement or whether it is
> reasonable from your point? If you agree, we can write design spec and
> implement this feature?
>
>
>
> I think the requirement is reasonable, but I would like to solve the problem
> in a generic way. In particular, there could be another user who might ask
> for N nova flavors to provision COE nodes in the future. A challenge to
> support N groups of Nova instances is how to express arbitrary number of
> resource groups (with different flavors) in a Heat template (Magnum uses
> Heat template to provision COE clusters). Heat doesn’t seem to support the
> logic of looping from 1 to N. There could be other challenges/complexities
> along the way. If the proposed design can address all the challenges and the
> implementation is clean, I am OK to add support for this feature. Thoughts
> from others?

This looks similar to the way we looked at passing a list of
availability zones. Mathieu asked and got a good answer:
http://lists.openstack.org/pipermail/openstack-dev/2016-March/088175.html

Something similar can probably be used to pass multiple flavors? Just
in case it helps.

Cheers,
  Ricardo

>
>
>
> Regards,
>
> Gary
>
>
> __
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Magnum]Cache docker images

2016-04-20 Thread Ricardo Rocha
Hi.

On Wed, Apr 20, 2016 at 5:43 PM, Fox, Kevin M  wrote:
> If the ops are deploying a cloud big enough to run into that problem, I
> think they can deploy a scaled out docker registry of some kind too, that
> the images can point to? Last I looked, it didn't seem very difficult. The
> native docker registry has ceph support now, so if your running ceph for the
> backend, you can put an instance on each controller and have it stateless I
> think.

This is what we did, using registry v2. There's an issue to pull from
a v2 registry anonymously:
https://github.com/docker/docker/issues/17317

but we've setup a dummy account to do it. Both this account and any
required CA certs can be configured by the operator, which was the
reasoning to propose (we patch the templates for now):
https://blueprints.launchpad.net/magnum/+spec/allow-user-softwareconfig

Allowing an optional prefix to pull from a local registry sounds reasonable.

Cheers,
  Ricardo

>
> Either way you would be hammering some storage service. Either glance or
> docker registry.
>
> Thanks,
> Kevin
> 
> From: Guz Egor [guz_e...@yahoo.com]
> Sent: Tuesday, April 19, 2016 7:20 PM
> To: OpenStack Development Mailing List (not for usage questions)
> Cc: Fox, Kevin M
> Subject: Re: [openstack-dev] [Magnum]Cache docker images
>
> Kevin,
>
> I agree this is not ideal solution, but it's probably the best option to
> deal with public cloud "stability" (e.g. we switched to the same model at
> AWS and
> got really good boost in provisioning time and reduce # failures during
> cluster provisioning). And if application need guarantee "fresh" image, it
> uses
> force pull option in Marathon.
>
> ---
> Egor
>
> 
> From: "Fox, Kevin M" 
> To: OpenStack Development Mailing List (not for usage questions)
> 
> Sent: Tuesday, April 19, 2016 1:04 PM
>
> Subject: Re: [openstack-dev] [Magnum]Cache docker images
>
> I'm kind of uncomfortable as an op with the prebundled stuff. how do you
> upgrade things when needed if there is no way to pull updated images from a
> central place?
>
> Thanks,
> Kevin
> 
> From: Hongbin Lu [hongbin...@huawei.com]
> Sent: Tuesday, April 19, 2016 11:56 AM
> To: OpenStack Development Mailing List (not for usage questions)
> Subject: Re: [openstack-dev] [Magnum]Cache docker images
>
> Eli,
>
> The approach of pre-pulling docker images has a problem. It only works for
> specific docker storage driver. In comparison, the tar file approach is
> portable across different storage drivers.
>
> Best regards,
> Hongbin
>
> From: taget [mailto:qiaoliy...@gmail.com]
> Sent: April-19-16 4:26 AM
> To: openstack-dev@lists.openstack.org
> Subject: Re: [openstack-dev] [Magnum]Cache docker images
>
> hi hello again
>
> I believe you are talking about this bp
> https://blueprints.launchpad.net/magnum/+spec/cache-docker-images
> then ignore my previous reply, that may another topic to solve network
> limited problem.
>
> I think you are on the right way to build docker images but this image could
> only bootstrap by cloud-init, without cloud-init
> the container image tar file are not loaded at all, but seems this may not
> be the best way.
>
> I'v suggest that may be the best way is we pull docker images while building
> atomic-image. Per my understanding, the
> image build process is we mount the image to read/write mode to some tmp
> directory and chroot to to that dircetory,
> we can do some custome operation there.
>
> I can do a try on the build progress(guess rpm-ostree should support some
> hook scripts)
>
> On 2016年04月19日 11:41, Eli Qiao wrote:
>
> @wanghua
>
> I think there were some discussion already , check
> https://blueprints.launchpad.net/magnum/+spec/support-private-registry
> and https://blueprints.launchpad.net/magnum/+spec/allow-user-softwareconfig
> On 2016年04月19日 10:57, 王华 wrote:
>
> Hi all,
>
> We want to eliminate pulling docker images over the Internet on bay
> provisioning. There are two problems of this approach:
> 1. Pulling docker images over the Internet is slow and fragile.
> 2. Some clouds don't have external Internet access.
>
> It is suggested to build all the required images into the cloud images to
> resolved the issue.
>
> Here is a solution:
> We export the docker images as tar files, and put the tar files into a dir
> in the image when we build the image. And we add scripts to load the tar
> files in cloud-init, so that we don't need to download the docker images.
>
> Any advice for this solution or any better solution?
>
> Regards,
> Wanghua
>
>
>
> __
>
> OpenStack Development Mailing List (not for usage questions)
>
> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
>
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
>
>
> --
>
> Best 

Re: [openstack-dev] [magnum] Discuss the blueprint "support-private-registry"

2016-03-30 Thread Ricardo Rocha
Hi.

On Wed, Mar 30, 2016 at 3:59 AM, Eli Qiao  wrote:
>
> Hi Hongbin
>
> Thanks for starting this thread,
>
>
>
> I initial propose this bp because I am in China which is behind China great
> wall and can not have access of gcr.io directly, after checking our
> cloud-init script, I see that
>
> lots of code are *hard coded* to using gcr.io, I personally though this is
> not good idea. We can not force user/customer to have internet access in
> their environment.
>
> I proposed to use insecure-registry to give customer/user (Chinese or whom
> doesn't have gcr.io access) a chance to switch use their own
> insecure-registry to deploy
> k8s/swarm bay.
>
> For your question:
>>  Is the private registry secure or insecure? If secure, how to handle
>> the authentication secrets. If insecure, is it OK to connect a secure bay to
>> an insecure registry?
> An insecure-resigtry should be 'secure' one, since customer need to setup it
> and make sure it's clear one and in this case, they could be a private
> cloud.
>
>>  Should we provide an instruction for users to pre-install the private
>> registry? If not, how to verify the correctness of this feature?
>
> The simply way to pre-install private registry is using insecure-resigtry
> and docker.io has very simple steps to start it [1]
> for other, docker registry v2 also supports using TLS enable mode but this
> will require to tell docker client key and crt file which will make
> "support-private-registry" complex.
>
> [1] https://docs.docker.com/registry/
> [2]https://docs.docker.com/registry/deploying/

'support-private-registry' and 'allow-insecure-registry' sound different to me.

We're using an internal docker registry at CERN (v2, TLS enabled), and
have the magnum nodes setup to use it.

We just install our CA certificates in the nodes (cp to
etc/pki/ca-trust/source/anchors/, update-ca-trust) - had to change the
HEAT templates for that, and submitted a blueprint to be able to do
similar things in a cleaner way:
https://blueprints.launchpad.net/magnum/+spec/allow-user-softwareconfig

That's all that is needed, the images are then prefixed with the
registry dns location when referenced - example:
docker.cern.ch/my-fancy-image.

Things we found on the way:
- registry v2 doesn't seem to allow anonymous pulls (you can always
add an account with read-only access everywhere, but it means you need
to always authenticate at least with this account)
https://github.com/docker/docker/issues/17317
- swarm 1.1 and kub8s 1.0 allow authentication to the registry from
the client (which was good news, and it works fine), handy if you want
to push/pull with authentication.

Cheers,
  Ricardo

>
>
>
> On 2016年03月30日 07:23, Hongbin Lu wrote:
>
> Hi team,
>
>
>
> This is the item we didn’t have time to discuss in our team meeting, so I
> started the discussion in here.
>
>
>
> Here is the blueprint:
> https://blueprints.launchpad.net/magnum/+spec/support-private-registry . Per
> my understanding, the goal of the BP is to allow users to specify the url of
> their private docker registry where the bays pull the kube/swarm images (if
> they are not able to access docker hub or other public registry). An
> assumption is that users need to pre-install their own private registry and
> upload all the required images to there. There are several potential issues
> of this proposal:
>
> · Is the private registry secure or insecure? If secure, how to
> handle the authentication secrets. If insecure, is it OK to connect a secure
> bay to an insecure registry?
>
> · Should we provide an instruction for users to pre-install the
> private registry? If not, how to verify the correctness of this feature?
>
>
>
> Thoughts?
>
>
>
> Best regards,
>
> Hongbin
>
>
>
> __
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
>
> --
> Best Regards, Eli Qiao (乔立勇)
> Intel OTC China
>
>
> __
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [magnum] High Availability

2016-03-19 Thread Ricardo Rocha
Hi.

We're on the way, the API is using haproxy load balancing in the same
way all openstack services do here - this part seems to work fine.

For the conductor we're stopped due to bay certificates - we don't
currently have barbican so local was the only option. To get them
accessible on all nodes we're considering two options:
- store bay certs in a shared filesystem, meaning a new set of
credentials in the boxes (and a process to renew fs tokens)
- deploy barbican (some bits of puppet missing we're sorting out)

More news next week.

Cheers,
Ricardo

On Thu, Mar 17, 2016 at 6:46 PM, Daneyon Hansen (danehans)
 wrote:
> All,
>
> Does anyone have experience deploying Magnum in a highly-available fashion?
> If so, I’m interested in learning from your experience. My biggest unknown
> is the Conductor service. Any insight you can provide is greatly
> appreciated.
>
> Regards,
> Daneyon Hansen
>
> __
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [magnum] High Availability

2016-03-19 Thread Ricardo Rocha
 only. In 
>>>> other words, this option is not for production. As a result, Barbican 
>>>> becomes the only option for production which is the root of the problem. 
>>>> It basically forces everyone to install Barbican in order to use Magnum.
>>>>
>>>> [1] https://review.openstack.org/#/c/212395/
>>>>
>>>>> It's probably a bad idea to replicate them.
>>>>> That's what Barbican is for. --adrian_otto
>>>> Frankly, I am surprised that you disagreed here. Back to July 2015, we all 
>>>> agreed to have two phases of implementation and the statement was made by 
>>>> you [2].
>>>>
>>>> 
>>>> #agreed Magnum will use Barbican for an initial implementation for 
>>>> certificate generation and secure storage/retrieval.  We will commit to a 
>>>> second phase of development to eliminating the hard requirement on 
>>>> Barbican with an alternate implementation that implements the functional 
>>>> equivalent implemented in Magnum, which may depend on libraries, but not 
>>>> Barbican.
>>>> 
>>>>
>>>> [2] 
>>>> http://lists.openstack.org/pipermail/openstack-dev/2015-July/069130.html
>>>
>>> The context there is important. Barbican was considered for two purposes: 
>>> (1) CA signing capability, and (2) certificate storage. My willingness to 
>>> implement an alternative was based on our need to get a certificate 
>>> generation and signing solution that actually worked, as Barbican did not 
>>> work for that at the time. I have always viewed Barbican as a suitable 
>>> solution for certificate storage, as that was what it was first designed 
>>> for. Since then, we have implemented certificate generation and signing 
>>> logic within a library that does not depend on Barbican, and we can use 
>>> that safely in production use cases. What we don’t have built in is what 
>>> Barbican is best at, secure storage for our certificates that will allow 
>>> multi-conductor operation.
>>>
>>> I am opposed to the idea that Magnum should re-implement Barbican for 
>>> certificate storage just because operators are reluctant to adopt it. If we 
>>> need to ship a Barbican instance along with each Magnum control plane, so 
>>> be it, but I don’t see the value in re-inventing the wheel. I promised the 
>>> OpenStack community that we were out to integrate with and enhance 
>>> OpenStack not to replace it.
>>>
>>> Now, with all that said, I do recognize that not all clouds are motivated 
>>> to use all available security best practices. They may be operating in 
>>> environments that they believe are already secure (because of a secure 
>>> perimeter), and that it’s okay to run fundamentally insecure software 
>>> within those environments. As misguided as this viewpoint may be, it’s 
>>> common. My belief is that it’s best to offer the best practice by default, 
>>> and only allow insecure operation when someone deliberately turns of 
>>> fundamental security features.
>>>
>>> With all this said, I also care about Magnum adoption as much as all of us, 
>>> so I’d like us to think creatively about how to strike the right balance 
>>> between re-implementing existing technology, and making that technology 
>>> easily accessible.
>>>
>>> Thanks,
>>>
>>> Adrian
>>>
>>>>
>>>> Best regards,
>>>> Hongbin
>>>>
>>>> -Original Message-
>>>> From: Adrian Otto [mailto:adrian.o...@rackspace.com]
>>>> Sent: March-17-16 4:32 PM
>>>> To: OpenStack Development Mailing List (not for usage questions)
>>>> Subject: Re: [openstack-dev] [magnum] High Availability
>>>>
>>>> I have trouble understanding that blueprint. I will put some remarks on 
>>>> the whiteboard. Duplicating Barbican sounds like a mistake to me.
>>>>
>>>> --
>>>> Adrian
>>>>
>>>>> On Mar 17, 2016, at 12:01 PM, Hongbin Lu <hongbin...@huawei.com> wrote:
>>>>>
>>>>> The problem of missing Barbican alternative implementation has been 
>>>>> raised several times by different people. IMO, this is a very serious 
>>>>> issue that will hurt Magnum adoption. I created a blueprint

Re: [openstack-dev] [magnum] containers across availability zones

2016-02-24 Thread Ricardo Rocha
Thanks, done.

https://blueprints.launchpad.net/magnum/+spec/magnum-availability-zones

We might have something already to expose the labels in the docker
daemon config.

On Wed, Feb 24, 2016 at 6:01 PM, Vilobh Meshram
<vilobhmeshram.openst...@gmail.com> wrote:
> +1 from me too for the idea. Please file a blueprint. Seems feasible and
> useful.
>
> -Vilobh
>
>
> On Tue, Feb 23, 2016 at 7:25 PM, Adrian Otto <adrian.o...@rackspace.com>
> wrote:
>>
>> Ricardo,
>>
>> Yes, that approach would work. I don’t see any harm in automatically
>> adding tags to the docker daemon on the bay nodes as part of the swarm heat
>> template. That would allow the filter selection you described.
>>
>> Adrian
>>
>> > On Feb 23, 2016, at 4:11 PM, Ricardo Rocha <rocha.po...@gmail.com>
>> > wrote:
>> >
>> > Hi.
>> >
>> > Has anyone looked into having magnum bay nodes deployed in different
>> > availability zones? The goal would be to have multiple instances of a
>> > container running on nodes across multiple AZs.
>> >
>> > Looking at docker swarm this could be achieved using (for example)
>> > affinity filters based on labels. Something like:
>> >
>> > docker run -it -d -p 80:80 --label nova.availability-zone=my-zone-a
>> > nginx
>> > https://docs.docker.com/swarm/scheduler/filter/#use-an-affinity-filter
>> >
>> > We can do this if we change the templates/config scripts to add to the
>> > docker daemon params some labels exposing availability zone or other
>> > metadata (taken from the nova metadata).
>> >
>> > https://docs.docker.com/engine/userguide/labels-custom-metadata/#daemon-labels
>> >
>> > It's a bit less clear how we would get heat to launch nodes across
>> > availability zones using ResourceGroup(s), but there are other heat
>> > resources that support it (i'm sure this can be done).
>> >
>> > Does this make sense? Any thoughts or alternatives?
>> >
>> > If it makes sense i'm happy to submit a blueprint.
>> >
>> > Cheers,
>> >  Ricardo
>> >
>> >
>> > __
>> > OpenStack Development Mailing List (not for usage questions)
>> > Unsubscribe:
>> > openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
>> > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>
>> __
>> OpenStack Development Mailing List (not for usage questions)
>> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
>
>
> __
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [magnum] containers across availability zones

2016-02-23 Thread Ricardo Rocha
Hi.

Has anyone looked into having magnum bay nodes deployed in different
availability zones? The goal would be to have multiple instances of a
container running on nodes across multiple AZs.

Looking at docker swarm this could be achieved using (for example)
affinity filters based on labels. Something like:

docker run -it -d -p 80:80 --label nova.availability-zone=my-zone-a nginx
https://docs.docker.com/swarm/scheduler/filter/#use-an-affinity-filter

We can do this if we change the templates/config scripts to add to the
docker daemon params some labels exposing availability zone or other
metadata (taken from the nova metadata).
https://docs.docker.com/engine/userguide/labels-custom-metadata/#daemon-labels

It's a bit less clear how we would get heat to launch nodes across
availability zones using ResourceGroup(s), but there are other heat
resources that support it (i'm sure this can be done).

Does this make sense? Any thoughts or alternatives?

If it makes sense i'm happy to submit a blueprint.

Cheers,
  Ricardo

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [magnum] Nesting /containers resource under /bays

2016-01-19 Thread Ricardo Rocha
Hi.

I agree with this. It's great magnum does the setup and config of the
container cluster backends, but we could also call heat ourselves if
that would be it.

Taking a common use case we have:
- create and expose a volume using a nfs backend so that multiple
clients can access the same data simultaneously (manila)
- launch a set of containers exposing and crunching data mounting that
nfs endpoint

Is there a reason why magnum can't do this for me (or aim at doing
it)? Handling all the required linking of containers with block
storage or filesystems would be great (and we have multiple block
storage backends, credentials not available to clients). There will be
other cases where all we want is a kubernetes or swarm endpoint, but
here we want containers to integrate with all the rest openstack
already manages.

Ricardo

On Tue, Jan 19, 2016 at 11:10 PM, Hongbin Lu  wrote:
> I don't see why the existent of /containers endpoint blocks your workflow. 
> However, with /containers gone, the alternate workflows are blocked.
>
> As a counterexample, some users want to manage containers through an 
> OpenStack API for various reasons (i.e. single integration point, lack of 
> domain knowledge of COEs, orchestration with other OpenStack resources: VMs, 
> networks, volumes, etc.):
>
> * Deployment of a cluster
> * Management of that cluster
> * Creation of a container
> * Management of that container
>
> As another counterexample, some users just want a container:
>
> * Creation of a container
> * Management of that container
>
> Then, should we remove the /bays endpoint as well? Mangum is currently in an 
> early stage, so workflows are diverse, non-static, and hypothetical. It is a 
> risk to have Magnum overfit into a specific workflow by removing others.
>
> For your analogies, Cinder is a block storage service so it doesn't abstract 
> the filesystems. Mangum is a container service [1] so it is reasonable to 
> abstract containers. Again, if your logic is applied, should Nova have an 
> endpoint that let you work with individual hypervisor? Probably not, because 
> Nova is a Compute service.
>
> [1] 
> https://github.com/openstack/magnum/blob/master/specs/containers-service.rst
>
> Best regards,
> Hongbin
>
> -Original Message-
> From: Kyle Kelley [mailto:kyle.kel...@rackspace.com]
> Sent: January-19-16 2:37 PM
> To: OpenStack Development Mailing List (not for usage questions)
> Subject: Re: [openstack-dev] [magnum] Nesting /containers resource under /bays
>
> With /containers gone, what Magnum offers is a workflow for consuming 
> container orchestration engines:
>
> * Deployment of a cluster
> * Management of that cluster
> * Key handling (creation, upload, revocation, etc.)
>
> The first two are handled underneath by Nova + Heat, the last is in the 
> purview of Barbican. That doesn't matter though.
>
> What users care about is getting access to these resources without having to 
> write their own heat template, create a backing key store, etc. They'd like 
> to get started immediately with container technologies that are proven.
>
> If you're looking for analogies Hongbin, this would be more like saying that 
> Cinder shouldn't have an endpoint that let you work with individual files on 
> a volume. It would be unreasonable to try to abstract across filesystems in a 
> meaningful and sustainable way.
>
> 
> From: Hongbin Lu 
> Sent: Tuesday, January 19, 2016 9:43 AM
> To: OpenStack Development Mailing List (not for usage questions)
> Subject: Re: [openstack-dev] [magnum] Nesting /containers resource under /bays
>
> Assume your logic is applied. Should Nova remove the endpoint of managing 
> VMs? Should Cinder remove the endpoint of managing volumes?
>
> I think the best way to deal with the heterogeneity is to introduce a common 
> abstraction layer, not to decouple from it. The real critical functionality 
> Magnum could offer to OpenStack is to provide a Container-as-a-Service. If 
> Magnum is a Deployment-as-a-service, it will be less useful and won't bring 
> too much value to the OpenStack ecosystem.
>
> Best regards,
> Hongbin
>
> -Original Message-
> From: Clark, Robert Graham [mailto:robert.cl...@hpe.com]
> Sent: January-19-16 5:19 AM
> To: OpenStack Development Mailing List (not for usage questions)
> Subject: Re: [openstack-dev] [magnum] Nesting /containers resource under /bays
>
> +1
>
> Doing this, and doing this well, provides critical functionality to OpenStack 
> while keeping said functionality reasonably decoupled from the COE API 
> vagaries that would inevitably encumber a solution that sought to provide 
> ‘one api to control them all’.
>
> -Rob
>
> From: Mike Metral
> Reply-To: OpenStack List
> Date: Saturday, 16 January 2016 02:24
> To: OpenStack List
> Subject: Re: [openstack-dev] [magnum] Nesting /containers resource under /bays
>
> The requirements that running a fully 

Re: [openstack-dev] New [puppet] module for Magnum project

2015-11-25 Thread Ricardo Rocha
Hi.

We've started implementing a similar module here, i just pushed it to:
https://github.com/cernops/puppet-magnum

It already does a working magnum-api/conductor, and we'll add
configuration for additional conf options this week - to allow
alternate heat templates for the bays.

I've done some work before on puppet-ceph before and i'm happy to
start pushing patches to openstack/puppet-magnum. Is there already
something going on? I couldn't find any in:
https://review.openstack.org/#/q/status:open+project:openstack/puppet-magnum,n,z

Cheers,
  Ricardo

On Thu, Oct 29, 2015 at 4:58 PM, Potter, Nathaniel
 wrote:
> Hi Adrian,
>
>
>
> Basically it would fall under the same umbrella as all of the other
> puppet-openstack projects, which use puppet automation to configure as well
> as manage various OpenStack projects. An example of a mature one is here for
> the Cinder project: https://github.com/openstack/puppet-cinder. Right now
> there are about 35-40 such puppet modules for different projects in
> OpenStack, so one example of people who might make use of this project are
> people who have already used the existing puppet modules to set up their
> cloud and wish to incorporate Magnum into their cloud using the same tool.
>
>
>
> Thanks,
>
> Nate
>
>
>
> From: Adrian Otto [mailto:adrian.o...@rackspace.com]
> Sent: Thursday, October 29, 2015 10:10 AM
> To: OpenStack Development Mailing List (not for usage questions)
> Subject: Re: [openstack-dev] New [puppet] module for Magnum project
>
>
>
> Nate,
>
>
>
> On Oct 29, 2015, at 11:26 PM, Potter, Nathaniel 
> wrote:
>
>
>
> Hi everyone,
>
>
>
> I’m interested in starting up a puppet module that will handle the Magnum
> containers project. Would this be something the community might want?
> Thanks!
>
>
>
> Best,
>
> Nate Potter
>
>
>
> Can you elaborate a bit more about your concept? Who would use this? What
> function would it provide? My guess is that you are suggesting a puppet
> config for adding the Magnum service to an OpenStack cloud. Is that what you
> meant? If so, could you share a reference to an existing one that we could
> see as an example of what you had in mind?
>
>
>
> Thanks,
>
>
>
> Adrian
>
>
>
>
> __
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev