Re: [openstack-dev] [tripleo] service validation during deployment steps
What about ability of service expert to plug-in remediation module? If remediation action succeed - proceed, if not then stop. Remediation module can be extended independently from main flow. Thanks, Arkady -Original Message- From: Steven Hardy [mailto:sha...@redhat.com] Sent: Wednesday, July 27, 2016 3:26 AM To: OpenStack Development Mailing List (not for usage questions) Subject: Re: [openstack-dev] [tripleo] service validation during deployment steps Hi Emilien, On Tue, Jul 26, 2016 at 03:59:33PM -0400, Emilien Macchi wrote: > I would love to hear some feedback about $topic, thanks. Sorry for the slow response, we did dicuss this on IRC, but providing that feedback and some other comments below: > On Fri, Jul 15, 2016 at 11:31 AM, Emilien Macchi wrote: > > Hi, > > > > Some people on the field brought interesting feedback: > > > > "As a TripleO User, I would like the deployment to stop immediately > > after an resource creation failure during a step of the deployment > > and be able to easily understand what service or resource failed to > > be installed". > > > > Example: > > If during step4 Puppet tries to deploy Neutron and OVS, but OVS > > fails to start for some reasons, deployment should stop at the end > > of the step. I don't think anyone will argue against this use-case, we absolutely want to enable a better "fail fast" for deployment problems, as well as better surfacing of why it failed. > > So there are 2 things in this user story: > > > > 1) Be able to run some service validation within a step deployment. > > Note about the implementation: make the validation composable per > > service (OVS, nova, etc) and not per role (compute, controller, etc). +1, now we have composable services we need any validations to be associated with the services, not the roles. That said, it's fairly easy to imagine an interface like step_config/config_settings could be used to wire in composable service validations on a per-role basis, e.g similar to what we do here, but per-step: https://github.com/openstack/tripleo-heat-templates/blob/master/overcloud.yaml#L1144 Similar to what was proposed (but never merged) here: https://review.openstack.org/#/c/174150/15/puppet/controller-post-puppet.yaml > > 2) Make this information readable and easy to access and understand > > for our users. > > > > I have a proof-of-concept for 1) and partially 2), with the example > > of > > OVS: https://review.openstack.org/#/c/342202/ > > This patch will make sure OVS is actually usable at step 4 by > > running 'ovs-vsctl show' during the Puppet catalog and if it's > > working, it will create a Puppet anchor. This anchor is currently > > not useful but could be in future if we want to rely on it for > > orchestration. > > I wrote the service validation in Puppet 2 years ago when doing > > Spinal Stack with eNovance: > > https://github.com/openstack/puppet-openstacklib/blob/master/manifes > > ts/service_validation.pp I think we could re-use it very easily, it > > has been proven to work. > > Also, the code is within our Puppet profiles, so it's by design > > composable and we don't need to make any connection with our current > > services with some magic. Validation will reside within Puppet > > manifests. > > If you look my PoC, this code could even live in puppet-vswitch > > itself (we already have this code for puppet-nova, and some others). I think having the validations inside the puppet implementation is OK, but ideally I think we do want it to be part of the puppet modules themselves (not part of the puppet-tripleo abstraction layer). The issue I'd have with putting it in puppet-tripleo is that if we're going to do this in a tripleo specific way, it should probably be done via a method that's more config tool agnostic. Otherwise we'll have to recreate the same validations for future implementations (I'm thinking specifically about containers here, and possibly ansible[1]. So, in summary, I'm +1 on getting this integrated if it can be done with little overhead and it's something we can leverage via the puppet modules vs puppet-tripleo. > > > > Ok now, what if validation fails? > > I'm testing it here: https://review.openstack.org/#/c/342205/ > > If you look at /var/log/messages, you'll see: > > > > Error: > > /Stage[main]/Tripleo::Profile::Base::Neutron::Ovs/Openstacklib::Serv > > ice_validation[openvswitch]/Exec[execute > > openvswitch validation]/returns: change from notrun to 0 failed > > > > So it's pretty clear by looking at logs that openvswitch service > > validation failed and something is wrong. You'll also notice in the > > logs tha
Re: [openstack-dev] [tripleo] service validation during deployment steps
On Wed, Jul 27, 2016 at 4:25 AM, Steven Hardywrote: > Hi Emilien, > > On Tue, Jul 26, 2016 at 03:59:33PM -0400, Emilien Macchi wrote: >> I would love to hear some feedback about $topic, thanks. > > Sorry for the slow response, we did dicuss this on IRC, but providing that > feedback and some other comments below: > >> On Fri, Jul 15, 2016 at 11:31 AM, Emilien Macchi wrote: >> > Hi, >> > >> > Some people on the field brought interesting feedback: >> > >> > "As a TripleO User, I would like the deployment to stop immediately >> > after an resource creation failure during a step of the deployment and >> > be able to easily understand what service or resource failed to be >> > installed". >> > >> > Example: >> > If during step4 Puppet tries to deploy Neutron and OVS, but OVS fails >> > to start for some reasons, deployment should stop at the end of the >> > step. > > I don't think anyone will argue against this use-case, we absolutely want > to enable a better "fail fast" for deployment problems, as well as better > surfacing of why it failed. > >> > So there are 2 things in this user story: >> > >> > 1) Be able to run some service validation within a step deployment. >> > Note about the implementation: make the validation composable per >> > service (OVS, nova, etc) and not per role (compute, controller, etc). > > +1, now we have composable services we need any validations to be > associated with the services, not the roles. > > That said, it's fairly easy to imagine an interface like > step_config/config_settings could be used to wire in composable service > validations on a per-role basis, e.g similar to what we do here, but > per-step: > > https://github.com/openstack/tripleo-heat-templates/blob/master/overcloud.yaml#L1144 > > Similar to what was proposed (but never merged) here: > > https://review.openstack.org/#/c/174150/15/puppet/controller-post-puppet.yaml > >> > 2) Make this information readable and easy to access and understand >> > for our users. >> > >> > I have a proof-of-concept for 1) and partially 2), with the example of >> > OVS: https://review.openstack.org/#/c/342202/ >> > This patch will make sure OVS is actually usable at step 4 by running >> > 'ovs-vsctl show' during the Puppet catalog and if it's working, it >> > will create a Puppet anchor. This anchor is currently not useful but >> > could be in future if we want to rely on it for orchestration. >> > I wrote the service validation in Puppet 2 years ago when doing Spinal >> > Stack with eNovance: >> > https://github.com/openstack/puppet-openstacklib/blob/master/manifests/service_validation.pp >> > I think we could re-use it very easily, it has been proven to work. >> > Also, the code is within our Puppet profiles, so it's by design >> > composable and we don't need to make any connection with our current >> > services with some magic. Validation will reside within Puppet >> > manifests. >> > If you look my PoC, this code could even live in puppet-vswitch itself >> > (we already have this code for puppet-nova, and some others). > > I think having the validations inside the puppet implementation is OK, but > ideally I think we do want it to be part of the puppet modules themselves > (not part of the puppet-tripleo abstraction layer). > > The issue I'd have with putting it in puppet-tripleo is that if we're going > to do this in a tripleo specific way, it should probably be done via a > method that's more config tool agnostic. Otherwise we'll have to recreate > the same validations for future implementations (I'm thinking specifically > about containers here, and possibly ansible[1]. > > So, in summary, I'm +1 on getting this integrated if it can be done with > little overhead and it's something we can leverage via the puppet modules > vs puppet-tripleo. > >> > >> > Ok now, what if validation fails? >> > I'm testing it here: https://review.openstack.org/#/c/342205/ >> > If you look at /var/log/messages, you'll see: >> > >> > Error: >> > /Stage[main]/Tripleo::Profile::Base::Neutron::Ovs/Openstacklib::Service_validation[openvswitch]/Exec[execute >> > openvswitch validation]/returns: change from notrun to 0 failed >> > >> > So it's pretty clear by looking at logs that openvswitch service >> > validation failed and something is wrong. You'll also notice in the >> > logs that deployed stopped at step 4 since OVS is not considered to >> > run. >> > It's partially addressing 2) because we need to make it more explicit >> > and readable. Dan Prince had the idea to use >> > https://github.com/ripienaar/puppet-reportprint to print a nice report >> > of Puppet catalog result (we haven't tried it yet). We could also use >> > Operational Tools later to monitor Puppet logs and find Service >> > validation failures. > > This all sounds good, but we do need to think beyond the puppet > implementation, e.g how will we enable similar validations in a container > based deployment? > > I remember SpinalStack also
Re: [openstack-dev] [tripleo] service validation during deployment steps
Hi Emilien, On Tue, Jul 26, 2016 at 03:59:33PM -0400, Emilien Macchi wrote: > I would love to hear some feedback about $topic, thanks. Sorry for the slow response, we did dicuss this on IRC, but providing that feedback and some other comments below: > On Fri, Jul 15, 2016 at 11:31 AM, Emilien Macchiwrote: > > Hi, > > > > Some people on the field brought interesting feedback: > > > > "As a TripleO User, I would like the deployment to stop immediately > > after an resource creation failure during a step of the deployment and > > be able to easily understand what service or resource failed to be > > installed". > > > > Example: > > If during step4 Puppet tries to deploy Neutron and OVS, but OVS fails > > to start for some reasons, deployment should stop at the end of the > > step. I don't think anyone will argue against this use-case, we absolutely want to enable a better "fail fast" for deployment problems, as well as better surfacing of why it failed. > > So there are 2 things in this user story: > > > > 1) Be able to run some service validation within a step deployment. > > Note about the implementation: make the validation composable per > > service (OVS, nova, etc) and not per role (compute, controller, etc). +1, now we have composable services we need any validations to be associated with the services, not the roles. That said, it's fairly easy to imagine an interface like step_config/config_settings could be used to wire in composable service validations on a per-role basis, e.g similar to what we do here, but per-step: https://github.com/openstack/tripleo-heat-templates/blob/master/overcloud.yaml#L1144 Similar to what was proposed (but never merged) here: https://review.openstack.org/#/c/174150/15/puppet/controller-post-puppet.yaml > > 2) Make this information readable and easy to access and understand > > for our users. > > > > I have a proof-of-concept for 1) and partially 2), with the example of > > OVS: https://review.openstack.org/#/c/342202/ > > This patch will make sure OVS is actually usable at step 4 by running > > 'ovs-vsctl show' during the Puppet catalog and if it's working, it > > will create a Puppet anchor. This anchor is currently not useful but > > could be in future if we want to rely on it for orchestration. > > I wrote the service validation in Puppet 2 years ago when doing Spinal > > Stack with eNovance: > > https://github.com/openstack/puppet-openstacklib/blob/master/manifests/service_validation.pp > > I think we could re-use it very easily, it has been proven to work. > > Also, the code is within our Puppet profiles, so it's by design > > composable and we don't need to make any connection with our current > > services with some magic. Validation will reside within Puppet > > manifests. > > If you look my PoC, this code could even live in puppet-vswitch itself > > (we already have this code for puppet-nova, and some others). I think having the validations inside the puppet implementation is OK, but ideally I think we do want it to be part of the puppet modules themselves (not part of the puppet-tripleo abstraction layer). The issue I'd have with putting it in puppet-tripleo is that if we're going to do this in a tripleo specific way, it should probably be done via a method that's more config tool agnostic. Otherwise we'll have to recreate the same validations for future implementations (I'm thinking specifically about containers here, and possibly ansible[1]. So, in summary, I'm +1 on getting this integrated if it can be done with little overhead and it's something we can leverage via the puppet modules vs puppet-tripleo. > > > > Ok now, what if validation fails? > > I'm testing it here: https://review.openstack.org/#/c/342205/ > > If you look at /var/log/messages, you'll see: > > > > Error: > > /Stage[main]/Tripleo::Profile::Base::Neutron::Ovs/Openstacklib::Service_validation[openvswitch]/Exec[execute > > openvswitch validation]/returns: change from notrun to 0 failed > > > > So it's pretty clear by looking at logs that openvswitch service > > validation failed and something is wrong. You'll also notice in the > > logs that deployed stopped at step 4 since OVS is not considered to > > run. > > It's partially addressing 2) because we need to make it more explicit > > and readable. Dan Prince had the idea to use > > https://github.com/ripienaar/puppet-reportprint to print a nice report > > of Puppet catalog result (we haven't tried it yet). We could also use > > Operational Tools later to monitor Puppet logs and find Service > > validation failures. This all sounds good, but we do need to think beyond the puppet implementation, e.g how will we enable similar validations in a container based deployment? I remember SpinalStack also used serverspec, can you describe the differences between using that tool (was it only used for post-deploy validation of the whole server, not per-step validation?) I'm just wondering if the overhead of
Re: [openstack-dev] [tripleo] service validation during deployment steps
On Jul 26, 2016 10:02 PM, "Emilien Macchi"wrote: > > I would love to hear some feedback about $topic, thanks. Ian, Rhys, Jacob - this sounds like a big step in the right direction to me, what do you guys think? -Hugh > > On Fri, Jul 15, 2016 at 11:31 AM, Emilien Macchi wrote: > > Hi, > > > > Some people on the field brought interesting feedback: > > > > "As a TripleO User, I would like the deployment to stop immediately > > after an resource creation failure during a step of the deployment and > > be able to easily understand what service or resource failed to be > > installed". > > > > Example: > > If during step4 Puppet tries to deploy Neutron and OVS, but OVS fails > > to start for some reasons, deployment should stop at the end of the > > step. > > > > So there are 2 things in this user story: > > > > 1) Be able to run some service validation within a step deployment. > > Note about the implementation: make the validation composable per > > service (OVS, nova, etc) and not per role (compute, controller, etc). > > > > 2) Make this information readable and easy to access and understand > > for our users. > > > > I have a proof-of-concept for 1) and partially 2), with the example of > > OVS: https://review.openstack.org/#/c/342202/ > > This patch will make sure OVS is actually usable at step 4 by running > > 'ovs-vsctl show' during the Puppet catalog and if it's working, it > > will create a Puppet anchor. This anchor is currently not useful but > > could be in future if we want to rely on it for orchestration. > > I wrote the service validation in Puppet 2 years ago when doing Spinal > > Stack with eNovance: > > https://github.com/openstack/puppet-openstacklib/blob/master/manifests/service_validation.pp > > I think we could re-use it very easily, it has been proven to work. > > Also, the code is within our Puppet profiles, so it's by design > > composable and we don't need to make any connection with our current > > services with some magic. Validation will reside within Puppet > > manifests. > > If you look my PoC, this code could even live in puppet-vswitch itself > > (we already have this code for puppet-nova, and some others). > > > > Ok now, what if validation fails? > > I'm testing it here: https://review.openstack.org/#/c/342205/ > > If you look at /var/log/messages, you'll see: > > > > Error: /Stage[main]/Tripleo::Profile::Base::Neutron::Ovs/Openstacklib::Service_validation[openvswitch]/Exec[execute > > openvswitch validation]/returns: change from notrun to 0 failed > > > > So it's pretty clear by looking at logs that openvswitch service > > validation failed and something is wrong. You'll also notice in the > > logs that deployed stopped at step 4 since OVS is not considered to > > run. > > It's partially addressing 2) because we need to make it more explicit > > and readable. Dan Prince had the idea to use > > https://github.com/ripienaar/puppet-reportprint to print a nice report > > of Puppet catalog result (we haven't tried it yet). We could also use > > Operational Tools later to monitor Puppet logs and find Service > > validation failures. > > > > > > So this email is a bootstrap of discussion, it's open for feedback. > > Don't take my PoC as something we'll implement. It's an idea and I > > think it's worth to look at it. > > I like it for 2 reasons: > > - the validation code reside within our profiles, so it's composable by design. > > - it's flexible and allow us to test everything. It can be a bash > > script, a shell command, a Puppet resource (provider, service, etc). > > > > Thanks for reading so far, > > -- > > Emilien Macchi > > > > -- > Emilien Macchi > > __ > OpenStack Development Mailing List (not for usage questions) > Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [tripleo] service validation during deployment steps
I would love to hear some feedback about $topic, thanks. On Fri, Jul 15, 2016 at 11:31 AM, Emilien Macchiwrote: > Hi, > > Some people on the field brought interesting feedback: > > "As a TripleO User, I would like the deployment to stop immediately > after an resource creation failure during a step of the deployment and > be able to easily understand what service or resource failed to be > installed". > > Example: > If during step4 Puppet tries to deploy Neutron and OVS, but OVS fails > to start for some reasons, deployment should stop at the end of the > step. > > So there are 2 things in this user story: > > 1) Be able to run some service validation within a step deployment. > Note about the implementation: make the validation composable per > service (OVS, nova, etc) and not per role (compute, controller, etc). > > 2) Make this information readable and easy to access and understand > for our users. > > I have a proof-of-concept for 1) and partially 2), with the example of > OVS: https://review.openstack.org/#/c/342202/ > This patch will make sure OVS is actually usable at step 4 by running > 'ovs-vsctl show' during the Puppet catalog and if it's working, it > will create a Puppet anchor. This anchor is currently not useful but > could be in future if we want to rely on it for orchestration. > I wrote the service validation in Puppet 2 years ago when doing Spinal > Stack with eNovance: > https://github.com/openstack/puppet-openstacklib/blob/master/manifests/service_validation.pp > I think we could re-use it very easily, it has been proven to work. > Also, the code is within our Puppet profiles, so it's by design > composable and we don't need to make any connection with our current > services with some magic. Validation will reside within Puppet > manifests. > If you look my PoC, this code could even live in puppet-vswitch itself > (we already have this code for puppet-nova, and some others). > > Ok now, what if validation fails? > I'm testing it here: https://review.openstack.org/#/c/342205/ > If you look at /var/log/messages, you'll see: > > Error: > /Stage[main]/Tripleo::Profile::Base::Neutron::Ovs/Openstacklib::Service_validation[openvswitch]/Exec[execute > openvswitch validation]/returns: change from notrun to 0 failed > > So it's pretty clear by looking at logs that openvswitch service > validation failed and something is wrong. You'll also notice in the > logs that deployed stopped at step 4 since OVS is not considered to > run. > It's partially addressing 2) because we need to make it more explicit > and readable. Dan Prince had the idea to use > https://github.com/ripienaar/puppet-reportprint to print a nice report > of Puppet catalog result (we haven't tried it yet). We could also use > Operational Tools later to monitor Puppet logs and find Service > validation failures. > > > So this email is a bootstrap of discussion, it's open for feedback. > Don't take my PoC as something we'll implement. It's an idea and I > think it's worth to look at it. > I like it for 2 reasons: > - the validation code reside within our profiles, so it's composable by > design. > - it's flexible and allow us to test everything. It can be a bash > script, a shell command, a Puppet resource (provider, service, etc). > > Thanks for reading so far, > -- > Emilien Macchi -- Emilien Macchi __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
[openstack-dev] [tripleo] service validation during deployment steps
Hi, Some people on the field brought interesting feedback: "As a TripleO User, I would like the deployment to stop immediately after an resource creation failure during a step of the deployment and be able to easily understand what service or resource failed to be installed". Example: If during step4 Puppet tries to deploy Neutron and OVS, but OVS fails to start for some reasons, deployment should stop at the end of the step. So there are 2 things in this user story: 1) Be able to run some service validation within a step deployment. Note about the implementation: make the validation composable per service (OVS, nova, etc) and not per role (compute, controller, etc). 2) Make this information readable and easy to access and understand for our users. I have a proof-of-concept for 1) and partially 2), with the example of OVS: https://review.openstack.org/#/c/342202/ This patch will make sure OVS is actually usable at step 4 by running 'ovs-vsctl show' during the Puppet catalog and if it's working, it will create a Puppet anchor. This anchor is currently not useful but could be in future if we want to rely on it for orchestration. I wrote the service validation in Puppet 2 years ago when doing Spinal Stack with eNovance: https://github.com/openstack/puppet-openstacklib/blob/master/manifests/service_validation.pp I think we could re-use it very easily, it has been proven to work. Also, the code is within our Puppet profiles, so it's by design composable and we don't need to make any connection with our current services with some magic. Validation will reside within Puppet manifests. If you look my PoC, this code could even live in puppet-vswitch itself (we already have this code for puppet-nova, and some others). Ok now, what if validation fails? I'm testing it here: https://review.openstack.org/#/c/342205/ If you look at /var/log/messages, you'll see: Error: /Stage[main]/Tripleo::Profile::Base::Neutron::Ovs/Openstacklib::Service_validation[openvswitch]/Exec[execute openvswitch validation]/returns: change from notrun to 0 failed So it's pretty clear by looking at logs that openvswitch service validation failed and something is wrong. You'll also notice in the logs that deployed stopped at step 4 since OVS is not considered to run. It's partially addressing 2) because we need to make it more explicit and readable. Dan Prince had the idea to use https://github.com/ripienaar/puppet-reportprint to print a nice report of Puppet catalog result (we haven't tried it yet). We could also use Operational Tools later to monitor Puppet logs and find Service validation failures. So this email is a bootstrap of discussion, it's open for feedback. Don't take my PoC as something we'll implement. It's an idea and I think it's worth to look at it. I like it for 2 reasons: - the validation code reside within our profiles, so it's composable by design. - it's flexible and allow us to test everything. It can be a bash script, a shell command, a Puppet resource (provider, service, etc). Thanks for reading so far, -- Emilien Macchi __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev