Re: [openstack-dev] [tripleo] service validation during deployment steps

2016-08-02 Thread Arkady_Kanevsky
What about ability of service expert to plug-in remediation module?
If remediation action succeed - proceed, if not then stop.
Remediation module can be extended independently from main flow.
Thanks,
Arkady

-Original Message-
From: Steven Hardy [mailto:sha...@redhat.com]
Sent: Wednesday, July 27, 2016 3:26 AM
To: OpenStack Development Mailing List (not for usage questions)
Subject: Re: [openstack-dev] [tripleo] service validation during deployment 
steps

Hi Emilien,

On Tue, Jul 26, 2016 at 03:59:33PM -0400, Emilien Macchi wrote:
> I would love to hear some feedback about $topic, thanks.

Sorry for the slow response, we did dicuss this on IRC, but providing that 
feedback and some other comments below:

> On Fri, Jul 15, 2016 at 11:31 AM, Emilien Macchi wrote:
> > Hi,
> >
> > Some people on the field brought interesting feedback:
> >
> > "As a TripleO User, I would like the deployment to stop immediately
> > after an resource creation failure during a step of the deployment
> > and be able to easily understand what service or resource failed to
> > be installed".
> >
> > Example:
> > If during step4 Puppet tries to deploy Neutron and OVS, but OVS
> > fails to start for some reasons, deployment should stop at the end
> > of the step.

I don't think anyone will argue against this use-case, we absolutely want to 
enable a better "fail fast" for deployment problems, as well as better 
surfacing of why it failed.

> > So there are 2 things in this user story:
> >
> > 1) Be able to run some service validation within a step deployment.
> > Note about the implementation: make the validation composable per
> > service (OVS, nova, etc) and not per role (compute, controller, etc).

+1, now we have composable services we need any validations to be
associated with the services, not the roles.

That said, it's fairly easy to imagine an interface like 
step_config/config_settings could be used to wire in composable service 
validations on a per-role basis, e.g similar to what we do here, but
per-step:

https://github.com/openstack/tripleo-heat-templates/blob/master/overcloud.yaml#L1144

Similar to what was proposed (but never merged) here:

https://review.openstack.org/#/c/174150/15/puppet/controller-post-puppet.yaml

> > 2) Make this information readable and easy to access and understand
> > for our users.
> >
> > I have a proof-of-concept for 1) and partially 2), with the example
> > of
> > OVS: https://review.openstack.org/#/c/342202/
> > This patch will make sure OVS is actually usable at step 4 by
> > running 'ovs-vsctl show' during the Puppet catalog and if it's
> > working, it will create a Puppet anchor. This anchor is currently
> > not useful but could be in future if we want to rely on it for 
> > orchestration.
> > I wrote the service validation in Puppet 2 years ago when doing
> > Spinal Stack with eNovance:
> > https://github.com/openstack/puppet-openstacklib/blob/master/manifes
> > ts/service_validation.pp I think we could re-use it very easily, it
> > has been proven to work.
> > Also, the code is within our Puppet profiles, so it's by design
> > composable and we don't need to make any connection with our current
> > services with some magic. Validation will reside within Puppet
> > manifests.
> > If you look my PoC, this code could even live in puppet-vswitch
> > itself (we already have this code for puppet-nova, and some others).

I think having the validations inside the puppet implementation is OK, but 
ideally I think we do want it to be part of the puppet modules themselves (not 
part of the puppet-tripleo abstraction layer).

The issue I'd have with putting it in puppet-tripleo is that if we're going to 
do this in a tripleo specific way, it should probably be done via a method 
that's more config tool agnostic. Otherwise we'll have to recreate the same 
validations for future implementations (I'm thinking specifically about 
containers here, and possibly ansible[1].

So, in summary, I'm +1 on getting this integrated if it can be done with little 
overhead and it's something we can leverage via the puppet modules vs 
puppet-tripleo.

> >
> > Ok now, what if validation fails?
> > I'm testing it here: https://review.openstack.org/#/c/342205/
> > If you look at /var/log/messages, you'll see:
> >
> > Error:
> > /Stage[main]/Tripleo::Profile::Base::Neutron::Ovs/Openstacklib::Serv
> > ice_validation[openvswitch]/Exec[execute
> > openvswitch validation]/returns: change from notrun to 0 failed
> >
> > So it's pretty clear by looking at logs that openvswitch service
> > validation failed and something is wrong. You'll also notice in the
> > logs tha

Re: [openstack-dev] [tripleo] service validation during deployment steps

2016-07-29 Thread Emilien Macchi
On Wed, Jul 27, 2016 at 4:25 AM, Steven Hardy  wrote:
> Hi Emilien,
>
> On Tue, Jul 26, 2016 at 03:59:33PM -0400, Emilien Macchi wrote:
>> I would love to hear some feedback about $topic, thanks.
>
> Sorry for the slow response, we did dicuss this on IRC, but providing that
> feedback and some other comments below:
>
>> On Fri, Jul 15, 2016 at 11:31 AM, Emilien Macchi  wrote:
>> > Hi,
>> >
>> > Some people on the field brought interesting feedback:
>> >
>> > "As a TripleO User, I would like the deployment to stop immediately
>> > after an resource creation failure during a step of the deployment and
>> > be able to easily understand what service or resource failed to be
>> > installed".
>> >
>> > Example:
>> > If during step4 Puppet tries to deploy Neutron and OVS, but OVS fails
>> > to start for some reasons, deployment should stop at the end of the
>> > step.
>
> I don't think anyone will argue against this use-case, we absolutely want
> to enable a better "fail fast" for deployment problems, as well as better
> surfacing of why it failed.
>
>> > So there are 2 things in this user story:
>> >
>> > 1) Be able to run some service validation within a step deployment.
>> > Note about the implementation: make the validation composable per
>> > service (OVS, nova, etc) and not per role (compute, controller, etc).
>
> +1, now we have composable services we need any validations to be
> associated with the services, not the roles.
>
> That said, it's fairly easy to imagine an interface like
> step_config/config_settings could be used to wire in composable service
> validations on a per-role basis, e.g similar to what we do here, but
> per-step:
>
> https://github.com/openstack/tripleo-heat-templates/blob/master/overcloud.yaml#L1144
>
> Similar to what was proposed (but never merged) here:
>
> https://review.openstack.org/#/c/174150/15/puppet/controller-post-puppet.yaml
>
>> > 2) Make this information readable and easy to access and understand
>> > for our users.
>> >
>> > I have a proof-of-concept for 1) and partially 2), with the example of
>> > OVS: https://review.openstack.org/#/c/342202/
>> > This patch will make sure OVS is actually usable at step 4 by running
>> > 'ovs-vsctl show' during the Puppet catalog and if it's working, it
>> > will create a Puppet anchor. This anchor is currently not useful but
>> > could be in future if we want to rely on it for orchestration.
>> > I wrote the service validation in Puppet 2 years ago when doing Spinal
>> > Stack with eNovance:
>> > https://github.com/openstack/puppet-openstacklib/blob/master/manifests/service_validation.pp
>> > I think we could re-use it very easily, it has been proven to work.
>> > Also, the code is within our Puppet profiles, so it's by design
>> > composable and we don't need to make any connection with our current
>> > services with some magic. Validation will reside within Puppet
>> > manifests.
>> > If you look my PoC, this code could even live in puppet-vswitch itself
>> > (we already have this code for puppet-nova, and some others).
>
> I think having the validations inside the puppet implementation is OK, but
> ideally I think we do want it to be part of the puppet modules themselves
> (not part of the puppet-tripleo abstraction layer).
>
> The issue I'd have with putting it in puppet-tripleo is that if we're going
> to do this in a tripleo specific way, it should probably be done via a
> method that's more config tool agnostic.  Otherwise we'll have to recreate
> the same validations for future implementations (I'm thinking specifically
> about containers here, and possibly ansible[1].
>
> So, in summary, I'm +1 on getting this integrated if it can be done with
> little overhead and it's something we can leverage via the puppet modules
> vs puppet-tripleo.
>
>> >
>> > Ok now, what if validation fails?
>> > I'm testing it here: https://review.openstack.org/#/c/342205/
>> > If you look at /var/log/messages, you'll see:
>> >
>> > Error: 
>> > /Stage[main]/Tripleo::Profile::Base::Neutron::Ovs/Openstacklib::Service_validation[openvswitch]/Exec[execute
>> > openvswitch validation]/returns: change from notrun to 0 failed
>> >
>> > So it's pretty clear by looking at logs that openvswitch service
>> > validation failed and something is wrong. You'll also notice in the
>> > logs that deployed stopped at step 4 since OVS is not considered to
>> > run.
>> > It's partially addressing 2) because we need to make it more explicit
>> > and readable. Dan Prince had the idea to use
>> > https://github.com/ripienaar/puppet-reportprint to print a nice report
>> > of Puppet catalog result (we haven't tried it yet). We could also use
>> > Operational Tools later to monitor Puppet logs and find Service
>> > validation failures.
>
> This all sounds good, but we do need to think beyond the puppet
> implementation, e.g how will we enable similar validations in a container
> based deployment?
>
> I remember SpinalStack also 

Re: [openstack-dev] [tripleo] service validation during deployment steps

2016-07-27 Thread Steven Hardy
Hi Emilien,

On Tue, Jul 26, 2016 at 03:59:33PM -0400, Emilien Macchi wrote:
> I would love to hear some feedback about $topic, thanks.

Sorry for the slow response, we did dicuss this on IRC, but providing that
feedback and some other comments below:

> On Fri, Jul 15, 2016 at 11:31 AM, Emilien Macchi  wrote:
> > Hi,
> >
> > Some people on the field brought interesting feedback:
> >
> > "As a TripleO User, I would like the deployment to stop immediately
> > after an resource creation failure during a step of the deployment and
> > be able to easily understand what service or resource failed to be
> > installed".
> >
> > Example:
> > If during step4 Puppet tries to deploy Neutron and OVS, but OVS fails
> > to start for some reasons, deployment should stop at the end of the
> > step.

I don't think anyone will argue against this use-case, we absolutely want
to enable a better "fail fast" for deployment problems, as well as better
surfacing of why it failed.

> > So there are 2 things in this user story:
> >
> > 1) Be able to run some service validation within a step deployment.
> > Note about the implementation: make the validation composable per
> > service (OVS, nova, etc) and not per role (compute, controller, etc).

+1, now we have composable services we need any validations to be
associated with the services, not the roles.

That said, it's fairly easy to imagine an interface like
step_config/config_settings could be used to wire in composable service
validations on a per-role basis, e.g similar to what we do here, but
per-step:

https://github.com/openstack/tripleo-heat-templates/blob/master/overcloud.yaml#L1144

Similar to what was proposed (but never merged) here:

https://review.openstack.org/#/c/174150/15/puppet/controller-post-puppet.yaml

> > 2) Make this information readable and easy to access and understand
> > for our users.
> >
> > I have a proof-of-concept for 1) and partially 2), with the example of
> > OVS: https://review.openstack.org/#/c/342202/
> > This patch will make sure OVS is actually usable at step 4 by running
> > 'ovs-vsctl show' during the Puppet catalog and if it's working, it
> > will create a Puppet anchor. This anchor is currently not useful but
> > could be in future if we want to rely on it for orchestration.
> > I wrote the service validation in Puppet 2 years ago when doing Spinal
> > Stack with eNovance:
> > https://github.com/openstack/puppet-openstacklib/blob/master/manifests/service_validation.pp
> > I think we could re-use it very easily, it has been proven to work.
> > Also, the code is within our Puppet profiles, so it's by design
> > composable and we don't need to make any connection with our current
> > services with some magic. Validation will reside within Puppet
> > manifests.
> > If you look my PoC, this code could even live in puppet-vswitch itself
> > (we already have this code for puppet-nova, and some others).

I think having the validations inside the puppet implementation is OK, but
ideally I think we do want it to be part of the puppet modules themselves
(not part of the puppet-tripleo abstraction layer).

The issue I'd have with putting it in puppet-tripleo is that if we're going
to do this in a tripleo specific way, it should probably be done via a
method that's more config tool agnostic.  Otherwise we'll have to recreate
the same validations for future implementations (I'm thinking specifically
about containers here, and possibly ansible[1].

So, in summary, I'm +1 on getting this integrated if it can be done with
little overhead and it's something we can leverage via the puppet modules
vs puppet-tripleo.

> >
> > Ok now, what if validation fails?
> > I'm testing it here: https://review.openstack.org/#/c/342205/
> > If you look at /var/log/messages, you'll see:
> >
> > Error: 
> > /Stage[main]/Tripleo::Profile::Base::Neutron::Ovs/Openstacklib::Service_validation[openvswitch]/Exec[execute
> > openvswitch validation]/returns: change from notrun to 0 failed
> >
> > So it's pretty clear by looking at logs that openvswitch service
> > validation failed and something is wrong. You'll also notice in the
> > logs that deployed stopped at step 4 since OVS is not considered to
> > run.
> > It's partially addressing 2) because we need to make it more explicit
> > and readable. Dan Prince had the idea to use
> > https://github.com/ripienaar/puppet-reportprint to print a nice report
> > of Puppet catalog result (we haven't tried it yet). We could also use
> > Operational Tools later to monitor Puppet logs and find Service
> > validation failures.

This all sounds good, but we do need to think beyond the puppet
implementation, e.g how will we enable similar validations in a container
based deployment?

I remember SpinalStack also used serverspec, can you describe the
differences between using that tool (was it only used for post-deploy
validation of the whole server, not per-step validation?)

I'm just wondering if the overhead of 

Re: [openstack-dev] [tripleo] service validation during deployment steps

2016-07-27 Thread Hugh Brock
On Jul 26, 2016 10:02 PM, "Emilien Macchi"  wrote:
>
> I would love to hear some feedback about $topic, thanks.

Ian, Rhys, Jacob - this sounds like a big step in the right direction to
me, what do you guys think?

-Hugh

>
> On Fri, Jul 15, 2016 at 11:31 AM, Emilien Macchi 
wrote:
> > Hi,
> >
> > Some people on the field brought interesting feedback:
> >
> > "As a TripleO User, I would like the deployment to stop immediately
> > after an resource creation failure during a step of the deployment and
> > be able to easily understand what service or resource failed to be
> > installed".
> >
> > Example:
> > If during step4 Puppet tries to deploy Neutron and OVS, but OVS fails
> > to start for some reasons, deployment should stop at the end of the
> > step.
> >
> > So there are 2 things in this user story:
> >
> > 1) Be able to run some service validation within a step deployment.
> > Note about the implementation: make the validation composable per
> > service (OVS, nova, etc) and not per role (compute, controller, etc).
> >
> > 2) Make this information readable and easy to access and understand
> > for our users.
> >
> > I have a proof-of-concept for 1) and partially 2), with the example of
> > OVS: https://review.openstack.org/#/c/342202/
> > This patch will make sure OVS is actually usable at step 4 by running
> > 'ovs-vsctl show' during the Puppet catalog and if it's working, it
> > will create a Puppet anchor. This anchor is currently not useful but
> > could be in future if we want to rely on it for orchestration.
> > I wrote the service validation in Puppet 2 years ago when doing Spinal
> > Stack with eNovance:
> >
https://github.com/openstack/puppet-openstacklib/blob/master/manifests/service_validation.pp
> > I think we could re-use it very easily, it has been proven to work.
> > Also, the code is within our Puppet profiles, so it's by design
> > composable and we don't need to make any connection with our current
> > services with some magic. Validation will reside within Puppet
> > manifests.
> > If you look my PoC, this code could even live in puppet-vswitch itself
> > (we already have this code for puppet-nova, and some others).
> >
> > Ok now, what if validation fails?
> > I'm testing it here: https://review.openstack.org/#/c/342205/
> > If you look at /var/log/messages, you'll see:
> >
> > Error:
/Stage[main]/Tripleo::Profile::Base::Neutron::Ovs/Openstacklib::Service_validation[openvswitch]/Exec[execute
> > openvswitch validation]/returns: change from notrun to 0 failed
> >
> > So it's pretty clear by looking at logs that openvswitch service
> > validation failed and something is wrong. You'll also notice in the
> > logs that deployed stopped at step 4 since OVS is not considered to
> > run.
> > It's partially addressing 2) because we need to make it more explicit
> > and readable. Dan Prince had the idea to use
> > https://github.com/ripienaar/puppet-reportprint to print a nice report
> > of Puppet catalog result (we haven't tried it yet). We could also use
> > Operational Tools later to monitor Puppet logs and find Service
> > validation failures.
> >
> >
> > So this email is a bootstrap of discussion, it's open for feedback.
> > Don't take my PoC as something we'll implement. It's an idea and I
> > think it's worth to look at it.
> > I like it for 2 reasons:
> > - the validation code reside within our profiles, so it's composable by
design.
> > - it's flexible and allow us to test everything. It can be a bash
> > script, a shell command, a Puppet resource (provider, service, etc).
> >
> > Thanks for reading so far,
> > --
> > Emilien Macchi
>
>
>
> --
> Emilien Macchi
>
> __
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [tripleo] service validation during deployment steps

2016-07-26 Thread Emilien Macchi
I would love to hear some feedback about $topic, thanks.

On Fri, Jul 15, 2016 at 11:31 AM, Emilien Macchi  wrote:
> Hi,
>
> Some people on the field brought interesting feedback:
>
> "As a TripleO User, I would like the deployment to stop immediately
> after an resource creation failure during a step of the deployment and
> be able to easily understand what service or resource failed to be
> installed".
>
> Example:
> If during step4 Puppet tries to deploy Neutron and OVS, but OVS fails
> to start for some reasons, deployment should stop at the end of the
> step.
>
> So there are 2 things in this user story:
>
> 1) Be able to run some service validation within a step deployment.
> Note about the implementation: make the validation composable per
> service (OVS, nova, etc) and not per role (compute, controller, etc).
>
> 2) Make this information readable and easy to access and understand
> for our users.
>
> I have a proof-of-concept for 1) and partially 2), with the example of
> OVS: https://review.openstack.org/#/c/342202/
> This patch will make sure OVS is actually usable at step 4 by running
> 'ovs-vsctl show' during the Puppet catalog and if it's working, it
> will create a Puppet anchor. This anchor is currently not useful but
> could be in future if we want to rely on it for orchestration.
> I wrote the service validation in Puppet 2 years ago when doing Spinal
> Stack with eNovance:
> https://github.com/openstack/puppet-openstacklib/blob/master/manifests/service_validation.pp
> I think we could re-use it very easily, it has been proven to work.
> Also, the code is within our Puppet profiles, so it's by design
> composable and we don't need to make any connection with our current
> services with some magic. Validation will reside within Puppet
> manifests.
> If you look my PoC, this code could even live in puppet-vswitch itself
> (we already have this code for puppet-nova, and some others).
>
> Ok now, what if validation fails?
> I'm testing it here: https://review.openstack.org/#/c/342205/
> If you look at /var/log/messages, you'll see:
>
> Error: 
> /Stage[main]/Tripleo::Profile::Base::Neutron::Ovs/Openstacklib::Service_validation[openvswitch]/Exec[execute
> openvswitch validation]/returns: change from notrun to 0 failed
>
> So it's pretty clear by looking at logs that openvswitch service
> validation failed and something is wrong. You'll also notice in the
> logs that deployed stopped at step 4 since OVS is not considered to
> run.
> It's partially addressing 2) because we need to make it more explicit
> and readable. Dan Prince had the idea to use
> https://github.com/ripienaar/puppet-reportprint to print a nice report
> of Puppet catalog result (we haven't tried it yet). We could also use
> Operational Tools later to monitor Puppet logs and find Service
> validation failures.
>
>
> So this email is a bootstrap of discussion, it's open for feedback.
> Don't take my PoC as something we'll implement. It's an idea and I
> think it's worth to look at it.
> I like it for 2 reasons:
> - the validation code reside within our profiles, so it's composable by 
> design.
> - it's flexible and allow us to test everything. It can be a bash
> script, a shell command, a Puppet resource (provider, service, etc).
>
> Thanks for reading so far,
> --
> Emilien Macchi



-- 
Emilien Macchi

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [tripleo] service validation during deployment steps

2016-07-15 Thread Emilien Macchi
Hi,

Some people on the field brought interesting feedback:

"As a TripleO User, I would like the deployment to stop immediately
after an resource creation failure during a step of the deployment and
be able to easily understand what service or resource failed to be
installed".

Example:
If during step4 Puppet tries to deploy Neutron and OVS, but OVS fails
to start for some reasons, deployment should stop at the end of the
step.

So there are 2 things in this user story:

1) Be able to run some service validation within a step deployment.
Note about the implementation: make the validation composable per
service (OVS, nova, etc) and not per role (compute, controller, etc).

2) Make this information readable and easy to access and understand
for our users.

I have a proof-of-concept for 1) and partially 2), with the example of
OVS: https://review.openstack.org/#/c/342202/
This patch will make sure OVS is actually usable at step 4 by running
'ovs-vsctl show' during the Puppet catalog and if it's working, it
will create a Puppet anchor. This anchor is currently not useful but
could be in future if we want to rely on it for orchestration.
I wrote the service validation in Puppet 2 years ago when doing Spinal
Stack with eNovance:
https://github.com/openstack/puppet-openstacklib/blob/master/manifests/service_validation.pp
I think we could re-use it very easily, it has been proven to work.
Also, the code is within our Puppet profiles, so it's by design
composable and we don't need to make any connection with our current
services with some magic. Validation will reside within Puppet
manifests.
If you look my PoC, this code could even live in puppet-vswitch itself
(we already have this code for puppet-nova, and some others).

Ok now, what if validation fails?
I'm testing it here: https://review.openstack.org/#/c/342205/
If you look at /var/log/messages, you'll see:

Error: 
/Stage[main]/Tripleo::Profile::Base::Neutron::Ovs/Openstacklib::Service_validation[openvswitch]/Exec[execute
openvswitch validation]/returns: change from notrun to 0 failed

So it's pretty clear by looking at logs that openvswitch service
validation failed and something is wrong. You'll also notice in the
logs that deployed stopped at step 4 since OVS is not considered to
run.
It's partially addressing 2) because we need to make it more explicit
and readable. Dan Prince had the idea to use
https://github.com/ripienaar/puppet-reportprint to print a nice report
of Puppet catalog result (we haven't tried it yet). We could also use
Operational Tools later to monitor Puppet logs and find Service
validation failures.


So this email is a bootstrap of discussion, it's open for feedback.
Don't take my PoC as something we'll implement. It's an idea and I
think it's worth to look at it.
I like it for 2 reasons:
- the validation code reside within our profiles, so it's composable by design.
- it's flexible and allow us to test everything. It can be a bash
script, a shell command, a Puppet resource (provider, service, etc).

Thanks for reading so far,
-- 
Emilien Macchi

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev