Re: [openstack-dev] [TripleO] Forming our plans around Ansible

2017-07-17 Thread Dmitry Tantsur

On 07/12/2017 04:18 AM, Steve Baker wrote:



On Wed, Jul 12, 2017 at 11:47 AM, James Slagle > wrote:


On Tue, Jul 11, 2017 at 6:53 PM, Steve Baker > wrote:
 >
 >
 > On Tue, Jul 11, 2017 at 6:51 AM, James Slagle >
 > wrote:
 >>
 >> On Mon, Jul 10, 2017 at 11:37 AM, Lars Kellogg-Stedman >
 >> wrote:
 >> > On Fri, Jul 7, 2017 at 1:50 PM, James Slagle >
 >> > wrote:
 >> >>
 >> >> There are also some ideas forming around pulling the Ansible 
playbooks
 >> >>
 >> >> and vars out of Heat so that they can be rerun (or run initially)
 >> >> independently from the Heat SoftwareDeployment delivery mechanism:
 >> >
 >> >
 >> > I think the closer we can come to "the operator runs ansible-playbook 
to
 >> > configure the overcloud" the better, but not because I think Ansible 
is
 >> > inherently a great tool: rather, I think the many layers of 
indirection
 >> > in
 >> > our existing model make error reporting and diagnosis much more
 >> > complicated
 >> > that it needs to be.  Combined with Puppet's "fail as late as 
possible"
 >> > model, this means that (a) operators waste time waiting for a 
deployment
 >> > that is ultimately going to fail but hasn't yet, and (b) when it does
 >> > fail,
 >> > they need relatively intimate knowledge of our deployment tools to
 >> > backtrack
 >> > through logs and find the root cause of the failure.
 >> >
 >> > If we can offer a deployment mode that reduces the number of layers
 >> > between
 >> > the operator and the actions being performed on the hosts I think we
 >> > would
 >> > win on both fronts: faster failures and reporting errors as close as
 >> > possible to the actual problem will result in less frustration across
 >> > the
 >> > board.
 >> >
 >> > I do like Steve's suggestion of a split model where Heat is 
responsible
 >> > for
 >> > instantiating OpenStack resources while Ansible is used to perform 
host
 >> > configuration tasks.  Despite all the work done on Ansible's OpenStack
 >> > modules, they feel inflexible and frustrating to work with when 
compared
 >> > to
 >> > Heat's state-aware, dependency ordered deployments.  A solution that
 >> > allows
 >> > Heat to output configuration that can subsequently be consumed by
 >> > Ansible --
 >> > either running manually or perhaps via Mistral for
 >> > API-driven-deployments --
 >> > seems like an excellent goal.  Using Heat as a "front-end" to the
 >> > process
 >> > means that we get to keep the parameter validation and documentation
 >> > that is
 >> > missing in Ansible, while still following the Unix philosophy of 
giving
 >> > you
 >> > enough rope to hang yourself if you really want it.
 >>
 >> This is excellent input, thanks for providing it.
 >>
 >> I think it lends itself towards suggesting that we may like to persue
 >> (again) adding native Ironic resources to Heat. If those were written
 >> in a way that also addressed some of the feedback about TripleO and
 >> the baremetal deployment side, then we could continue to get the
 >> advantages from Heat that you mention.
 >>
 >> My personal opinion to date is that Ansible's os_ironic* modules are
 >> superior in some ways to the Heat->Nova->Ironic model. However, just a
 >> Heat->Ironic model may work in a way that has the advantages of both.
 >
 >
 > I too would dearly like to get nova out of the picture. Our placement 
needs
 > mean the scheduler is something we need to work around, and it discards
 > basically all context for the operator when ironic can't deploy for some
 > reason.
 >
 > Whether we use a mistral workflow[1], a heat resource, or ansible 
os_ironic,
 > there will still need to be some python logic to build the config drive 
ISO
 > that injects the ssh keys and os-collect-config bootstrap.
 >
 > Unfortunately ironic iPXE boot from iSCSI[2] doesn't support config-drive
 > (still?) so the only option to inject ssh keys is the nova ec2-metadata
 > service (or equivalent). I suspect if we can't make every ironic 
deployment
 > method support config-drive then we're stuck with nova.
 >
 > I don't have a strong preference for a heat resource vs mistral vs 
ansible
 > os_ironic, but given there is some python logic required anyway, I would
 > lean towards a heat resource. If the resource is general enough we could
 > propose it to heat upstream, otherwise we could carry it in 
tripleo-common.
 >
 > Alternatively, we 

Re: [openstack-dev] [TripleO] Forming our plans around Ansible

2017-07-12 Thread John Fulton
On Wed, Jul 12, 2017 at 2:04 AM, Giulio Fidente  wrote:

> On 07/12/2017 01:53 AM, James Slagle wrote:
>
>> On Tue, Jul 11, 2017 at 5:53 PM, Steve Baker  wrote:
>>
>>>
>>> 
>
>> What would be nice is when a heat->mistral->ansible upgrade step fails,
>>> the
>>> operator is given an ansible-playbook command to run which skips
>>> directly to
>>> the failing step. This would dramatically reduce the debug cycle and also
>>> make it possible for the operator to automate any required fixes over
>>> every
>>> host in a role. This would likely mean rendering out ansible config
>>> files,
>>> playbooks, (and roles?) to the operator's working directory. What
>>> happens to
>>> these rendered files after deployment is an open question. Delete them?
>>> Encourage the operator to track them in source control?
>>>
>>
> interesting question, as long as we run playbooks from a filesystem, I
> suppose users can make customizations without "changing" anything in
> tripleo ... this is how we tested some of the ceph-ansible fixes!
>
> for upgrades we should maintain the tasks outside the templates do be able
> to do that though, assuming we want users to customize the upgrade tasks


I like this option too! Perhaps we could add a new task to a mistral
workflow that uses this approach to store the info that was generated
dynamically from Heat (e.g. inventory and extra-vars) somewhere (swift?)
and then make it easy for the user to get this info and run the playbook
manually. Kind of like a debug-on-error option with a dribble file [1] for
the deployer. Assuming they get it working again, and we have idempotence,
they should just be able to resume the deploy.

  John

[1]
https://ftp.gnu.org/old-gnu/Manuals/elisp-manual-20-2.5/html_chapter/elisp_18.html
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [TripleO] Forming our plans around Ansible

2017-07-12 Thread Giulio Fidente

On 07/12/2017 01:53 AM, James Slagle wrote:

On Tue, Jul 11, 2017 at 5:53 PM, Steve Baker  wrote:




[...]


I think its important that we allow full support for both mistral-driven and
manually running playbooks. If there was no option to run ansible-playbook
directly then operators would miss one of the main benefits of using ansible
in the first place (which is leveraging their knowledge of inventory,
playbooks and roles to deploy things).


+1, I like this idea as well. If you have a few minutes could you
summarize it here:
https://etherpad.openstack.org/p/tripleo-ptg-queens-ansible


note that this is how option (3) currently operates; it runs an 
unmodified version of ceph-ansible, installed on the undercloud so what 
the user needs to do on failure is to look for the mistral task that 
triggered the playbook and rerun the command


what it misses, as pointed by Steven, is a dump of the execution 
environment, that provides the extra_vars given to the playbook ... heat 
has this data, it should be possible to dump it in a file on the 
undercloud if we want to


I believe Steven is, with (4), trying to improve/reuse the mechanim


I'm attempting to capture some of the common requirements from this
thread for discussion at the ptg so we can consider them when choosing
solution(s).




What would be nice is when a heat->mistral->ansible upgrade step fails, the
operator is given an ansible-playbook command to run which skips directly to
the failing step. This would dramatically reduce the debug cycle and also
make it possible for the operator to automate any required fixes over every
host in a role. This would likely mean rendering out ansible config files,
playbooks, (and roles?) to the operator's working directory. What happens to
these rendered files after deployment is an open question. Delete them?
Encourage the operator to track them in source control?


interesting question, as long as we run playbooks from a filesystem, I 
suppose users can make customizations without "changing" anything in 
tripleo ... this is how we tested some of the ceph-ansible fixes!


for upgrades we should maintain the tasks outside the templates do be 
able to do that though, assuming we want users to customize the upgrade 
tasks

--
Giulio Fidente
GPG KEY: 08D733BA

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [TripleO] Forming our plans around Ansible

2017-07-11 Thread Steve Baker
On Wed, Jul 12, 2017 at 11:47 AM, James Slagle 
wrote:

> On Tue, Jul 11, 2017 at 6:53 PM, Steve Baker  wrote:
> >
> >
> > On Tue, Jul 11, 2017 at 6:51 AM, James Slagle 
> > wrote:
> >>
> >> On Mon, Jul 10, 2017 at 11:37 AM, Lars Kellogg-Stedman  >
> >> wrote:
> >> > On Fri, Jul 7, 2017 at 1:50 PM, James Slagle 
> >> > wrote:
> >> >>
> >> >> There are also some ideas forming around pulling the Ansible
> playbooks
> >> >>
> >> >> and vars out of Heat so that they can be rerun (or run initially)
> >> >> independently from the Heat SoftwareDeployment delivery mechanism:
> >> >
> >> >
> >> > I think the closer we can come to "the operator runs ansible-playbook
> to
> >> > configure the overcloud" the better, but not because I think Ansible
> is
> >> > inherently a great tool: rather, I think the many layers of
> indirection
> >> > in
> >> > our existing model make error reporting and diagnosis much more
> >> > complicated
> >> > that it needs to be.  Combined with Puppet's "fail as late as
> possible"
> >> > model, this means that (a) operators waste time waiting for a
> deployment
> >> > that is ultimately going to fail but hasn't yet, and (b) when it does
> >> > fail,
> >> > they need relatively intimate knowledge of our deployment tools to
> >> > backtrack
> >> > through logs and find the root cause of the failure.
> >> >
> >> > If we can offer a deployment mode that reduces the number of layers
> >> > between
> >> > the operator and the actions being performed on the hosts I think we
> >> > would
> >> > win on both fronts: faster failures and reporting errors as close as
> >> > possible to the actual problem will result in less frustration across
> >> > the
> >> > board.
> >> >
> >> > I do like Steve's suggestion of a split model where Heat is
> responsible
> >> > for
> >> > instantiating OpenStack resources while Ansible is used to perform
> host
> >> > configuration tasks.  Despite all the work done on Ansible's OpenStack
> >> > modules, they feel inflexible and frustrating to work with when
> compared
> >> > to
> >> > Heat's state-aware, dependency ordered deployments.  A solution that
> >> > allows
> >> > Heat to output configuration that can subsequently be consumed by
> >> > Ansible --
> >> > either running manually or perhaps via Mistral for
> >> > API-driven-deployments --
> >> > seems like an excellent goal.  Using Heat as a "front-end" to the
> >> > process
> >> > means that we get to keep the parameter validation and documentation
> >> > that is
> >> > missing in Ansible, while still following the Unix philosophy of
> giving
> >> > you
> >> > enough rope to hang yourself if you really want it.
> >>
> >> This is excellent input, thanks for providing it.
> >>
> >> I think it lends itself towards suggesting that we may like to persue
> >> (again) adding native Ironic resources to Heat. If those were written
> >> in a way that also addressed some of the feedback about TripleO and
> >> the baremetal deployment side, then we could continue to get the
> >> advantages from Heat that you mention.
> >>
> >> My personal opinion to date is that Ansible's os_ironic* modules are
> >> superior in some ways to the Heat->Nova->Ironic model. However, just a
> >> Heat->Ironic model may work in a way that has the advantages of both.
> >
> >
> > I too would dearly like to get nova out of the picture. Our placement
> needs
> > mean the scheduler is something we need to work around, and it discards
> > basically all context for the operator when ironic can't deploy for some
> > reason.
> >
> > Whether we use a mistral workflow[1], a heat resource, or ansible
> os_ironic,
> > there will still need to be some python logic to build the config drive
> ISO
> > that injects the ssh keys and os-collect-config bootstrap.
> >
> > Unfortunately ironic iPXE boot from iSCSI[2] doesn't support config-drive
> > (still?) so the only option to inject ssh keys is the nova ec2-metadata
> > service (or equivalent). I suspect if we can't make every ironic
> deployment
> > method support config-drive then we're stuck with nova.
> >
> > I don't have a strong preference for a heat resource vs mistral vs
> ansible
> > os_ironic, but given there is some python logic required anyway, I would
> > lean towards a heat resource. If the resource is general enough we could
> > propose it to heat upstream, otherwise we could carry it in
> tripleo-common.
> >
> > Alternatively, we can implement a config-drive builder in tripleo-common
> and
> > invoke that from mistral or ansible.
>
> Ironic's cli node-set-provision-state command has a --config-drive
> option where you just point it a directory and it will automatically
> bundle that dir into the config drive ISO format.
>
> Ansible's os_ironic_node[1] also supports that via the config_drive
> parameter. Combining that with a couple of template tasks to create
> meta_data.json and user_data 

Re: [openstack-dev] [TripleO] Forming our plans around Ansible

2017-07-11 Thread James Slagle
On Tue, Jul 11, 2017 at 5:53 PM, Steve Baker  wrote:
>
>
> On Tue, Jul 11, 2017 at 3:37 AM, Lars Kellogg-Stedman 
> wrote:
>>
>> On Fri, Jul 7, 2017 at 1:50 PM, James Slagle 
>> wrote:
>>>
>>> There are also some ideas forming around pulling the Ansible playbooks
>>>
>>> and vars out of Heat so that they can be rerun (or run initially)
>>> independently from the Heat SoftwareDeployment delivery mechanism:
>>
>>
>> I think the closer we can come to "the operator runs ansible-playbook to
>> configure the overcloud" the better, but not because I think Ansible is
>> inherently a great tool: rather, I think the many layers of indirection in
>> our existing model make error reporting and diagnosis much more complicated
>> that it needs to be.  Combined with Puppet's "fail as late as possible"
>> model, this means that (a) operators waste time waiting for a deployment
>> that is ultimately going to fail but hasn't yet, and (b) when it does fail,
>> they need relatively intimate knowledge of our deployment tools to backtrack
>> through logs and find the root cause of the failure.
>>
>> If we can offer a deployment mode that reduces the number of layers
>> between the operator and the actions being performed on the hosts I think we
>> would win on both fronts: faster failures and reporting errors as close as
>> possible to the actual problem will result in less frustration across the
>> board.
>>
>> I do like Steve's suggestion of a split model where Heat is responsible
>> for instantiating OpenStack resources while Ansible is used to perform host
>> configuration tasks.  Despite all the work done on Ansible's OpenStack
>> modules, they feel inflexible and frustrating to work with when compared to
>> Heat's state-aware, dependency ordered deployments.  A solution that allows
>> Heat to output configuration that can subsequently be consumed by Ansible --
>> either running manually or perhaps via Mistral for API-driven-deployments --
>> seems like an excellent goal.  Using Heat as a "front-end" to the process
>> means that we get to keep the parameter validation and documentation that is
>> missing in Ansible, while still following the Unix philosophy of giving you
>> enough rope to hang yourself if you really want it.
>
>
> I think this nicely sums up what we should be aiming for, but I'd like to
> elaborate on "either running manually or perhaps via Mistral for
> API-driven-deployments".
>
> I think its important that we allow full support for both mistral-driven and
> manually running playbooks. If there was no option to run ansible-playbook
> directly then operators would miss one of the main benefits of using ansible
> in the first place (which is leveraging their knowledge of inventory,
> playbooks and roles to deploy things).

+1, I like this idea as well. If you have a few minutes could you
summarize it here:
https://etherpad.openstack.org/p/tripleo-ptg-queens-ansible

I'm attempting to capture some of the common requirements from this
thread for discussion at the ptg so we can consider them when choosing
solution(s).

> I'm thinking specifically about upgrade scenarios where a step fails.
> Currently the only option is a manual diagnosis of the problem, manual
> modification of state, then re-running the entire stack update to see if it
> can get past the failing step.
>
> What would be nice is when a heat->mistral->ansible upgrade step fails, the
> operator is given an ansible-playbook command to run which skips directly to
> the failing step. This would dramatically reduce the debug cycle and also
> make it possible for the operator to automate any required fixes over every
> host in a role. This would likely mean rendering out ansible config files,
> playbooks, (and roles?) to the operator's working directory. What happens to
> these rendered files after deployment is an open question. Delete them?
> Encourage the operator to track them in source control?




-- 
-- James Slagle
--

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [TripleO] Forming our plans around Ansible

2017-07-11 Thread James Slagle
On Tue, Jul 11, 2017 at 6:53 PM, Steve Baker  wrote:
>
>
> On Tue, Jul 11, 2017 at 6:51 AM, James Slagle 
> wrote:
>>
>> On Mon, Jul 10, 2017 at 11:37 AM, Lars Kellogg-Stedman 
>> wrote:
>> > On Fri, Jul 7, 2017 at 1:50 PM, James Slagle 
>> > wrote:
>> >>
>> >> There are also some ideas forming around pulling the Ansible playbooks
>> >>
>> >> and vars out of Heat so that they can be rerun (or run initially)
>> >> independently from the Heat SoftwareDeployment delivery mechanism:
>> >
>> >
>> > I think the closer we can come to "the operator runs ansible-playbook to
>> > configure the overcloud" the better, but not because I think Ansible is
>> > inherently a great tool: rather, I think the many layers of indirection
>> > in
>> > our existing model make error reporting and diagnosis much more
>> > complicated
>> > that it needs to be.  Combined with Puppet's "fail as late as possible"
>> > model, this means that (a) operators waste time waiting for a deployment
>> > that is ultimately going to fail but hasn't yet, and (b) when it does
>> > fail,
>> > they need relatively intimate knowledge of our deployment tools to
>> > backtrack
>> > through logs and find the root cause of the failure.
>> >
>> > If we can offer a deployment mode that reduces the number of layers
>> > between
>> > the operator and the actions being performed on the hosts I think we
>> > would
>> > win on both fronts: faster failures and reporting errors as close as
>> > possible to the actual problem will result in less frustration across
>> > the
>> > board.
>> >
>> > I do like Steve's suggestion of a split model where Heat is responsible
>> > for
>> > instantiating OpenStack resources while Ansible is used to perform host
>> > configuration tasks.  Despite all the work done on Ansible's OpenStack
>> > modules, they feel inflexible and frustrating to work with when compared
>> > to
>> > Heat's state-aware, dependency ordered deployments.  A solution that
>> > allows
>> > Heat to output configuration that can subsequently be consumed by
>> > Ansible --
>> > either running manually or perhaps via Mistral for
>> > API-driven-deployments --
>> > seems like an excellent goal.  Using Heat as a "front-end" to the
>> > process
>> > means that we get to keep the parameter validation and documentation
>> > that is
>> > missing in Ansible, while still following the Unix philosophy of giving
>> > you
>> > enough rope to hang yourself if you really want it.
>>
>> This is excellent input, thanks for providing it.
>>
>> I think it lends itself towards suggesting that we may like to persue
>> (again) adding native Ironic resources to Heat. If those were written
>> in a way that also addressed some of the feedback about TripleO and
>> the baremetal deployment side, then we could continue to get the
>> advantages from Heat that you mention.
>>
>> My personal opinion to date is that Ansible's os_ironic* modules are
>> superior in some ways to the Heat->Nova->Ironic model. However, just a
>> Heat->Ironic model may work in a way that has the advantages of both.
>
>
> I too would dearly like to get nova out of the picture. Our placement needs
> mean the scheduler is something we need to work around, and it discards
> basically all context for the operator when ironic can't deploy for some
> reason.
>
> Whether we use a mistral workflow[1], a heat resource, or ansible os_ironic,
> there will still need to be some python logic to build the config drive ISO
> that injects the ssh keys and os-collect-config bootstrap.
>
> Unfortunately ironic iPXE boot from iSCSI[2] doesn't support config-drive
> (still?) so the only option to inject ssh keys is the nova ec2-metadata
> service (or equivalent). I suspect if we can't make every ironic deployment
> method support config-drive then we're stuck with nova.
>
> I don't have a strong preference for a heat resource vs mistral vs ansible
> os_ironic, but given there is some python logic required anyway, I would
> lean towards a heat resource. If the resource is general enough we could
> propose it to heat upstream, otherwise we could carry it in tripleo-common.
>
> Alternatively, we can implement a config-drive builder in tripleo-common and
> invoke that from mistral or ansible.

Ironic's cli node-set-provision-state command has a --config-drive
option where you just point it a directory and it will automatically
bundle that dir into the config drive ISO format.

Ansible's os_ironic_node[1] also supports that via the config_drive
parameter. Combining that with a couple of template tasks to create
meta_data.json and user_data files makes for a very easy to user
interface.


[1] http://docs.ansible.com/ansible/os_ironic_node_module.html

-- 
-- James Slagle
--

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: 

Re: [openstack-dev] [TripleO] Forming our plans around Ansible

2017-07-11 Thread Steve Baker
On Tue, Jul 11, 2017 at 6:51 AM, James Slagle 
wrote:

> On Mon, Jul 10, 2017 at 11:37 AM, Lars Kellogg-Stedman 
> wrote:
> > On Fri, Jul 7, 2017 at 1:50 PM, James Slagle 
> wrote:
> >>
> >> There are also some ideas forming around pulling the Ansible playbooks
> >>
> >> and vars out of Heat so that they can be rerun (or run initially)
> >> independently from the Heat SoftwareDeployment delivery mechanism:
> >
> >
> > I think the closer we can come to "the operator runs ansible-playbook to
> > configure the overcloud" the better, but not because I think Ansible is
> > inherently a great tool: rather, I think the many layers of indirection
> in
> > our existing model make error reporting and diagnosis much more
> complicated
> > that it needs to be.  Combined with Puppet's "fail as late as possible"
> > model, this means that (a) operators waste time waiting for a deployment
> > that is ultimately going to fail but hasn't yet, and (b) when it does
> fail,
> > they need relatively intimate knowledge of our deployment tools to
> backtrack
> > through logs and find the root cause of the failure.
> >
> > If we can offer a deployment mode that reduces the number of layers
> between
> > the operator and the actions being performed on the hosts I think we
> would
> > win on both fronts: faster failures and reporting errors as close as
> > possible to the actual problem will result in less frustration across the
> > board.
> >
> > I do like Steve's suggestion of a split model where Heat is responsible
> for
> > instantiating OpenStack resources while Ansible is used to perform host
> > configuration tasks.  Despite all the work done on Ansible's OpenStack
> > modules, they feel inflexible and frustrating to work with when compared
> to
> > Heat's state-aware, dependency ordered deployments.  A solution that
> allows
> > Heat to output configuration that can subsequently be consumed by
> Ansible --
> > either running manually or perhaps via Mistral for
> API-driven-deployments --
> > seems like an excellent goal.  Using Heat as a "front-end" to the process
> > means that we get to keep the parameter validation and documentation
> that is
> > missing in Ansible, while still following the Unix philosophy of giving
> you
> > enough rope to hang yourself if you really want it.
>
> This is excellent input, thanks for providing it.
>
> I think it lends itself towards suggesting that we may like to persue
> (again) adding native Ironic resources to Heat. If those were written
> in a way that also addressed some of the feedback about TripleO and
> the baremetal deployment side, then we could continue to get the
> advantages from Heat that you mention.
>
> My personal opinion to date is that Ansible's os_ironic* modules are
> superior in some ways to the Heat->Nova->Ironic model. However, just a
> Heat->Ironic model may work in a way that has the advantages of both.
>

I too would dearly like to get nova out of the picture. Our placement needs
mean the scheduler is something we need to work around, and it discards
basically all context for the operator when ironic can't deploy for some
reason.

Whether we use a mistral workflow[1], a heat resource, or ansible
os_ironic, there will still need to be some python logic to build the
config drive ISO that injects the ssh keys and os-collect-config bootstrap.

Unfortunately ironic iPXE boot from iSCSI[2] doesn't support config-drive
(still?) so the only option to inject ssh keys is the nova ec2-metadata
service (or equivalent). I suspect if we can't make every ironic deployment
method support config-drive then we're stuck with nova.

I don't have a strong preference for a heat resource vs mistral vs ansible
os_ironic, but given there is some python logic required anyway, I would
lean towards a heat resource. If the resource is general enough we could
propose it to heat upstream, otherwise we could carry it in tripleo-common.

Alternatively, we can implement a config-drive builder in tripleo-common
and invoke that from mistral or ansible.

[1] https://review.openstack.org/#/c/313048/1
[2] http://specs.openstack.org/openstack/ironic-specs/
specs/approved/boot-from-volume-reference-drivers.html#
scenario-1-ipxe-boot-from-iscsi-volume
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [TripleO] Forming our plans around Ansible

2017-07-11 Thread Steve Baker
On Tue, Jul 11, 2017 at 3:37 AM, Lars Kellogg-Stedman 
wrote:

> On Fri, Jul 7, 2017 at 1:50 PM, James Slagle 
> wrote:
>
>> There are also some ideas forming around pulling the Ansible playbooks
>>
> and vars out of Heat so that they can be rerun (or run initially)
>> independently from the Heat SoftwareDeployment delivery mechanism:
>>
>
> I think the closer we can come to "the operator runs ansible-playbook to
> configure the overcloud" the better, but not because I think Ansible is
> inherently a great tool: rather, I think the many layers of indirection in
> our existing model make error reporting and diagnosis much more complicated
> that it needs to be.  Combined with Puppet's "fail as late as possible"
> model, this means that (a) operators waste time waiting for a deployment
> that is ultimately going to fail but hasn't yet, and (b) when it does fail,
> they need relatively intimate knowledge of our deployment tools to
> backtrack through logs and find the root cause of the failure.
>
> If we can offer a deployment mode that reduces the number of layers
> between the operator and the actions being performed on the hosts I think
> we would win on both fronts: faster failures and reporting errors as close
> as possible to the actual problem will result in less frustration across
> the board.
>
> I do like Steve's suggestion of a split model where Heat is responsible
> for instantiating OpenStack resources while Ansible is used to perform host
> configuration tasks.  Despite all the work done on Ansible's OpenStack
> modules, they feel inflexible and frustrating to work with when compared to
> Heat's state-aware, dependency ordered deployments.  A solution that allows
> Heat to output configuration that can subsequently be consumed by Ansible
> -- either running manually or perhaps via Mistral for
> API-driven-deployments -- seems like an excellent goal.  Using Heat as a
> "front-end" to the process means that we get to keep the parameter
> validation and documentation that is missing in Ansible, while still
> following the Unix philosophy of giving you enough rope to hang yourself if
> you really want it.
>

I think this nicely sums up what we should be aiming for, but I'd like to
elaborate on "either running manually or perhaps via Mistral for
API-driven-deployments".

I think its important that we allow full support for both mistral-driven
and manually running playbooks. If there was no option to run
ansible-playbook directly then operators would miss one of the main
benefits of using ansible in the first place (which is leveraging their
knowledge of inventory, playbooks and roles to deploy things).

I'm thinking specifically about upgrade scenarios where a step fails.
Currently the only option is a manual diagnosis of the problem, manual
modification of state, then re-running the entire stack update to see if it
can get past the failing step.

What would be nice is when a heat->mistral->ansible upgrade step fails, the
operator is given an ansible-playbook command to run which skips directly
to the failing step. This would dramatically reduce the debug cycle and
also make it possible for the operator to automate any required fixes over
every host in a role. This would likely mean rendering out ansible config
files, playbooks, (and roles?) to the operator's working directory. What
happens to these rendered files after deployment is an open question.
Delete them? Encourage the operator to track them in source control?
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [TripleO] Forming our plans around Ansible

2017-07-10 Thread Giulio Fidente
On 07/10/2017 09:23 PM, James Slagle wrote:
> On Mon, Jul 10, 2017 at 2:54 PM, Giulio Fidente  wrote:
>> On 07/10/2017 07:06 PM, James Slagle wrote:
>>> On Mon, Jul 10, 2017 at 11:19 AM, Giulio Fidente  
>>> wrote:
 splitstack though requires changes in how the *existing* openstack
 services are deployed and we didn't want to do that just for the purpose
 of integrating ceph-ansible so I still believe (3) to be a sensible
 compromise to provide the needed functionalities and not breaking the
 existing deployment logic
>>>
>>> We might be talking about different definitions of "splitstack", as
>>> I'm not sure what changes are required for existing services. FWIW, I
>>> refer to what we do in CI with multinode to be splitstack in that the
>>> nodes are already provisioned and we deploy the services on those
>>> nodes using the same templates that we do for a "full" stack.
>>
>>> Those nodes could have just as easily been provisioned with our
>>> undercloud and the services deployed using 2 separate stacks, and that
>>> model works just as well.
>>
>> true, sorry for the misuse of the term splistack; the existing
>> splitstack implementation continues to work well and option (3), like
>> the others, can be plugged on top of it
>>
>> what I had in mind was instead the "split stack" scenario described by
>> Steven, where the orchestration steps are moved outside heat, this is
>> what we didn't have, still don't have and can be discussed at the PTG
> 
> Ok, thanks for clarifying. So when you're saying split-stack in this
> context, you imply just deploying a baremetal stack, then use whatever
> tool we want (or may develop) to deploy the service configuration.

yes but I am still assuming heat to be tool providing the per-role and
per-service settings, while not the tool orchestrating the steps anymore

I also don't think we should assume puppet or ansible to be "the
deployment tool"; the past seems to be telling us that we changed the
tool once already, later decided to use a new one fitting better our
needs for upgrades and yet resorted to a third more generic 'workflow
triggering' mechanism to decouple further some services configuration
from the general approach and I wouldn't give away flexibility easily
-- 
Giulio Fidente
GPG KEY: 08D733BA

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [TripleO] Forming our plans around Ansible

2017-07-10 Thread Fox, Kevin M
I think the migration path to something like kolla-kubernetes would be fine,
as you have total control over the orchestration piece, ansible and the config 
generation
ansible and since it is all containerized and TripleO production isn't, you 
should be able to
'upgrade' from non containtered to containered while leaving alone all the 
existing services as a 
roll back path. Something like read in the old config, tweak it a bit as 
needed, upload as configmaps. then helm install some kolla packages?

Thanks,
Kevin

From: Emilien Macchi [emil...@redhat.com]
Sent: Monday, July 10, 2017 12:14 PM
To: OpenStack Development Mailing List (not for usage questions)
Subject: Re: [openstack-dev] [TripleO] Forming our plans around Ansible

On Mon, Jul 10, 2017 at 6:19 AM, Steven Hardy <sha...@redhat.com> wrote:
[...]
> 1. How to perform end-to-end configuration via ansible (outside of
> heat, but probably still using data and possibly playbooks generated
> by heat)

I guess we're talking about removing Puppet from TripleO and use more
Ansible to manage configuration files.

This is somewhat related to what Flavio (and team) are currently investigating:
https://github.com/flaper87/tripleo-apb-roles/tree/master/keystone-apb

Also see this thread for more context:
http://lists.openstack.org/pipermail/openstack-dev/2017-June/118417.html

We could imagine these apb used by Split Stack 2 by applying the
software configuration (Ansible) to deploy OpenStack on already
deployed baremetal nodes.
One of the challenges here is how do we get the data from Heat to
generate Ansible vars.

[...]

>> I think if we can form some broad agreement before the PTG, we have a
>> chance at making some meaningful progress during Queens.
>
> Agreed, although we probably do need to make some more progress on
> some aspects of this for container minor updates that we'll need for
> Pike.

++ Thanks for bringing this James.

Some other thoughts:
* I also agree that TripleO Quickstart is a separated topic. I was
also confused why OOOQ was templating bash scripts and it has become
clear we needed a way to run exactly the commands in our documentation
without abstraction (please tell me if I'm wrong), therefore we had to
do these templates. We could have been a bit more granular (run cmds
in tasks instead of shell scripts) but I might have missed something
why we didn't do that way.

* Kayobe and Kolla are great tools, though TripleO is looking for a
path to migrate to Ansible in a backward compatible way. Throwing a
third grenade here - I think these tools are too opinionated to allow
us to simply use them. I think we should work toward re-using the
maximum of bits when it makes sense, but folks need to keep in mind we
need to support our existing production deployments, manage upgrades
etc. We're already using some bits from Kolla and our team is already
willing to collaborate with other deployments tools when it makes
sense.

* I agree with some comments in this thread when I read "TripleO would
be a tool to deploy OpenStack Infrastructure as split stacks", like
we're doing in our multinode jobs but even further. I'm interested by
the work done by Flavio and see how we could use Split Stack 2 to
deploy Kubernetes with Ansible (eventually without Mistral calling
Heat calling Mistral calling Ansible).

* It might sound like we want to add more complexity in TripleO but I
confirm James's goal which is a common goal in the team, is to reduce
the number of tools used by TripleO. In other words, we hope we can
e.g. remove Puppet to manage configuration files (which could be done
by Ansible), remove some workflows usually done by Heat but could be
done by Ansible as well, etc. The idea around forming plans to use
Ansible is excellent and we need to converge our efforts together so
we can address some of our operators's feedbacks.

Thanks,
--
Emilien Macchi

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [TripleO] Forming our plans around Ansible

2017-07-10 Thread Alex Schultz
On Fri, Jul 7, 2017 at 11:50 AM, James Slagle  wrote:
> I proposed a session for the PTG
> (https://etherpad.openstack.org/p/tripleo-ptg-queens) about forming a
> common plan and vision around Ansible in TripleO.
>
> I think it's important however that we kick this discussion off more
> broadly before the PTG, so that we can hopefully have some agreement
> for deeper discussions and prototyping when we actually meet in
> person.
>
> Right now, we have multiple uses of Ansible in TripleO:
>
> (0) tripleo-quickstart which follows the common and well accepted
> approach to bundling a set of Ansible playbooks/roles.
>
> (1) Mistral calling Ansible. This is the approach used by
> tripleo-validations where Mistral directly executes ansible playbooks
> using a dynamic inventory. The inventory is constructed from the
> server related stack outputs of the overcloud stack.
>
> (2) Ansible running playbooks against localhost triggered by the
> heat-config Ansible hook. This approach is used by
> tripleo-heat-templates for upgrade tasks and various tasks for
> deploying containers.
>
> (3) Mistral calling Heat calling Mistral calling Ansible. In this
> approach, we have Mistral resources in tripleo-heat-templates that are
> created as part of the overcloud stack and in turn, the created
> Mistral action executions run ansible. This has been prototyped with
> using ceph-ansible to install Ceph as part of the overcloud
> deployment, and some of the work has already landed. There are also
> proposed WIP patches using this approach to install Kubernetes.
>
> There are also some ideas forming around pulling the Ansible playbooks
> and vars out of Heat so that they can be rerun (or run initially)
> independently from the Heat SoftwareDeployment delivery mechanism:
>
> (4) https://review.openstack.org/#/c/454816/
>
> (5) Another idea I'd like to prototype is a local tool that runs on
> the undercloud and pulls all of the SoftwareDeployment data out of
> Heat as the stack is being created and generates corresponding Ansible
> playbooks to apply those deployments. Once a given playbook is
> generated by the tool, the tool would signal back to Heat that the
> deployment is complete. Heat then creates the whole stack without
> actually applying a single deployment to an overcloud node. At that
> point, Ansible (or Mistral->Ansible for an API) would be used to do
> the actual deployment of the Overcloud with the Undercloud as the
> ansible runner.
>
> All of this work has merit as we investigate longer term plans, and
> it's all at different stages with some being for dev/CI (0), some
> being used already in production (1 and 2), some just at the
> experimental stage (3 and 4), and some does not exist other than an
> idea (5).
>
> My intent with this mail is to start a discussion around what we've
> learned from these approaches and start discussing a consolidated plan
> around Ansible. And I'm not saying that whatever we come up with
> should only use Ansible a certain way. Just that we ought to look at
> how users/operators interact with Ansible and TripleO today and try
> and come up with the best solution(s) going forward.
>
> I think that (1) has been pretty successful, and my idea with (5)
> would use a similar approach once the playbooks were generated.
> Further, my idea with (5) would give us a fully backwards compatible
> solution with our existing template interfaces from
> tripleo-heat-templates. Longer term (or even in parallel for some
> time), the generated playbooks could stop being generated (and just
> exist in git), and we could consider moving away from Heat more
> permanently
>
> I recognize that saying "moving away from Heat" may be quite
> controversial. While it's not 100% the same discussion as what we are
> doing with Ansible, I think it is a big part of the discussion and if
> we want to continue with Heat as the primary orchestration tool in
> TripleO.
>
> I've been hearing a lot of feedback from various operators about how
> difficult the baremetal deployment is with Heat. While feedback about
> Ironic is generally positive, a lot of the negative feedback is around
> the Heat->Nova->Ironic interaction. And, if we also move more towards
> Ansible for the service deployment, I wonder if there is still a long
> term place for Heat at all.
>
> Personally, I'm pretty apprehensive about the approach taken in (3). I
> feel that it is a lot of complexity that could be done simpler if we
> took a step back and thought more about a longer term approach. I
> recognize that it's mostly an experiment/POC at this stage, and I'm
> not trying to directly knock down the approach. It's just that when I
> start to see more patches (Kubernetes installation) using the same
> approach, I figure it's worth discussing more broadly vs trying to
> have a discussion by -1'ing patch reviews, etc.
>
> I'm interested in all feedback of course. And I plan to take a shot at
> working on the prototype I 

Re: [openstack-dev] [TripleO] Forming our plans around Ansible

2017-07-10 Thread James Slagle
On Mon, Jul 10, 2017 at 2:54 PM, Giulio Fidente  wrote:
> On 07/10/2017 07:06 PM, James Slagle wrote:
>> On Mon, Jul 10, 2017 at 11:19 AM, Giulio Fidente  wrote:
>>> splitstack though requires changes in how the *existing* openstack
>>> services are deployed and we didn't want to do that just for the purpose
>>> of integrating ceph-ansible so I still believe (3) to be a sensible
>>> compromise to provide the needed functionalities and not breaking the
>>> existing deployment logic
>>
>> We might be talking about different definitions of "splitstack", as
>> I'm not sure what changes are required for existing services. FWIW, I
>> refer to what we do in CI with multinode to be splitstack in that the
>> nodes are already provisioned and we deploy the services on those
>> nodes using the same templates that we do for a "full" stack.
>
>> Those nodes could have just as easily been provisioned with our
>> undercloud and the services deployed using 2 separate stacks, and that
>> model works just as well.
>
> true, sorry for the misuse of the term splistack; the existing
> splitstack implementation continues to work well and option (3), like
> the others, can be plugged on top of it
>
> what I had in mind was instead the "split stack" scenario described by
> Steven, where the orchestration steps are moved outside heat, this is
> what we didn't have, still don't have and can be discussed at the PTG

Ok, thanks for clarifying. So when you're saying split-stack in this
context, you imply just deploying a baremetal stack, then use whatever
tool we want (or may develop) to deploy the service configuration.


-- 
-- James Slagle
--

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [TripleO] Forming our plans around Ansible

2017-07-10 Thread Emilien Macchi
On Mon, Jul 10, 2017 at 6:19 AM, Steven Hardy  wrote:
[...]
> 1. How to perform end-to-end configuration via ansible (outside of
> heat, but probably still using data and possibly playbooks generated
> by heat)

I guess we're talking about removing Puppet from TripleO and use more
Ansible to manage configuration files.

This is somewhat related to what Flavio (and team) are currently investigating:
https://github.com/flaper87/tripleo-apb-roles/tree/master/keystone-apb

Also see this thread for more context:
http://lists.openstack.org/pipermail/openstack-dev/2017-June/118417.html

We could imagine these apb used by Split Stack 2 by applying the
software configuration (Ansible) to deploy OpenStack on already
deployed baremetal nodes.
One of the challenges here is how do we get the data from Heat to
generate Ansible vars.

[...]

>> I think if we can form some broad agreement before the PTG, we have a
>> chance at making some meaningful progress during Queens.
>
> Agreed, although we probably do need to make some more progress on
> some aspects of this for container minor updates that we'll need for
> Pike.

++ Thanks for bringing this James.

Some other thoughts:
* I also agree that TripleO Quickstart is a separated topic. I was
also confused why OOOQ was templating bash scripts and it has become
clear we needed a way to run exactly the commands in our documentation
without abstraction (please tell me if I'm wrong), therefore we had to
do these templates. We could have been a bit more granular (run cmds
in tasks instead of shell scripts) but I might have missed something
why we didn't do that way.

* Kayobe and Kolla are great tools, though TripleO is looking for a
path to migrate to Ansible in a backward compatible way. Throwing a
third grenade here - I think these tools are too opinionated to allow
us to simply use them. I think we should work toward re-using the
maximum of bits when it makes sense, but folks need to keep in mind we
need to support our existing production deployments, manage upgrades
etc. We're already using some bits from Kolla and our team is already
willing to collaborate with other deployments tools when it makes
sense.

* I agree with some comments in this thread when I read "TripleO would
be a tool to deploy OpenStack Infrastructure as split stacks", like
we're doing in our multinode jobs but even further. I'm interested by
the work done by Flavio and see how we could use Split Stack 2 to
deploy Kubernetes with Ansible (eventually without Mistral calling
Heat calling Mistral calling Ansible).

* It might sound like we want to add more complexity in TripleO but I
confirm James's goal which is a common goal in the team, is to reduce
the number of tools used by TripleO. In other words, we hope we can
e.g. remove Puppet to manage configuration files (which could be done
by Ansible), remove some workflows usually done by Heat but could be
done by Ansible as well, etc. The idea around forming plans to use
Ansible is excellent and we need to converge our efforts together so
we can address some of our operators's feedbacks.

Thanks,
-- 
Emilien Macchi

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [TripleO] Forming our plans around Ansible

2017-07-10 Thread Giulio Fidente
On 07/10/2017 07:06 PM, James Slagle wrote:
> On Mon, Jul 10, 2017 at 11:19 AM, Giulio Fidente  wrote:
>> On 07/10/2017 03:19 PM, Steven Hardy wrote:
>>> On Fri, Jul 7, 2017 at 6:50 PM, James Slagle  wrote:
>>
>> [...]
>>
>>> Yeah, I think the first step is to focus on a clean "split stack"
>>> model where the nodes/networks etc are still deployed via heat, then
>>> ansible handles the configuration of the nodes.
>>
>> +1
>>
>> as per my previous email, if splitstack was available we might have been
>> able to use that for the ceph-ansible integration : "if we had migrated
>> to splitstack already, it might have been possible"
> 
> Can you expand on what isn't available? I've primarily been the one
> working on different parts of splitstack, and I'm not sure what it
> can't do that you need it to do :).

the idea behind option (3) was to make it possible to run any mistral
workflow (or task) to deploy a service

we decoupled, on a per-service basis, how a given service is deployed
from the rest of the stack, yet maintained orchestration of the
overcloud deployment steps in heat; I know for sure that not everybody
liked this idea but it was the goal

as a result via option (3) you can deploy a new service in tripleo by
pointing it to a workflow ... and it doesn't matter if the workflow uses
ansible, puppet or simply returns 0

plus the workflow can be executed at a given deployment step, making it
possible to interleave its execution with the rest of the deployment
steps (the puppet apply steps); splitstack couldn't interleave the steps
and even if we made it to, we needed to add the parts to describe which
workflow/task needed to be run

but now that option (3) is implemented, assuming we move outside heat
the capability to collect and run tasks/workflows for a given service,
it'll be trivial to remove the "mistral > heat > mistral" loop, we'd
just need to execute the service workflows from the
$new_tool_driving_the_deployment_steps

>> splitstack though requires changes in how the *existing* openstack
>> services are deployed and we didn't want to do that just for the purpose
>> of integrating ceph-ansible so I still believe (3) to be a sensible
>> compromise to provide the needed functionalities and not breaking the
>> existing deployment logic
> 
> We might be talking about different definitions of "splitstack", as
> I'm not sure what changes are required for existing services. FWIW, I
> refer to what we do in CI with multinode to be splitstack in that the
> nodes are already provisioned and we deploy the services on those
> nodes using the same templates that we do for a "full" stack.

> Those nodes could have just as easily been provisioned with our
> undercloud and the services deployed using 2 separate stacks, and that
> model works just as well.

true, sorry for the misuse of the term splistack; the existing
splitstack implementation continues to work well and option (3), like
the others, can be plugged on top of it

what I had in mind was instead the "split stack" scenario described by
Steven, where the orchestration steps are moved outside heat, this is
what we didn't have, still don't have and can be discussed at the PTG

-- 
Giulio Fidente
GPG KEY: 08D733BA

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [TripleO] Forming our plans around Ansible

2017-07-10 Thread James Slagle
On Mon, Jul 10, 2017 at 11:37 AM, Lars Kellogg-Stedman  wrote:
> On Fri, Jul 7, 2017 at 1:50 PM, James Slagle  wrote:
>>
>> There are also some ideas forming around pulling the Ansible playbooks
>>
>> and vars out of Heat so that they can be rerun (or run initially)
>> independently from the Heat SoftwareDeployment delivery mechanism:
>
>
> I think the closer we can come to "the operator runs ansible-playbook to
> configure the overcloud" the better, but not because I think Ansible is
> inherently a great tool: rather, I think the many layers of indirection in
> our existing model make error reporting and diagnosis much more complicated
> that it needs to be.  Combined with Puppet's "fail as late as possible"
> model, this means that (a) operators waste time waiting for a deployment
> that is ultimately going to fail but hasn't yet, and (b) when it does fail,
> they need relatively intimate knowledge of our deployment tools to backtrack
> through logs and find the root cause of the failure.
>
> If we can offer a deployment mode that reduces the number of layers between
> the operator and the actions being performed on the hosts I think we would
> win on both fronts: faster failures and reporting errors as close as
> possible to the actual problem will result in less frustration across the
> board.
>
> I do like Steve's suggestion of a split model where Heat is responsible for
> instantiating OpenStack resources while Ansible is used to perform host
> configuration tasks.  Despite all the work done on Ansible's OpenStack
> modules, they feel inflexible and frustrating to work with when compared to
> Heat's state-aware, dependency ordered deployments.  A solution that allows
> Heat to output configuration that can subsequently be consumed by Ansible --
> either running manually or perhaps via Mistral for API-driven-deployments --
> seems like an excellent goal.  Using Heat as a "front-end" to the process
> means that we get to keep the parameter validation and documentation that is
> missing in Ansible, while still following the Unix philosophy of giving you
> enough rope to hang yourself if you really want it.

This is excellent input, thanks for providing it.

I think it lends itself towards suggesting that we may like to persue
(again) adding native Ironic resources to Heat. If those were written
in a way that also addressed some of the feedback about TripleO and
the baremetal deployment side, then we could continue to get the
advantages from Heat that you mention.

My personal opinion to date is that Ansible's os_ironic* modules are
superior in some ways to the Heat->Nova->Ironic model. However, just a
Heat->Ironic model may work in a way that has the advantages of both.

-- 
-- James Slagle
--

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [TripleO] Forming our plans around Ansible

2017-07-10 Thread Mark Goddard
I'll throw a second grenade in.

Kayobe[1][2] is an OpenStack deployment tool based on kolla-ansible that
adds sounds in some ways similar to what you're describing. It roughly
follows the TripleO undercloud/overcloud model, with Bifrost used to deploy
the overcloud. Kayobe augments kolla-ansible with ansible playbooks for
configuration of the undercloud and overcloud hosts - networking, docker
storage, LVM. It also provides automation of some common workflows. There's
currently a focus on baremetal and scientific computing, but that's only
because that's been the focus up to now. Users drive kayobe using a CLI
which is mostly a wrapper around ansible-playbook.

I'm not suggesting that everyone should discard TripleO and adopt Kayobe -
clearly TripleO is a lot more mature. That said, if TripleO wants to move
to a more ansible-centric architecture, it might be prudent to see how
similar projects work, and if possible, share some code.

[1] https://github.com/stackhpc/kayobe
[2] http://kayobe.readthedocs.io/en/latest/

On 10 July 2017 at 17:44, Michał Jastrzębski  wrote:

> Hey,
>
> I'll just throw a grenade (pun intended) into your discussion - sorry!
> How about kolla-kubernetes? State awareness is done by kubernetes,
> it's designed for containers and we already have most of services
> ready and we'll be running ansible inside containers on top of k8s,
> for all the things that k8s is not natively good at. Sounds like
> somewhat you describe just switch heat with k8s.
>
> Cheers,
> Michal
>
> On 10 July 2017 at 08:37, Lars Kellogg-Stedman  wrote:
> > On Fri, Jul 7, 2017 at 1:50 PM, James Slagle 
> wrote:
> >>
> >> There are also some ideas forming around pulling the Ansible playbooks
> >>
> >> and vars out of Heat so that they can be rerun (or run initially)
> >> independently from the Heat SoftwareDeployment delivery mechanism:
> >
> >
> > I think the closer we can come to "the operator runs ansible-playbook to
> > configure the overcloud" the better, but not because I think Ansible is
> > inherently a great tool: rather, I think the many layers of indirection
> in
> > our existing model make error reporting and diagnosis much more
> complicated
> > that it needs to be.  Combined with Puppet's "fail as late as possible"
> > model, this means that (a) operators waste time waiting for a deployment
> > that is ultimately going to fail but hasn't yet, and (b) when it does
> fail,
> > they need relatively intimate knowledge of our deployment tools to
> backtrack
> > through logs and find the root cause of the failure.
> >
> > If we can offer a deployment mode that reduces the number of layers
> between
> > the operator and the actions being performed on the hosts I think we
> would
> > win on both fronts: faster failures and reporting errors as close as
> > possible to the actual problem will result in less frustration across the
> > board.
> >
> > I do like Steve's suggestion of a split model where Heat is responsible
> for
> > instantiating OpenStack resources while Ansible is used to perform host
> > configuration tasks.  Despite all the work done on Ansible's OpenStack
> > modules, they feel inflexible and frustrating to work with when compared
> to
> > Heat's state-aware, dependency ordered deployments.  A solution that
> allows
> > Heat to output configuration that can subsequently be consumed by
> Ansible --
> > either running manually or perhaps via Mistral for
> API-driven-deployments --
> > seems like an excellent goal.  Using Heat as a "front-end" to the process
> > means that we get to keep the parameter validation and documentation
> that is
> > missing in Ansible, while still following the Unix philosophy of giving
> you
> > enough rope to hang yourself if you really want it.
> >
> > --
> > Lars Kellogg-Stedman 
> >
> >
> > 
> __
> > OpenStack Development Mailing List (not for usage questions)
> > Unsubscribe: openstack-dev-requ...@lists.op
> enstack.org?subject:unsubscribe
> > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> >
>
> __
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [TripleO] Forming our plans around Ansible

2017-07-10 Thread James Slagle
On Mon, Jul 10, 2017 at 11:19 AM, Giulio Fidente  wrote:
> On 07/10/2017 03:19 PM, Steven Hardy wrote:
>> On Fri, Jul 7, 2017 at 6:50 PM, James Slagle  wrote:
>
> [...]
>
>> Yeah, I think the first step is to focus on a clean "split stack"
>> model where the nodes/networks etc are still deployed via heat, then
>> ansible handles the configuration of the nodes.
>
> +1
>
> as per my previous email, if splitstack was available we might have been
> able to use that for the ceph-ansible integration : "if we had migrated
> to splitstack already, it might have been possible"

Can you expand on what isn't available? I've primarily been the one
working on different parts of splitstack, and I'm not sure what it
can't do that you need it to do :).

>
> splitstack though requires changes in how the *existing* openstack
> services are deployed and we didn't want to do that just for the purpose
> of integrating ceph-ansible so I still believe (3) to be a sensible
> compromise to provide the needed functionalities and not breaking the
> existing deployment logic

We might be talking about different definitions of "splitstack", as
I'm not sure what changes are required for existing services. FWIW, I
refer to what we do in CI with multinode to be splitstack in that the
nodes are already provisioned and we deploy the services on those
nodes using the same templates that we do for a "full" stack.

Those nodes could have just as easily been provisioned with our
undercloud and the services deployed using 2 separate stacks, and that
model works just as well.

-- 
-- James Slagle
--

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [TripleO] Forming our plans around Ansible

2017-07-10 Thread James Slagle
On Mon, Jul 10, 2017 at 9:19 AM, Steven Hardy  wrote:
> On Fri, Jul 7, 2017 at 6:50 PM, James Slagle  wrote:
> Yeah so my idea with (4), and subsequent patches such as[1] is to
> gradually move the deploy steps performed to configure services (on
> baremetal and in containers) to a single ansible playbook.
>
> There's currently still heat orchestration around the host preparation
> (although this is performed via ansible) and iteration over each step
> (where we re-apply the same deploy-steps playbook with an incrementing
> step variable, but this could be replaced by e.g an ansible or mistral
> loop), but my idea was to enable end-to-end configuration of nodes via
> ansible-playbook, without the need for any special tooks (e.g we
> refactor t-h-t enough that we don't need any special tools, and we
> make deploy-steps-playbook.yaml the only method of deployment (for
> baremetal and container cases)
>
> [1] https://review.openstack.org/#/c/462211/
>
>> All of this work has merit as we investigate longer term plans, and
>> it's all at different stages with some being for dev/CI (0), some
>> being used already in production (1 and 2), some just at the
>> experimental stage (3 and 4), and some does not exist other than an
>> idea (5).
>
> I'd like to get the remaining work for (4) done so it's a supportable
> option for minor updates, but there's still a bit more t-h-t
> refactoring required to enable it I think, but I think we're already
> pretty close to being able to run end-to-end ansible for most of the
> PostDeploy steps without any special tooling.

Thanks for this context, I think it helps clarify where we could be
going with these patches. I'll take a closer look at what you've done
so far.

I think I will missing the point of whether the playbooks would still
run localhost mode on each node, or if the idea would be that we could
eventually work towards a central ansible "runner" (such as the
undercloud) that could execute all the playbooks.

It sounds the latter is possibly just an iterative step beyond the
former, so I think like where this approach is going.

>> My intent with this mail is to start a discussion around what we've
>> learned from these approaches and start discussing a consolidated plan
>> around Ansible. And I'm not saying that whatever we come up with
>> should only use Ansible a certain way. Just that we ought to look at
>> how users/operators interact with Ansible and TripleO today and try
>> and come up with the best solution(s) going forward.
>>
>> I think that (1) has been pretty successful, and my idea with (5)
>> would use a similar approach once the playbooks were generated.
>> Further, my idea with (5) would give us a fully backwards compatible
>> solution with our existing template interfaces from
>> tripleo-heat-templates. Longer term (or even in parallel for some
>> time), the generated playbooks could stop being generated (and just
>> exist in git), and we could consider moving away from Heat more
>> permanently
>
> Yeah I think working towards aligning more TripleO configuration with
> the approach taken by tripleo-validations is fine, and we can e.g add
> more heat generated data about the nodes to the dynamic ansible
> inventory:
>
> https://github.com/openstack/tripleo-validations/blob/master/tripleo_validations/inventory.py
>
> We've been gradually adding data there, which I hope will enable a
> cleaner "split stack", where the nodes are deployed via heat, then
> ansible can do the configuration based on data exposed via stack
> outputs (which again is a pattern that I think has been proven to work
> quite well for tripleo-validations, and is also something I've been
> using locally for dev testing quite successfully).
>
>> I recognize that saying "moving away from Heat" may be quite
>> controversial. While it's not 100% the same discussion as what we are
>> doing with Ansible, I think it is a big part of the discussion and if
>> we want to continue with Heat as the primary orchestration tool in
>> TripleO.
>
> Yeah, I think the first step is to focus on a clean "split stack"
> model where the nodes/networks etc are still deployed via heat, then
> ansible handles the configuration of the nodes.
>
> In the long term I could see benefits in a "tripleo lite" model,
> where, say, we only used mistral+Ironic+ansible, but IMO we're not at
> the point yet where that's achievable, primarily because there's
> coupling between the heat parameter interfaces and multiple
> integrations we can't break (e.g users with environment files,
> tripleo-ui, vendor integrations, etc).
>
> It's a good discussion to kick off regardless though, so personally
> I'd like to focus on these as the first "baby steps":
>
> 1. How to perform end-to-end configuration via ansible (outside of
> heat, but probably still using data and possibly playbooks generated
> by heat)
>
> 2. How to deploy nodes directly via Ironic, with a mistral workflow
> (e.g no 

Re: [openstack-dev] [TripleO] Forming our plans around Ansible

2017-07-10 Thread Michał Jastrzębski
Hey,

I'll just throw a grenade (pun intended) into your discussion - sorry!
How about kolla-kubernetes? State awareness is done by kubernetes,
it's designed for containers and we already have most of services
ready and we'll be running ansible inside containers on top of k8s,
for all the things that k8s is not natively good at. Sounds like
somewhat you describe just switch heat with k8s.

Cheers,
Michal

On 10 July 2017 at 08:37, Lars Kellogg-Stedman  wrote:
> On Fri, Jul 7, 2017 at 1:50 PM, James Slagle  wrote:
>>
>> There are also some ideas forming around pulling the Ansible playbooks
>>
>> and vars out of Heat so that they can be rerun (or run initially)
>> independently from the Heat SoftwareDeployment delivery mechanism:
>
>
> I think the closer we can come to "the operator runs ansible-playbook to
> configure the overcloud" the better, but not because I think Ansible is
> inherently a great tool: rather, I think the many layers of indirection in
> our existing model make error reporting and diagnosis much more complicated
> that it needs to be.  Combined with Puppet's "fail as late as possible"
> model, this means that (a) operators waste time waiting for a deployment
> that is ultimately going to fail but hasn't yet, and (b) when it does fail,
> they need relatively intimate knowledge of our deployment tools to backtrack
> through logs and find the root cause of the failure.
>
> If we can offer a deployment mode that reduces the number of layers between
> the operator and the actions being performed on the hosts I think we would
> win on both fronts: faster failures and reporting errors as close as
> possible to the actual problem will result in less frustration across the
> board.
>
> I do like Steve's suggestion of a split model where Heat is responsible for
> instantiating OpenStack resources while Ansible is used to perform host
> configuration tasks.  Despite all the work done on Ansible's OpenStack
> modules, they feel inflexible and frustrating to work with when compared to
> Heat's state-aware, dependency ordered deployments.  A solution that allows
> Heat to output configuration that can subsequently be consumed by Ansible --
> either running manually or perhaps via Mistral for API-driven-deployments --
> seems like an excellent goal.  Using Heat as a "front-end" to the process
> means that we get to keep the parameter validation and documentation that is
> missing in Ansible, while still following the Unix philosophy of giving you
> enough rope to hang yourself if you really want it.
>
> --
> Lars Kellogg-Stedman 
>
>
> __
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [TripleO] Forming our plans around Ansible

2017-07-10 Thread Lars Kellogg-Stedman
On Fri, Jul 7, 2017 at 1:50 PM, James Slagle  wrote:

> There are also some ideas forming around pulling the Ansible playbooks
>
and vars out of Heat so that they can be rerun (or run initially)
> independently from the Heat SoftwareDeployment delivery mechanism:
>

I think the closer we can come to "the operator runs ansible-playbook to
configure the overcloud" the better, but not because I think Ansible is
inherently a great tool: rather, I think the many layers of indirection in
our existing model make error reporting and diagnosis much more complicated
that it needs to be.  Combined with Puppet's "fail as late as possible"
model, this means that (a) operators waste time waiting for a deployment
that is ultimately going to fail but hasn't yet, and (b) when it does fail,
they need relatively intimate knowledge of our deployment tools to
backtrack through logs and find the root cause of the failure.

If we can offer a deployment mode that reduces the number of layers between
the operator and the actions being performed on the hosts I think we would
win on both fronts: faster failures and reporting errors as close as
possible to the actual problem will result in less frustration across the
board.

I do like Steve's suggestion of a split model where Heat is responsible for
instantiating OpenStack resources while Ansible is used to perform host
configuration tasks.  Despite all the work done on Ansible's OpenStack
modules, they feel inflexible and frustrating to work with when compared to
Heat's state-aware, dependency ordered deployments.  A solution that allows
Heat to output configuration that can subsequently be consumed by Ansible
-- either running manually or perhaps via Mistral for
API-driven-deployments -- seems like an excellent goal.  Using Heat as a
"front-end" to the process means that we get to keep the parameter
validation and documentation that is missing in Ansible, while still
following the Unix philosophy of giving you enough rope to hang yourself if
you really want it.

-- 
Lars Kellogg-Stedman 
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [TripleO] Forming our plans around Ansible

2017-07-10 Thread Giulio Fidente
On 07/10/2017 03:19 PM, Steven Hardy wrote:
> On Fri, Jul 7, 2017 at 6:50 PM, James Slagle  wrote:

[...]

> Yeah, I think the first step is to focus on a clean "split stack"
> model where the nodes/networks etc are still deployed via heat, then
> ansible handles the configuration of the nodes.

+1

as per my previous email, if splitstack was available we might have been
able to use that for the ceph-ansible integration : "if we had migrated
to splitstack already, it might have been possible"

splitstack though requires changes in how the *existing* openstack
services are deployed and we didn't want to do that just for the purpose
of integrating ceph-ansible so I still believe (3) to be a sensible
compromise to provide the needed functionalities and not breaking the
existing deployment logic

note that I know of at least another case (the swift rings building)
which would benefit from being able to trigger a workflow during the
overcloud deployment and does not need to run ansible

[...]

>> Personally, I'm pretty apprehensive about the approach taken in (3). I
>> feel that it is a lot of complexity that could be done simpler if we
>> took a step back and thought more about a longer term approach. I
>> recognize that it's mostly an experiment/POC at this stage, and I'm
>> not trying to directly knock down the approach. It's just that when I
>> start to see more patches (Kubernetes installation) using the same
>> approach, I figure it's worth discussing more broadly vs trying to
>> have a discussion by -1'ing patch reviews, etc.
> 
> I agree, I think the approach in (3) is a stopgap until we can define
> a cleaner approach with less layers.

> IMO the first step towards that is likely to be a "split stack" which
> outputs heat data, then deployment configuration is performed via
> mistral->ansible just like we already do in (1).

given option (3) allows triggering of workflows during a particular
deployment step, it seems that option (1) would need to be revisited to
implement some sort of a loop in mistral, instead of heat, to provide
that same functionality ... which in the end moves the existing logic
from heat into mistral

>> I'm interested in all feedback of course. And I plan to take a shot at
>> working on the prototype I mentioned in (5) if anyone would like to
>> collaborate around that.
> 
> I'm very happy to collaborate, and this is quite closely related to
> the investigations I've been doing around enabling minor updates for
> containers.
> 
> Lets sync up about it, but as I mentioned above I'm not yet fully sold
> on a new translation tool, vs just more t-h-t refactoring to enable
> output of data directly consumable via ansible-playbook (which can
> then be run via operators, or heat, or mistral, or whatever).
I'd be happy to revisit the requirements around the ceph-ansible
integration as well, to see how those can still be met
-- 
Giulio Fidente
GPG KEY: 08D733BA

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [TripleO] Forming our plans around Ansible

2017-07-10 Thread David Moreau Simard
That sounds like a good fit for an Ansible plugin to control how
variables or host inventories are designed [1] and is the intended use
case for extending Ansible behavior.

[1]: 
http://docs.ansible.com/ansible/dev_guide/developing_plugins.html#vars-plugins

David Moreau Simard
Senior Software Engineer | Openstack RDO

dmsimard = [irc, github, twitter]


On Mon, Jul 10, 2017 at 9:31 AM, Steven Hardy  wrote:
> On Sun, Jul 9, 2017 at 8:44 AM, Yolanda Robla Mota  
> wrote:
>> What i'd like to dig more is how Ansible and Heat can live together. And
>> what features do Heat offer that are not covered by Ansible as well? Is
>> there still the need to have Heat as the main engine, or could that be
>> replaced by Ansible totally in the future?
>
> The main interface provided by Heat which AFAIK cannot currently be
> replaced by Ansible is the parameters schema, where the template
> parameters are exposed (that include description, type and constraint
> data) in a format that is useful to e.g building the interfaces for
> tripleo-ui
>
> Ansible has a different approach to role/playbook parameters AFAICT,
> which is more a global namespace with no type validation, no way to
> include description data or tags with variable declarations, and no
> way to specify constraints (other than perhaps hainvg custom modules
> or playbook patterns that perform the validations early in the
> deployment).
>
> This is kind of similar to how the global namespace for hiera works
> with our puppet model, although that at least has the advantage of
> namespacing foo::something::variable, which again doesn't have a
> direct equivalent in the ansible role model AFAIK (happy to be
> corrected here, I'm not an ansible expert :)
>
> For these reasons (as mentioned in my reply to James), I think a first
> step of a "split stack" model where heat deploys the nodes/networks
> etc, then outputs data that can be consumed by Ansible is reasonable -
> it leaves the operator interfaces alone for now, and gives us time to
> think about the interface changes that may be needed long term, while
> still giving most of the operator-debug and usability/scalabilty
> benefits that I think folks pushing for Ansible are looking for.
>
> Steve
>
>
>
>
>> On Sat, Jul 8, 2017 at 12:20 AM, James Slagle 
>> wrote:
>>>
>>> On Fri, Jul 7, 2017 at 5:31 PM, David Moreau Simard 
>>> wrote:
>>> > On Fri, Jul 7, 2017 at 1:50 PM, James Slagle 
>>> > wrote:
>>> >> (0) tripleo-quickstart which follows the common and well accepted
>>> >> approach to bundling a set of Ansible playbooks/roles.
>>> >
>>> > I don't want to de-rail the thread but I really want to bring some
>>> > attention to a pattern that tripleo-quickstart has been using across
>>> > it's playbooks and roles.
>>> > I sincerely hope that we can find a better implementation should we
>>> > start developing new things from scratch.
>>>
>>> Yes, just to clarify...by "well accepted" I just meant how the git
>>> repo is organized and how you are expected to interface with those
>>> playbooks and roles as opposed to what those playbooks/roles actually
>>> do.
>>>
>>> > I'll sound like a broken record for those that have heard me mention
>>> > this before but for those that haven't, here's a concrete example of
>>> > how things are done today:
>>> > (Sorry for the link overload, making sure the relevant information is
>>> > available)
>>> >
>>> > For an example tripleo-quickstart job, here's the console [1] and it's
>>> > corresponding ARA report [2]:
>>> > - A bash script is created [3][4][5] from a jinja template [6]
>>> > - A task executes the bash script [7][8][9]
>>>
>>> From my limited experience, I believe the intent was that the
>>> playbooks should do what a user is expected to do so that it's as
>>> close to reproducing the user interface of TripleO 1:1.
>>>
>>> For example, we document users running commands from a shell prompt.
>>> Therefore, oooq ought to do the same thing as close as possible.
>>> Obviously there will be gaps, just as there is with tripleo.sh, but I
>>> feel that both tools (tripleo.sh/oooq) were trying to be faithful to
>>> our published docs as mush as possible, and I think there's something
>>> to be commended there.
>>>
>>> Not saying it's right or wong, just that I believe that was the intent.
>>>
>>> An alternative would be custom ansible modules that exposed tasks for
>>> interfacing with our API directly. That would also be valuable, as
>>> that code path is mostly untested now outside of the UI and CLI.
>>>
>>> I think that tripleo-quickstart is a slightly different class of
>>> "thing" from the other current Ansible uses I mentioned, in that it
>>> sits at a layer above everything else. It's meant to automate TripleO
>>> itself vs TripleO automating things. Regardless, we should certainly
>>> consider how it fits into a larger plan.
>>>
>>> --
>>> -- James Slagle
>>> --

Re: [openstack-dev] [TripleO] Forming our plans around Ansible

2017-07-10 Thread Steven Hardy
On Sun, Jul 9, 2017 at 8:44 AM, Yolanda Robla Mota  wrote:
> What i'd like to dig more is how Ansible and Heat can live together. And
> what features do Heat offer that are not covered by Ansible as well? Is
> there still the need to have Heat as the main engine, or could that be
> replaced by Ansible totally in the future?

The main interface provided by Heat which AFAIK cannot currently be
replaced by Ansible is the parameters schema, where the template
parameters are exposed (that include description, type and constraint
data) in a format that is useful to e.g building the interfaces for
tripleo-ui

Ansible has a different approach to role/playbook parameters AFAICT,
which is more a global namespace with no type validation, no way to
include description data or tags with variable declarations, and no
way to specify constraints (other than perhaps hainvg custom modules
or playbook patterns that perform the validations early in the
deployment).

This is kind of similar to how the global namespace for hiera works
with our puppet model, although that at least has the advantage of
namespacing foo::something::variable, which again doesn't have a
direct equivalent in the ansible role model AFAIK (happy to be
corrected here, I'm not an ansible expert :)

For these reasons (as mentioned in my reply to James), I think a first
step of a "split stack" model where heat deploys the nodes/networks
etc, then outputs data that can be consumed by Ansible is reasonable -
it leaves the operator interfaces alone for now, and gives us time to
think about the interface changes that may be needed long term, while
still giving most of the operator-debug and usability/scalabilty
benefits that I think folks pushing for Ansible are looking for.

Steve




> On Sat, Jul 8, 2017 at 12:20 AM, James Slagle 
> wrote:
>>
>> On Fri, Jul 7, 2017 at 5:31 PM, David Moreau Simard 
>> wrote:
>> > On Fri, Jul 7, 2017 at 1:50 PM, James Slagle 
>> > wrote:
>> >> (0) tripleo-quickstart which follows the common and well accepted
>> >> approach to bundling a set of Ansible playbooks/roles.
>> >
>> > I don't want to de-rail the thread but I really want to bring some
>> > attention to a pattern that tripleo-quickstart has been using across
>> > it's playbooks and roles.
>> > I sincerely hope that we can find a better implementation should we
>> > start developing new things from scratch.
>>
>> Yes, just to clarify...by "well accepted" I just meant how the git
>> repo is organized and how you are expected to interface with those
>> playbooks and roles as opposed to what those playbooks/roles actually
>> do.
>>
>> > I'll sound like a broken record for those that have heard me mention
>> > this before but for those that haven't, here's a concrete example of
>> > how things are done today:
>> > (Sorry for the link overload, making sure the relevant information is
>> > available)
>> >
>> > For an example tripleo-quickstart job, here's the console [1] and it's
>> > corresponding ARA report [2]:
>> > - A bash script is created [3][4][5] from a jinja template [6]
>> > - A task executes the bash script [7][8][9]
>>
>> From my limited experience, I believe the intent was that the
>> playbooks should do what a user is expected to do so that it's as
>> close to reproducing the user interface of TripleO 1:1.
>>
>> For example, we document users running commands from a shell prompt.
>> Therefore, oooq ought to do the same thing as close as possible.
>> Obviously there will be gaps, just as there is with tripleo.sh, but I
>> feel that both tools (tripleo.sh/oooq) were trying to be faithful to
>> our published docs as mush as possible, and I think there's something
>> to be commended there.
>>
>> Not saying it's right or wong, just that I believe that was the intent.
>>
>> An alternative would be custom ansible modules that exposed tasks for
>> interfacing with our API directly. That would also be valuable, as
>> that code path is mostly untested now outside of the UI and CLI.
>>
>> I think that tripleo-quickstart is a slightly different class of
>> "thing" from the other current Ansible uses I mentioned, in that it
>> sits at a layer above everything else. It's meant to automate TripleO
>> itself vs TripleO automating things. Regardless, we should certainly
>> consider how it fits into a larger plan.
>>
>> --
>> -- James Slagle
>> --
>>
>> __
>> OpenStack Development Mailing List (not for usage questions)
>> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
>
>
>
> --
>
> Yolanda Robla Mota
>
> Principal Software Engineer, RHCE
>
> Red Hat
>
> C/Avellana 213
>
> Urb Portugal
>
> yrobl...@redhat.comM: +34605641639
>
>
> __
> OpenStack 

Re: [openstack-dev] [TripleO] Forming our plans around Ansible

2017-07-10 Thread Steven Hardy
On Fri, Jul 7, 2017 at 6:50 PM, James Slagle  wrote:
> I proposed a session for the PTG
> (https://etherpad.openstack.org/p/tripleo-ptg-queens) about forming a
> common plan and vision around Ansible in TripleO.
>
> I think it's important however that we kick this discussion off more
> broadly before the PTG, so that we can hopefully have some agreement
> for deeper discussions and prototyping when we actually meet in
> person.

Thanks for starting this James, it's a topic that I've also been
giving quite a lot of thought to lately (and as you've seen, have
pushed some related patches) so it's good to get some broader
discussions going.

> Right now, we have multiple uses of Ansible in TripleO:
>
> (0) tripleo-quickstart which follows the common and well accepted
> approach to bundling a set of Ansible playbooks/roles.

FWIW I agree with Giulio that quickstart is a separate case, and while
I also do agree with David that there's plenty of scope for
improvement of the oooq user experience, but I'm going to focus on the
TripleO deployment aspects below.

> (1) Mistral calling Ansible. This is the approach used by
> tripleo-validations where Mistral directly executes ansible playbooks
> using a dynamic inventory. The inventory is constructed from the
> server related stack outputs of the overcloud stack.
>
> (2) Ansible running playbooks against localhost triggered by the
> heat-config Ansible hook. This approach is used by
> tripleo-heat-templates for upgrade tasks and various tasks for
> deploying containers.
>
> (3) Mistral calling Heat calling Mistral calling Ansible. In this
> approach, we have Mistral resources in tripleo-heat-templates that are
> created as part of the overcloud stack and in turn, the created
> Mistral action executions run ansible. This has been prototyped with
> using ceph-ansible to install Ceph as part of the overcloud
> deployment, and some of the work has already landed. There are also
> proposed WIP patches using this approach to install Kubernetes.
>
> There are also some ideas forming around pulling the Ansible playbooks
> and vars out of Heat so that they can be rerun (or run initially)
> independently from the Heat SoftwareDeployment delivery mechanism:
>
> (4) https://review.openstack.org/#/c/454816/
>
> (5) Another idea I'd like to prototype is a local tool that runs on
> the undercloud and pulls all of the SoftwareDeployment data out of
> Heat as the stack is being created and generates corresponding Ansible
> playbooks to apply those deployments. Once a given playbook is
> generated by the tool, the tool would signal back to Heat that the
> deployment is complete. Heat then creates the whole stack without
> actually applying a single deployment to an overcloud node. At that
> point, Ansible (or Mistral->Ansible for an API) would be used to do
> the actual deployment of the Overcloud with the Undercloud as the
> ansible runner.

Yeah so my idea with (4), and subsequent patches such as[1] is to
gradually move the deploy steps performed to configure services (on
baremetal and in containers) to a single ansible playbook.

There's currently still heat orchestration around the host preparation
(although this is performed via ansible) and iteration over each step
(where we re-apply the same deploy-steps playbook with an incrementing
step variable, but this could be replaced by e.g an ansible or mistral
loop), but my idea was to enable end-to-end configuration of nodes via
ansible-playbook, without the need for any special tooks (e.g we
refactor t-h-t enough that we don't need any special tools, and we
make deploy-steps-playbook.yaml the only method of deployment (for
baremetal and container cases)

[1] https://review.openstack.org/#/c/462211/

> All of this work has merit as we investigate longer term plans, and
> it's all at different stages with some being for dev/CI (0), some
> being used already in production (1 and 2), some just at the
> experimental stage (3 and 4), and some does not exist other than an
> idea (5).

I'd like to get the remaining work for (4) done so it's a supportable
option for minor updates, but there's still a bit more t-h-t
refactoring required to enable it I think, but I think we're already
pretty close to being able to run end-to-end ansible for most of the
PostDeploy steps without any special tooling.

Note this related patch from Matthieu:

https://review.openstack.org/#/c/444224/

I think we'll need to go further here but it's a starting point which
shows how we could expose ansible tasks from the heat stack outputs as
a first step to enabling standalone configuration via ansible (or
mistral->ansible)

> My intent with this mail is to start a discussion around what we've
> learned from these approaches and start discussing a consolidated plan
> around Ansible. And I'm not saying that whatever we come up with
> should only use Ansible a certain way. Just that we ought to look at
> how users/operators interact with Ansible 

Re: [openstack-dev] [TripleO] Forming our plans around Ansible

2017-07-10 Thread Giulio Fidente
On 07/07/2017 07:50 PM, James Slagle wrote:
> I proposed a session for the PTG
> (https://etherpad.openstack.org/p/tripleo-ptg-queens) about forming a
> common plan and vision around Ansible in TripleO.
> 
> I think it's important however that we kick this discussion off more
> broadly before the PTG, so that we can hopefully have some agreement
> for deeper discussions and prototyping when we actually meet in
> person.
> 
> Right now, we have multiple uses of Ansible in TripleO:

having worked on one of the versions listed, I would like to add some
comments

> (0) tripleo-quickstart which follows the common and well accepted
> approach to bundling a set of Ansible playbooks/roles.

this approach does not consume config data from heat; I don't think it
fits in the same category of the others

> (1) Mistral calling Ansible. This is the approach used by
> tripleo-validations where Mistral directly executes ansible playbooks
> using a dynamic inventory. The inventory is constructed from the
> server related stack outputs of the overcloud stack.

this approach is actually very similar to (3), with the main difference
that ansible is executed only *after* the stack is complete to be able
to build the dynamic inventory; in fact the flow looks like this:

tripleoclient -> mistral -> heat -> tripleoclient -> mistral (<< heat)

we couldn't use this same approach for ceph-ansible because we needed
the workflow to be executed during a specific overcloud deployment step;
if we had migrated to splitstack already, it might have been possible
(not sure though, more about this later)

> (2) Ansible running playbooks against localhost triggered by the
> heat-config Ansible hook. This approach is used by
> tripleo-heat-templates for upgrade tasks and various tasks for
> deploying containers.

we couldn't use this approach either because we needed to run an
unmodified version of ceph-ansible and provide to it the list of role
hosts in one shot so that ceph-ansible could manage the task
dependencies and ordering by itself; running on localhost wouldn't fit

> (3) Mistral calling Heat calling Mistral calling Ansible. In this
> approach, we have Mistral resources in tripleo-heat-templates that are
> created as part of the overcloud stack and in turn, the created
> Mistral action executions run ansible. This has been prototyped with
> using ceph-ansible to install Ceph as part of the overcloud
> deployment, and some of the work has already landed. There are also
> proposed WIP patches using this approach to install Kubernetes.

as per my comment about (1), this allows for execution of the workflows
to happen *during* the stack creation (at one or multiple deployment steps)

workflow tasks are described on a per-service basis, within the heat
templates and executions have access to the existing roles
config_settings which we also use for puppet

it allows interleaving of the puppet/workflow steps, which is a feature
we use for ceph-ansible for example to configure the firewall on the
nodes (using the established puppet manifests) before ceph-ansible
starts; we run ceph-ansible unmodified and users can provide arbitrary
extra vars to ceph-ansible via a heat parameter; the flow looks like this:

tripleoclient -> mistral -> heat -> mistral

also note, the workflows *can* run ansible (like it happens for
ceph-ansible) but don't need to, workflows can use any mistral action
and even define custom ones

I have proposed a topic for the ptg to discuss the above, I am sure it
can be extended and improved but IMHO it provides for a compelling set
of features (all of which we wanted/use for ceph-ansible)

> There are also some ideas forming around pulling the Ansible playbooks
> and vars out of Heat so that they can be rerun (or run initially)
> independently from the Heat SoftwareDeployment delivery mechanism:
> 
> (4) https://review.openstack.org/#/c/454816/
> 
> (5) Another idea I'd like to prototype is a local tool that runs on
> the undercloud and pulls all of the SoftwareDeployment data out of
> Heat as the stack is being created and generates corresponding Ansible
> playbooks to apply those deployments. Once a given playbook is
> generated by the tool, the tool would signal back to Heat that the
> deployment is complete. Heat then creates the whole stack without
> actually applying a single deployment to an overcloud node. At that
> point, Ansible (or Mistral->Ansible for an API) would be used to do
> the actual deployment of the Overcloud with the Undercloud as the
> ansible runner.

this seems interesting to me; do I understand correctly that if we keep
understanding of the deployment steps in heat then the flow would look like:

tripleoclient -> loop(mistral -> heat)

if so I think we'd need to move (or duplicate) some understanding about
the deployment steps from heat into mistral (as opposed to the approach
in (3) which keeps all the understanding in heat); I am not sure if
having this information in two tools will help in the 

Re: [openstack-dev] [TripleO] Forming our plans around Ansible

2017-07-09 Thread Wesley Hayutin
On Fri, Jul 7, 2017 at 6:20 PM, James Slagle  wrote:

> On Fri, Jul 7, 2017 at 5:31 PM, David Moreau Simard 
> wrote:
> > On Fri, Jul 7, 2017 at 1:50 PM, James Slagle 
> wrote:
> >> (0) tripleo-quickstart which follows the common and well accepted
> >> approach to bundling a set of Ansible playbooks/roles.
> >
> > I don't want to de-rail the thread but I really want to bring some
> > attention to a pattern that tripleo-quickstart has been using across
> > it's playbooks and roles.
> > I sincerely hope that we can find a better implementation should we
> > start developing new things from scratch.
>
> Yes, just to clarify...by "well accepted" I just meant how the git
> repo is organized and how you are expected to interface with those
> playbooks and roles as opposed to what those playbooks/roles actually
> do.
>
> > I'll sound like a broken record for those that have heard me mention
> > this before but for those that haven't, here's a concrete example of
> > how things are done today:
> > (Sorry for the link overload, making sure the relevant information is
> available)
> >
> > For an example tripleo-quickstart job, here's the console [1] and it's
> > corresponding ARA report [2]:
> > - A bash script is created [3][4][5] from a jinja template [6]
> > - A task executes the bash script [7][8][9]
>
> From my limited experience, I believe the intent was that the
> playbooks should do what a user is expected to do so that it's as
> close to reproducing the user interface of TripleO 1:1.
>
> For example, we document users running commands from a shell prompt.
> Therefore, oooq ought to do the same thing as close as possible.
> Obviously there will be gaps, just as there is with tripleo.sh, but I
> feel that both tools (tripleo.sh/oooq) were trying to be faithful to
> our published docs as mush as possible, and I think there's something
> to be commended there.
>

That is exactly right James, CI should be as close to a user driven install
as possible IMHO.

David you are conflating two use cases as far as I can tell. The first use
case (a) ansible used in the project/product that is launched by
openstack/project commands,  and the second use case (b) ansible as a
wrapper around commands that users are expected to execute.

Using navtive ansible modules as part of the project/product (a) as James
is describing is perfectly fine and ansible, ARA and other tools work
really well here.

If the CI reinterprets user level commands (b) directly into ansible module
calls you basically loose the 1:1 mapping between CI, documentation and
user experience.
The *most* important function of CI is guarantee that users can follow the
documentation and have a defect free experience [docs].  Having to "look at
the logs" is a very small
price to pay to preserve that experience.   I think we'll be able to get
the logs from the templated bash into ARA, we just need a little time to
get that done.
IMHO CI is a very different topic than what James is talking about in this
thread and hopefully won't interupt this converstation further.

Thanks

[docs]
https://docs.openstack.org/tripleo-quickstart/latest/design.html#problem-help-make-the-deployment-steps-easier-to-understand



> Not saying it's right or wong, just that I believe that was the intent.
>
> An alternative would be custom ansible modules that exposed tasks for
> interfacing with our API directly. That would also be valuable, as
> that code path is mostly untested now outside of the UI and CLI.
>
> I think that tripleo-quickstart is a slightly different class of
> "thing" from the other current Ansible uses I mentioned, in that it
> sits at a layer above everything else. It's meant to automate TripleO
> itself vs TripleO automating things. Regardless, we should certainly
> consider how it fits into a larger plan.
>
> --
> -- James Slagle
> --
>
> __
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [TripleO] Forming our plans around Ansible

2017-07-09 Thread Yolanda Robla Mota
What i'd like to dig more is how Ansible and Heat can live together. And
what features do Heat offer that are not covered by Ansible as well? Is
there still the need to have Heat as the main engine, or could that be
replaced by Ansible totally in the future?

On Sat, Jul 8, 2017 at 12:20 AM, James Slagle 
wrote:

> On Fri, Jul 7, 2017 at 5:31 PM, David Moreau Simard 
> wrote:
> > On Fri, Jul 7, 2017 at 1:50 PM, James Slagle 
> wrote:
> >> (0) tripleo-quickstart which follows the common and well accepted
> >> approach to bundling a set of Ansible playbooks/roles.
> >
> > I don't want to de-rail the thread but I really want to bring some
> > attention to a pattern that tripleo-quickstart has been using across
> > it's playbooks and roles.
> > I sincerely hope that we can find a better implementation should we
> > start developing new things from scratch.
>
> Yes, just to clarify...by "well accepted" I just meant how the git
> repo is organized and how you are expected to interface with those
> playbooks and roles as opposed to what those playbooks/roles actually
> do.
>
> > I'll sound like a broken record for those that have heard me mention
> > this before but for those that haven't, here's a concrete example of
> > how things are done today:
> > (Sorry for the link overload, making sure the relevant information is
> available)
> >
> > For an example tripleo-quickstart job, here's the console [1] and it's
> > corresponding ARA report [2]:
> > - A bash script is created [3][4][5] from a jinja template [6]
> > - A task executes the bash script [7][8][9]
>
> From my limited experience, I believe the intent was that the
> playbooks should do what a user is expected to do so that it's as
> close to reproducing the user interface of TripleO 1:1.
>
> For example, we document users running commands from a shell prompt.
> Therefore, oooq ought to do the same thing as close as possible.
> Obviously there will be gaps, just as there is with tripleo.sh, but I
> feel that both tools (tripleo.sh/oooq) were trying to be faithful to
> our published docs as mush as possible, and I think there's something
> to be commended there.
>
> Not saying it's right or wong, just that I believe that was the intent.
>
> An alternative would be custom ansible modules that exposed tasks for
> interfacing with our API directly. That would also be valuable, as
> that code path is mostly untested now outside of the UI and CLI.
>
> I think that tripleo-quickstart is a slightly different class of
> "thing" from the other current Ansible uses I mentioned, in that it
> sits at a layer above everything else. It's meant to automate TripleO
> itself vs TripleO automating things. Regardless, we should certainly
> consider how it fits into a larger plan.
>
> --
> -- James Slagle
> --
>
> __
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>



-- 

Yolanda Robla Mota

Principal Software Engineer, RHCE

Red Hat



C/Avellana 213

Urb Portugal

yrobl...@redhat.comM: +34605641639


__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [TripleO] Forming our plans around Ansible

2017-07-07 Thread James Slagle
On Fri, Jul 7, 2017 at 5:31 PM, David Moreau Simard  wrote:
> On Fri, Jul 7, 2017 at 1:50 PM, James Slagle  wrote:
>> (0) tripleo-quickstart which follows the common and well accepted
>> approach to bundling a set of Ansible playbooks/roles.
>
> I don't want to de-rail the thread but I really want to bring some
> attention to a pattern that tripleo-quickstart has been using across
> it's playbooks and roles.
> I sincerely hope that we can find a better implementation should we
> start developing new things from scratch.

Yes, just to clarify...by "well accepted" I just meant how the git
repo is organized and how you are expected to interface with those
playbooks and roles as opposed to what those playbooks/roles actually
do.

> I'll sound like a broken record for those that have heard me mention
> this before but for those that haven't, here's a concrete example of
> how things are done today:
> (Sorry for the link overload, making sure the relevant information is 
> available)
>
> For an example tripleo-quickstart job, here's the console [1] and it's
> corresponding ARA report [2]:
> - A bash script is created [3][4][5] from a jinja template [6]
> - A task executes the bash script [7][8][9]

From my limited experience, I believe the intent was that the
playbooks should do what a user is expected to do so that it's as
close to reproducing the user interface of TripleO 1:1.

For example, we document users running commands from a shell prompt.
Therefore, oooq ought to do the same thing as close as possible.
Obviously there will be gaps, just as there is with tripleo.sh, but I
feel that both tools (tripleo.sh/oooq) were trying to be faithful to
our published docs as mush as possible, and I think there's something
to be commended there.

Not saying it's right or wong, just that I believe that was the intent.

An alternative would be custom ansible modules that exposed tasks for
interfacing with our API directly. That would also be valuable, as
that code path is mostly untested now outside of the UI and CLI.

I think that tripleo-quickstart is a slightly different class of
"thing" from the other current Ansible uses I mentioned, in that it
sits at a layer above everything else. It's meant to automate TripleO
itself vs TripleO automating things. Regardless, we should certainly
consider how it fits into a larger plan.

-- 
-- James Slagle
--

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [TripleO] Forming our plans around Ansible

2017-07-07 Thread Luke Hinds
On Fri, Jul 7, 2017 at 10:17 PM, James Slagle 
wrote:

> On Fri, Jul 7, 2017 at 5:00 PM, Luke Hinds  wrote:
> > I can't offer much in-depth feedback on the pros and cons of each
> scenario.
> > My main point would be to try and simplify as much as we can, rather then
> > adding yet more tooling to the stack. At the moment ooo is spread across
> > multi repos and events are handed around multiple tool sets and queues.
> This
> > adds to a very steep learning curve for the folk who have to operate
> these
> > systems, as there are multiple moving parts to contend with. At the
> moment
> > things seem 'duck taped' together, so we should avoid adding more
> > complexity, and refactor down to a simpler architecture instead.
> >
> > With that in mind [1] sounds viable to myself, but with the caveat that
> > others might have a better view of how much of a fit that is for what we
> > need.
>
> Agreed, I think the goal ought to be a move towards simplification
> with Ansible at the core.
>
> An ideal scenario for me personally would be a situation where
> operators could just run Ansible in the typical way that they do today
> for any other project. Additionally, we'd have a way to execute the
> same Ansible playbook/roles/vars/whatever via Mistral so that we had a
> common API for our CLI and UI.
>
> Perhaps the default would be to go through the API, and more advanced
> usage could interface with Ansible directly.
>

I like the sound of this approach, as we then have a API for driving
complex deployment and upgrades, but if an operator needs to troubleshoot
or customise, they can do so with pure play ansible. Yet mistral is there
to drive the main complexity of a full openstack deployment.


> Additionally, we must have a way to maintain backwards compatibility
> with our existing template interfaces, or at least offer some form of
> migration tooling.
>
> Thanks for your feedback.
>
> --
> -- James Slagle
> --
>
> __
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>



-- 
Luke Hinds | NFV Partner Engineering | Office of Technology | Red Hat
e: lhi...@redhat.com | irc: lhinds @freenode | m: +44 77 45 63 98 84 | t: +44
12 52 36 2483
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [TripleO] Forming our plans around Ansible

2017-07-07 Thread David Moreau Simard
On Fri, Jul 7, 2017 at 1:50 PM, James Slagle  wrote:
> (0) tripleo-quickstart which follows the common and well accepted
> approach to bundling a set of Ansible playbooks/roles.

I don't want to de-rail the thread but I really want to bring some
attention to a pattern that tripleo-quickstart has been using across
it's playbooks and roles.
I sincerely hope that we can find a better implementation should we
start developing new things from scratch.

I'll sound like a broken record for those that have heard me mention
this before but for those that haven't, here's a concrete example of
how things are done today:
(Sorry for the link overload, making sure the relevant information is available)

For an example tripleo-quickstart job, here's the console [1] and it's
corresponding ARA report [2]:
- A bash script is created [3][4][5] from a jinja template [6]
- A task executes the bash script [7][8][9]

My understanding is that things are done this way in order to provide
automated documentation and make the builds reproducible.

One of Ansible's greatest strength is supposed to be it's simplicity:
making things readable and straightforward ("Automation for Everyone"
is it's motto).
It's hard for me to put succintly into words how complicated and
counter-intuitive the current pattern is making things so I'll provide
some examples.

1) When a task running a bash script fails, you don't know what failed
from the ansible-playbook output.
You need to find the appropriate log file and look at the output
of the bash script there.

2) There is logic, conditionals and variables inside the templated
bash scripts making it non-trivial to guess what the script actually
ends up looking like once it is "compiled".
If you happen to know that this task actually ran a templated bash
script in the first place, you need to know or remember where it is
located in the logs after the job is complete and then open it up.

3) There can be more than one operation inside a bash script so you
don't know which of those operations failed unless you look at the
logs.
This reduces granularity which makes it harder to profile,
identify and troubleshoot errors.

4) You don't know what the bash script actually did (if it did
anything at all) unless you look at the logs

5) Idempotency is handled (or not) inside the bash scripts, oblivious
to Ansible really knowing if running the bash script changed something
or not

Here's an example ARA report from openstack-ansible where you're
easily able to tell what went wrong and what happened [10].

Now, I'm not being selfish and trying to say that things should be
written in a specific way so that it can make ARA more useful.
Yes, ARA would be more useful. But this is about following Ansible
best practices and making it more intuitive to understand how things
work and what happens when tasks run.
Puppet is designed the same way: there are resources and modules to do
things. You don't template bash scripts and then use Exec resources.

Documentation and reproducible builds are great things to have, but
not with this kind of tradeoff IMO.
Surely there are other means of providing documentation and reproducible builds.

TripleO is complicated enough already.
Actively making it simpler in every way we can, not just for
developers but for users and operators, should be a priority and a
theme throughout the refactor around Ansible.
We should agree on the best practices and use them.

[1]: 
http://logs.openstack.org/11/478211/5/check/gate-tripleo-ci-centos-7-scenario001-multinode-oooq-puppet/2409a70/console.html
[2]: 
http://logs.openstack.org/11/478211/5/check/gate-tripleo-ci-centos-7-scenario001-multinode-oooq-puppet/2409a70/logs/ara_oooq/reports/d8f79fa8-c8db-4134-8696-795d04ba6f65.html
[3]: 
http://logs.openstack.org/11/478211/5/check/gate-tripleo-ci-centos-7-scenario001-multinode-oooq-puppet/2409a70/console.html#_2017-07-07_15_11_38_778824
[4]: 
http://logs.openstack.org/11/478211/5/check/gate-tripleo-ci-centos-7-scenario001-multinode-oooq-puppet/2409a70/logs/ara_oooq/file/efa7400f-9f8a-4b02-b650-2060c7a3cec3/#line-1
[5]: 
http://logs.openstack.org/11/478211/5/check/gate-tripleo-ci-centos-7-scenario001-multinode-oooq-puppet/2409a70/logs/ara_oooq/result/4b3cffd6-f252-4156-9f15-bceed6f12510/
[6]: 
https://github.com/openstack/tripleo-quickstart/blob/ec7b2d71f28efd301eafec8f53fc644c2fd8cc6e/roles/repo-setup/templates/repo_setup.sh.j2
[7]: 
http://logs.openstack.org/11/478211/5/check/gate-tripleo-ci-centos-7-scenario001-multinode-oooq-puppet/2409a70/console.html#_2017-07-07_15_11_42_330477
[8]: 
http://logs.openstack.org/11/478211/5/check/gate-tripleo-ci-centos-7-scenario001-multinode-oooq-puppet/2409a70/logs/ara_oooq/file/08c9-fada-49cb-b72c-9b93f8d2565b/#line-1
[9]: 
http://logs.openstack.org/11/478211/5/check/gate-tripleo-ci-centos-7-scenario001-multinode-oooq-puppet/2409a70/logs/ara_oooq/result/49de1e7a-32f9-4e6e-9242-fb8afdb91d88/
[10]: 

Re: [openstack-dev] [TripleO] Forming our plans around Ansible

2017-07-07 Thread James Slagle
On Fri, Jul 7, 2017 at 5:00 PM, Luke Hinds  wrote:
> I can't offer much in-depth feedback on the pros and cons of each scenario.
> My main point would be to try and simplify as much as we can, rather then
> adding yet more tooling to the stack. At the moment ooo is spread across
> multi repos and events are handed around multiple tool sets and queues. This
> adds to a very steep learning curve for the folk who have to operate these
> systems, as there are multiple moving parts to contend with. At the moment
> things seem 'duck taped' together, so we should avoid adding more
> complexity, and refactor down to a simpler architecture instead.
>
> With that in mind [1] sounds viable to myself, but with the caveat that
> others might have a better view of how much of a fit that is for what we
> need.

Agreed, I think the goal ought to be a move towards simplification
with Ansible at the core.

An ideal scenario for me personally would be a situation where
operators could just run Ansible in the typical way that they do today
for any other project. Additionally, we'd have a way to execute the
same Ansible playbook/roles/vars/whatever via Mistral so that we had a
common API for our CLI and UI.

Perhaps the default would be to go through the API, and more advanced
usage could interface with Ansible directly.

Additionally, we must have a way to maintain backwards compatibility
with our existing template interfaces, or at least offer some form of
migration tooling.

Thanks for your feedback.

-- 
-- James Slagle
--

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [TripleO] Forming our plans around Ansible

2017-07-07 Thread Luke Hinds
On Fri, Jul 7, 2017 at 6:50 PM, James Slagle  wrote:

> I proposed a session for the PTG
> (https://etherpad.openstack.org/p/tripleo-ptg-queens) about forming a
> common plan and vision around Ansible in TripleO.
>
> I think it's important however that we kick this discussion off more
> broadly before the PTG, so that we can hopefully have some agreement
> for deeper discussions and prototyping when we actually meet in
> person.
>
> Right now, we have multiple uses of Ansible in TripleO:
>
> (0) tripleo-quickstart which follows the common and well accepted
> approach to bundling a set of Ansible playbooks/roles.
>
> (1) Mistral calling Ansible. This is the approach used by
> tripleo-validations where Mistral directly executes ansible playbooks
> using a dynamic inventory. The inventory is constructed from the
> server related stack outputs of the overcloud stack.
>
> (2) Ansible running playbooks against localhost triggered by the
> heat-config Ansible hook. This approach is used by
> tripleo-heat-templates for upgrade tasks and various tasks for
> deploying containers.
>
> (3) Mistral calling Heat calling Mistral calling Ansible. In this
> approach, we have Mistral resources in tripleo-heat-templates that are
> created as part of the overcloud stack and in turn, the created
> Mistral action executions run ansible. This has been prototyped with
> using ceph-ansible to install Ceph as part of the overcloud
> deployment, and some of the work has already landed. There are also
> proposed WIP patches using this approach to install Kubernetes.
>
> There are also some ideas forming around pulling the Ansible playbooks
> and vars out of Heat so that they can be rerun (or run initially)
> independently from the Heat SoftwareDeployment delivery mechanism:
>
> (4) https://review.openstack.org/#/c/454816/
>
> (5) Another idea I'd like to prototype is a local tool that runs on
> the undercloud and pulls all of the SoftwareDeployment data out of
> Heat as the stack is being created and generates corresponding Ansible
> playbooks to apply those deployments. Once a given playbook is
> generated by the tool, the tool would signal back to Heat that the
> deployment is complete. Heat then creates the whole stack without
> actually applying a single deployment to an overcloud node. At that
> point, Ansible (or Mistral->Ansible for an API) would be used to do
> the actual deployment of the Overcloud with the Undercloud as the
> ansible runner.
>
> All of this work has merit as we investigate longer term plans, and
> it's all at different stages with some being for dev/CI (0), some
> being used already in production (1 and 2), some just at the
> experimental stage (3 and 4), and some does not exist other than an
> idea (5).
>
> My intent with this mail is to start a discussion around what we've
> learned from these approaches and start discussing a consolidated plan
> around Ansible. And I'm not saying that whatever we come up with
> should only use Ansible a certain way. Just that we ought to look at
> how users/operators interact with Ansible and TripleO today and try
> and come up with the best solution(s) going forward.
>
> I think that (1) has been pretty successful, and my idea with (5)
> would use a similar approach once the playbooks were generated.
> Further, my idea with (5) would give us a fully backwards compatible
> solution with our existing template interfaces from
> tripleo-heat-templates. Longer term (or even in parallel for some
> time), the generated playbooks could stop being generated (and just
> exist in git), and we could consider moving away from Heat more
> permanently
>
> I recognize that saying "moving away from Heat" may be quite
> controversial. While it's not 100% the same discussion as what we are
> doing with Ansible, I think it is a big part of the discussion and if
> we want to continue with Heat as the primary orchestration tool in
> TripleO.
>
> I've been hearing a lot of feedback from various operators about how
> difficult the baremetal deployment is with Heat. While feedback about
> Ironic is generally positive, a lot of the negative feedback is around
> the Heat->Nova->Ironic interaction. And, if we also move more towards
> Ansible for the service deployment, I wonder if there is still a long
> term place for Heat at all.
>
> Personally, I'm pretty apprehensive about the approach taken in (3). I
> feel that it is a lot of complexity that could be done simpler if we
> took a step back and thought more about a longer term approach. I
> recognize that it's mostly an experiment/POC at this stage, and I'm
> not trying to directly knock down the approach. It's just that when I
> start to see more patches (Kubernetes installation) using the same
> approach, I figure it's worth discussing more broadly vs trying to
> have a discussion by -1'ing patch reviews, etc.
>
> I'm interested in all feedback of course. And I plan to take a shot at
> working on the prototype I 

[openstack-dev] [TripleO] Forming our plans around Ansible

2017-07-07 Thread James Slagle
I proposed a session for the PTG
(https://etherpad.openstack.org/p/tripleo-ptg-queens) about forming a
common plan and vision around Ansible in TripleO.

I think it's important however that we kick this discussion off more
broadly before the PTG, so that we can hopefully have some agreement
for deeper discussions and prototyping when we actually meet in
person.

Right now, we have multiple uses of Ansible in TripleO:

(0) tripleo-quickstart which follows the common and well accepted
approach to bundling a set of Ansible playbooks/roles.

(1) Mistral calling Ansible. This is the approach used by
tripleo-validations where Mistral directly executes ansible playbooks
using a dynamic inventory. The inventory is constructed from the
server related stack outputs of the overcloud stack.

(2) Ansible running playbooks against localhost triggered by the
heat-config Ansible hook. This approach is used by
tripleo-heat-templates for upgrade tasks and various tasks for
deploying containers.

(3) Mistral calling Heat calling Mistral calling Ansible. In this
approach, we have Mistral resources in tripleo-heat-templates that are
created as part of the overcloud stack and in turn, the created
Mistral action executions run ansible. This has been prototyped with
using ceph-ansible to install Ceph as part of the overcloud
deployment, and some of the work has already landed. There are also
proposed WIP patches using this approach to install Kubernetes.

There are also some ideas forming around pulling the Ansible playbooks
and vars out of Heat so that they can be rerun (or run initially)
independently from the Heat SoftwareDeployment delivery mechanism:

(4) https://review.openstack.org/#/c/454816/

(5) Another idea I'd like to prototype is a local tool that runs on
the undercloud and pulls all of the SoftwareDeployment data out of
Heat as the stack is being created and generates corresponding Ansible
playbooks to apply those deployments. Once a given playbook is
generated by the tool, the tool would signal back to Heat that the
deployment is complete. Heat then creates the whole stack without
actually applying a single deployment to an overcloud node. At that
point, Ansible (or Mistral->Ansible for an API) would be used to do
the actual deployment of the Overcloud with the Undercloud as the
ansible runner.

All of this work has merit as we investigate longer term plans, and
it's all at different stages with some being for dev/CI (0), some
being used already in production (1 and 2), some just at the
experimental stage (3 and 4), and some does not exist other than an
idea (5).

My intent with this mail is to start a discussion around what we've
learned from these approaches and start discussing a consolidated plan
around Ansible. And I'm not saying that whatever we come up with
should only use Ansible a certain way. Just that we ought to look at
how users/operators interact with Ansible and TripleO today and try
and come up with the best solution(s) going forward.

I think that (1) has been pretty successful, and my idea with (5)
would use a similar approach once the playbooks were generated.
Further, my idea with (5) would give us a fully backwards compatible
solution with our existing template interfaces from
tripleo-heat-templates. Longer term (or even in parallel for some
time), the generated playbooks could stop being generated (and just
exist in git), and we could consider moving away from Heat more
permanently

I recognize that saying "moving away from Heat" may be quite
controversial. While it's not 100% the same discussion as what we are
doing with Ansible, I think it is a big part of the discussion and if
we want to continue with Heat as the primary orchestration tool in
TripleO.

I've been hearing a lot of feedback from various operators about how
difficult the baremetal deployment is with Heat. While feedback about
Ironic is generally positive, a lot of the negative feedback is around
the Heat->Nova->Ironic interaction. And, if we also move more towards
Ansible for the service deployment, I wonder if there is still a long
term place for Heat at all.

Personally, I'm pretty apprehensive about the approach taken in (3). I
feel that it is a lot of complexity that could be done simpler if we
took a step back and thought more about a longer term approach. I
recognize that it's mostly an experiment/POC at this stage, and I'm
not trying to directly knock down the approach. It's just that when I
start to see more patches (Kubernetes installation) using the same
approach, I figure it's worth discussing more broadly vs trying to
have a discussion by -1'ing patch reviews, etc.

I'm interested in all feedback of course. And I plan to take a shot at
working on the prototype I mentioned in (5) if anyone would like to
collaborate around that.

I think if we can form some broad agreement before the PTG, we have a
chance at making some meaningful progress during Queens.


-- 
-- James Slagle
--