Re: [openstack-dev] [Infra] Generic solution for bare metal testing

2016-04-14 Thread Ben Nemec
On 04/12/2016 09:17 AM, Jim Rollenhagen wrote:
> On Thu, Apr 07, 2016 at 02:42:09AM +, Jeremy Stanley wrote:
>> On 2016-04-06 18:33:06 +0300 (+0300), Igor Belikov wrote:
>> [...]
>>> I suppose there are security issues when we talk about running
>>> custom code on bare metal slaves, but I'm not sure I understand
>>> the difference from running custom code on a virtual machine if
>>> bare metal nodes are isolated, don't contain any sensitive data
>>> and follow a regular redeployment procedure.
>> [...]
>>
>> With a virtual machine, you can delete it and create a new one.
>> Nothing remains behind.
>>
>> With a physical machine, arbitrary code running in the scope of a
>> test with root access can do _nasty_ things like backdoor your
>> server firmware with shims that even masquerade as the firmware
>> updater and persist through redeployments that include firmware
>> refreshes.
>>
>> Physical servers persist, and are therefore vulnerable in this
>> scenario in ways which virtual servers are not.
> 
> Right, it's a huge effort to run a secure bare metal cloud running
> arbitrary code. Homogenous hardware and vendor cooperation is a must,
> and that's only part of it.
> 
> I don't foresee the infra team having the resources to take on such a
> task any time soon (but of course, I'm not well-informed on the infra
> team's workload).
> 
> Another option for baremetal in the gate is baremetal flavors in other
> public clouds - Rackspace has one (OnMetal) but doesn't yet support
> custom images, and others have launched or are working on one. Once
> there's two clouds that support baremetal with custom images, we could
> put those resources in the upstream CI pool.

Depending on exactly what you need baremetal for, we're getting very
close to OVB[1] being usable in an unmodified cloud, especially for
one-time-use CI environments.  I just merged [2] from Steve Baker which
enables pxe booting without Nova hacks, and I've done some successful
tests locally using the Neutron port-security extension to allow PXE
deployment of instances.  The port-security stuff isn't in the git repo
yet because we need to make it compatible with Kilo-based clouds, but
Steve tells me has a way to make that work.

This obviously doesn't help with the nested virt problem, if that's what
you need baremetal for, but for testing baremetal-style deployments it
works quite well in my experience.  We've started work to make use of it
for TripleO CI[3], and it's already being used for some of our
downstream testing.

I don't know that we're quite ready to just run in regular infra yet
because we do need the ability to upload our custom ipxe-boot image and
we need a cloud at least new enough for the port-security to work (and I
don't know exactly how new is new enough, other than it worked in a
Neutron build from a couple of weeks ago).  It also deploys the VMs with
Heat, so we need that in addition to all the other usual suspects.

For the moment, our plan in TripleO is to re-deploy our rack with an
OVB-friendly cloud and stay separate, but I believe eventually we'd like
to run in a regular infra environment and throw that hardware into the
infra pool (don't quote me on this, I don't have any direct control over
it, but this is my understanding of the plan).  We're way closer to
being able to do that than I had thought a month ago, so I wanted to
bring it up as part of this discussion.

1: https://github.com/cybertron/openstack-virtual-baremetal
2:
https://github.com/cybertron/openstack-virtual-baremetal/commit/915269adc73475c1ee6ac722534386ef5dc0250c
3: https://review.openstack.org/#/c/295243

-Ben

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Infra] Generic solution for bare metal testing

2016-04-12 Thread Jim Rollenhagen
On Thu, Apr 07, 2016 at 02:42:09AM +, Jeremy Stanley wrote:
> On 2016-04-06 18:33:06 +0300 (+0300), Igor Belikov wrote:
> [...]
> > I suppose there are security issues when we talk about running
> > custom code on bare metal slaves, but I'm not sure I understand
> > the difference from running custom code on a virtual machine if
> > bare metal nodes are isolated, don't contain any sensitive data
> > and follow a regular redeployment procedure.
> [...]
> 
> With a virtual machine, you can delete it and create a new one.
> Nothing remains behind.
> 
> With a physical machine, arbitrary code running in the scope of a
> test with root access can do _nasty_ things like backdoor your
> server firmware with shims that even masquerade as the firmware
> updater and persist through redeployments that include firmware
> refreshes.
> 
> Physical servers persist, and are therefore vulnerable in this
> scenario in ways which virtual servers are not.

Right, it's a huge effort to run a secure bare metal cloud running
arbitrary code. Homogenous hardware and vendor cooperation is a must,
and that's only part of it.

I don't foresee the infra team having the resources to take on such a
task any time soon (but of course, I'm not well-informed on the infra
team's workload).

Another option for baremetal in the gate is baremetal flavors in other
public clouds - Rackspace has one (OnMetal) but doesn't yet support
custom images, and others have launched or are working on one. Once
there's two clouds that support baremetal with custom images, we could
put those resources in the upstream CI pool.

// jim

> -- 
> Jeremy Stanley
> 
> __
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Infra] Generic solution for bare metal testing

2016-04-06 Thread Jeremy Stanley
On 2016-04-06 18:33:06 +0300 (+0300), Igor Belikov wrote:
[...]
> I suppose there are security issues when we talk about running
> custom code on bare metal slaves, but I'm not sure I understand
> the difference from running custom code on a virtual machine if
> bare metal nodes are isolated, don't contain any sensitive data
> and follow a regular redeployment procedure.
[...]

With a virtual machine, you can delete it and create a new one.
Nothing remains behind.

With a physical machine, arbitrary code running in the scope of a
test with root access can do _nasty_ things like backdoor your
server firmware with shims that even masquerade as the firmware
updater and persist through redeployments that include firmware
refreshes.

Physical servers persist, and are therefore vulnerable in this
scenario in ways which virtual servers are not.
-- 
Jeremy Stanley

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Infra] Generic solution for bare metal testing

2016-04-06 Thread Paul Belanger
On Wed, Apr 06, 2016 at 06:33:06PM +0300, Igor Belikov wrote:
> Hey Stackers,
> 
> In Fuel we use bare metal testing for deployment tests. This is essentially a 
> core component of Fuel CI and as much as we like having it around we’d rather 
> spend time and resources integrating with upstream instead of growing and 
> polishing third-party testing solutions.
> 
> On one of the previous Infra team meetings we discussed the possibility of 
> bringing testing on bare metal nodes to openstack-infra[1]. This is not a new 
> topic, similar question was brought up by Magnum some time ago[2] and there 
> might other times this was discussed. We use bare metal testing for Fuel, I 
> assume that Magnum still wants to use it, TripleO would probably also fit in 
> the picture in some way (though I’m not familiar with current scheme of 
> TripleO CI) - hope this is enough to consider implementation of generic way 
> to use baremetal nodes in CI.
> 
> The most obvious way to do this seems to be using existing OpenStack service 
> for bare metal provisioning - Ironic. Ironic fits pretty well in existing 
> Infra workflow, Ironic usage (in form of Rackspace's OnMetal) was previously 
> discussed in Magnum thread[2] with the main technical issue being inability 
> to use custom glance images to boot instances. AFAIK the situation didn't 
> change much with OnMetal, but Ironic perfectly supports booting from glance 
> images created by diskimage-builder - which is exactly the way Nodepool 
> currently works for virtual machines.
> 
> With the work currently going on InfraCloud there's a possibility to properly 
> design and implement bare metal testing, Zuul v3 spec[3] also brings a number 
> of relevant changes to Nodepool. So, summing up some points of possible 
> implementation:
> * Multiple pools of bare metal nodes under Ironic management are available as 
> a part of InfraCloud
> * Ironic acts as an additional hypervisor for Nova, providing the ability to 
> use bare metal nodes by booting an instance with a specific flavor
> * Nodepool manages booting bare metal instances using the images generated 
> with diskimage-builder and stored in Glance
> * Nodepool also manages redeployment of bare metal nodes - redeploying a 
> glance image on a bare metal node takes only a few minutes, but time may 
> depend on a set of cleaning steps used to redeploy a node
> * Bare metal instances are exposed to Jenkins (or a different worker in case 
> of Zuul v3) by Nodepool 
> 
> I suppose there are security issues when we talk about running custom code on 
> bare metal slaves, but I'm not sure I understand the difference from running 
> custom code on a virtual machine if bare metal nodes are isolated, don't 
> contain any sensitive data and follow a regular redeployment procedure.
> 
> I'd like to add that we're ready to start donating hardware from the Fuel CI 
> pool (2 pools in different locations, to be accurate) to see this initiative 
> taking off.
> 
> Please, share your thoughts and opinions.
> 
Personally, I don't see this happening in the short term.  Currently infracloud
is down (moving data centers) and zuulv3 still has work that needs to be
completed.  While baremetal is a nice to have, I don't see us using infracloud
to do this right now. At our recent -infra midcycle, we talked about not wanting
infracloud become a dominate cloud for nodepool.  Meaning, we'd be bringing
it up and down at specific intervals for tasks like upgrading to the current
release.

I agree that zuulv3 has a lot of potential (and super excited to see it come
online) but we also need to work on ansible playbooks to make all this happen.

TL;DR I see bare metal happing someday, not sure 2016 is in the cards.

My personal opinion.

[4] http://docs.openstack.org/infra/system-config/infra-cloud.html
> [1]http://eavesdrop.openstack.org/meetings/infra/2016/infra.2016-03-29-19.03.log.html
> [2]http://lists.openstack.org/pipermail/openstack-infra/2015-September/003138.html
> [3]http://specs.openstack.org/openstack-infra/infra-specs/specs/zuulv3.html
> --
> Igor Belikov
> Fuel CI Engineer
> ibeli...@mirantis.com
> 
> 
> __
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [Infra] Generic solution for bare metal testing

2016-04-06 Thread Igor Belikov
Hey Stackers,

In Fuel we use bare metal testing for deployment tests. This is essentially a 
core component of Fuel CI and as much as we like having it around we’d rather 
spend time and resources integrating with upstream instead of growing and 
polishing third-party testing solutions.

On one of the previous Infra team meetings we discussed the possibility of 
bringing testing on bare metal nodes to openstack-infra[1]. This is not a new 
topic, similar question was brought up by Magnum some time ago[2] and there 
might other times this was discussed. We use bare metal testing for Fuel, I 
assume that Magnum still wants to use it, TripleO would probably also fit in 
the picture in some way (though I’m not familiar with current scheme of TripleO 
CI) - hope this is enough to consider implementation of generic way to use 
baremetal nodes in CI.

The most obvious way to do this seems to be using existing OpenStack service 
for bare metal provisioning - Ironic. Ironic fits pretty well in existing Infra 
workflow, Ironic usage (in form of Rackspace's OnMetal) was previously 
discussed in Magnum thread[2] with the main technical issue being inability to 
use custom glance images to boot instances. AFAIK the situation didn't change 
much with OnMetal, but Ironic perfectly supports booting from glance images 
created by diskimage-builder - which is exactly the way Nodepool currently 
works for virtual machines.

With the work currently going on InfraCloud there's a possibility to properly 
design and implement bare metal testing, Zuul v3 spec[3] also brings a number 
of relevant changes to Nodepool. So, summing up some points of possible 
implementation:
* Multiple pools of bare metal nodes under Ironic management are available as a 
part of InfraCloud
* Ironic acts as an additional hypervisor for Nova, providing the ability to 
use bare metal nodes by booting an instance with a specific flavor
* Nodepool manages booting bare metal instances using the images generated with 
diskimage-builder and stored in Glance
* Nodepool also manages redeployment of bare metal nodes - redeploying a glance 
image on a bare metal node takes only a few minutes, but time may depend on a 
set of cleaning steps used to redeploy a node
* Bare metal instances are exposed to Jenkins (or a different worker in case of 
Zuul v3) by Nodepool 

I suppose there are security issues when we talk about running custom code on 
bare metal slaves, but I'm not sure I understand the difference from running 
custom code on a virtual machine if bare metal nodes are isolated, don't 
contain any sensitive data and follow a regular redeployment procedure.

I'd like to add that we're ready to start donating hardware from the Fuel CI 
pool (2 pools in different locations, to be accurate) to see this initiative 
taking off.

Please, share your thoughts and opinions.

[1]http://eavesdrop.openstack.org/meetings/infra/2016/infra.2016-03-29-19.03.log.html
[2]http://lists.openstack.org/pipermail/openstack-infra/2015-September/003138.html
[3]http://specs.openstack.org/openstack-infra/infra-specs/specs/zuulv3.html
--
Igor Belikov
Fuel CI Engineer
ibeli...@mirantis.com


__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev