subject:"\[openstack\-dev\] \[nova\]\[ironic\] A couple feature freeze exception requests"

Re: [openstack-dev] [nova][ironic] A couple feature freeze exception requests

2016-08-02 Thread Dan Smith

> Multitenant networking
> ==

I haven't reviewed this one much either, but it looks smallish and if
other people are good with it then I think it's probably something we
should do.

> Multi-compute usage via a hash ring
> ===

I'm obviously +2 on this one :)

--Dan

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [nova][ironic] A couple feature freeze exception requests

2016-08-02 Thread Matt Riedemann


On 8/1/2016 4:20 PM, Jim Rollenhagen wrote:

Yes, I know this is stupid late for these.

I'd like to request two exceptions to the non-priority feature freeze,
for a couple of features in the Ironic driver.  These were not requested
at the normal time as I thought they were nowhere near ready.

Multitenant networking
==

Ironic's top feature request for around 2 years now has been to make
networking safe for multitenant use, as opposed to a flat network
(including control plane access!) for all tenants. We've been working on
a solution for 3 cycles now, and finally have the Ironic pieces of it
done, after a heroic effort to finish things up this cycle.

There's just one patch left to make it work, in the virt driver in Nova.
That is here: https://review.openstack.org/#/c/297895/

It's important to note that this actually fixes some dead code we pushed
on before this feature was done, and is only ~50 lines, half of which
are comments/reno.

Reviewers on this unearthed a problem on the ironic side, which I expect
to be fixed in the next couple of days:
https://review.openstack.org/#/q/topic:bug/1608511

We also have CI for this feature in ironic, and I have a depends-on
testing all of this as a whole: https://review.openstack.org/#/c/347004/

Per Matt's request, I'm also adding that job to Nova's experimental
queue: https://review.openstack.org/#/c/349595/

A couple folks from the ironic team have also done some manual testing
of this feature, with the nova code in, using real switches.

Merging this patch would bring a *huge* win for deployers and operators,
and I don't think it's very risky. It'll be ready to go sometime this
week, once that ironic chain is merged.


I've reviewed this one and it looks good to me. It's dependent on 
python-ironicclient>=1.5.0 which Jim has a g-r bump up as a dependency. 
And the gate-tempest-dsvm-ironic-multitenant-network-nv job is testing 
this and passing on the test patch in ironic (and that job is in the 
nova experimental queue now).


The upgrade procedure had some people scratching their heads in IRC this 
week so I've stated that we need clear documentation there, which will 
probably live here:


http://docs.openstack.org/developer/ironic/deploy/upgrade-guide.html

Since Ironic isn't in here:

http://docs.openstack.org/ops-guide/ops_upgrades.html#update-services

But the docs in the Ironic repo say that Nova should be upgraded first 
when going from Juno to Kilo so it's definitely important to get those 
docs updated for upgrades from Mitaka to Newton, but Jim said he'd do 
that this cycle.


Given how long people have been asking for this in Ironic and the Ironic 
team has made it a priority to get it working on their side, and there 
is CI already and a small change in Nova, I'm OK with giving a 
non-priority FFE for this.




Multi-compute usage via a hash ring
===

One of the major problems with the ironic virt driver today is that we
don't support running multiple nova-compute daemons with the ironic driver
loaded, because each compute service manages all ironic nodes and stomps
on each other.

There's currently a hack in the ironic virt driver to
kind of make this work, but instance locking still isn't done:
https://github.com/openstack/ironic/blob/master/ironic/nova/compute/manager.py

That is also holding back removing the pluggable compute manager in nova:
https://github.com/openstack/nova/blob/master/nova/conf/service.py#L64-L69

And as someone that runs a deployment using this hack, I can tell you
first-hand that it doesn't work well.

We (the ironic and nova community) have been working on fixing this for
2-3 cycles now, trying to find a solution that isn't terrible and
doesn't break existing use cases. We've been conflating it with how we
schedule ironic instances and keep managing to find a big wedge with
each approach. The best approach we've found involves duplicating the
compute capabilities and affinity filters in ironic.

Some of us were talking at the nova midcycle and decided we should try
the hash ring approach (like ironic uses to shard nodes between
conductors) and see how it works out, even though people have said in
the past that wouldn't work. I did a proof of concept last week, and
started playing with five compute daemons in a devstack environment.
Two nerd-snipey days later and I had a fully working solution, with unit
tests, passing CI. That is here:
https://review.openstack.org/#/c/348443/

We'll need to work on CI for this with multiple compute services. That
shouldn't be crazy difficult, but I'm not sure we'll have it done this
cycle (and it might get interesting trying to test computes joining and
leaving the cluster).

It also needs some testing at scale, which is hard to do in the upstream
gate, but I'll be doing my best to ship this downstream as soon as I
can, and iterating on any problems we see there.

It's a huge win for operators, for only a few hundred lines (some of

Re: [openstack-dev] [nova][ironic] A couple feature freeze exception requests

2016-08-01 Thread Jay Pipes


On 08/01/2016 05:20 PM, Jim Rollenhagen wrote:

Yes, I know this is stupid late for these.

I'd like to request two exceptions to the non-priority feature freeze,
for a couple of features in the Ironic driver.  These were not requested
at the normal time as I thought they were nowhere near ready.

Multitenant networking
==

Ironic's top feature request for around 2 years now has been to make
networking safe for multitenant use, as opposed to a flat network
(including control plane access!) for all tenants. We've been working on
a solution for 3 cycles now, and finally have the Ironic pieces of it
done, after a heroic effort to finish things up this cycle.

There's just one patch left to make it work, in the virt driver in Nova.
That is here: https://review.openstack.org/#/c/297895/


Reviewed. +2 from me, under the assumption that Ironic must always be 
upgraded before Nova per our discussion on IRC on the same topic today.



It's important to note that this actually fixes some dead code we pushed
on before this feature was done, and is only ~50 lines, half of which
are comments/reno.

Reviewers on this unearthed a problem on the ironic side, which I expect
to be fixed in the next couple of days:
https://review.openstack.org/#/q/topic:bug/1608511

We also have CI for this feature in ironic, and I have a depends-on
testing all of this as a whole: https://review.openstack.org/#/c/347004/

Per Matt's request, I'm also adding that job to Nova's experimental
queue: https://review.openstack.org/#/c/349595/

A couple folks from the ironic team have also done some manual testing
of this feature, with the nova code in, using real switches.

Merging this patch would bring a *huge* win for deployers and operators,
and I don't think it's very risky. It'll be ready to go sometime this
week, once that ironic chain is merged.


++


Multi-compute usage via a hash ring
===

One of the major problems with the ironic virt driver today is that we
don't support running multiple nova-compute daemons with the ironic driver
loaded, because each compute service manages all ironic nodes and stomps
on each other.

There's currently a hack in the ironic virt driver to
kind of make this work, but instance locking still isn't done:
https://github.com/openstack/ironic/blob/master/ironic/nova/compute/manager.py

That is also holding back removing the pluggable compute manager in nova:
https://github.com/openstack/nova/blob/master/nova/conf/service.py#L64-L69

And as someone that runs a deployment using this hack, I can tell you
first-hand that it doesn't work well.

We (the ironic and nova community) have been working on fixing this for
2-3 cycles now, trying to find a solution that isn't terrible and
doesn't break existing use cases. We've been conflating it with how we
schedule ironic instances and keep managing to find a big wedge with
each approach. The best approach we've found involves duplicating the
compute capabilities and affinity filters in ironic.

Some of us were talking at the nova midcycle and decided we should try
the hash ring approach (like ironic uses to shard nodes between
conductors) and see how it works out, even though people have said in
the past that wouldn't work. I did a proof of concept last week, and
started playing with five compute daemons in a devstack environment.
Two nerd-snipey days later and I had a fully working solution, with unit
tests, passing CI. That is here:
https://review.openstack.org/#/c/348443/


w00t :)


We'll need to work on CI for this with multiple compute services. That
shouldn't be crazy difficult, but I'm not sure we'll have it done this
cycle (and it might get interesting trying to test computes joining and
leaving the cluster).

It also needs some testing at scale, which is hard to do in the upstream
gate, but I'll be doing my best to ship this downstream as soon as I
can, and iterating on any problems we see there.

It's a huge win for operators, for only a few hundred lines (some of
which will be pulled out to oslo next cycle, as it's copied from
ironic). The single compute mode would still be recommended while we
iron out any issues here, and that mode is well-understood (as this will
behave the same in that case). We have a couple of nova cores on board
with helping get this through, and I think it's totally doable.

Thanks for hearing me out,

// jim


__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

[openstack-dev] [nova][ironic] A couple feature freeze exception requests

2016-08-01 Thread Jim Rollenhagen

Yes, I know this is stupid late for these.

I'd like to request two exceptions to the non-priority feature freeze,
for a couple of features in the Ironic driver.  These were not requested
at the normal time as I thought they were nowhere near ready.

Multitenant networking
==

Ironic's top feature request for around 2 years now has been to make
networking safe for multitenant use, as opposed to a flat network
(including control plane access!) for all tenants. We've been working on
a solution for 3 cycles now, and finally have the Ironic pieces of it
done, after a heroic effort to finish things up this cycle.

There's just one patch left to make it work, in the virt driver in Nova.
That is here: https://review.openstack.org/#/c/297895/

It's important to note that this actually fixes some dead code we pushed
on before this feature was done, and is only ~50 lines, half of which
are comments/reno.

Reviewers on this unearthed a problem on the ironic side, which I expect
to be fixed in the next couple of days:
https://review.openstack.org/#/q/topic:bug/1608511

We also have CI for this feature in ironic, and I have a depends-on
testing all of this as a whole: https://review.openstack.org/#/c/347004/

Per Matt's request, I'm also adding that job to Nova's experimental
queue: https://review.openstack.org/#/c/349595/

A couple folks from the ironic team have also done some manual testing
of this feature, with the nova code in, using real switches.

Merging this patch would bring a *huge* win for deployers and operators,
and I don't think it's very risky. It'll be ready to go sometime this
week, once that ironic chain is merged.

Multi-compute usage via a hash ring
===

One of the major problems with the ironic virt driver today is that we
don't support running multiple nova-compute daemons with the ironic driver
loaded, because each compute service manages all ironic nodes and stomps
on each other.

There's currently a hack in the ironic virt driver to
kind of make this work, but instance locking still isn't done:
https://github.com/openstack/ironic/blob/master/ironic/nova/compute/manager.py

That is also holding back removing the pluggable compute manager in nova:
https://github.com/openstack/nova/blob/master/nova/conf/service.py#L64-L69

And as someone that runs a deployment using this hack, I can tell you
first-hand that it doesn't work well.

We (the ironic and nova community) have been working on fixing this for
2-3 cycles now, trying to find a solution that isn't terrible and
doesn't break existing use cases. We've been conflating it with how we
schedule ironic instances and keep managing to find a big wedge with
each approach. The best approach we've found involves duplicating the
compute capabilities and affinity filters in ironic.

Some of us were talking at the nova midcycle and decided we should try
the hash ring approach (like ironic uses to shard nodes between
conductors) and see how it works out, even though people have said in
the past that wouldn't work. I did a proof of concept last week, and
started playing with five compute daemons in a devstack environment.
Two nerd-snipey days later and I had a fully working solution, with unit
tests, passing CI. That is here:
https://review.openstack.org/#/c/348443/

We'll need to work on CI for this with multiple compute services. That
shouldn't be crazy difficult, but I'm not sure we'll have it done this
cycle (and it might get interesting trying to test computes joining and
leaving the cluster).

It also needs some testing at scale, which is hard to do in the upstream
gate, but I'll be doing my best to ship this downstream as soon as I
can, and iterating on any problems we see there.

It's a huge win for operators, for only a few hundred lines (some of
which will be pulled out to oslo next cycle, as it's copied from
ironic). The single compute mode would still be recommended while we
iron out any issues here, and that mode is well-understood (as this will
behave the same in that case). We have a couple of nova cores on board
with helping get this through, and I think it's totally doable.

Thanks for hearing me out,

// jim


__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [nova][ironic] A couple feature freeze exception requests

Re: [openstack-dev] [nova][ironic] A couple feature freeze exception requests

Re: [openstack-dev] [nova][ironic] A couple feature freeze exception requests

[openstack-dev] [nova][ironic] A couple feature freeze exception requests

4 matches

Site Navigation

Mail list logo

Footer information