Re: [openstack-dev] [neutron] Neutron rolling upgrade - are we there yet?
Hi folks, I see there is significant interest in neutron upgrade strategy. I suggest we meet on summit on Fri during ‘unplugged’ track in a small group and come up with a plan for Mitaka and beyond. I start to think that the work we can expect is quite enormous, and some coordination is due. Maybe we’ll need to form a close subteam to track the effort in Mitaka. I started an etherpad to track upgrade strategy discussions at: https://etherpad.openstack.org/p/neutron-upgrade-strategy Note that Artur already added the track to unplugged etherpad: https://etherpad.openstack.org/p/mitaka-neutron-unplugged-track See you in Tokyo, Ihar > On 19 Oct 2015, at 11:03, Miguel Angel Ajowrote: > > Rossella Sblendido wrote: >> Hello Artur, >> >> thanks for staring this thread. See inline please. >> >> On 10/15/2015 05:23 PM, Ihar Hrachyshka wrote: >>> Hi Artur, >>> >>> thanks a lot for caring about upgrades! >>> >>> There are a lot of good points below. As you noted, surprisingly, we seem >>> to have rolling upgrades working for RPC layer. Before we go into >>> complicating database workflow by doing oslo.versionedobjects transition >>> heavy-lifting, I would like us to spend cycles on making sure rolling >>> upgrades work not just surprisingly, but also covered with appropriate >>> gating (I speak grenade). >> >> +1 agreed that the first step is to have test coverage then we can go on >> improving the process :) >> >>> >>> I also feel that upgrades are in lots of ways not only a technical issue, >>> but a cultural one too. You should have reviewers being aware of all the >>> moving parts, and how a seemingly innocent change can break the flow. >>> That’s why I plan to start on a devref page specifically about upgrades, >>> where we could lay ground about which scenarios we should support, and >>> those we should not (f.e. we have plenty of compatibility code in agents >>> that to handle old controller scenario, which should not be supported); how >>> all pieces interact and behave in transition, and what to look for during >>> reviews. Hopefully, once such a page is up and read by folks, we will be >>> able to have more meaningful conversation about our upgrade strategy. >>> On 14 Oct 2015, at 20:10, Korzeniewski, Artur wrote: Hi all, I would like to gather all upgrade activities in Neutron in one place, in order to summarizes the current status and future activities on rolling upgrades in Mitaka. >>> >>> If you think it’s worth it, we can start up a new etherpad page to gather >>> upgrade ideas and things to do. >>> 1. RPC versioning a. It is already implemented in Neutron. b. TODO: To have the rolling upgrade we have to implement the RPC version pinning in conf. i. I’m not a big fan of this solution, but we can work out better idea if needed. >>> >>> As Dan pointed out, and as I think Miguel was thinking about, we can have >>> pin defined by agents in the cluster. Actually, we can have per agent pin. >> >> I am not a big fan either mostly because the pinning is a manual task. >> Anyway looking at the patch Dan linked >> https://review.openstack.org/#/c/233289/ ...if we remove the manual step I >> can become a fan of this approach :) >> > Yes, the minimum implementation we could agree on initially was pining. > Direct request of objects from agents > to neutron-server includes the requested version, so that's always OK, the > complicated part is notification of object > changes via fanout. > > In that case, I thinking of including the supported object versions on agent > status reports, so neutron server can > decide on runtime which versions to send (in some cases it may need to send > several versions in parallel), I'm in > long due to upload the strategy to the rpc callbacks devref. But it will be > along those lines. > >>> c. Possible unit/functional tests to catch RPC version incompatibilities between RPC revisions. d. TODO: Multi-node Grenade job to have rolling upgrades covered in CI. >>> >>> That is not for unit or functional test level. >>> >>> As you mentioned, we already have grenade project that is designed to test >>> upgrades. To validate RPC compatibility on rolling upgrade we would need so >>> called ‘partial’ job (when different components are running with different >>> versions; in case of neutron it would mean a new controller and old >>> agents). The job is present in nova gate and validates RPC compatibility. >>> >>> As far as I know, Russell Bryant was looking into introducing the job for >>> neutron, but was blocked by ongoing grenade refactoring to support partial >>> upgrades ‘the right way’ (using multinode setups). I think that we should >>> check with
Re: [openstack-dev] [neutron] Neutron rolling upgrade - are we there yet?
Hello Artur, thanks for staring this thread. See inline please. On 10/15/2015 05:23 PM, Ihar Hrachyshka wrote: Hi Artur, thanks a lot for caring about upgrades! There are a lot of good points below. As you noted, surprisingly, we seem to have rolling upgrades working for RPC layer. Before we go into complicating database workflow by doing oslo.versionedobjects transition heavy-lifting, I would like us to spend cycles on making sure rolling upgrades work not just surprisingly, but also covered with appropriate gating (I speak grenade). +1 agreed that the first step is to have test coverage then we can go on improving the process :) I also feel that upgrades are in lots of ways not only a technical issue, but a cultural one too. You should have reviewers being aware of all the moving parts, and how a seemingly innocent change can break the flow. That’s why I plan to start on a devref page specifically about upgrades, where we could lay ground about which scenarios we should support, and those we should not (f.e. we have plenty of compatibility code in agents that to handle old controller scenario, which should not be supported); how all pieces interact and behave in transition, and what to look for during reviews. Hopefully, once such a page is up and read by folks, we will be able to have more meaningful conversation about our upgrade strategy. On 14 Oct 2015, at 20:10, Korzeniewski, Arturwrote: Hi all, I would like to gather all upgrade activities in Neutron in one place, in order to summarizes the current status and future activities on rolling upgrades in Mitaka. If you think it’s worth it, we can start up a new etherpad page to gather upgrade ideas and things to do. 1. RPC versioning a. It is already implemented in Neutron. b. TODO: To have the rolling upgrade we have to implement the RPC version pinning in conf. i. I’m not a big fan of this solution, but we can work out better idea if needed. As Dan pointed out, and as I think Miguel was thinking about, we can have pin defined by agents in the cluster. Actually, we can have per agent pin. I am not a big fan either mostly because the pinning is a manual task. Anyway looking at the patch Dan linked https://review.openstack.org/#/c/233289/ ...if we remove the manual step I can become a fan of this approach :) c. Possible unit/functional tests to catch RPC version incompatibilities between RPC revisions. d. TODO: Multi-node Grenade job to have rolling upgrades covered in CI. That is not for unit or functional test level. As you mentioned, we already have grenade project that is designed to test upgrades. To validate RPC compatibility on rolling upgrade we would need so called ‘partial’ job (when different components are running with different versions; in case of neutron it would mean a new controller and old agents). The job is present in nova gate and validates RPC compatibility. As far as I know, Russell Bryant was looking into introducing the job for neutron, but was blocked by ongoing grenade refactoring to support partial upgrades ‘the right way’ (using multinode setups). I think that we should check with grenade folks on that matter, I have heard start of Mitaka was ETA for this work to complete. 2. Message content versioning – versioned objects a. TODO: implement Oslo Versionobject in Mitaka cycle. The interesting entities to be implemented: network, subnet, port, security groups… Though we haven’t touched base neutron resources in Liberty, we introduced oslo.versionedobjects based NeutronObject class during Liberty as part of QoS effort. I plan to expand on that work during Mitaka. The existing code for QoS resources can be found at: https://github.com/openstack/neutron/tree/master/neutron/objects b. Will OVO have impact on vendor plugins? It surely can have significant impact, but hopefully dict compat layer should make transition more smooth: https://github.com/openstack/neutron/blob/master/neutron/objects/base.py#L50 c. Be strict on changes in version objects in code review, any change in object structure should increment the minor (backward-compatible) or major (breaking change) RPC version. That’s assuming we have a clear mapping of objects onto current RPC interfaces, which is not obvious. Another problem we would need to solve is core resource extensions (currently available in ml2 only), like qos or port_security, that modify resources based on controller configuration. d. Indirection API – message from newer format should be translated to older version by neutron server. For QoS, we used a new object agnostic subscriber mechanism to propagate changes applied to QoS objects into agents: http://docs.openstack.org/developer/neutron/devref/rpc_callbacks.html It is already (expected) to downgrade
Re: [openstack-dev] [neutron] Neutron rolling upgrade - are we there yet?
Rossella Sblendido wrote: Hello Artur, thanks for staring this thread. See inline please. On 10/15/2015 05:23 PM, Ihar Hrachyshka wrote: Hi Artur, thanks a lot for caring about upgrades! There are a lot of good points below. As you noted, surprisingly, we seem to have rolling upgrades working for RPC layer. Before we go into complicating database workflow by doing oslo.versionedobjects transition heavy-lifting, I would like us to spend cycles on making sure rolling upgrades work not just surprisingly, but also covered with appropriate gating (I speak grenade). +1 agreed that the first step is to have test coverage then we can go on improving the process :) I also feel that upgrades are in lots of ways not only a technical issue, but a cultural one too. You should have reviewers being aware of all the moving parts, and how a seemingly innocent change can break the flow. That’s why I plan to start on a devref page specifically about upgrades, where we could lay ground about which scenarios we should support, and those we should not (f.e. we have plenty of compatibility code in agents that to handle old controller scenario, which should not be supported); how all pieces interact and behave in transition, and what to look for during reviews. Hopefully, once such a page is up and read by folks, we will be able to have more meaningful conversation about our upgrade strategy. On 14 Oct 2015, at 20:10, Korzeniewski, Arturwrote: Hi all, I would like to gather all upgrade activities in Neutron in one place, in order to summarizes the current status and future activities on rolling upgrades in Mitaka. If you think it’s worth it, we can start up a new etherpad page to gather upgrade ideas and things to do. 1. RPC versioning a. It is already implemented in Neutron. b. TODO: To have the rolling upgrade we have to implement the RPC version pinning in conf. i. I’m not a big fan of this solution, but we can work out better idea if needed. As Dan pointed out, and as I think Miguel was thinking about, we can have pin defined by agents in the cluster. Actually, we can have per agent pin. I am not a big fan either mostly because the pinning is a manual task. Anyway looking at the patch Dan linked https://review.openstack.org/#/c/233289/ ...if we remove the manual step I can become a fan of this approach :) Yes, the minimum implementation we could agree on initially was pining. Direct request of objects from agents to neutron-server includes the requested version, so that's always OK, the complicated part is notification of object changes via fanout. In that case, I thinking of including the supported object versions on agent status reports, so neutron server can decide on runtime which versions to send (in some cases it may need to send several versions in parallel), I'm in long due to upload the strategy to the rpc callbacks devref. But it will be along those lines. c. Possible unit/functional tests to catch RPC version incompatibilities between RPC revisions. d. TODO: Multi-node Grenade job to have rolling upgrades covered in CI. That is not for unit or functional test level. As you mentioned, we already have grenade project that is designed to test upgrades. To validate RPC compatibility on rolling upgrade we would need so called ‘partial’ job (when different components are running with different versions; in case of neutron it would mean a new controller and old agents). The job is present in nova gate and validates RPC compatibility. As far as I know, Russell Bryant was looking into introducing the job for neutron, but was blocked by ongoing grenade refactoring to support partial upgrades ‘the right way’ (using multinode setups). I think that we should check with grenade folks on that matter, I have heard start of Mitaka was ETA for this work to complete. 2. Message content versioning – versioned objects a. TODO: implement Oslo Versionobject in Mitaka cycle. The interesting entities to be implemented: network, subnet, port, security groups… Though we haven’t touched base neutron resources in Liberty, we introduced oslo.versionedobjects based NeutronObject class during Liberty as part of QoS effort. I plan to expand on that work during Mitaka. ++ The existing code for QoS resources can be found at: https://github.com/openstack/neutron/tree/master/neutron/objects b. Will OVO have impact on vendor plugins? It surely can have significant impact, but hopefully dict compat layer should make transition more smooth: https://github.com/openstack/neutron/blob/master/neutron/objects/base.py#L50 Correct. c. Be strict on changes in version objects in code review, any change in object structure should increment the minor (backward-compatible) or major (breaking change) RPC
Re: [openstack-dev] [neutron] Neutron rolling upgrade - are we there yet?
Thanks a lot for bringing up this theme! I'm interested in working on online data migration in Mitaka. 3. Database migration a. Online schema migration was done in Liberty release, any work left to do? The work here is finished. The only thing is that I'm aware of is some extra tests https://review.openstack.org/#/c/220091/. But this needs some Alembic changes. All main functionality is implemented. b. TODO: Online data migration to be introduced in Mitaka cycle. i. Online data migration can be done during normal operation on the data. ii. There should be also the script to invoke the data migration in the background. c. Currently the contract phase is doing the data migration. But since the contract phase should be run offline, we should move the data migration to preceding step. Also the contract phase should be blocked if there is still relevant data in removed entities. i. Contract phase can be executed online, if there is all new code running in setup. d. The other strategy is to not drop tables, alter names or remove the columns from the DB – what’s in, it’s in. We should put more attention on code reviews, merge only additive changes and avoid questionable DB modification. Unfortunately sometimes we may need such changes, despite we always tried to avoid it. As plugins were moved out of Neutron it can be easier now, but I'm still not sure we can have restriction. e. The Neutron server should be updated first, in order to do data translation between old format into new schema. When doing this, we can be sure that old data would not be inserted into old DB structures. On Wed, Oct 14, 2015 at 9:27 PM, Dan Smithwrote: > > I would like to gather all upgrade activities in Neutron in one place, > > in order to summarizes the current status and future activities on > > rolling upgrades in Mitaka. > > Glad to see this work really picking up steam in other projects! > > > b. TODO: To have the rolling upgrade we have to implement the RPC > > version pinning in conf. > > > > i. I’m not a big > > fan of this solution, but we can work out better idea if needed. > > I'll just point to this: > > https://review.openstack.org/#/c/233289/ > > and if you go check the logs for the partial-ncpu job, you'll see > something like this: > > nova.compute.rpcapi Automatically selected compute RPC version 4.5 > from minimum service version 2 > > I think that some amount of RPC pinning is probably going to be required > for most people in most places, given our current model. But I assume > the concern is around requiring this to be a manual task the operators > have to manage. The above patch is the first step towards nova removing > this as something the operators have to know anything about. > > --Dan > > __ > OpenStack Development Mailing List (not for usage questions) > Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev > -- Regards, Ann Kamyshnikova Mirantis, Inc __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [neutron] Neutron rolling upgrade - are we there yet?
Hi Artur, thanks a lot for caring about upgrades! There are a lot of good points below. As you noted, surprisingly, we seem to have rolling upgrades working for RPC layer. Before we go into complicating database workflow by doing oslo.versionedobjects transition heavy-lifting, I would like us to spend cycles on making sure rolling upgrades work not just surprisingly, but also covered with appropriate gating (I speak grenade). I also feel that upgrades are in lots of ways not only a technical issue, but a cultural one too. You should have reviewers being aware of all the moving parts, and how a seemingly innocent change can break the flow. That’s why I plan to start on a devref page specifically about upgrades, where we could lay ground about which scenarios we should support, and those we should not (f.e. we have plenty of compatibility code in agents that to handle old controller scenario, which should not be supported); how all pieces interact and behave in transition, and what to look for during reviews. Hopefully, once such a page is up and read by folks, we will be able to have more meaningful conversation about our upgrade strategy. > On 14 Oct 2015, at 20:10, Korzeniewski, Artur> wrote: > > Hi all, > > I would like to gather all upgrade activities in Neutron in one place, in > order to summarizes the current status and future activities on rolling > upgrades in Mitaka. > If you think it’s worth it, we can start up a new etherpad page to gather upgrade ideas and things to do. > > > 1. RPC versioning > > a. It is already implemented in Neutron. > > b. TODO: To have the rolling upgrade we have to implement the RPC > version pinning in conf. > > i. I’m not a big fan > of this solution, but we can work out better idea if needed. As Dan pointed out, and as I think Miguel was thinking about, we can have pin defined by agents in the cluster. Actually, we can have per agent pin. > > c. Possible unit/functional tests to catch RPC version incompatibilities > between RPC revisions. > > d. TODO: Multi-node Grenade job to have rolling upgrades covered in CI. That is not for unit or functional test level. As you mentioned, we already have grenade project that is designed to test upgrades. To validate RPC compatibility on rolling upgrade we would need so called ‘partial’ job (when different components are running with different versions; in case of neutron it would mean a new controller and old agents). The job is present in nova gate and validates RPC compatibility. As far as I know, Russell Bryant was looking into introducing the job for neutron, but was blocked by ongoing grenade refactoring to support partial upgrades ‘the right way’ (using multinode setups). I think that we should check with grenade folks on that matter, I have heard start of Mitaka was ETA for this work to complete. > > 2. Message content versioning – versioned objects > > a. TODO: implement Oslo Versionobject in Mitaka cycle. The interesting > entities to be implemented: network, subnet, port, security groups… Though we haven’t touched base neutron resources in Liberty, we introduced oslo.versionedobjects based NeutronObject class during Liberty as part of QoS effort. I plan to expand on that work during Mitaka. The existing code for QoS resources can be found at: https://github.com/openstack/neutron/tree/master/neutron/objects > > b. Will OVO have impact on vendor plugins? It surely can have significant impact, but hopefully dict compat layer should make transition more smooth: https://github.com/openstack/neutron/blob/master/neutron/objects/base.py#L50 > > c. Be strict on changes in version objects in code review, any change in > object structure should increment the minor (backward-compatible) or major > (breaking change) RPC version. That’s assuming we have a clear mapping of objects onto current RPC interfaces, which is not obvious. Another problem we would need to solve is core resource extensions (currently available in ml2 only), like qos or port_security, that modify resources based on controller configuration. > > d. Indirection API – message from newer format should be translated to > older version by neutron server. For QoS, we used a new object agnostic subscriber mechanism to propagate changes applied to QoS objects into agents: http://docs.openstack.org/developer/neutron/devref/rpc_callbacks.html It is already (expected) to downgrade objects based on agent version (note it’s not implemented yet, but will surely be ready during Mitaka): https://github.com/openstack/neutron/blob/master/neutron/api/rpc/handlers/resources_rpc.py#L142 > > 3. Database migration > > a. Online schema migration was done in Liberty release, any work left to > do? Nothing specific, maybe a bug or two here and there. > >
[openstack-dev] [neutron] Neutron rolling upgrade - are we there yet?
Hi all, I would like to gather all upgrade activities in Neutron in one place, in order to summarizes the current status and future activities on rolling upgrades in Mitaka. 1. RPC versioning a. It is already implemented in Neutron. b. TODO: To have the rolling upgrade we have to implement the RPC version pinning in conf. i. I'm not a big fan of this solution, but we can work out better idea if needed. c. Possible unit/functional tests to catch RPC version incompatibilities between RPC revisions. d. TODO: Multi-node Grenade job to have rolling upgrades covered in CI. 2. Message content versioning - versioned objects a. TODO: implement Oslo Versionobject in Mitaka cycle. The interesting entities to be implemented: network, subnet, port, security groups... b. Will OVO have impact on vendor plugins? c. Be strict on changes in version objects in code review, any change in object structure should increment the minor (backward-compatible) or major (breaking change) RPC version. d. Indirection API - message from newer format should be translated to older version by neutron server. 3. Database migration a. Online schema migration was done in Liberty release, any work left to do? b. TODO: Online data migration to be introduced in Mitaka cycle. i. Online data migration can be done during normal operation on the data. ii. There should be also the script to invoke the data migration in the background. c. Currently the contract phase is doing the data migration. But since the contract phase should be run offline, we should move the data migration to preceding step. Also the contract phase should be blocked if there is still relevant data in removed entities. i. Contract phase can be executed online, if there is all new code running in setup. d. The other strategy is to not drop tables, alter names or remove the columns from the DB - what's in, it's in. We should put more attention on code reviews, merge only additive changes and avoid questionable DB modification. e. The Neutron server should be updated first, in order to do data translation between old format into new schema. When doing this, we can be sure that old data would not be inserted into old DB structures. I have performed the manual Kilo to Liberty upgrade, both in operational manner and in code review of the RPC APIs. All is working fine. We can have some discussion on cross-project session [7] or we can also review any issues with Neutron upgrade in Friday's unplugged session [8]. Sources: [1] http://www.danplanet.com/blog/2015/10/05/upgrades-in-nova-rpc-apis/ [2] http://www.danplanet.com/blog/2015/10/06/upgrades-in-nova-objects/ [3] http://www.danplanet.com/blog/2015/10/07/upgrades-in-nova-database-migrations/ [4] https://github.com/openstack/neutron/blob/master/doc/source/devref/rpc_callbacks.rst [5] http://www.danplanet.com/blog/2015/06/26/upgrading-nova-to-kilo-with-minimal-downtime/ [6] https://github.com/openstack/neutron-specs/blob/master/specs/liberty/online-schema-migrations.rst [7] https://etherpad.openstack.org/p/mitaka-cross-project-session-planning [8] https://etherpad.openstack.org/p/mitaka-neutron-unplugged-track Regards, Artur Korzeniewski IRC: korzen Intel Technology Poland sp. z o.o. KRS 101882 ul. Slowackiego 173, 80-298 Gdansk __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [neutron] Neutron rolling upgrade - are we there yet?
> I would like to gather all upgrade activities in Neutron in one place, > in order to summarizes the current status and future activities on > rolling upgrades in Mitaka. Glad to see this work really picking up steam in other projects! > b. TODO: To have the rolling upgrade we have to implement the RPC > version pinning in conf. > > i. I’m not a big > fan of this solution, but we can work out better idea if needed. I'll just point to this: https://review.openstack.org/#/c/233289/ and if you go check the logs for the partial-ncpu job, you'll see something like this: nova.compute.rpcapi Automatically selected compute RPC version 4.5 from minimum service version 2 I think that some amount of RPC pinning is probably going to be required for most people in most places, given our current model. But I assume the concern is around requiring this to be a manual task the operators have to manage. The above patch is the first step towards nova removing this as something the operators have to know anything about. --Dan __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev