Re: [openstack-dev] [neutron][upgrades] Potential issues when performing Neutron upgrades
On 07/13/2015 04:09 AM, Kevin Benton wrote: because you won't have to run Neutron agents on compute nodes anymore. How will upgrades work for OVN? We haven't written anything down yet, but here's what I expect. Right now we're still changing the db schema however is needed without messing with versioning. As we get to production ready, I expect we'll start being strict about only making backwards compatible ovsdb schema changes to make upgrades easier. There are 2 central components - ovn-northd and ovsdb-server - that would be upgraded first, which I would expect to be done at the same time as upgrading your Neutron control plane. As long as any ovsdb schema changes are backwards compatible, you could do rolling-upgrades of ovn-controller on compute or network nodes. -- Russell Bryant __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [neutron][upgrades] Potential issues when performing Neutron upgrades
Thanks for the info. So the equivalent in neutron would be if we just ensure backward compatible AMQP APIs, right? On Mon, Jul 13, 2015 at 7:33 AM, Russell Bryant rbry...@redhat.com wrote: On 07/13/2015 04:09 AM, Kevin Benton wrote: because you won't have to run Neutron agents on compute nodes anymore. How will upgrades work for OVN? We haven't written anything down yet, but here's what I expect. Right now we're still changing the db schema however is needed without messing with versioning. As we get to production ready, I expect we'll start being strict about only making backwards compatible ovsdb schema changes to make upgrades easier. There are 2 central components - ovn-northd and ovsdb-server - that would be upgraded first, which I would expect to be done at the same time as upgrading your Neutron control plane. As long as any ovsdb schema changes are backwards compatible, you could do rolling-upgrades of ovn-controller on compute or network nodes. -- Russell Bryant __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev -- Kevin Benton __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [neutron][upgrades] Potential issues when performing Neutron upgrades
On 07/13/2015 05:08 PM, Kevin Benton wrote: Thanks for the info. So the equivalent in neutron would be if we just ensure backward compatible AMQP APIs, right? There's a few parts: 1) Backwards compatibility with changes to the oslo.messaging APIs using API versioning (what you're referring to, I think). Neutron does this (though not tested in a mixed version mid-upgrade environment yet). 2) Compatibility of the data sent over those interfaces. This is where oslo.versionedobjects comes in. Breakage here is much easier to miss since it's not always obvious when you're modifying a data structure that's sent over the wire. There has been a ton of work in Nova to version the data sent over the wire and have the ability for a service (nova-conductor in nova's case) to be able to convert objects back to a version that an older service can understand. This is the most likely way Neutron will break rolling upgrades right now, especially since it's not tested. 3) DB schema. Depending on what services access the db directly and what the rolling upgrade strategy is, there may be some additional constraints on making sure the db schema is backwards copmatible, too. -- Russell Bryant On Mon, Jul 13, 2015 at 7:33 AM, Russell Bryant rbry...@redhat.com mailto:rbry...@redhat.com wrote: On 07/13/2015 04:09 AM, Kevin Benton wrote: because you won't have to run Neutron agents on compute nodes anymore. How will upgrades work for OVN? We haven't written anything down yet, but here's what I expect. Right now we're still changing the db schema however is needed without messing with versioning. As we get to production ready, I expect we'll start being strict about only making backwards compatible ovsdb schema changes to make upgrades easier. There are 2 central components - ovn-northd and ovsdb-server - that would be upgraded first, which I would expect to be done at the same time as upgrading your Neutron control plane. As long as any ovsdb schema changes are backwards compatible, you could do rolling-upgrades of ovn-controller on compute or network nodes. -- Russell Bryant __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev -- Kevin Benton __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev -- Russell Bryant __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [neutron][upgrades] Potential issues when performing Neutron upgrades
Some pedant comments inline. Salvatore On 13 July 2015 at 23:29, Russell Bryant rbry...@redhat.com wrote: On 07/13/2015 05:08 PM, Kevin Benton wrote: Thanks for the info. So the equivalent in neutron would be if we just ensure backward compatible AMQP APIs, right? There's a few parts: 1) Backwards compatibility with changes to the oslo.messaging APIs using API versioning (what you're referring to, I think). Neutron does this (though not tested in a mixed version mid-upgrade environment yet). 2) Compatibility of the data sent over those interfaces. This is where oslo.versionedobjects comes in. Breakage here is much easier to miss since it's not always obvious when you're modifying a data structure that's sent over the wire. There has been a ton of work in Nova to version the data sent over the wire and have the ability for a service (nova-conductor in nova's case) to be able to convert objects back to a version that an older service can understand. This is the most likely way Neutron will break rolling upgrades right now, especially since it's not tested. It is worth noting that versioned objects are helpful in any circumstance where you have a versioned RPC API be it AMQP or REST or whatever. Neutron now completely lacks a layer between the front-end API endpoint and the plugin, which then manages DB access. The now pretty much defunct perestroika blueprint aimed to do this; These versioned objects would live in this layer, which older folks like me who studied software engineering in late '90s would call Business logic layer. But this discussion is really out of scope for this thread, so I'll stop here. 3) DB schema. Depending on what services access the db directly and what the rolling upgrade strategy is, there may be some additional constraints on making sure the db schema is backwards copmatible, too. I guess if one properly uses object persistency so that DB access can be entirely performed via API objects, then #2 should imply #3 (and possibly even hide backward incompatible DB schema changes). -- Russell Bryant On Mon, Jul 13, 2015 at 7:33 AM, Russell Bryant rbry...@redhat.com mailto:rbry...@redhat.com wrote: On 07/13/2015 04:09 AM, Kevin Benton wrote: because you won't have to run Neutron agents on compute nodes anymore. How will upgrades work for OVN? We haven't written anything down yet, but here's what I expect. Right now we're still changing the db schema however is needed without messing with versioning. As we get to production ready, I expect we'll start being strict about only making backwards compatible ovsdb schema changes to make upgrades easier. There are 2 central components - ovn-northd and ovsdb-server - that would be upgraded first, which I would expect to be done at the same time as upgrading your Neutron control plane. As long as any ovsdb schema changes are backwards compatible, you could do rolling-upgrades of ovn-controller on compute or network nodes. -- Russell Bryant __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev -- Kevin Benton __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev -- Russell Bryant __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [neutron][upgrades] Potential issues when performing Neutron upgrades
because you won't have to run Neutron agents on compute nodes anymore. How will upgrades work for OVN? On Thu, Jul 9, 2015 at 2:30 PM, Russell Bryant rbry...@redhat.com wrote: On 07/09/2015 10:11 AM, Ihar Hrachyshka wrote: On 07/09/2015 09:01 AM, Artur Korzen wrote: Hi all, I've been researching the Neutron project as a part of work on Openstack rolling upgrades, my primary assignments included testing if there is no VM access downtime when performing upgrade. Are you aware of any issues that are present in Liberty release of Neutron, that may cause the VM access downtime and may stop ops to pick up newest versions of code? When talking about no VM access downtime during upgrade, the sensitive operations are: 1. ensure that even after neutron services is shutoff/uninstalled, the traffic can go into the VM 2. take care on neutron services startup/installation that existing configuration is preserved (routing tables, forwarding entries and security rules) 3. the underlying network technologies (OVS, linux bridges etc.) is working without distraction when upgrading neutron code: no dropping table flows or applying the traffic rules to drop the packages To be compatible between different versions of service talking to each other, the important areas are: 1. RPC API versioning 2. Objects exchanged between services via RPC 3. Database schema (bp [2]) and data migration I have researched the ML2 plugin with ovs agent, including the neutron-ovs-agent, dhcp-server, l3-agent, neutron-server. One of the VM access downtime during upgrade is addressed in [1], removing the flows in ovs after neutron agent restarts. Is Neutron ready to be upgradable with minimal downtime of services and no VM access downtime? As the ovs bug you refer to above, no, at least not in reference implementation. That's for data plane. As for other services, neutron-server online schema migration should help, and I hope to get it implemented in L (though there are some obstacles that may block us; we'll see). As for live data migration, neutron code base is not ready yet to reasonably require live data migration being implemented in all patches that need data moves in database. The very first obstacle for that is that there is no middleware layer between sqlalchemy and the rest of neutron that would allow us to hide migration details. Oslo.versionedobjects is such a middleware. See below. Are there any guidelines on using the oslo versionobjects and its priority in Liberty cycle? It's not a common priority for L, but we've started on this road inside feature/qos branch that will hopefully get into master some time after L-2. If interested, see: http://git.openstack.org/cgit/openstack/neutron/tree/neutron/objects?h=f eature/qos Once the feature branch is merged into master, I plan to start converting existing resources to objects. It may take time and will definitely span to M. Depending on progress in this regard, we'll see whether we will be able to consider live data migration. At the moment, I don't see it happening in M, at least not at the start of it. Are there use-cases written down when talking about Neutron upgrades? One thing that is currently in review are partial upgrades. They are tested for nova (including nova-network) but not for neutron, so it's considered a nova-network/neutron parity issue. You should find most of relevant patches in: https://review.openstack.org/#/q/owner:%22Russell+Bryant%22+status:open, n,z Yes, there's a ton of work to make rolling upgrades as robust as what Nova has done. There's significant limitations to what Neutron can do without breaking it, but hopefully the grenade job would help us catch things that would break it sooner. At the moment those jobs have been blocked pending some re-work of how the partial upgrade jobs work. As of the last release (Kilo), rolling upgrades from Juno with Neutron worked when verified manually. I'm hoping we'll have the same for Liberty, but I'm not sure if anyone has tried it out recently. All of the things discussed here sound like good things to work on. I actually thought a group was going to work on versioned objects for Kilo, but that never materialized. I'll also plug a project I'm working on (OVN, part of the Open vSwitch project), which I also think simplifies this a good bit for Neutron, because you won't have to run Neutron agents on compute nodes anymore. -- Russell Bryant __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev -- Kevin Benton
Re: [openstack-dev] [neutron][upgrades] Potential issues when performing Neutron upgrades
Thanks Ihar comments! Is Neutron ready to be upgradable with minimal downtime of services and no VM access downtime? As the ovs bug you refer to above, no, at least not in reference implementation. That's for data plane. My understanding is that after the ovs neutron agent will be patched with [1], the upgrade from Kilo to Liberty on dataplane should be less painful that juno to Kilo. At least the VM access should be without connectivity downtime. IF in Liberty will be no breaking changes merged, the messages exchanged on RPC channel should be compatible between releases, as well as RPC API versions. Do you agree? As for other services, neutron-server online schema migration should help, and I hope to get it implemented in L (though there are some obstacles that may block us; we'll see). As for live data migration, neutron code base is not ready yet to reasonably require live data migration being implemented in all patches that need data moves in database. The very first obstacle for that is that there is no middleware layer between sqlalchemy and the rest of neutron that would allow us to hide migration details. Oslo.versionedobjects is such a middleware. See below. Are there any guidelines on using the oslo versionobjects and its priority in Liberty cycle? It's not a common priority for L, but we've started on this road inside feature/qos branch that will hopefully get into master some time after L-2. If interested, see: http://git.openstack.org/cgit/openstack/neutron/tree/neutron/objects?h=f eature/qos Once the feature branch is merged into master, I plan to start converting existing resources to objects. It may take time and will definitely span to M. Depending on progress in this regard, we'll see whether we will be able to consider live data migration. At the moment, I don't see it happening in M, at least not at the start of it. Maybe we can define the priorities on what should be changed into OVO first, to have good starting point for Liberty and M release upgrade process. I'm willing to help, as well as there can be some other volunteers to join. If the OVO cannot be delivered fully in Liberty, we should take care that no breaking changes will be merged in Liberty and take necessary steps to mitigate any risk of incompatibility of upgrades in Kilo-Liberty-M-release process. Are there use-cases written down when talking about Neutron upgrades? One thing that is currently in review are partial upgrades. They are tested for nova (including nova-network) but not for neutron, so it's considered a nova-network/neutron parity issue. You should find most of relevant patches in: https://review.openstack.org/#/q/owner:%22Russell+Bryant%22+status:open, n,z Ihar Regards, Artur Korzeniewski [1] https://bugs.launchpad.net/neutron/+bug/1383674 __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [neutron][upgrades] Potential issues when performing Neutron upgrades
On 07/09/2015 10:11 AM, Ihar Hrachyshka wrote: On 07/09/2015 09:01 AM, Artur Korzen wrote: Hi all, I've been researching the Neutron project as a part of work on Openstack rolling upgrades, my primary assignments included testing if there is no VM access downtime when performing upgrade. Are you aware of any issues that are present in Liberty release of Neutron, that may cause the VM access downtime and may stop ops to pick up newest versions of code? When talking about no VM access downtime during upgrade, the sensitive operations are: 1. ensure that even after neutron services is shutoff/uninstalled, the traffic can go into the VM 2. take care on neutron services startup/installation that existing configuration is preserved (routing tables, forwarding entries and security rules) 3. the underlying network technologies (OVS, linux bridges etc.) is working without distraction when upgrading neutron code: no dropping table flows or applying the traffic rules to drop the packages To be compatible between different versions of service talking to each other, the important areas are: 1. RPC API versioning 2. Objects exchanged between services via RPC 3. Database schema (bp [2]) and data migration I have researched the ML2 plugin with ovs agent, including the neutron-ovs-agent, dhcp-server, l3-agent, neutron-server. One of the VM access downtime during upgrade is addressed in [1], removing the flows in ovs after neutron agent restarts. Is Neutron ready to be upgradable with minimal downtime of services and no VM access downtime? As the ovs bug you refer to above, no, at least not in reference implementation. That's for data plane. As for other services, neutron-server online schema migration should help, and I hope to get it implemented in L (though there are some obstacles that may block us; we'll see). As for live data migration, neutron code base is not ready yet to reasonably require live data migration being implemented in all patches that need data moves in database. The very first obstacle for that is that there is no middleware layer between sqlalchemy and the rest of neutron that would allow us to hide migration details. Oslo.versionedobjects is such a middleware. See below. Are there any guidelines on using the oslo versionobjects and its priority in Liberty cycle? It's not a common priority for L, but we've started on this road inside feature/qos branch that will hopefully get into master some time after L-2. If interested, see: http://git.openstack.org/cgit/openstack/neutron/tree/neutron/objects?h=f eature/qos Once the feature branch is merged into master, I plan to start converting existing resources to objects. It may take time and will definitely span to M. Depending on progress in this regard, we'll see whether we will be able to consider live data migration. At the moment, I don't see it happening in M, at least not at the start of it. Are there use-cases written down when talking about Neutron upgrades? One thing that is currently in review are partial upgrades. They are tested for nova (including nova-network) but not for neutron, so it's considered a nova-network/neutron parity issue. You should find most of relevant patches in: https://review.openstack.org/#/q/owner:%22Russell+Bryant%22+status:open, n,z Yes, there's a ton of work to make rolling upgrades as robust as what Nova has done. There's significant limitations to what Neutron can do without breaking it, but hopefully the grenade job would help us catch things that would break it sooner. At the moment those jobs have been blocked pending some re-work of how the partial upgrade jobs work. As of the last release (Kilo), rolling upgrades from Juno with Neutron worked when verified manually. I'm hoping we'll have the same for Liberty, but I'm not sure if anyone has tried it out recently. All of the things discussed here sound like good things to work on. I actually thought a group was going to work on versioned objects for Kilo, but that never materialized. I'll also plug a project I'm working on (OVN, part of the Open vSwitch project), which I also think simplifies this a good bit for Neutron, because you won't have to run Neutron agents on compute nodes anymore. -- Russell Bryant __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev