[Yahoo-eng-team] [Bug 2008943] Please test proposed package
Hello Miro, or anyone else affected, Accepted neutron into wallaby-proposed. The package will build now and be available in the Ubuntu Cloud Archive in a few hours, and then in the -proposed repository. Please help us by testing this new package. To enable the -proposed repository: sudo add-apt-repository cloud-archive:wallaby-proposed sudo apt-get update Your feedback will aid us getting this update out to other Ubuntu users. If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, and change the tag from verification-wallaby-needed to verification-wallaby-done. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-wallaby-failed. In either case, details of your testing will help us make a better decision. Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance! ** Changed in: cloud-archive/wallaby Status: Fix Released => Fix Committed ** Tags added: verification-wallaby-needed -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/2008943 Title: OVN DB Sync utility cannot find NB DB Port Group Status in Ubuntu Cloud Archive: Fix Released Status in Ubuntu Cloud Archive ussuri series: Fix Released Status in Ubuntu Cloud Archive victoria series: Fix Released Status in Ubuntu Cloud Archive wallaby series: Fix Committed Status in Ubuntu Cloud Archive xena series: Fix Released Status in neutron: In Progress Status in neutron package in Ubuntu: Fix Released Status in neutron source package in Focal: Fix Released Bug description: Runtime exception: ovsdbapp.backend.ovs_idl.idlutils.RowNotFound: Cannot find Port_Group with name=pg_aa9f203b_ec51_4893_9bda_cfadbff9f800 can occure while performing database sync between Neutron db and OVN NB db using neutron-ovn-db-sync-util. This exception occures when the `sync_networks_ports_and_dhcp_opts()` function ends up implicitly creating a new default security group for a tenant/project id. This is normally ok but the problem is that `sync_port_groups` was already called and thus the port_group does not exists in NB DB. When the `sync_acls()` is called later there is no port group found and exception occurs. Quick way to reproduce on ML2/OVN: - openstack project create test_project - openstack create network --project test_project test_network - openstack port delete $(openstack port list --network test_network -c ID -f value) # since this is an empty network only the metadata port should get listed and subsequently deleted - openstack security group delete test_project So now that you have a network without a metadata port in it and no default security group for the project/tenant that this network belongs to run neutron-ovn-db-sync-util --config-file /etc/neutron/neutron.conf --config-file /etc/neutron/plugins/ml2/ml2_conf.ini --ovn- neutron_sync_mode migrate The exeption should occur Here is a more realistic scenario how we can run into this with ML2/OVS to ML2/OVN migration. I am also including why the code runs into it. 1. ML2/OVS enviroment with a network but no default security group for the project/tenant associated with the network 2. Perform ML2/OVS to ML2/OVN migration. This migration process will run neutron-ovn-db-sync-util with --migrate 3. During the sync we first sync port groups[1] from Neutron DB to OVN DB 4. Then we sync network ports [2]. The process will detect that the network in question is not part of OVN NB. It will create that network in OVN NB db and along with that it will create a metadata port for it(OVN network requires metadataport). The Port_create call will implicitly notify _ensure_default_security_group_handler which will not find securty group for that tenant/project id and create one. Now you have a new security group with 4 new default security group rules. 5. When sync_acls[4] runs it will pick up those 4 new rules but commit to NB DB will fail since the port_group(aka security group) does not exists in NB DB [1] https://opendev.org/openstack/neutron/src/branch/master/neutron/plugins/ml2/drivers/ovn/mech_driver/ovsdb/ovn_db_sync.py#L104 [2] https://opendev.org/openstack/neutron/src/branch/master/neutron/plugins/ml2/drivers/ovn/mech_driver/ovsdb/ovn_db_sync.py#L10 [3] https://opendev.org/openstack/neutron/src/branch/master/neutron/db/securitygroups_db.py#L915 [4] https://opendev.org/openstack/neutron/src/branch/master/neutron/plugins/ml2/drivers/ovn/mech_driver/ovsdb/ovn_db_sync.py#L107 = Ubuntu SRU Details = [Impact] See bug description. [Test Case] Deploy openstack with OVN. Follow steps in "Quick way to reproduce on ML2/OVN" from bug description. [Where problems could
[Yahoo-eng-team] [Bug 2030773] Please test proposed package
Hello Lucas, or anyone else affected, Accepted neutron into wallaby-proposed. The package will build now and be available in the Ubuntu Cloud Archive in a few hours, and then in the -proposed repository. Please help us by testing this new package. To enable the -proposed repository: sudo add-apt-repository cloud-archive:wallaby-proposed sudo apt-get update Your feedback will aid us getting this update out to other Ubuntu users. If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, and change the tag from verification-wallaby-needed to verification-wallaby-done. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-wallaby-failed. In either case, details of your testing will help us make a better decision. Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance! ** Changed in: cloud-archive/wallaby Status: Fix Released => Fix Committed ** Tags added: verification-wallaby-needed -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/2030773 Title: OVN DB Sync always logs warning messages about updating all router ports Status in Ubuntu Cloud Archive: Fix Released Status in Ubuntu Cloud Archive antelope series: Fix Released Status in Ubuntu Cloud Archive ussuri series: Fix Released Status in Ubuntu Cloud Archive victoria series: Fix Released Status in Ubuntu Cloud Archive wallaby series: Fix Committed Status in Ubuntu Cloud Archive xena series: Fix Released Status in Ubuntu Cloud Archive yoga series: Fix Released Status in Ubuntu Cloud Archive zed series: Fix Released Status in neutron: Fix Released Status in neutron package in Ubuntu: Fix Released Status in neutron source package in Focal: Fix Released Status in neutron source package in Jammy: Fix Released Status in neutron source package in Lunar: Fix Released Bug description: Reported at: https://bugzilla.redhat.com/show_bug.cgi?id=2225156 The ovn-db-sync script does not check if the router ports are actually out-of-sync before adding it to the list of ports that needs to be updated, this can create red herring problems by introducing an irrelevant piece of information [0] in the sync report (specially when ran in "log" mode) making the user think that the databases might be out-of-sync even when it is not. Looking at the code [1] we can see that the comment talks about checking the networks and ipv6_ra_configs changes but it does neither; instead, it adds every router port to the list of ports that needs to be updated. # We dont have to check for the networks and # ipv6_ra_configs values. Lets add it to the # update_lrport_list. If they are in sync, then # update_router_port will be a no-op. update_lrport_list.append(db_router_ports[lrport]) This LP is about changing this behavior and checking for such differences in the router ports before marking them to be updated. [0] 2023-07-24 11:46:31.391 952358 WARNING networking_ovn.ovn_db_sync [req-1081a8a6-82dd-431c-a2ab-f58741dc1677 - - - - -] Router Port port_id=f164c0f1-8ac8-4c45-bba9-8c723a30c701 needs to be updated for networks changed [1] https://github.com/openstack/neutron/blob/c453813d0664259c4da0d132f224be2eebe70072/neutron/plugins/ml2/drivers/ovn/mech_driver/ovsdb/ovn_db_sync.py#L553-L557 = Ubuntu SRU Details = [Impact] See bug description above. [Test Case] Deploy openstack with OVN and multiple routers. Run the ovn-db-sync script and ensure router ports that are not out of sync are not marked to be updated. [Where problems could occur] If the _is_router_port_changed() function had a bug there would be potential for ports to be filtered out that need updating. Presumably this is not the case, but that is a theoritical potential for where problems could occur. All of these patches have already landed in the corresponding upstream branches. To manage notifications about this bug go to: https://bugs.launchpad.net/cloud-archive/+bug/2030773/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 2032770] Please test proposed package
Hello Mustafa, or anyone else affected, Accepted neutron into wallaby-proposed. The package will build now and be available in the Ubuntu Cloud Archive in a few hours, and then in the -proposed repository. Please help us by testing this new package. To enable the -proposed repository: sudo add-apt-repository cloud-archive:wallaby-proposed sudo apt-get update Your feedback will aid us getting this update out to other Ubuntu users. If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, and change the tag from verification-wallaby-needed to verification-wallaby-done. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-wallaby-failed. In either case, details of your testing will help us make a better decision. Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance! ** Changed in: cloud-archive/wallaby Status: Fix Released => Fix Committed ** Tags added: verification-wallaby-needed -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/2032770 Title: [SRU] [OVN] port creation with --enable-uplink-status-propagation does not work with OVN mechanism driver Status in Ubuntu Cloud Archive: Fix Released Status in Ubuntu Cloud Archive antelope series: Fix Released Status in Ubuntu Cloud Archive ussuri series: Fix Released Status in Ubuntu Cloud Archive victoria series: Fix Released Status in Ubuntu Cloud Archive wallaby series: Fix Committed Status in Ubuntu Cloud Archive xena series: Fix Released Status in Ubuntu Cloud Archive yoga series: Fix Released Status in Ubuntu Cloud Archive zed series: Fix Released Status in neutron: Fix Released Status in neutron package in Ubuntu: Fix Released Status in neutron source package in Focal: Fix Released Status in neutron source package in Jammy: Fix Released Status in neutron source package in Lunar: Fix Released Bug description: [Impact] This SRU is a backport of https://review.opendev.org/c/openstack/neutron/+/892895 to the respective Ubuntu and UCA releases. The patch is merged to all respective upstream branches (master & stable/[u,v,w,x,y,z,2023.1(a)]). This SRU intends to add the missing 'uplink-status-propagation' extension to ML2/OVN. This extension is already present and working in ML2/OVS, and it is supported by ML2/OVN but the extension is somehow not added to ML2/OVN. The patch simply adds the missing extension to the ML2/OVN too. The impact of this is visible for the deployments migrating from ML2/OVS to ML2/OVN. The following command fails to work on ML2/OVN: ``` openstack port create --network 8d30fb08-2c6a-42fd-98c4-223d345c8c4f --binding-profile trusted=true --enable-uplink-status-propagation --vnic-type direct aaa # BadRequestException: 400: Client Error for url: https://mycloud.example.com:9696/v2.0/ports, Unrecognized attribute(s) 'propagate_uplink_status' ``` The fix corrects this behavior by adding the missing extension. [Test Case] - Deploy a Focal/Yoga cloud: - ./generate-bundle.sh -s focal -r yoga --name test-focal-yoga-stack --run --ovn # After the dust settles - ./configure - source ./novarc - openstack port create --network --binding-profile trusted=true --enable-uplink-status-propagation --vnic-type direct aaa - It should fail with "BadRequestException: 400: Client Error for url: https://mycloud.example.com:9696/v2.0/ports, Unrecognized attribute(s) 'propagate_uplink_status'" To confirm the fix, repeat the scenario and observe that the error disappears and port creation succeeds. [Regression Potential] The patch is quite trivial and should not affect any deployment negatively. The extension is optional and disabled by default. To manage notifications about this bug go to: https://bugs.launchpad.net/cloud-archive/+bug/2032770/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1955578] Re: OVN transaction could not be completed due to a race condition
This bug was fixed in the package neutron - 2:16.4.2-0ubuntu6.4~cloud0 --- neutron (2:16.4.2-0ubuntu6.4~cloud0) bionic-ussuri; urgency=medium . * New update for the Ubuntu Cloud Archive. . neutron (2:16.4.2-0ubuntu6.4) focal; urgency=medium . [ Corey Bryant ] * d/p/ovn-db-sync-continue-on-duplicate-normalise.patch: Cherry-picked from upstream to allow ovn_db_sync to continue on duplicate normalised CIDR (LP: #1961112). * d/p/ovn-db-sync-check-for-router-port-differences.patch: Cherry-picked from upstream to ensure router ports are marked for needing updates only if they have changed (LP: #2030773). * d/p/ovn-specify-port-type-if-router-port-when-updating.patch: Specify port type if it's a router port when updating to avoid port flapping (LP: #1955578). * d/p/fix-acl-sync-when-default-sg-group-created.patch: Cherry-picked form upstream to fix ACL sync when default security group is created (LP: #2008943). . [ Mustafa Kemal GILOR ] * d/p/add_uplink_status_propagation.patch: Add the 'uplink-status-propagation' extension to ML2/OVN (LP: #2032770). ** Changed in: cloud-archive/ussuri Status: Fix Committed => Fix Released -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1955578 Title: OVN transaction could not be completed due to a race condition Status in Ubuntu Cloud Archive: Fix Released Status in Ubuntu Cloud Archive ussuri series: Fix Released Status in Ubuntu Cloud Archive victoria series: Fix Released Status in neutron: Fix Released Status in neutron package in Ubuntu: Fix Released Status in neutron source package in Focal: Fix Released Bug description: When executing the test "test_connectivity_through_2_routers" it is highly possible to have a race condition: networking_ovn.common.exceptions.RevisionConflict: OVN revision number for {PORT_ID} (type: ports) is equal or higher than the given resource. Skipping update. Bugzilla reference: https://bugzilla.redhat.com/show_bug.cgi?id=1860448 = Ubuntu SRU Details = [Impact] See bug description. [Test Case] Deploy openstack with OVN. Run the test_connectivity_through_2_routers test from https://github.com/openstack/neutron-tempest-plugin. This could also be tested manually based on what that test does. Ensure the router port status is not set to DOWN at any point. [Where problems could occur] The existing bug could still occur if the assumpion that specifying the port type is not correct. Presumably this is not the case, but that is a theoritical potential for where problems could occur. All of these patches have already landed in the corresponding upstream branches. To manage notifications about this bug go to: https://bugs.launchpad.net/cloud-archive/+bug/1955578/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1961112] Re: [ovn] overlapping security group rules break neutron-ovn-db-sync-util
This bug was fixed in the package neutron - 2:16.4.2-0ubuntu6.4~cloud0 --- neutron (2:16.4.2-0ubuntu6.4~cloud0) bionic-ussuri; urgency=medium . * New update for the Ubuntu Cloud Archive. . neutron (2:16.4.2-0ubuntu6.4) focal; urgency=medium . [ Corey Bryant ] * d/p/ovn-db-sync-continue-on-duplicate-normalise.patch: Cherry-picked from upstream to allow ovn_db_sync to continue on duplicate normalised CIDR (LP: #1961112). * d/p/ovn-db-sync-check-for-router-port-differences.patch: Cherry-picked from upstream to ensure router ports are marked for needing updates only if they have changed (LP: #2030773). * d/p/ovn-specify-port-type-if-router-port-when-updating.patch: Specify port type if it's a router port when updating to avoid port flapping (LP: #1955578). * d/p/fix-acl-sync-when-default-sg-group-created.patch: Cherry-picked form upstream to fix ACL sync when default security group is created (LP: #2008943). . [ Mustafa Kemal GILOR ] * d/p/add_uplink_status_propagation.patch: Add the 'uplink-status-propagation' extension to ML2/OVN (LP: #2032770). ** Changed in: cloud-archive/ussuri Status: Fix Committed => Fix Released -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1961112 Title: [ovn] overlapping security group rules break neutron-ovn-db-sync-util Status in Ubuntu Cloud Archive: Fix Released Status in Ubuntu Cloud Archive ussuri series: Fix Released Status in neutron: Fix Released Status in neutron package in Ubuntu: Fix Released Status in neutron source package in Focal: Fix Released Bug description: Neutron (Xena) is happy to accept equivalent rules with overlapping remote CIDR prefix as long as the notation is different, e.g. 10.0.0.0/8 and 10.0.0.1/8. However, OVN is smarter, normalizes the prefix and figures out that they both are 10.0.0.0/8. This does not have any fatal effects in a running OVN deployment (creating and using such rules does not even trigger a warning) but upon running neutron-ovn-db-sync-util, it crashes and won't perform a sync. This is a blocker for upgrades (and other scenarios). Security group's rules: $ openstack security group rule list overlap-sgr +--+-+---+++---+---+--+ | ID | IP Protocol | Ethertype | IP Range | Port Range | Direction | Remote Security Group | Remote Address Group | +--+-+---+++---+---+--+ | 3c41fa80-1d23-49c9-9ec1-adf581e07e24 | tcp | IPv4 | 10.0.0.1/8 || ingress | None | None | | 639d263e-6873-47cb-b2c4-17fc824252db | None| IPv4 | 0.0.0.0/0 || egress| None | None | | 96e99039-cbc0-48fe-98fe-ef28d41b9d9b | tcp | IPv4 | 10.0.0.0/8 || ingress | None | None | | bf9160a3-fc9b-467e-85d5-c889811fd6ca | None| IPv6 | ::/0 || egress| None | None | +--+-+---+++---+---+--+ Log excerpt: 16/Feb/2022:20:55:40.568 527216 INFO neutron.cmd.ovn.neutron_ovn_db_sync_util [req-c595a893-db9b-484e-ae8a-bb7dbe8b31f3 - - - - -] Sync for Northbound db started with mode : repair 16/Feb/2022:20:55:42.105 527216 INFO neutron.plugins.ml2.drivers.ovn.mech_driver.ovsdb.extensions.qos [req-c595a893-db9b-484e-ae8a-bb7dbe8b31f3 - - - - -] Starting OVNClientQosExtension 16/Feb/2022:20:55:42.380 527216 INFO neutron.db.ovn_revision_numbers_db [req-c595a893-db9b-484e-ae8a-bb7dbe8b31f3 - - - - -] Successfully bumped revision number for resource 49b3249a-7624-4711-b271-3e63c6a27658 (type: ports) to 17 16/Feb/2022:20:55:43.205 527216 WARNING neutron.plugins.ml2.drivers.ovn.mech_driver.ovsdb.ovn_db_sync [req-c595a893-db9b-484e-ae8a-bb7dbe8b31f3 - - - - -] ACLs-to-be-added 1 ACLs-to-be-removed 0 16/Feb/2022:20:55:43.206 527216 WARNING neutron.plugins.ml2.drivers.ovn.mech_driver.ovsdb.ovn_db_sync [req-c595a893-db9b-484e-ae8a-bb7dbe8b31f3 - - - - -] ACL found in Neutron but not in OVN DB for port group pg_e90b68f3_9f8d_4250_9b6a_7531e2249c99 16/Feb/2022:20:55:43.208 527216 ERROR ovsdbapp.backend.ovs_idl.transaction [req-c595a893-db9b-484e-ae8a-bb7dbe8b31f3 - - - - -] Traceback (most recent call last): File "/usr/lib/python3/dist-packages/ovsdbapp/backend/ovs_idl/connection.py", line 131, in run txn.results.put(txn.do_commit()) File
[Yahoo-eng-team] [Bug 2008943] Re: OVN DB Sync utility cannot find NB DB Port Group
This bug was fixed in the package neutron - 2:16.4.2-0ubuntu6.4~cloud0 --- neutron (2:16.4.2-0ubuntu6.4~cloud0) bionic-ussuri; urgency=medium . * New update for the Ubuntu Cloud Archive. . neutron (2:16.4.2-0ubuntu6.4) focal; urgency=medium . [ Corey Bryant ] * d/p/ovn-db-sync-continue-on-duplicate-normalise.patch: Cherry-picked from upstream to allow ovn_db_sync to continue on duplicate normalised CIDR (LP: #1961112). * d/p/ovn-db-sync-check-for-router-port-differences.patch: Cherry-picked from upstream to ensure router ports are marked for needing updates only if they have changed (LP: #2030773). * d/p/ovn-specify-port-type-if-router-port-when-updating.patch: Specify port type if it's a router port when updating to avoid port flapping (LP: #1955578). * d/p/fix-acl-sync-when-default-sg-group-created.patch: Cherry-picked form upstream to fix ACL sync when default security group is created (LP: #2008943). . [ Mustafa Kemal GILOR ] * d/p/add_uplink_status_propagation.patch: Add the 'uplink-status-propagation' extension to ML2/OVN (LP: #2032770). ** Changed in: cloud-archive/ussuri Status: Fix Committed => Fix Released -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/2008943 Title: OVN DB Sync utility cannot find NB DB Port Group Status in Ubuntu Cloud Archive: Fix Released Status in Ubuntu Cloud Archive ussuri series: Fix Released Status in Ubuntu Cloud Archive victoria series: Fix Released Status in Ubuntu Cloud Archive wallaby series: Fix Released Status in Ubuntu Cloud Archive xena series: Fix Released Status in neutron: In Progress Status in neutron package in Ubuntu: Fix Released Status in neutron source package in Focal: Fix Released Bug description: Runtime exception: ovsdbapp.backend.ovs_idl.idlutils.RowNotFound: Cannot find Port_Group with name=pg_aa9f203b_ec51_4893_9bda_cfadbff9f800 can occure while performing database sync between Neutron db and OVN NB db using neutron-ovn-db-sync-util. This exception occures when the `sync_networks_ports_and_dhcp_opts()` function ends up implicitly creating a new default security group for a tenant/project id. This is normally ok but the problem is that `sync_port_groups` was already called and thus the port_group does not exists in NB DB. When the `sync_acls()` is called later there is no port group found and exception occurs. Quick way to reproduce on ML2/OVN: - openstack project create test_project - openstack create network --project test_project test_network - openstack port delete $(openstack port list --network test_network -c ID -f value) # since this is an empty network only the metadata port should get listed and subsequently deleted - openstack security group delete test_project So now that you have a network without a metadata port in it and no default security group for the project/tenant that this network belongs to run neutron-ovn-db-sync-util --config-file /etc/neutron/neutron.conf --config-file /etc/neutron/plugins/ml2/ml2_conf.ini --ovn- neutron_sync_mode migrate The exeption should occur Here is a more realistic scenario how we can run into this with ML2/OVS to ML2/OVN migration. I am also including why the code runs into it. 1. ML2/OVS enviroment with a network but no default security group for the project/tenant associated with the network 2. Perform ML2/OVS to ML2/OVN migration. This migration process will run neutron-ovn-db-sync-util with --migrate 3. During the sync we first sync port groups[1] from Neutron DB to OVN DB 4. Then we sync network ports [2]. The process will detect that the network in question is not part of OVN NB. It will create that network in OVN NB db and along with that it will create a metadata port for it(OVN network requires metadataport). The Port_create call will implicitly notify _ensure_default_security_group_handler which will not find securty group for that tenant/project id and create one. Now you have a new security group with 4 new default security group rules. 5. When sync_acls[4] runs it will pick up those 4 new rules but commit to NB DB will fail since the port_group(aka security group) does not exists in NB DB [1] https://opendev.org/openstack/neutron/src/branch/master/neutron/plugins/ml2/drivers/ovn/mech_driver/ovsdb/ovn_db_sync.py#L104 [2] https://opendev.org/openstack/neutron/src/branch/master/neutron/plugins/ml2/drivers/ovn/mech_driver/ovsdb/ovn_db_sync.py#L10 [3] https://opendev.org/openstack/neutron/src/branch/master/neutron/db/securitygroups_db.py#L915 [4] https://opendev.org/openstack/neutron/src/branch/master/neutron/plugins/ml2/drivers/ovn/mech_driver/ovsdb/ovn_db_sync.py#L107 = Ubuntu SRU Details = [Impact] See bug description. [Test Case] Deploy openstack with
[Yahoo-eng-team] [Bug 2030773] Re: OVN DB Sync always logs warning messages about updating all router ports
This bug was fixed in the package neutron - 2:16.4.2-0ubuntu6.4~cloud0 --- neutron (2:16.4.2-0ubuntu6.4~cloud0) bionic-ussuri; urgency=medium . * New update for the Ubuntu Cloud Archive. . neutron (2:16.4.2-0ubuntu6.4) focal; urgency=medium . [ Corey Bryant ] * d/p/ovn-db-sync-continue-on-duplicate-normalise.patch: Cherry-picked from upstream to allow ovn_db_sync to continue on duplicate normalised CIDR (LP: #1961112). * d/p/ovn-db-sync-check-for-router-port-differences.patch: Cherry-picked from upstream to ensure router ports are marked for needing updates only if they have changed (LP: #2030773). * d/p/ovn-specify-port-type-if-router-port-when-updating.patch: Specify port type if it's a router port when updating to avoid port flapping (LP: #1955578). * d/p/fix-acl-sync-when-default-sg-group-created.patch: Cherry-picked form upstream to fix ACL sync when default security group is created (LP: #2008943). . [ Mustafa Kemal GILOR ] * d/p/add_uplink_status_propagation.patch: Add the 'uplink-status-propagation' extension to ML2/OVN (LP: #2032770). ** Changed in: cloud-archive/ussuri Status: Fix Committed => Fix Released -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/2030773 Title: OVN DB Sync always logs warning messages about updating all router ports Status in Ubuntu Cloud Archive: Fix Released Status in Ubuntu Cloud Archive antelope series: Fix Released Status in Ubuntu Cloud Archive ussuri series: Fix Released Status in Ubuntu Cloud Archive victoria series: Fix Released Status in Ubuntu Cloud Archive wallaby series: Fix Released Status in Ubuntu Cloud Archive xena series: Fix Released Status in Ubuntu Cloud Archive yoga series: Fix Released Status in Ubuntu Cloud Archive zed series: Fix Released Status in neutron: Fix Released Status in neutron package in Ubuntu: Fix Released Status in neutron source package in Focal: Fix Released Status in neutron source package in Jammy: Fix Released Status in neutron source package in Lunar: Fix Released Bug description: Reported at: https://bugzilla.redhat.com/show_bug.cgi?id=2225156 The ovn-db-sync script does not check if the router ports are actually out-of-sync before adding it to the list of ports that needs to be updated, this can create red herring problems by introducing an irrelevant piece of information [0] in the sync report (specially when ran in "log" mode) making the user think that the databases might be out-of-sync even when it is not. Looking at the code [1] we can see that the comment talks about checking the networks and ipv6_ra_configs changes but it does neither; instead, it adds every router port to the list of ports that needs to be updated. # We dont have to check for the networks and # ipv6_ra_configs values. Lets add it to the # update_lrport_list. If they are in sync, then # update_router_port will be a no-op. update_lrport_list.append(db_router_ports[lrport]) This LP is about changing this behavior and checking for such differences in the router ports before marking them to be updated. [0] 2023-07-24 11:46:31.391 952358 WARNING networking_ovn.ovn_db_sync [req-1081a8a6-82dd-431c-a2ab-f58741dc1677 - - - - -] Router Port port_id=f164c0f1-8ac8-4c45-bba9-8c723a30c701 needs to be updated for networks changed [1] https://github.com/openstack/neutron/blob/c453813d0664259c4da0d132f224be2eebe70072/neutron/plugins/ml2/drivers/ovn/mech_driver/ovsdb/ovn_db_sync.py#L553-L557 = Ubuntu SRU Details = [Impact] See bug description above. [Test Case] Deploy openstack with OVN and multiple routers. Run the ovn-db-sync script and ensure router ports that are not out of sync are not marked to be updated. [Where problems could occur] If the _is_router_port_changed() function had a bug there would be potential for ports to be filtered out that need updating. Presumably this is not the case, but that is a theoritical potential for where problems could occur. All of these patches have already landed in the corresponding upstream branches. To manage notifications about this bug go to: https://bugs.launchpad.net/cloud-archive/+bug/2030773/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 2032770] Re: [SRU] [OVN] port creation with --enable-uplink-status-propagation does not work with OVN mechanism driver
This bug was fixed in the package neutron - 2:16.4.2-0ubuntu6.4~cloud0 --- neutron (2:16.4.2-0ubuntu6.4~cloud0) bionic-ussuri; urgency=medium . * New update for the Ubuntu Cloud Archive. . neutron (2:16.4.2-0ubuntu6.4) focal; urgency=medium . [ Corey Bryant ] * d/p/ovn-db-sync-continue-on-duplicate-normalise.patch: Cherry-picked from upstream to allow ovn_db_sync to continue on duplicate normalised CIDR (LP: #1961112). * d/p/ovn-db-sync-check-for-router-port-differences.patch: Cherry-picked from upstream to ensure router ports are marked for needing updates only if they have changed (LP: #2030773). * d/p/ovn-specify-port-type-if-router-port-when-updating.patch: Specify port type if it's a router port when updating to avoid port flapping (LP: #1955578). * d/p/fix-acl-sync-when-default-sg-group-created.patch: Cherry-picked form upstream to fix ACL sync when default security group is created (LP: #2008943). . [ Mustafa Kemal GILOR ] * d/p/add_uplink_status_propagation.patch: Add the 'uplink-status-propagation' extension to ML2/OVN (LP: #2032770). ** Changed in: cloud-archive/ussuri Status: Fix Committed => Fix Released -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/2032770 Title: [SRU] [OVN] port creation with --enable-uplink-status-propagation does not work with OVN mechanism driver Status in Ubuntu Cloud Archive: Fix Released Status in Ubuntu Cloud Archive antelope series: Fix Released Status in Ubuntu Cloud Archive ussuri series: Fix Released Status in Ubuntu Cloud Archive victoria series: Fix Released Status in Ubuntu Cloud Archive wallaby series: Fix Released Status in Ubuntu Cloud Archive xena series: Fix Released Status in Ubuntu Cloud Archive yoga series: Fix Released Status in Ubuntu Cloud Archive zed series: Fix Released Status in neutron: Fix Released Status in neutron package in Ubuntu: Fix Released Status in neutron source package in Focal: Fix Released Status in neutron source package in Jammy: Fix Released Status in neutron source package in Lunar: Fix Released Bug description: [Impact] This SRU is a backport of https://review.opendev.org/c/openstack/neutron/+/892895 to the respective Ubuntu and UCA releases. The patch is merged to all respective upstream branches (master & stable/[u,v,w,x,y,z,2023.1(a)]). This SRU intends to add the missing 'uplink-status-propagation' extension to ML2/OVN. This extension is already present and working in ML2/OVS, and it is supported by ML2/OVN but the extension is somehow not added to ML2/OVN. The patch simply adds the missing extension to the ML2/OVN too. The impact of this is visible for the deployments migrating from ML2/OVS to ML2/OVN. The following command fails to work on ML2/OVN: ``` openstack port create --network 8d30fb08-2c6a-42fd-98c4-223d345c8c4f --binding-profile trusted=true --enable-uplink-status-propagation --vnic-type direct aaa # BadRequestException: 400: Client Error for url: https://mycloud.example.com:9696/v2.0/ports, Unrecognized attribute(s) 'propagate_uplink_status' ``` The fix corrects this behavior by adding the missing extension. [Test Case] - Deploy a Focal/Yoga cloud: - ./generate-bundle.sh -s focal -r yoga --name test-focal-yoga-stack --run --ovn # After the dust settles - ./configure - source ./novarc - openstack port create --network --binding-profile trusted=true --enable-uplink-status-propagation --vnic-type direct aaa - It should fail with "BadRequestException: 400: Client Error for url: https://mycloud.example.com:9696/v2.0/ports, Unrecognized attribute(s) 'propagate_uplink_status'" To confirm the fix, repeat the scenario and observe that the error disappears and port creation succeeds. [Regression Potential] The patch is quite trivial and should not affect any deployment negatively. The extension is optional and disabled by default. To manage notifications about this bug go to: https://bugs.launchpad.net/cloud-archive/+bug/2032770/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1978489] Re: libvirt / cgroups v2: cannot boot instance with more than 16 CPUs
This bug was fixed in the package nova - 3:25.2.1-0ubuntu2~cloud0 --- nova (3:25.2.1-0ubuntu2~cloud0) focal-yoga; urgency=medium . * New update for the Ubuntu Cloud Archive. . nova (3:25.2.1-0ubuntu2) jammy; urgency=medium . * d/p/libvirt-remove-default-cputune-shares-value.patch: Enable launch of instances with more than 9 CPUs on Jammy (LP: #1978489). ** Changed in: cloud-archive/yoga Status: Fix Committed => Fix Released -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1978489 Title: libvirt / cgroups v2: cannot boot instance with more than 16 CPUs Status in Ubuntu Cloud Archive: Invalid Status in Ubuntu Cloud Archive yoga series: Fix Released Status in OpenStack Compute (nova): In Progress Status in nova package in Ubuntu: Fix Released Status in nova source package in Jammy: Fix Released Bug description: Description === Using the libvirt driver and a host OS that uses cgroups v2 (RHEL 9, Ubuntu Jammy), an instance with more than 16 CPUs cannot be booted. Steps to reproduce == 1. Boot an instance with 10 (or more) CPUs on RHEL 9 or Ubuntu Jammy using Nova with the libvirt driver. Expected result === Instance boots. Actual result = Instance fails to boot with a 'Value specified in CPUWeight is out of range' error. Environment === Originially report as a libvirt but in RHEL 9 [1] Additional information == This is happening because Nova defaults to 1024 * (# of CPUs) for the value of domain/cputune/shares in the libvirt XML. This is then passed directly by libvirt to the cgroups API, but cgroups v2 has a maximum value of 1. 1 / 1024 ~= 9.76 [1] https://bugzilla.redhat.com/show_bug.cgi?id=2035518 Ubuntu SRU Details: [Impact] See above. [Test Case] See above. [Regression Potential] We've had this change in other jammy-based versions of the nova package for a while now, including zed, antelope, bobcat. To manage notifications about this bug go to: https://bugs.launchpad.net/cloud-archive/+bug/1978489/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1998789] Please test proposed package
Hello Mustafa, or anyone else affected, Accepted keystone into ussuri-proposed. The package will build now and be available in the Ubuntu Cloud Archive in a few hours, and then in the -proposed repository. Please help us by testing this new package. To enable the -proposed repository: sudo add-apt-repository cloud-archive:ussuri-proposed sudo apt-get update Your feedback will aid us getting this update out to other Ubuntu users. If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, and change the tag from verification-ussuri-needed to verification-ussuri-done. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-ussuri-failed. In either case, details of your testing will help us make a better decision. Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance! ** Changed in: cloud-archive/ussuri Status: Fix Released => Fix Committed ** Tags added: verification-ussuri-needed -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Identity (keystone). https://bugs.launchpad.net/bugs/1998789 Title: [SRU] PooledLDAPHandler.result3 does not release pool connection back when an exception is raised Status in Ubuntu Cloud Archive: Fix Released Status in Ubuntu Cloud Archive antelope series: Fix Released Status in Ubuntu Cloud Archive ussuri series: Fix Committed Status in Ubuntu Cloud Archive victoria series: Fix Released Status in Ubuntu Cloud Archive wallaby series: Fix Released Status in Ubuntu Cloud Archive xena series: Fix Released Status in Ubuntu Cloud Archive yoga series: Fix Released Status in Ubuntu Cloud Archive zed series: Fix Released Status in OpenStack Identity (keystone): Fix Released Status in keystone package in Ubuntu: Fix Released Status in keystone source package in Focal: Fix Released Status in keystone source package in Jammy: Fix Released Status in keystone source package in Lunar: Fix Released Bug description: [Impact] This SRU is a backport of https://review.opendev.org/c/openstack/keystone/+/866723 to the respective Ubuntu and UCA releases. The patch is merged to the all respective upstream branches (master & stable/[u,v,w,x,y,z]). This SRU intends to fix a denial-of-service bug that happens when keystone uses pooled ldap connections. In pooled ldap connection mode, keystone borrows a connection from the pool, do the LDAP operation and release it back to the pool. But, if an exception or error happens while the LDAP connection is still borrowed, Keystone fails to release the connection back to the pool, hogging it forever. If this happens for all the pooled connections, the connection pool will be exhausted and Keystone will no longer be able to perform LDAP operations. The fix corrects this behavior by allowing the connection to release back to the pool even if an exception/error happens during the LDAP operation. [Test Case] - Deploy an LDAP server of your choice - Fill it with many data so the search takes more than `pool_connection_timeout` seconds - Define a keystone domain with the LDAP driver with following options: [ldap] use_pool = True page_size = 100 pool_connection_timeout = 3 pool_retry_max = 3 pool_size = 10 - Point the domain to the LDAP server - Try to login to the OpenStack dashboard, or try to do anything that uses the LDAP user - Observe the /var/log/apache2/keystone_error.log, it should contain ldap.TIMEOUT() stack traces followed by `ldappool.MaxConnectionReachedError` stack traces To confirm the fix, repeat the scenario and observe that the "/var/log/apache2/keystone_error.log" does not contain `ldappool.MaxConnectionReachedError` stack traces and LDAP operation in motion is successful (e.g. OpenStack Dashboard login) [Regression Potential] The patch is quite trivial and should not affect any deployment in a negative way. The LDAP pool functionality can be disabled by setting "use_pool=False" in case of any regression. To manage notifications about this bug go to: https://bugs.launchpad.net/cloud-archive/+bug/1998789/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1978489] Re: libvirt / cgroups v2: cannot boot instance with more than 16 CPUs
Re: > The same patch should also be available on cloud archive cloud:focal- yoga This will happen alongside the changes being made into 22.04 - the updates are in the yoga-proposed pocket at the moment. ** Also affects: cloud-archive Importance: Undecided Status: New ** Also affects: cloud-archive/yoga Importance: Undecided Status: New ** Changed in: cloud-archive Status: New => Invalid ** Changed in: cloud-archive/yoga Status: New => Fix Committed ** Changed in: cloud-archive/yoga Importance: Undecided => High ** Changed in: nova (Ubuntu Jammy) Importance: Undecided => High -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1978489 Title: libvirt / cgroups v2: cannot boot instance with more than 16 CPUs Status in Ubuntu Cloud Archive: Invalid Status in Ubuntu Cloud Archive yoga series: Fix Committed Status in OpenStack Compute (nova): In Progress Status in nova package in Ubuntu: Confirmed Status in nova source package in Jammy: Fix Committed Bug description: Description === Using the libvirt driver and a host OS that uses cgroups v2 (RHEL 9, Ubuntu Jammy), an instance with more than 16 CPUs cannot be booted. Steps to reproduce == 1. Boot an instance with 10 (or more) CPUs on RHEL 9 or Ubuntu Jammy using Nova with the libvirt driver. Expected result === Instance boots. Actual result = Instance fails to boot with a 'Value specified in CPUWeight is out of range' error. Environment === Originially report as a libvirt but in RHEL 9 [1] Additional information == This is happening because Nova defaults to 1024 * (# of CPUs) for the value of domain/cputune/shares in the libvirt XML. This is then passed directly by libvirt to the cgroups API, but cgroups v2 has a maximum value of 1. 1 / 1024 ~= 9.76 [1] https://bugzilla.redhat.com/show_bug.cgi?id=2035518 Ubuntu SRU Details: [Impact] See above. [Test Case] See above. [Regression Potential] We've had this change in other jammy-based versions of the nova package for a while now, including zed, antelope, bobcat. To manage notifications about this bug go to: https://bugs.launchpad.net/cloud-archive/+bug/1978489/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 2019190] Re: [SRU][RBD] Retyping of in-use boot volumes renders instances unusable (possible data corruption)
This bug was fixed in the package cinder - 2:20.3.1-0ubuntu1.1~cloud0 --- cinder (2:20.3.1-0ubuntu1.1~cloud0) focal-yoga; urgency=medium . * New update for the Ubuntu Cloud Archive. . cinder (2:20.3.1-0ubuntu1.1) jammy; urgency=medium . * Revert driver assisted volume retype (LP: #2019190): - d/p/0001-Revert-Driver-assisted-migration-on-retype-when-it-s.patch ** Changed in: cloud-archive/yoga Status: Fix Committed => Fix Released -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/2019190 Title: [SRU][RBD] Retyping of in-use boot volumes renders instances unusable (possible data corruption) Status in Cinder: New Status in Cinder wallaby series: New Status in Ubuntu Cloud Archive: Fix Released Status in Ubuntu Cloud Archive antelope series: Fix Released Status in Ubuntu Cloud Archive bobcat series: Fix Released Status in Ubuntu Cloud Archive caracal series: Fix Released Status in Ubuntu Cloud Archive yoga series: Fix Released Status in Ubuntu Cloud Archive zed series: Fix Released Status in OpenStack Compute (nova): Invalid Status in cinder package in Ubuntu: Fix Released Status in cinder source package in Jammy: Fix Released Status in cinder source package in Lunar: Won't Fix Status in cinder source package in Mantic: Fix Released Status in cinder source package in Noble: Fix Released Bug description: [Impact] See bug description for full details but short summary is that a patch landed in Wallaby release that introduced a regression whereby retyping an in-use volume leaves the attached volume in an inconsistent state with potential for data corruption. Result is that a vm does not receive updated connection_info from Cinder and will keep pointing to the old volume, even after reboot. [Test Plan] * Deploy Openstack with two Cinder RBD storage backends (different pools) * Create two volume types * Boot a vm from volume: openstack server create --wait --image jammy --flavor m1.small --key-name testkey --nic net-id=8c74f1ef-9231-46f4-a492-eccdb7943ecd testvm --boot-from-volume 10 * Retype the volume to type B: openstack volume set --type typeB --retype-policy on-demand * Go to compute host running vm and check that the vm is now copying data to the new location e.g. b68be47d-f526-4f98-a77b-a903bf8b6c65 which will eventually settle and change to: b68be47d-f526-4f98-a77b-a903bf8b6c65 * And lastly a reboot of the vm should be successfull. [Regression Potential] Given that the current state is potential data corruption and the patch will fix this by successfully refreshing connection info I do not see a regression potential. It is in fact fixing a regression. - While trying out the volume retype feature in cinder, we noticed that after an instance is rebooted it will not come back online and be stuck in an error state or if it comes back online, its filesystem is corrupted. ## Observations Say there are the two volume types `fast` (stored in ceph pool `volumes`) and `slow` (stored in ceph pool `volumes.hdd`). Before the retyping we can see that the volume for example is present in the `volumes.hdd` pool and has a watcher accessing the volume. ```sh [ceph: root@mon0 /]# rbd ls volumes.hdd volume-81cfbafc-4fbb-41b0-abcb-8ec7359d0bf9 [ceph: root@mon0 /]# rbd status volumes.hdd/volume-81cfbafc-4fbb-41b0-abcb-8ec7359d0bf9 Watchers: watcher=[2001:XX:XX:XX::10ad]:0/3914407456 client.365192 cookie=140370268803456 ``` Starting the retyping process using the migration policy `on-demand` for that volume either via the horizon dashboard or the CLI causes the volume to be correctly transferred to the `volumes` pool within the ceph cluster. However, the watcher does not get transferred, so nobody is accessing the volume after it has been transferred. ```sh [ceph: root@mon0 /]# rbd ls volumes volume-81cfbafc-4fbb-41b0-abcb-8ec7359d0bf9 [ceph: root@mon0 /]# rbd status volumes/volume-81cfbafc-4fbb-41b0-abcb-8ec7359d0bf9 Watchers: none ``` Taking a look at the libvirt XML of the instance in question, one can see that the `rbd` volume path does not change after the retyping is completed. Therefore, if the instance is restarted nova will not be able to find its volume preventing an instance start. Pre retype ```xml [...] [...] ```
[Yahoo-eng-team] [Bug 2019190] Re: [SRU][RBD] Retyping of in-use boot volumes renders instances unusable (possible data corruption)
This bug was fixed in the package cinder - 2:21.3.1-0ubuntu1.1~cloud0 --- cinder (2:21.3.1-0ubuntu1.1~cloud0) jammy-zed; urgency=medium . * revert driver assister volume retype (LP: #2019190) - d/p/0001-Revert-Driver-assisted-migration-on-retype-when-it-s.patch ** Changed in: cloud-archive/zed Status: Fix Committed => Fix Released -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/2019190 Title: [SRU][RBD] Retyping of in-use boot volumes renders instances unusable (possible data corruption) Status in Cinder: New Status in Cinder wallaby series: New Status in Ubuntu Cloud Archive: Fix Released Status in Ubuntu Cloud Archive antelope series: Fix Released Status in Ubuntu Cloud Archive bobcat series: Fix Released Status in Ubuntu Cloud Archive caracal series: Fix Released Status in Ubuntu Cloud Archive yoga series: Fix Released Status in Ubuntu Cloud Archive zed series: Fix Released Status in OpenStack Compute (nova): Invalid Status in cinder package in Ubuntu: Fix Released Status in cinder source package in Jammy: Fix Released Status in cinder source package in Lunar: Won't Fix Status in cinder source package in Mantic: Fix Released Status in cinder source package in Noble: Fix Released Bug description: [Impact] See bug description for full details but short summary is that a patch landed in Wallaby release that introduced a regression whereby retyping an in-use volume leaves the attached volume in an inconsistent state with potential for data corruption. Result is that a vm does not receive updated connection_info from Cinder and will keep pointing to the old volume, even after reboot. [Test Plan] * Deploy Openstack with two Cinder RBD storage backends (different pools) * Create two volume types * Boot a vm from volume: openstack server create --wait --image jammy --flavor m1.small --key-name testkey --nic net-id=8c74f1ef-9231-46f4-a492-eccdb7943ecd testvm --boot-from-volume 10 * Retype the volume to type B: openstack volume set --type typeB --retype-policy on-demand * Go to compute host running vm and check that the vm is now copying data to the new location e.g. b68be47d-f526-4f98-a77b-a903bf8b6c65 which will eventually settle and change to: b68be47d-f526-4f98-a77b-a903bf8b6c65 * And lastly a reboot of the vm should be successfull. [Regression Potential] Given that the current state is potential data corruption and the patch will fix this by successfully refreshing connection info I do not see a regression potential. It is in fact fixing a regression. - While trying out the volume retype feature in cinder, we noticed that after an instance is rebooted it will not come back online and be stuck in an error state or if it comes back online, its filesystem is corrupted. ## Observations Say there are the two volume types `fast` (stored in ceph pool `volumes`) and `slow` (stored in ceph pool `volumes.hdd`). Before the retyping we can see that the volume for example is present in the `volumes.hdd` pool and has a watcher accessing the volume. ```sh [ceph: root@mon0 /]# rbd ls volumes.hdd volume-81cfbafc-4fbb-41b0-abcb-8ec7359d0bf9 [ceph: root@mon0 /]# rbd status volumes.hdd/volume-81cfbafc-4fbb-41b0-abcb-8ec7359d0bf9 Watchers: watcher=[2001:XX:XX:XX::10ad]:0/3914407456 client.365192 cookie=140370268803456 ``` Starting the retyping process using the migration policy `on-demand` for that volume either via the horizon dashboard or the CLI causes the volume to be correctly transferred to the `volumes` pool within the ceph cluster. However, the watcher does not get transferred, so nobody is accessing the volume after it has been transferred. ```sh [ceph: root@mon0 /]# rbd ls volumes volume-81cfbafc-4fbb-41b0-abcb-8ec7359d0bf9 [ceph: root@mon0 /]# rbd status volumes/volume-81cfbafc-4fbb-41b0-abcb-8ec7359d0bf9 Watchers: none ``` Taking a look at the libvirt XML of the instance in question, one can see that the `rbd` volume path does not change after the retyping is completed. Therefore, if the instance is restarted nova will not be able to find its volume preventing an instance start. Pre retype ```xml [...] [...] ``` Post retype (no change) ```xml [...] [...] ``` ### Possible cause
[Yahoo-eng-team] [Bug 2019190] Re: [SRU][RBD] Retyping of in-use boot volumes renders instances unusable (possible data corruption)
This bug was fixed in the package cinder - 2:22.1.1-0ubuntu1.1~cloud0 --- cinder (2:22.1.1-0ubuntu1.1~cloud0) jammy-antelope; urgency=medium . * revert driver assister volume retype (LP: #2019190) - d/p/0001-Revert-Driver-assisted-migration-on-retype-when-it-s.patch ** Changed in: cloud-archive/antelope Status: Fix Committed => Fix Released -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/2019190 Title: [SRU][RBD] Retyping of in-use boot volumes renders instances unusable (possible data corruption) Status in Cinder: New Status in Cinder wallaby series: New Status in Ubuntu Cloud Archive: Fix Released Status in Ubuntu Cloud Archive antelope series: Fix Released Status in Ubuntu Cloud Archive bobcat series: Fix Released Status in Ubuntu Cloud Archive caracal series: Fix Released Status in Ubuntu Cloud Archive yoga series: Fix Committed Status in Ubuntu Cloud Archive zed series: Fix Committed Status in OpenStack Compute (nova): Invalid Status in cinder package in Ubuntu: Fix Released Status in cinder source package in Jammy: Fix Committed Status in cinder source package in Lunar: Won't Fix Status in cinder source package in Mantic: Fix Released Status in cinder source package in Noble: Fix Released Bug description: [Impact] See bug description for full details but short summary is that a patch landed in Wallaby release that introduced a regression whereby retyping an in-use volume leaves the attached volume in an inconsistent state with potential for data corruption. Result is that a vm does not receive updated connection_info from Cinder and will keep pointing to the old volume, even after reboot. [Test Plan] * Deploy Openstack with two Cinder RBD storage backends (different pools) * Create two volume types * Boot a vm from volume: openstack server create --wait --image jammy --flavor m1.small --key-name testkey --nic net-id=8c74f1ef-9231-46f4-a492-eccdb7943ecd testvm --boot-from-volume 10 * Retype the volume to type B: openstack volume set --type typeB --retype-policy on-demand * Go to compute host running vm and check that the vm is now copying data to the new location e.g. b68be47d-f526-4f98-a77b-a903bf8b6c65 which will eventually settle and change to: b68be47d-f526-4f98-a77b-a903bf8b6c65 * And lastly a reboot of the vm should be successfull. [Regression Potential] Given that the current state is potential data corruption and the patch will fix this by successfully refreshing connection info I do not see a regression potential. It is in fact fixing a regression. - While trying out the volume retype feature in cinder, we noticed that after an instance is rebooted it will not come back online and be stuck in an error state or if it comes back online, its filesystem is corrupted. ## Observations Say there are the two volume types `fast` (stored in ceph pool `volumes`) and `slow` (stored in ceph pool `volumes.hdd`). Before the retyping we can see that the volume for example is present in the `volumes.hdd` pool and has a watcher accessing the volume. ```sh [ceph: root@mon0 /]# rbd ls volumes.hdd volume-81cfbafc-4fbb-41b0-abcb-8ec7359d0bf9 [ceph: root@mon0 /]# rbd status volumes.hdd/volume-81cfbafc-4fbb-41b0-abcb-8ec7359d0bf9 Watchers: watcher=[2001:XX:XX:XX::10ad]:0/3914407456 client.365192 cookie=140370268803456 ``` Starting the retyping process using the migration policy `on-demand` for that volume either via the horizon dashboard or the CLI causes the volume to be correctly transferred to the `volumes` pool within the ceph cluster. However, the watcher does not get transferred, so nobody is accessing the volume after it has been transferred. ```sh [ceph: root@mon0 /]# rbd ls volumes volume-81cfbafc-4fbb-41b0-abcb-8ec7359d0bf9 [ceph: root@mon0 /]# rbd status volumes/volume-81cfbafc-4fbb-41b0-abcb-8ec7359d0bf9 Watchers: none ``` Taking a look at the libvirt XML of the instance in question, one can see that the `rbd` volume path does not change after the retyping is completed. Therefore, if the instance is restarted nova will not be able to find its volume preventing an instance start. Pre retype ```xml [...] [...] ``` Post retype (no change) ```xml [...] [...] ``` ###
[Yahoo-eng-team] [Bug 2019190] Re: [SRU][RBD] Retyping of in-use boot volumes renders instances unusable (possible data corruption)
This bug was fixed in the package cinder - 2:23.0.0-0ubuntu1.1~cloud0 --- cinder (2:23.0.0-0ubuntu1.1~cloud0) jammy-bobcat; urgency=medium . * New update for the Ubuntu Cloud Archive. . cinder (2:23.0.0-0ubuntu1.1) mantic; urgency=medium . [ Corey Bryant ] * d/gbp.conf: Create stable/2023.2 branch. * d/gbp.conf, .launchpad.yaml: Sync from cloud-archive-tools for bobcat. . [ Edward Hope-Morley ] * revert driver assister volume retype (LP: #2019190) - d/p/0001-Revert-Driver-assisted-migration-on-retype-when-it-s.patch ** Changed in: cloud-archive/bobcat Status: Fix Committed => Fix Released -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/2019190 Title: [SRU][RBD] Retyping of in-use boot volumes renders instances unusable (possible data corruption) Status in Cinder: New Status in Cinder wallaby series: New Status in Ubuntu Cloud Archive: Fix Released Status in Ubuntu Cloud Archive antelope series: Fix Released Status in Ubuntu Cloud Archive bobcat series: Fix Released Status in Ubuntu Cloud Archive caracal series: Fix Released Status in Ubuntu Cloud Archive yoga series: Fix Committed Status in Ubuntu Cloud Archive zed series: Fix Committed Status in OpenStack Compute (nova): Invalid Status in cinder package in Ubuntu: Fix Released Status in cinder source package in Jammy: Fix Committed Status in cinder source package in Lunar: Won't Fix Status in cinder source package in Mantic: Fix Released Status in cinder source package in Noble: Fix Released Bug description: [Impact] See bug description for full details but short summary is that a patch landed in Wallaby release that introduced a regression whereby retyping an in-use volume leaves the attached volume in an inconsistent state with potential for data corruption. Result is that a vm does not receive updated connection_info from Cinder and will keep pointing to the old volume, even after reboot. [Test Plan] * Deploy Openstack with two Cinder RBD storage backends (different pools) * Create two volume types * Boot a vm from volume: openstack server create --wait --image jammy --flavor m1.small --key-name testkey --nic net-id=8c74f1ef-9231-46f4-a492-eccdb7943ecd testvm --boot-from-volume 10 * Retype the volume to type B: openstack volume set --type typeB --retype-policy on-demand * Go to compute host running vm and check that the vm is now copying data to the new location e.g. b68be47d-f526-4f98-a77b-a903bf8b6c65 which will eventually settle and change to: b68be47d-f526-4f98-a77b-a903bf8b6c65 * And lastly a reboot of the vm should be successfull. [Regression Potential] Given that the current state is potential data corruption and the patch will fix this by successfully refreshing connection info I do not see a regression potential. It is in fact fixing a regression. - While trying out the volume retype feature in cinder, we noticed that after an instance is rebooted it will not come back online and be stuck in an error state or if it comes back online, its filesystem is corrupted. ## Observations Say there are the two volume types `fast` (stored in ceph pool `volumes`) and `slow` (stored in ceph pool `volumes.hdd`). Before the retyping we can see that the volume for example is present in the `volumes.hdd` pool and has a watcher accessing the volume. ```sh [ceph: root@mon0 /]# rbd ls volumes.hdd volume-81cfbafc-4fbb-41b0-abcb-8ec7359d0bf9 [ceph: root@mon0 /]# rbd status volumes.hdd/volume-81cfbafc-4fbb-41b0-abcb-8ec7359d0bf9 Watchers: watcher=[2001:XX:XX:XX::10ad]:0/3914407456 client.365192 cookie=140370268803456 ``` Starting the retyping process using the migration policy `on-demand` for that volume either via the horizon dashboard or the CLI causes the volume to be correctly transferred to the `volumes` pool within the ceph cluster. However, the watcher does not get transferred, so nobody is accessing the volume after it has been transferred. ```sh [ceph: root@mon0 /]# rbd ls volumes volume-81cfbafc-4fbb-41b0-abcb-8ec7359d0bf9 [ceph: root@mon0 /]# rbd status volumes/volume-81cfbafc-4fbb-41b0-abcb-8ec7359d0bf9 Watchers: none ``` Taking a look at the libvirt XML of the instance in question, one can see that the `rbd` volume path does not change after the retyping is completed. Therefore, if the
[Yahoo-eng-team] [Bug 1955578] Please test proposed package
Hello Arnau, or anyone else affected, Accepted neutron into ussuri-proposed. The package will build now and be available in the Ubuntu Cloud Archive in a few hours, and then in the -proposed repository. Please help us by testing this new package. To enable the -proposed repository: sudo add-apt-repository cloud-archive:ussuri-proposed sudo apt-get update Your feedback will aid us getting this update out to other Ubuntu users. If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, and change the tag from verification-ussuri-needed to verification-ussuri-done. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-ussuri-failed. In either case, details of your testing will help us make a better decision. Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance! ** Changed in: cloud-archive/ussuri Status: Fix Released => Fix Committed ** Tags added: verification-ussuri-needed -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1955578 Title: OVN transaction could not be completed due to a race condition Status in Ubuntu Cloud Archive: Fix Released Status in Ubuntu Cloud Archive ussuri series: Fix Committed Status in Ubuntu Cloud Archive victoria series: Triaged Status in neutron: Fix Released Status in neutron package in Ubuntu: Fix Released Status in neutron source package in Focal: Fix Released Bug description: When executing the test "test_connectivity_through_2_routers" it is highly possible to have a race condition: networking_ovn.common.exceptions.RevisionConflict: OVN revision number for {PORT_ID} (type: ports) is equal or higher than the given resource. Skipping update. Bugzilla reference: https://bugzilla.redhat.com/show_bug.cgi?id=1860448 = Ubuntu SRU Details = [Impact] See bug description. [Test Case] Deploy openstack with OVN. Run the test_connectivity_through_2_routers test from https://github.com/openstack/neutron-tempest-plugin. This could also be tested manually based on what that test does. Ensure the router port status is not set to DOWN at any point. [Where problems could occur] The existing bug could still occur if the assumpion that specifying the port type is not correct. Presumably this is not the case, but that is a theoritical potential for where problems could occur. All of these patches have already landed in the corresponding upstream branches. To manage notifications about this bug go to: https://bugs.launchpad.net/cloud-archive/+bug/1955578/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1961112] Please test proposed package
Hello Daniel, or anyone else affected, Accepted neutron into ussuri-proposed. The package will build now and be available in the Ubuntu Cloud Archive in a few hours, and then in the -proposed repository. Please help us by testing this new package. To enable the -proposed repository: sudo add-apt-repository cloud-archive:ussuri-proposed sudo apt-get update Your feedback will aid us getting this update out to other Ubuntu users. If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, and change the tag from verification-ussuri-needed to verification-ussuri-done. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-ussuri-failed. In either case, details of your testing will help us make a better decision. Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance! ** Changed in: cloud-archive/ussuri Status: Fix Released => Fix Committed ** Tags added: verification-ussuri-needed -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1961112 Title: [ovn] overlapping security group rules break neutron-ovn-db-sync-util Status in Ubuntu Cloud Archive: Fix Released Status in Ubuntu Cloud Archive ussuri series: Fix Committed Status in neutron: Fix Released Status in neutron package in Ubuntu: Fix Released Status in neutron source package in Focal: Fix Released Bug description: Neutron (Xena) is happy to accept equivalent rules with overlapping remote CIDR prefix as long as the notation is different, e.g. 10.0.0.0/8 and 10.0.0.1/8. However, OVN is smarter, normalizes the prefix and figures out that they both are 10.0.0.0/8. This does not have any fatal effects in a running OVN deployment (creating and using such rules does not even trigger a warning) but upon running neutron-ovn-db-sync-util, it crashes and won't perform a sync. This is a blocker for upgrades (and other scenarios). Security group's rules: $ openstack security group rule list overlap-sgr +--+-+---+++---+---+--+ | ID | IP Protocol | Ethertype | IP Range | Port Range | Direction | Remote Security Group | Remote Address Group | +--+-+---+++---+---+--+ | 3c41fa80-1d23-49c9-9ec1-adf581e07e24 | tcp | IPv4 | 10.0.0.1/8 || ingress | None | None | | 639d263e-6873-47cb-b2c4-17fc824252db | None| IPv4 | 0.0.0.0/0 || egress| None | None | | 96e99039-cbc0-48fe-98fe-ef28d41b9d9b | tcp | IPv4 | 10.0.0.0/8 || ingress | None | None | | bf9160a3-fc9b-467e-85d5-c889811fd6ca | None| IPv6 | ::/0 || egress| None | None | +--+-+---+++---+---+--+ Log excerpt: 16/Feb/2022:20:55:40.568 527216 INFO neutron.cmd.ovn.neutron_ovn_db_sync_util [req-c595a893-db9b-484e-ae8a-bb7dbe8b31f3 - - - - -] Sync for Northbound db started with mode : repair 16/Feb/2022:20:55:42.105 527216 INFO neutron.plugins.ml2.drivers.ovn.mech_driver.ovsdb.extensions.qos [req-c595a893-db9b-484e-ae8a-bb7dbe8b31f3 - - - - -] Starting OVNClientQosExtension 16/Feb/2022:20:55:42.380 527216 INFO neutron.db.ovn_revision_numbers_db [req-c595a893-db9b-484e-ae8a-bb7dbe8b31f3 - - - - -] Successfully bumped revision number for resource 49b3249a-7624-4711-b271-3e63c6a27658 (type: ports) to 17 16/Feb/2022:20:55:43.205 527216 WARNING neutron.plugins.ml2.drivers.ovn.mech_driver.ovsdb.ovn_db_sync [req-c595a893-db9b-484e-ae8a-bb7dbe8b31f3 - - - - -] ACLs-to-be-added 1 ACLs-to-be-removed 0 16/Feb/2022:20:55:43.206 527216 WARNING neutron.plugins.ml2.drivers.ovn.mech_driver.ovsdb.ovn_db_sync [req-c595a893-db9b-484e-ae8a-bb7dbe8b31f3 - - - - -] ACL found in Neutron but not in OVN DB for port group pg_e90b68f3_9f8d_4250_9b6a_7531e2249c99 16/Feb/2022:20:55:43.208 527216 ERROR ovsdbapp.backend.ovs_idl.transaction [req-c595a893-db9b-484e-ae8a-bb7dbe8b31f3 - - - - -] Traceback (most recent call last): File "/usr/lib/python3/dist-packages/ovsdbapp/backend/ovs_idl/connection.py", line 131, in run txn.results.put(txn.do_commit()) File "/usr/lib/python3/dist-packages/ovsdbapp/backend/ovs_idl/transaction.py", line 93, in do_commit
[Yahoo-eng-team] [Bug 2008943] Please test proposed package
Hello Miro, or anyone else affected, Accepted neutron into ussuri-proposed. The package will build now and be available in the Ubuntu Cloud Archive in a few hours, and then in the -proposed repository. Please help us by testing this new package. To enable the -proposed repository: sudo add-apt-repository cloud-archive:ussuri-proposed sudo apt-get update Your feedback will aid us getting this update out to other Ubuntu users. If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, and change the tag from verification-ussuri-needed to verification-ussuri-done. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-ussuri-failed. In either case, details of your testing will help us make a better decision. Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance! ** Changed in: cloud-archive/ussuri Status: Fix Released => Fix Committed ** Tags added: verification-ussuri-needed -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/2008943 Title: OVN DB Sync utility cannot find NB DB Port Group Status in Ubuntu Cloud Archive: Fix Released Status in Ubuntu Cloud Archive ussuri series: Fix Committed Status in Ubuntu Cloud Archive victoria series: Triaged Status in Ubuntu Cloud Archive wallaby series: Triaged Status in Ubuntu Cloud Archive xena series: Triaged Status in neutron: In Progress Status in neutron package in Ubuntu: Fix Released Status in neutron source package in Focal: Fix Released Bug description: Runtime exception: ovsdbapp.backend.ovs_idl.idlutils.RowNotFound: Cannot find Port_Group with name=pg_aa9f203b_ec51_4893_9bda_cfadbff9f800 can occure while performing database sync between Neutron db and OVN NB db using neutron-ovn-db-sync-util. This exception occures when the `sync_networks_ports_and_dhcp_opts()` function ends up implicitly creating a new default security group for a tenant/project id. This is normally ok but the problem is that `sync_port_groups` was already called and thus the port_group does not exists in NB DB. When the `sync_acls()` is called later there is no port group found and exception occurs. Quick way to reproduce on ML2/OVN: - openstack project create test_project - openstack create network --project test_project test_network - openstack port delete $(openstack port list --network test_network -c ID -f value) # since this is an empty network only the metadata port should get listed and subsequently deleted - openstack security group delete test_project So now that you have a network without a metadata port in it and no default security group for the project/tenant that this network belongs to run neutron-ovn-db-sync-util --config-file /etc/neutron/neutron.conf --config-file /etc/neutron/plugins/ml2/ml2_conf.ini --ovn- neutron_sync_mode migrate The exeption should occur Here is a more realistic scenario how we can run into this with ML2/OVS to ML2/OVN migration. I am also including why the code runs into it. 1. ML2/OVS enviroment with a network but no default security group for the project/tenant associated with the network 2. Perform ML2/OVS to ML2/OVN migration. This migration process will run neutron-ovn-db-sync-util with --migrate 3. During the sync we first sync port groups[1] from Neutron DB to OVN DB 4. Then we sync network ports [2]. The process will detect that the network in question is not part of OVN NB. It will create that network in OVN NB db and along with that it will create a metadata port for it(OVN network requires metadataport). The Port_create call will implicitly notify _ensure_default_security_group_handler which will not find securty group for that tenant/project id and create one. Now you have a new security group with 4 new default security group rules. 5. When sync_acls[4] runs it will pick up those 4 new rules but commit to NB DB will fail since the port_group(aka security group) does not exists in NB DB [1] https://opendev.org/openstack/neutron/src/branch/master/neutron/plugins/ml2/drivers/ovn/mech_driver/ovsdb/ovn_db_sync.py#L104 [2] https://opendev.org/openstack/neutron/src/branch/master/neutron/plugins/ml2/drivers/ovn/mech_driver/ovsdb/ovn_db_sync.py#L10 [3] https://opendev.org/openstack/neutron/src/branch/master/neutron/db/securitygroups_db.py#L915 [4] https://opendev.org/openstack/neutron/src/branch/master/neutron/plugins/ml2/drivers/ovn/mech_driver/ovsdb/ovn_db_sync.py#L107 = Ubuntu SRU Details = [Impact] See bug description. [Test Case] Deploy openstack with OVN. Follow steps in "Quick way to reproduce on ML2/OVN" from bug description. [Where problems could occur] The fix
[Yahoo-eng-team] [Bug 2030773] Please test proposed package
Hello Lucas, or anyone else affected, Accepted neutron into ussuri-proposed. The package will build now and be available in the Ubuntu Cloud Archive in a few hours, and then in the -proposed repository. Please help us by testing this new package. To enable the -proposed repository: sudo add-apt-repository cloud-archive:ussuri-proposed sudo apt-get update Your feedback will aid us getting this update out to other Ubuntu users. If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, and change the tag from verification-ussuri-needed to verification-ussuri-done. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-ussuri-failed. In either case, details of your testing will help us make a better decision. Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance! ** Changed in: cloud-archive/ussuri Status: Fix Released => Fix Committed ** Tags added: verification-ussuri-needed -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/2030773 Title: OVN DB Sync always logs warning messages about updating all router ports Status in Ubuntu Cloud Archive: Fix Released Status in Ubuntu Cloud Archive antelope series: Fix Released Status in Ubuntu Cloud Archive ussuri series: Fix Committed Status in Ubuntu Cloud Archive victoria series: Triaged Status in Ubuntu Cloud Archive wallaby series: Triaged Status in Ubuntu Cloud Archive xena series: Triaged Status in Ubuntu Cloud Archive yoga series: Fix Released Status in Ubuntu Cloud Archive zed series: Fix Released Status in neutron: Fix Released Status in neutron package in Ubuntu: Fix Released Status in neutron source package in Focal: Fix Released Status in neutron source package in Jammy: Fix Released Status in neutron source package in Lunar: Fix Released Bug description: Reported at: https://bugzilla.redhat.com/show_bug.cgi?id=2225156 The ovn-db-sync script does not check if the router ports are actually out-of-sync before adding it to the list of ports that needs to be updated, this can create red herring problems by introducing an irrelevant piece of information [0] in the sync report (specially when ran in "log" mode) making the user think that the databases might be out-of-sync even when it is not. Looking at the code [1] we can see that the comment talks about checking the networks and ipv6_ra_configs changes but it does neither; instead, it adds every router port to the list of ports that needs to be updated. # We dont have to check for the networks and # ipv6_ra_configs values. Lets add it to the # update_lrport_list. If they are in sync, then # update_router_port will be a no-op. update_lrport_list.append(db_router_ports[lrport]) This LP is about changing this behavior and checking for such differences in the router ports before marking them to be updated. [0] 2023-07-24 11:46:31.391 952358 WARNING networking_ovn.ovn_db_sync [req-1081a8a6-82dd-431c-a2ab-f58741dc1677 - - - - -] Router Port port_id=f164c0f1-8ac8-4c45-bba9-8c723a30c701 needs to be updated for networks changed [1] https://github.com/openstack/neutron/blob/c453813d0664259c4da0d132f224be2eebe70072/neutron/plugins/ml2/drivers/ovn/mech_driver/ovsdb/ovn_db_sync.py#L553-L557 = Ubuntu SRU Details = [Impact] See bug description above. [Test Case] Deploy openstack with OVN and multiple routers. Run the ovn-db-sync script and ensure router ports that are not out of sync are not marked to be updated. [Where problems could occur] If the _is_router_port_changed() function had a bug there would be potential for ports to be filtered out that need updating. Presumably this is not the case, but that is a theoritical potential for where problems could occur. All of these patches have already landed in the corresponding upstream branches. To manage notifications about this bug go to: https://bugs.launchpad.net/cloud-archive/+bug/2030773/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 2032770] Please test proposed package
Hello Mustafa, or anyone else affected, Accepted neutron into ussuri-proposed. The package will build now and be available in the Ubuntu Cloud Archive in a few hours, and then in the -proposed repository. Please help us by testing this new package. To enable the -proposed repository: sudo add-apt-repository cloud-archive:ussuri-proposed sudo apt-get update Your feedback will aid us getting this update out to other Ubuntu users. If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, and change the tag from verification-ussuri-needed to verification-ussuri-done. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-ussuri-failed. In either case, details of your testing will help us make a better decision. Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance! ** Changed in: cloud-archive/ussuri Status: Fix Released => Fix Committed ** Tags added: verification-ussuri-needed -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/2032770 Title: [SRU] [OVN] port creation with --enable-uplink-status-propagation does not work with OVN mechanism driver Status in Ubuntu Cloud Archive: Fix Released Status in Ubuntu Cloud Archive antelope series: Fix Released Status in Ubuntu Cloud Archive ussuri series: Fix Committed Status in Ubuntu Cloud Archive victoria series: Triaged Status in Ubuntu Cloud Archive wallaby series: Triaged Status in Ubuntu Cloud Archive xena series: Triaged Status in Ubuntu Cloud Archive yoga series: Fix Released Status in Ubuntu Cloud Archive zed series: Fix Released Status in neutron: Fix Released Status in neutron package in Ubuntu: Fix Released Status in neutron source package in Focal: Fix Released Status in neutron source package in Jammy: Fix Released Status in neutron source package in Lunar: Fix Released Bug description: [Impact] This SRU is a backport of https://review.opendev.org/c/openstack/neutron/+/892895 to the respective Ubuntu and UCA releases. The patch is merged to all respective upstream branches (master & stable/[u,v,w,x,y,z,2023.1(a)]). This SRU intends to add the missing 'uplink-status-propagation' extension to ML2/OVN. This extension is already present and working in ML2/OVS, and it is supported by ML2/OVN but the extension is somehow not added to ML2/OVN. The patch simply adds the missing extension to the ML2/OVN too. The impact of this is visible for the deployments migrating from ML2/OVS to ML2/OVN. The following command fails to work on ML2/OVN: ``` openstack port create --network 8d30fb08-2c6a-42fd-98c4-223d345c8c4f --binding-profile trusted=true --enable-uplink-status-propagation --vnic-type direct aaa # BadRequestException: 400: Client Error for url: https://mycloud.example.com:9696/v2.0/ports, Unrecognized attribute(s) 'propagate_uplink_status' ``` The fix corrects this behavior by adding the missing extension. [Test Case] - Deploy a Focal/Yoga cloud: - ./generate-bundle.sh -s focal -r yoga --name test-focal-yoga-stack --run --ovn # After the dust settles - ./configure - source ./novarc - openstack port create --network --binding-profile trusted=true --enable-uplink-status-propagation --vnic-type direct aaa - It should fail with "BadRequestException: 400: Client Error for url: https://mycloud.example.com:9696/v2.0/ports, Unrecognized attribute(s) 'propagate_uplink_status'" To confirm the fix, repeat the scenario and observe that the error disappears and port creation succeeds. [Regression Potential] The patch is quite trivial and should not affect any deployment negatively. The extension is optional and disabled by default. To manage notifications about this bug go to: https://bugs.launchpad.net/cloud-archive/+bug/2032770/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 2051928] [NEW] tests - Python 3.12 - TypeError: Object of type _SentinelObject is not JSON serializable
Public bug reported: Executing unit tests with Python 3.12 results in some test failures which I think are todo with the way the unit tests mock the __json__ method in the tools module: neutron.tests.unit.api.v2.test_base.RegistryNotificationTest.test_networks_create_bulk_registry_publish --- Captured traceback: ~~~ Traceback (most recent call last): File "/home/jamespage/src/upstream/openstack/neutron/neutron/tests/base.py", line 178, in func return f(self, *args, **kwargs) File "/home/jamespage/src/upstream/openstack/neutron/neutron/tests/unit/api/v2/test_base.py", line 1300, in test_networks_create_bulk_registry_publish self._test_registry_publish('create', 'network', input) File "/home/jamespage/src/upstream/openstack/neutron/neutron/tests/unit/api/v2/test_base.py", line 1269, in _test_registry_publish res = self.api.post_json( ^^^ No tests were successful during the run File "/home/jamespage/src/upstream/openstack/neutron/.tox/py312/lib/python3.12/site-packages/webtest/utils.py", line 34, in wrapper return self._gen_request(method, url, **kw) File "/home/jamespage/src/upstream/openstack/neutron/.tox/py312/lib/python3.12/site-packages/webtest/app.py", line 749, in _gen_request return self.do_request(req, status=status, ^^^ File "/home/jamespage/src/upstream/openstack/neutron/.tox/py312/lib/python3.12/site-packages/webtest/app.py", line 646, in do_request self._check_status(status, res) File "/home/jamespage/src/upstream/openstack/neutron/.tox/py312/lib/python3.12/site-packages/webtest/app.py", line 675, in _check_status raise AppError( webtest.app.AppError: Bad response: 500 Internal Server Error (not 200 OK or 3xx redirect for http://localhost/networks) b'{"NeutronError": {"type": "HTTPInternalServerError", "message": "Request Failed: internal server error while processing your request.", "detail": ""}}' Captured pythonlogging: ~~~ ERROR [neutron.pecan_wsgi.hooks.translation] POST failed. Traceback (most recent call last): File "/home/jamespage/src/upstream/openstack/neutron/.tox/py312/lib/python3.12/site-packages/pecan/core.py", line 682, in __call__ self.invoke_controller(controller, args, kwargs, state) File "/home/jamespage/src/upstream/openstack/neutron/.tox/py312/lib/python3.12/site-packages/pecan/core.py", line 603, in invoke_controller result = self.render(template, result) ^ File "/home/jamespage/src/upstream/openstack/neutron/.tox/py312/lib/python3.12/site-packages/pecan/core.py", line 414, in render return renderer.render(template, namespace) File "/home/jamespage/src/upstream/openstack/neutron/.tox/py312/lib/python3.12/site-packages/pecan/templating.py", line 23, in render return encode(namespace) ^ File "/home/jamespage/src/upstream/openstack/neutron/.tox/py312/lib/python3.12/site-packages/pecan/jsonify.py", line 154, in encode return _instance.encode(obj) ^ File "/usr/lib/python3.12/json/encoder.py", line 200, in encode chunks = self.iterencode(o, _one_shot=True) ^^ File "/usr/lib/python3.12/json/encoder.py", line 258, in iterencode return _iterencode(o, 0) ^ File "/home/jamespage/src/upstream/openstack/neutron/.tox/py312/lib/python3.12/site-packages/pecan/jsonify.py", line 148, in default return jsonify(obj) File "/usr/lib/python3.12/functools.py", line 909, in wrapper return dispatch(args[0].__class__)(*args, **kw) File "/home/jamespage/src/upstream/openstack/neutron/.tox/py312/lib/python3.12/site-packages/pecan/jsonify.py", line 143, in jsonify return _default.default(obj) ^ File "/home/jamespage/src/upstream/openstack/neutron/.tox/py312/lib/python3.12/site-packages/pecan/jsonify.py", line 129, in default return JSONEncoder.default(self, obj) ^^ File "/usr/lib/python3.12/json/encoder.py", line 180, in default raise TypeError(f'Object of type {o.__class__.__name__} ' TypeError: Object of type _SentinelObject is not JSON serializable Digging in I can see all of the plugin child calls being updated, however I don't see them actually called under Python 3.12. This issue impacts the following unit tests: FAIL: neutron.tests.unit.api.v2.test_base.RegistryNotificationTest.test_network_create_registry_publish FAIL:
[Yahoo-eng-team] [Bug 2019190] Re: [SRU][RBD] Retyping of in-use boot volumes renders instances unusable (possible data corruption)
** Changed in: cinder (Ubuntu Lunar) Status: New => Won't Fix -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/2019190 Title: [SRU][RBD] Retyping of in-use boot volumes renders instances unusable (possible data corruption) Status in Cinder: New Status in Cinder wallaby series: New Status in Ubuntu Cloud Archive: Fix Released Status in Ubuntu Cloud Archive antelope series: New Status in Ubuntu Cloud Archive bobcat series: In Progress Status in Ubuntu Cloud Archive caracal series: Fix Released Status in Ubuntu Cloud Archive yoga series: New Status in Ubuntu Cloud Archive zed series: New Status in OpenStack Compute (nova): Invalid Status in cinder package in Ubuntu: Fix Released Status in cinder source package in Jammy: New Status in cinder source package in Lunar: Won't Fix Status in cinder source package in Mantic: In Progress Status in cinder source package in Noble: Fix Released Bug description: [Impact] See bug description for full details but short summary is that a patch landed in Wallaby release that introduced a regression whereby retyping an in-use volume leaves the attached volume in an inconsistent state with potential for data corruption. Result is that a vm does not receive updated connection_info from Cinder and will keep pointing to the old volume, even after reboot. [Test Plan] * Deploy Openstack with two Cinder RBD storage backends (different pools) * Create two volume types * Boot a vm from volume: openstack server create --wait --image jammy --flavor m1.small --key-name testkey --nic net-id=8c74f1ef-9231-46f4-a492-eccdb7943ecd testvm --boot-from-volume 10 * Retype the volume to type B: openstack volume set --type typeB --retype-policy on-demand * Go to compute host running vm and check that the vm is now copying data to the new location e.g. b68be47d-f526-4f98-a77b-a903bf8b6c65 which will eventually settle and change to: b68be47d-f526-4f98-a77b-a903bf8b6c65 * And lastly a reboot of the vm should be successfull. [Regression Potential] Given that the current state is potential data corruption and the patch will fix this by successfully refreshing connection info I do not see a regression potential. It is in fact fixing a regression. - While trying out the volume retype feature in cinder, we noticed that after an instance is rebooted it will not come back online and be stuck in an error state or if it comes back online, its filesystem is corrupted. ## Observations Say there are the two volume types `fast` (stored in ceph pool `volumes`) and `slow` (stored in ceph pool `volumes.hdd`). Before the retyping we can see that the volume for example is present in the `volumes.hdd` pool and has a watcher accessing the volume. ```sh [ceph: root@mon0 /]# rbd ls volumes.hdd volume-81cfbafc-4fbb-41b0-abcb-8ec7359d0bf9 [ceph: root@mon0 /]# rbd status volumes.hdd/volume-81cfbafc-4fbb-41b0-abcb-8ec7359d0bf9 Watchers: watcher=[2001:XX:XX:XX::10ad]:0/3914407456 client.365192 cookie=140370268803456 ``` Starting the retyping process using the migration policy `on-demand` for that volume either via the horizon dashboard or the CLI causes the volume to be correctly transferred to the `volumes` pool within the ceph cluster. However, the watcher does not get transferred, so nobody is accessing the volume after it has been transferred. ```sh [ceph: root@mon0 /]# rbd ls volumes volume-81cfbafc-4fbb-41b0-abcb-8ec7359d0bf9 [ceph: root@mon0 /]# rbd status volumes/volume-81cfbafc-4fbb-41b0-abcb-8ec7359d0bf9 Watchers: none ``` Taking a look at the libvirt XML of the instance in question, one can see that the `rbd` volume path does not change after the retyping is completed. Therefore, if the instance is restarted nova will not be able to find its volume preventing an instance start. Pre retype ```xml [...] [...] ``` Post retype (no change) ```xml [...] [...] ``` ### Possible cause While looking through the code that is responsible for the volume retype we found a function `swap_volume` volume which by our understanding should be responsible for fixing the association above. As we understand cinder should use an internal API path to let nova perform this action. This doesn't seem to happen.
[Yahoo-eng-team] [Bug 2019190] Re: [SRU][RBD] Retyping of in-use boot volumes renders instances unusable (possible data corruption)
** Changed in: cinder (Ubuntu Mantic) Status: New => In Progress ** Changed in: cloud-archive/caracal Status: New => Fix Released ** Changed in: cloud-archive/bobcat Status: New => In Progress -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/2019190 Title: [SRU][RBD] Retyping of in-use boot volumes renders instances unusable (possible data corruption) Status in Cinder: New Status in Cinder wallaby series: New Status in Ubuntu Cloud Archive: Fix Released Status in Ubuntu Cloud Archive antelope series: New Status in Ubuntu Cloud Archive bobcat series: In Progress Status in Ubuntu Cloud Archive caracal series: Fix Released Status in Ubuntu Cloud Archive yoga series: New Status in Ubuntu Cloud Archive zed series: New Status in OpenStack Compute (nova): Invalid Status in cinder package in Ubuntu: Fix Released Status in cinder source package in Jammy: New Status in cinder source package in Lunar: New Status in cinder source package in Mantic: In Progress Status in cinder source package in Noble: Fix Released Bug description: [Impact] See bug description for full details but short summary is that a patch landed in Wallaby release that introduced a regression whereby retyping an in-use volume leaves the attached volume in an inconsistent state with potential for data corruption. Result is that a vm does not receive updated connection_info from Cinder and will keep pointing to the old volume, even after reboot. [Test Plan] * Deploy Openstack with two Cinder RBD storage backends (different pools) * Create two volume types * Boot a vm from volume: openstack server create --wait --image jammy --flavor m1.small --key-name testkey --nic net-id=8c74f1ef-9231-46f4-a492-eccdb7943ecd testvm --boot-from-volume 10 * Retype the volume to type B: openstack volume set --type typeB --retype-policy on-demand * Go to compute host running vm and check that the vm is now copying data to the new location e.g. b68be47d-f526-4f98-a77b-a903bf8b6c65 which will eventually settle and change to: b68be47d-f526-4f98-a77b-a903bf8b6c65 * And lastly a reboot of the vm should be successfull. [Regression Potential] Given that the current state is potential data corruption and the patch will fix this by successfully refreshing connection info I do not see a regression potential. It is in fact fixing a regression. - While trying out the volume retype feature in cinder, we noticed that after an instance is rebooted it will not come back online and be stuck in an error state or if it comes back online, its filesystem is corrupted. ## Observations Say there are the two volume types `fast` (stored in ceph pool `volumes`) and `slow` (stored in ceph pool `volumes.hdd`). Before the retyping we can see that the volume for example is present in the `volumes.hdd` pool and has a watcher accessing the volume. ```sh [ceph: root@mon0 /]# rbd ls volumes.hdd volume-81cfbafc-4fbb-41b0-abcb-8ec7359d0bf9 [ceph: root@mon0 /]# rbd status volumes.hdd/volume-81cfbafc-4fbb-41b0-abcb-8ec7359d0bf9 Watchers: watcher=[2001:XX:XX:XX::10ad]:0/3914407456 client.365192 cookie=140370268803456 ``` Starting the retyping process using the migration policy `on-demand` for that volume either via the horizon dashboard or the CLI causes the volume to be correctly transferred to the `volumes` pool within the ceph cluster. However, the watcher does not get transferred, so nobody is accessing the volume after it has been transferred. ```sh [ceph: root@mon0 /]# rbd ls volumes volume-81cfbafc-4fbb-41b0-abcb-8ec7359d0bf9 [ceph: root@mon0 /]# rbd status volumes/volume-81cfbafc-4fbb-41b0-abcb-8ec7359d0bf9 Watchers: none ``` Taking a look at the libvirt XML of the instance in question, one can see that the `rbd` volume path does not change after the retyping is completed. Therefore, if the instance is restarted nova will not be able to find its volume preventing an instance start. Pre retype ```xml [...] [...] ``` Post retype (no change) ```xml [...] [...] ``` ### Possible cause While looking through the code that is responsible for the volume retype we found a function `swap_volume` volume which by our understanding should be responsible for fixing the association
[Yahoo-eng-team] [Bug 2019190] Re: [SRU][RBD] Retyping of in-use boot volumes renders instances unusable (possible data corruption)
Included in most recent snapshots for Caracal ** Changed in: cinder (Ubuntu Noble) Status: New => Fix Released -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/2019190 Title: [SRU][RBD] Retyping of in-use boot volumes renders instances unusable (possible data corruption) Status in Cinder: New Status in Cinder wallaby series: New Status in Ubuntu Cloud Archive: New Status in Ubuntu Cloud Archive antelope series: New Status in Ubuntu Cloud Archive bobcat series: New Status in Ubuntu Cloud Archive caracal series: New Status in Ubuntu Cloud Archive yoga series: New Status in Ubuntu Cloud Archive zed series: New Status in OpenStack Compute (nova): Invalid Status in cinder package in Ubuntu: Fix Released Status in cinder source package in Jammy: New Status in cinder source package in Lunar: New Status in cinder source package in Mantic: New Status in cinder source package in Noble: Fix Released Bug description: [Impact] See bug description for full details but short summary is that a patch landed in Wallaby release that introduced a regression whereby retyping an in-use volume leaves the attached volume in an inconsistent state with potential for data corruption. Result is that a vm does not receive updated connection_info from Cinder and will keep pointing to the old volume, even after reboot. [Test Plan] * Deploy Openstack with two Cinder RBD storage backends (different pools) * Create two volume types * Boot a vm from volume: openstack server create --wait --image jammy --flavor m1.small --key-name testkey --nic net-id=8c74f1ef-9231-46f4-a492-eccdb7943ecd testvm --boot-from-volume 10 * Retype the volume to type B: openstack volume set --type typeB --retype-policy on-demand * Go to compute host running vm and check that the vm is now copying data to the new location e.g. b68be47d-f526-4f98-a77b-a903bf8b6c65 which will eventually settle and change to: b68be47d-f526-4f98-a77b-a903bf8b6c65 * And lastly a reboot of the vm should be successfull. [Regression Potential] Given that the current state is potential data corruption and the patch will fix this by successfully refreshing connection info I do not see a regression potential. It is in fact fixing a regression. - While trying out the volume retype feature in cinder, we noticed that after an instance is rebooted it will not come back online and be stuck in an error state or if it comes back online, its filesystem is corrupted. ## Observations Say there are the two volume types `fast` (stored in ceph pool `volumes`) and `slow` (stored in ceph pool `volumes.hdd`). Before the retyping we can see that the volume for example is present in the `volumes.hdd` pool and has a watcher accessing the volume. ```sh [ceph: root@mon0 /]# rbd ls volumes.hdd volume-81cfbafc-4fbb-41b0-abcb-8ec7359d0bf9 [ceph: root@mon0 /]# rbd status volumes.hdd/volume-81cfbafc-4fbb-41b0-abcb-8ec7359d0bf9 Watchers: watcher=[2001:XX:XX:XX::10ad]:0/3914407456 client.365192 cookie=140370268803456 ``` Starting the retyping process using the migration policy `on-demand` for that volume either via the horizon dashboard or the CLI causes the volume to be correctly transferred to the `volumes` pool within the ceph cluster. However, the watcher does not get transferred, so nobody is accessing the volume after it has been transferred. ```sh [ceph: root@mon0 /]# rbd ls volumes volume-81cfbafc-4fbb-41b0-abcb-8ec7359d0bf9 [ceph: root@mon0 /]# rbd status volumes/volume-81cfbafc-4fbb-41b0-abcb-8ec7359d0bf9 Watchers: none ``` Taking a look at the libvirt XML of the instance in question, one can see that the `rbd` volume path does not change after the retyping is completed. Therefore, if the instance is restarted nova will not be able to find its volume preventing an instance start. Pre retype ```xml [...] [...] ``` Post retype (no change) ```xml [...] [...] ``` ### Possible cause While looking through the code that is responsible for the volume retype we found a function `swap_volume` volume which by our understanding should be responsible for fixing the association above. As we understand cinder should use an internal API path to let nova perform this action. This doesn't seem to happen.
[Yahoo-eng-team] [Bug 2004031] Re: User with admin_required in a non cloud_admin domain/project can manage other domains with admin_required permissions
Please can you provide full details of your deployment; specifically which charms and channels you are using and on which base version of Ubuntu. ** Project changed: keystone => charm-keystone ** Changed in: charm-keystone Status: New => Incomplete -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Identity (keystone). https://bugs.launchpad.net/bugs/2004031 Title: User with admin_required in a non cloud_admin domain/project can manage other domains with admin_required permissions Status in OpenStack Keystone Charm: Incomplete Bug description: In a deployment of Openstack Yoga, I have the following policy.json configured in Keystone: https://paste.ubuntu.com/p/F2PMP857mG/. When I create a new domain, a project inside that domain, a user with the role:Admin, and I set the context for that user/project/domain for the CLI, I can perform actions like list and delete instances, images, networks and routers created in the cloud_admin domain domain_id:703118433996472d82713a3100b07432 and cloud_admin project project_id:16264684b58747cba04a98c128f5044f. To manage notifications about this bug go to: https://bugs.launchpad.net/charm-keystone/+bug/2004031/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1909581] Re: Install and configure for Red Hat Enterprise Linux and CentOS in horizon
** Also affects: horizon Importance: Undecided Status: New ** No longer affects: horizon (Ubuntu) -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Dashboard (Horizon). https://bugs.launchpad.net/bugs/1909581 Title: Install and configure for Red Hat Enterprise Linux and CentOS in horizon Status in OpenStack Dashboard (Horizon): New Bug description: /etc/openstack-dashboard/local_settings needs WEBROOT = '/dashboard/' --- Release: 18.6.2.dev8 on 2019-12-05 11:04:48 SHA: 7806d67529b7718cac6015677b60b9b52a4f8dd7 Source: https://opendev.org/openstack/horizon/src/doc/source/install/install-rdo.rst URL: https://docs.openstack.org/horizon/victoria/install/install-rdo.html To manage notifications about this bug go to: https://bugs.launchpad.net/horizon/+bug/1909581/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1928031] Re: neutron-ovn-metadata-agent AttributeError: 'MetadataProxyHandler' object has no attribute 'sb_idl'
** Changed in: openvswitch (Ubuntu) Status: New => Fix Released -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1928031 Title: neutron-ovn-metadata-agent AttributeError: 'MetadataProxyHandler' object has no attribute 'sb_idl' Status in charm-ovn-chassis: Invalid Status in Ubuntu Cloud Archive: New Status in Ubuntu Cloud Archive ussuri series: New Status in Ubuntu Cloud Archive wallaby series: New Status in neutron: Fix Released Status in neutron package in Ubuntu: Fix Released Status in openvswitch package in Ubuntu: Fix Released Status in neutron source package in Focal: New Status in openvswitch source package in Focal: New Bug description: neutron-ovn-metadata-agent not able to handle any metadata requests from the instances. Scenario: * Initially there is some intermittent connectivity issues that are descirbed in LP #1907686 https://bugs.launchpad.net/charm-ovn-chassis/+bug/1907686/comments/9 * The fix for the above is available in python3-openvswitch package in ussuri-proposed pocket Installed the fix on all neutron-server and compute and restarted neutron-ovn-metadata-agent one by one * neutron-ovn-metadata-agent on one of the compute nodes not able to handle any metadata requests after restart. ( Please note the problem happened with only one ovn-metadata agent and rest of the agents are good on other compute nodes, so this is some race condition in IDL) Stacktrace shows both the workers 69188/69189 timed out on OVNSB IDL connection and hence sb_idl is never initialized. Stacktrace of Attribute error: -- 2021-04-27 08:51:01.340 69188 ERROR neutron.agent.ovn.metadata.server Traceback (most recent call last): 2021-04-27 08:51:01.340 69188 ERROR neutron.agent.ovn.metadata.server File "/usr/lib/python3/dist-packages/neutron/agent/ovn/metadata/server.py", line 67, in __call__ 2021-04-27 08:51:01.340 69188 ERROR neutron.agent.ovn.metadata.server instance_id, project_id = self._get_instance_and_project_id(req) 2021-04-27 08:51:01.340 69188 ERROR neutron.agent.ovn.metadata.server File "/usr/lib/python3/dist-packages/neutron/agent/ovn/metadata/server.py", line 84, in _get_instance_and_project_id 2021-04-27 08:51:01.340 69188 ERROR neutron.agent.ovn.metadata.server ports = self.sb_idl.get_network_port_bindings_by_ip(network_id, 2021-04-27 08:51:01.340 69188 ERROR neutron.agent.ovn.metadata.server AttributeError: 'MetadataProxyHandler' object has no attribute 'sb_idl' 2021-04-27 08:51:01.340 69188 ERROR neutron.agent.ovn.metadata.server Stacktrace at the restart of neutron-ovn-metadata-agent process: 2021-04-15 22:27:03.803 69124 INFO neutron.common.config [-] /usr/bin/neutron-ovn-metadata-agent version 16.2.0 2021-04-15 22:27:03.832 69124 INFO ovsdbapp.backend.ovs_idl.vlog [-] tcp:127.0.0.1:6640: connecting... 2021-04-15 22:27:03.833 69124 INFO ovsdbapp.backend.ovs_idl.vlog [-] tcp:127.0.0.1:6640: connected 2021-04-15 22:27:03.949 69124 WARNING neutron.agent.ovn.metadata.agent [-] Can't read ovn-bridge external-id from OVSDB. Using br-int instead. 2021-04-15 22:27:03.950 69124 INFO oslo_service.service [-] Starting 2 workers 2021-04-15 22:27:03.985 69188 INFO ovsdbapp.backend.ovs_idl.vlog [-] ssl:10.216.241.118:6642: connecting... 2021-04-15 22:27:03.986 69189 INFO ovsdbapp.backend.ovs_idl.vlog [-] ssl:10.216.241.118:6642: connecting... 2021-04-15 22:27:04.005 69124 INFO ovsdbapp.backend.ovs_idl.vlog [-] ssl:10.216.241.118:6642: connecting... 2021-04-15 22:27:04.006 69188 INFO ovsdbapp.backend.ovs_idl.vlog [-] ssl:10.216.241.118:6642: connected 2021-04-15 22:27:04.033 69189 INFO ovsdbapp.backend.ovs_idl.vlog [-] ssl:10.216.241.118:6642: connected 2021-04-15 22:27:04.061 69124 INFO ovsdbapp.backend.ovs_idl.vlog [-] ssl:10.216.241.118:6642: connected 2021-04-15 22:27:06.129 69124 INFO oslo.privsep.daemon [-] Running privsep helper: ['sudo', '/usr/bin/neutron-rootwrap', '/etc/neutron/rootwrap.conf', 'privsep-helper', '--config-file', '/etc/neutron/neutron.conf', '--config-file', '/etc/neutron/neutron_ovn_metadata_agent.ini', '--privsep_context', 'neutron.privileged.default', '--privsep_sock_path', '/tmp/tmpgncr2rq7/privsep.sock'] 2021-04-15 22:27:06.757 69124 INFO oslo.privsep.daemon [-] Spawned new privsep daemon via rootwrap 2021-04-15 22:27:06.676 69211 INFO oslo.privsep.daemon [-] privsep daemon starting 2021-04-15 22:27:06.678 69211 INFO oslo.privsep.daemon [-] privsep process running with uid/gid: 0/0 2021-04-15 22:27:06.680 69211 INFO oslo.privsep.daemon [-] privsep process running with capabilities (eff/prm/inh):
[Yahoo-eng-team] [Bug 1951261] Re: web-download doesn't work in proxied env
whitelisting and blacklist exists for the web-download importer but no proxy configuration. This feels like a feature that needs to go into glance rather than being mashed in by the charm in some way that kinda works/maybe works. ** Changed in: charm-glance Status: New => Incomplete ** Changed in: charm-glance Importance: Undecided => Low ** Also affects: glance Importance: Undecided Status: New ** Changed in: charm-glance Importance: Low => Wishlist -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to Glance. https://bugs.launchpad.net/bugs/1951261 Title: web-download doesn't work in proxied env Status in OpenStack Glance Charm: Incomplete Status in Glance: New Bug description: I'm trying to import an image via the web-download method[0][1]. When kicking off the import process I'm getting this in the glance- api.log 2021-11-17 12:50:20.586 24884 ERROR glance.async_.taskflow_executor Traceback (most recent call last): 2021-11-17 12:50:20.586 24884 ERROR glance.async_.taskflow_executor File "/usr/lib/python3/dist-packages/taskflow/engines/action_engine/executor.py", line 53, in _execute_task 2021-11-17 12:50:20.586 24884 ERROR glance.async_.taskflow_executor result = task.execute(**arguments) 2021-11-17 12:50:20.586 24884 ERROR glance.async_.taskflow_executor File "/usr/lib/python3/dist-packages/glance/async_/flows/_internal_plugins/web_download.py", line 116, in execute 2021-11-17 12:50:20.586 24884 ERROR glance.async_.taskflow_executor LOG.error("Task %(task_id)s failed with exception %(error)s", 2021-11-17 12:50:20.586 24884 ERROR glance.async_.taskflow_executor File "/usr/lib/python3/dist-packages/oslo_utils/excutils.py", line 220, in __exit__ 2021-11-17 12:50:20.586 24884 ERROR glance.async_.taskflow_executor self.force_reraise() 2021-11-17 12:50:20.586 24884 ERROR glance.async_.taskflow_executor File "/usr/lib/python3/dist-packages/oslo_utils/excutils.py", line 196, in force_reraise 2021-11-17 12:50:20.586 24884 ERROR glance.async_.taskflow_executor six.reraise(self.type_, self.value, self.tb) 2021-11-17 12:50:20.586 24884 ERROR glance.async_.taskflow_executor File "/usr/lib/python3/dist-packages/six.py", line 703, in reraise 2021-11-17 12:50:20.586 24884 ERROR glance.async_.taskflow_executor raise value 2021-11-17 12:50:20.586 24884 ERROR glance.async_.taskflow_executor File "/usr/lib/python3/dist-packages/glance/async_/flows/_internal_plugins/web_download.py", line 113, in execute 2021-11-17 12:50:20.586 24884 ERROR glance.async_.taskflow_executor data = script_utils.get_image_data_iter(self.uri) 2021-11-17 12:50:20.586 24884 ERROR glance.async_.taskflow_executor File "/usr/lib/python3/dist-packages/glance/common/scripts/utils.py", line 142, in get_image_data_iter 2021-11-17 12:50:20.586 24884 ERROR glance.async_.taskflow_executor return urllib.request.urlopen(uri) 2021-11-17 12:50:20.586 24884 ERROR glance.async_.taskflow_executor File "/usr/lib/python3.8/urllib/request.py", line 222, in urlopen 2021-11-17 12:50:20.586 24884 ERROR glance.async_.taskflow_executor return opener.open(url, data, timeout) 2021-11-17 12:50:20.586 24884 ERROR glance.async_.taskflow_executor File "/usr/lib/python3.8/urllib/request.py", line 525, in open 2021-11-17 12:50:20.586 24884 ERROR glance.async_.taskflow_executor response = self._open(req, data) 2021-11-17 12:50:20.586 24884 ERROR glance.async_.taskflow_executor File "/usr/lib/python3.8/urllib/request.py", line 542, in _open 2021-11-17 12:50:20.586 24884 ERROR glance.async_.taskflow_executor result = self._call_chain(self.handle_open, protocol, protocol + 2021-11-17 12:50:20.586 24884 ERROR glance.async_.taskflow_executor File "/usr/lib/python3.8/urllib/request.py", line 502, in _call_chain 2021-11-17 12:50:20.586 24884 ERROR glance.async_.taskflow_executor result = func(*args) 2021-11-17 12:50:20.586 24884 ERROR glance.async_.taskflow_executor File "/usr/lib/python3.8/urllib/request.py", line 1383, in http_open 2021-11-17 12:50:20.586 24884 ERROR glance.async_.taskflow_executor return self.do_open(http.client.HTTPConnection, req) 2021-11-17 12:50:20.586 24884 ERROR glance.async_.taskflow_executor File "/usr/lib/python3.8/urllib/request.py", line 1357, in do_open 2021-11-17 12:50:20.586 24884 ERROR glance.async_.taskflow_executor raise URLError(err) 2021-11-17 12:50:20.586 24884 ERROR glance.async_.taskflow_executor urllib.error.URLError: 2021-11-17 12:50:20.586 24884 ERROR glance.async_.taskflow_executor The model is situated behind a http proxy. I have set this model-config: juju model-config | grep http-proxy apt-http-proxycontroller http://foo.proxy.host:3128 http-proxydefault "" juju-http-proxy controller
[Yahoo-eng-team] [Bug 1892361] Re: SRIOV instance gets type-PF interface, libvirt kvm fails
** Changed in: nova/stein Status: Fix Committed => Fix Released -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1892361 Title: SRIOV instance gets type-PF interface, libvirt kvm fails Status in Ubuntu Cloud Archive: Fix Released Status in Ubuntu Cloud Archive queens series: Fix Released Status in Ubuntu Cloud Archive rocky series: Fix Released Status in Ubuntu Cloud Archive stein series: Fix Released Status in Ubuntu Cloud Archive train series: Fix Released Status in Ubuntu Cloud Archive ussuri series: Fix Released Status in Ubuntu Cloud Archive victoria series: Fix Released Status in OpenStack Compute (nova): Fix Released Status in OpenStack Compute (nova) queens series: Fix Released Status in OpenStack Compute (nova) rocky series: Fix Released Status in OpenStack Compute (nova) stein series: Fix Released Status in OpenStack Compute (nova) train series: Fix Released Status in OpenStack Compute (nova) ussuri series: Fix Released Status in OpenStack Compute (nova) victoria series: Fix Released Status in nova package in Ubuntu: Fix Released Status in nova source package in Bionic: Fix Released Status in nova source package in Focal: Fix Released Status in nova source package in Groovy: Fix Released Status in nova source package in Hirsute: Fix Released Bug description: When spawning an SR-IOV enabled instance on a newly deployed host, nova attempts to spawn it with an type-PF pci device. This fails with the below stack trace. After restarting neutron-sriov-agent and nova-compute services on the compute node and spawning an SR-IOV instance again, a type-VF pci device is selected, and instance spawning succeeds. Stack trace: 2020-08-20 08:29:09.558 7624 DEBUG oslo_messaging._drivers.amqpdriver [-] received reply msg_id: 6db8011e6ecd4fd0aaa53c8f89f08b1b __call__ /usr/lib/python2.7/dist-packages/oslo_messaging/_drivers/amqpdriver.py:400 2020-08-20 08:29:09.561 7624 ERROR nova.compute.manager [req-e3e49d07-24c6-4c62-916e-f830f70983a2 ddcfb3640535428798aa3c8545362bd4 dd99e7950a5b46b5b924ccd1720b6257 - 015e4fd7db304665ab5378caa691bb8b 015e4fd7db304665ab5378caa691bb8b] [insta nce: 9498ea75-fe88-4020-9a9e-f4c437c6de11] Instance failed to spawn: libvirtError: unsupported configuration: Interface type hostdev is currently supported on SR-IOV Virtual Functions only 2020-08-20 08:29:09.561 7624 ERROR nova.compute.manager [instance: 9498ea75-fe88-4020-9a9e-f4c437c6de11] Traceback (most recent call last): 2020-08-20 08:29:09.561 7624 ERROR nova.compute.manager [instance: 9498ea75-fe88-4020-9a9e-f4c437c6de11] File "/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 2274, in _build_resources 2020-08-20 08:29:09.561 7624 ERROR nova.compute.manager [instance: 9498ea75-fe88-4020-9a9e-f4c437c6de11] yield resources 2020-08-20 08:29:09.561 7624 ERROR nova.compute.manager [instance: 9498ea75-fe88-4020-9a9e-f4c437c6de11] File "/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 2054, in _build_and_run_instance 2020-08-20 08:29:09.561 7624 ERROR nova.compute.manager [instance: 9498ea75-fe88-4020-9a9e-f4c437c6de11] block_device_info=block_device_info) 2020-08-20 08:29:09.561 7624 ERROR nova.compute.manager [instance: 9498ea75-fe88-4020-9a9e-f4c437c6de11] File "/usr/lib/python2.7/dist-packages/nova/virt/libvirt/driver.py", line 3147, in spawn 2020-08-20 08:29:09.561 7624 ERROR nova.compute.manager [instance: 9498ea75-fe88-4020-9a9e-f4c437c6de11] destroy_disks_on_failure=True) 2020-08-20 08:29:09.561 7624 ERROR nova.compute.manager [instance: 9498ea75-fe88-4020-9a9e-f4c437c6de11] File "/usr/lib/python2.7/dist-packages/nova/virt/libvirt/driver.py", line 5651, in _create_domain_and_network 2020-08-20 08:29:09.561 7624 ERROR nova.compute.manager [instance: 9498ea75-fe88-4020-9a9e-f4c437c6de11] destroy_disks_on_failure) 2020-08-20 08:29:09.561 7624 ERROR nova.compute.manager [instance: 9498ea75-fe88-4020-9a9e-f4c437c6de11] File "/usr/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 220, in __exit__ 2020-08-20 08:29:09.561 7624 ERROR nova.compute.manager [instance: 9498ea75-fe88-4020-9a9e-f4c437c6de11] self.force_reraise() 2020-08-20 08:29:09.561 7624 ERROR nova.compute.manager [instance: 9498ea75-fe88-4020-9a9e-f4c437c6de11] File "/usr/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 196, in force_reraise 2020-08-20 08:29:09.561 7624 ERROR nova.compute.manager [instance: 9498ea75-fe88-4020-9a9e-f4c437c6de11] six.reraise(self.type_, self.value, self.tb) 2020-08-20 08:29:09.561 7624 ERROR nova.compute.manager [instance: 9498ea75-fe88-4020-9a9e-f4c437c6de11] File "/usr/lib/python2.7/dist-packages/nova/virt/libvirt/driver.py", line 5620, in _create_domain_and_network 2020-08-20 08:29:09.561 7624 ERROR
[Yahoo-eng-team] [Bug 1937261] Re: python3-msgpack package broken due to outdated cython
python-msgpack promoted to Ussuri updates pocket. ** Changed in: cloud-archive/ussuri Status: Fix Committed => Fix Released -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1937261 Title: python3-msgpack package broken due to outdated cython Status in Ubuntu Cloud Archive: Invalid Status in Ubuntu Cloud Archive ussuri series: Fix Released Status in neutron: New Status in oslo.privsep: New Bug description: After a successful upgrade of the control-plance from Train -> Ussuri on Ubuntu Bionic, we upgraded a first compute / network node and immediately ran into issues with Neutron: We noticed that Neutron is extremely slow in setting up and wiring the network ports, so slow it would never finish and throw all sorts of errors (RabbitMQ connection timeouts, full sync required, ...) We were now able to reproduce the error on our Ussuri DEV cloud as well: 1) First we used strace - -p $PID_OF_NEUTRON_LINUXBRIDGE_AGENT and noticed that the data exchange on the unix socket between the rootwrap-daemon and the main process is really really slow. One could actually read line by line the read calls to the fd of the socket. 2) We then (after adding lots of log lines and other intensive manual debugging) used py-spy (https://github.com/benfred/py-spy) via "py-spy top --pid $PID" on the running neutron-linuxbridge-agent process and noticed all the CPU time (process was at 100% most of the time) was spent in msgpack/fallback.py 3) Since the issue was not observed in TRAIN we compared the msgpack version used and noticed that TRAIN was using version 0.5.6 while Ussuri upgraded this dependency to 0.6.2. 4) We then downgraded to version 0.5.6 of msgpack (ignoring the actual dependencies) --- cut --- apt policy python3-msgpack python3-msgpack: Installed: 0.6.2-1~cloud0 Candidate: 0.6.2-1~cloud0 Version table: *** 0.6.2-1~cloud0 500 500 http://ubuntu-cloud.archive.canonical.com/ubuntu bionic-updates/ussuri/main amd64 Packages 0.5.6-1 500 500 http://de.archive.ubuntu.com/ubuntu bionic/main amd64 Packages 100 /var/lib/dpkg/status --- cut --- vs. --- cut --- apt policy python3-msgpack python3-msgpack: Installed: 0.5.6-1 Candidate: 0.6.2-1~cloud0 Version table: 0.6.2-1~cloud0 500 500 http://ubuntu-cloud.archive.canonical.com/ubuntu bionic-updates/ussuri/main amd64 Packages *** 0.5.6-1 500 500 http://de.archive.ubuntu.com/ubuntu bionic/main amd64 Packages 100 /var/lib/dpkg/status --- cut --- and et voila: The Neutron-Linuxbridge-Agent worked just like before (building one port every few seconds) and all network ports eventually converged to ACTIVE. I could not yet spot which commit of msgpack changes (https://github.com/msgpack/msgpack-python/compare/0.5.6...v0.6.2) might have caused this issue, but I am really certain that this is a major issue for Ussuri on Ubuntu Bionic. There are "similar" issues with * https://bugs.launchpad.net/oslo.privsep/+bug/1844822 * https://bugs.launchpad.net/oslo.privsep/+bug/1896734 both related to msgpack or the size of messages exchanged. To manage notifications about this bug go to: https://bugs.launchpad.net/cloud-archive/+bug/1937261/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1937261] Re: python3-msgpack package broken due to outdated cython
** Changed in: cloud-archive Status: New => Confirmed ** Also affects: cloud-archive/ussuri Importance: Undecided Status: New ** Changed in: cloud-archive Status: Confirmed => Invalid ** Changed in: cloud-archive/ussuri Status: New => Triaged ** Changed in: cloud-archive/ussuri Importance: Undecided => Medium ** No longer affects: python-msgpack (Ubuntu) ** No longer affects: python-oslo.privsep (Ubuntu) -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1937261 Title: python3-msgpack package broken due to outdated cython Status in Ubuntu Cloud Archive: Invalid Status in Ubuntu Cloud Archive ussuri series: Triaged Status in neutron: New Status in oslo.privsep: New Bug description: After a successful upgrade of the control-plance from Train -> Ussuri on Ubuntu Bionic, we upgraded a first compute / network node and immediately ran into issues with Neutron: We noticed that Neutron is extremely slow in setting up and wiring the network ports, so slow it would never finish and throw all sorts of errors (RabbitMQ connection timeouts, full sync required, ...) We were now able to reproduce the error on our Ussuri DEV cloud as well: 1) First we used strace - -p $PID_OF_NEUTRON_LINUXBRIDGE_AGENT and noticed that the data exchange on the unix socket between the rootwrap-daemon and the main process is really really slow. One could actually read line by line the read calls to the fd of the socket. 2) We then (after adding lots of log lines and other intensive manual debugging) used py-spy (https://github.com/benfred/py-spy) via "py-spy top --pid $PID" on the running neutron-linuxbridge-agent process and noticed all the CPU time (process was at 100% most of the time) was spent in msgpack/fallback.py 3) Since the issue was not observed in TRAIN we compared the msgpack version used and noticed that TRAIN was using version 0.5.6 while Ussuri upgraded this dependency to 0.6.2. 4) We then downgraded to version 0.5.6 of msgpack (ignoring the actual dependencies) --- cut --- apt policy python3-msgpack python3-msgpack: Installed: 0.6.2-1~cloud0 Candidate: 0.6.2-1~cloud0 Version table: *** 0.6.2-1~cloud0 500 500 http://ubuntu-cloud.archive.canonical.com/ubuntu bionic-updates/ussuri/main amd64 Packages 0.5.6-1 500 500 http://de.archive.ubuntu.com/ubuntu bionic/main amd64 Packages 100 /var/lib/dpkg/status --- cut --- vs. --- cut --- apt policy python3-msgpack python3-msgpack: Installed: 0.5.6-1 Candidate: 0.6.2-1~cloud0 Version table: 0.6.2-1~cloud0 500 500 http://ubuntu-cloud.archive.canonical.com/ubuntu bionic-updates/ussuri/main amd64 Packages *** 0.5.6-1 500 500 http://de.archive.ubuntu.com/ubuntu bionic/main amd64 Packages 100 /var/lib/dpkg/status --- cut --- and et voila: The Neutron-Linuxbridge-Agent worked just like before (building one port every few seconds) and all network ports eventually converged to ACTIVE. I could not yet spot which commit of msgpack changes (https://github.com/msgpack/msgpack-python/compare/0.5.6...v0.6.2) might have caused this issue, but I am really certain that this is a major issue for Ussuri on Ubuntu Bionic. There are "similar" issues with * https://bugs.launchpad.net/oslo.privsep/+bug/1844822 * https://bugs.launchpad.net/oslo.privsep/+bug/1896734 both related to msgpack or the size of messages exchanged. To manage notifications about this bug go to: https://bugs.launchpad.net/cloud-archive/+bug/1937261/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1852221] Re: ovs-vswitchd needs to be forced to reconfigure after adding protocols to bridges
2.15.0 contains the fix for this issue - marking Fix Released. ** Changed in: openvswitch (Ubuntu) Status: Triaged => Fix Released -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1852221 Title: ovs-vswitchd needs to be forced to reconfigure after adding protocols to bridges Status in OpenStack neutron-openvswitch charm: Invalid Status in Ubuntu Cloud Archive: Fix Released Status in Ubuntu Cloud Archive train series: Fix Released Status in Ubuntu Cloud Archive ussuri series: Fix Released Status in kolla-ansible: New Status in neutron: New Status in openvswitch: New Status in neutron package in Ubuntu: Fix Released Status in openvswitch package in Ubuntu: Fix Released Status in neutron source package in Eoan: Fix Released Status in neutron source package in Focal: Fix Released Bug description: [Impact] When the neutron native ovs driver creates bridges it will sometimes apply/modify the supported openflow protocols on that bridge. The OpenVswitch versions shipped with Train and Ussuri don't support this which results in OF protocol mismatches when neutron performs operations on that bridge. The patch we are backporting here ensures that all protocol versions are set on the bridge at the point on create/init. [Test Case] * deploy Openstack Train * go to a compute host and do: sudo ovs-ofctl -O OpenFlow14 dump-flows br-int * ensure you do not see "negotiation failed" errors [Regression Potential] * this patch is ensuring that newly created Neutron ovs bridges have OpenFlow 1.0, 1.3 and 1.4 set on them. Neutron already supports these so is not expected to have any change in behaviour. The patch will not impact bridges that already exist (so will not fix them either if they are affected). -- As part of programming OpenvSwitch, Neutron will add to which protocols bridges support [0]. However, the Open vSwitch `ovs-vswitchd` process does not appear to always update its perspective of which protocol versions it should support for bridges: # ovs-ofctl -O OpenFlow14 dump-flows br-int 2019-11-12T12:52:56Z|1|vconn|WARN|unix:/var/run/openvswitch/br-int.mgmt: version negotiation failed (we support version 0x05, peer supports version 0x01) ovs-ofctl: br-int: failed to connect to socket (Broken pipe) # systemctl restart ovsdb-server # ovs-ofctl -O OpenFlow14 dump-flows br-int cookie=0x84ead4b79da3289a, duration=1.576s, table=0, n_packets=0, n_bytes=0, priority=65535,vlan_tci=0x0fff/0x1fff actions=drop cookie=0x84ead4b79da3289a, duration=1.352s, table=0, n_packets=0, n_bytes=0, priority=5,in_port="int-br-ex",dl_dst=fa:16:3f:69:2e:c6 actions=goto_table:4 ... (Success) The restart of the `ovsdb-server` process above will make `ovs- vswitchd` reassess its configuration. 0: https://github.com/openstack/neutron/blob/0fa7e74ebb386b178d36ae684ff04f03bdd6cb0d/neutron/agent/common/ovs_lib.py#L281 To manage notifications about this bug go to: https://bugs.launchpad.net/charm-neutron-openvswitch/+bug/1852221/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1749425] Re: Neutron integrated with OpenVSwitch drops packets and fails to plug/unplug interfaces from OVS on router interfaces at scale
Marking OVS task as invalid as it appears this is a neutron bug related to configuration of VRRP for HA routers. ** Changed in: openvswitch (Ubuntu) Status: Incomplete => Invalid -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1749425 Title: Neutron integrated with OpenVSwitch drops packets and fails to plug/unplug interfaces from OVS on router interfaces at scale Status in neutron: New Status in openvswitch package in Ubuntu: Invalid Bug description: Description:Ubuntu 16.04.3 LTS Release:16.04 Linux 4.4.0-96-generic on AMD64 Neutron 2:10.0.4-0ubuntu2~cloud0 from Cloud Archive xenial-updates/ocata OpenVSwitch 2.6.1-0ubuntu5.2~cloud0 from Cloud Archive xenial-upates/ocata In an environment with three bare-metal Neutron deployments, hosting upward of 300 routers, with approximately the same number of instances, typically one router per instance, packet loss on instances accessed via floating IPs, including complete connectivity loss, is experienced. The problem is exacerbated by enabling L3HA, likely due to the increase in router namespaces to be scheduled and managed, and the additional scheduling work of bringing up keepalived and monitoring the keepalived VIP. Reducing the number of routers and rescheduling routers on new hosts, causing the routers to undergo a full recreation of namespace, iptables rules, and replugging of interfaces into OVS will correct packet loss or connectivity loss on impacted routers. On Neutron hosts in this environment, we have used systemtap to trace calls to kfree_skb which reveals the majority of dropped packets occur in the openvswitch module, notably on the br-int bridge. Inspecting the state of OVS shows many qtap interfaces which are no longer present on the Neutron host which are still plugged in to OVS. Diagnostic outputs in following comments. To manage notifications about this bug go to: https://bugs.launchpad.net/neutron/+bug/1749425/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1924776] [NEW] [ovn] use of address scopes does not automatically disable router snat
Public bug reported: OpenStack Ussuri OVN 20.03.x Ubuntu 20.04 When multiple networks/subnets are attached to a router which all form part of the same subnet pool and associated address scope SNAT is not automatically disabled to support routing between the subnets attached to the router. Ensuring the router is created with SNAT disabled resolves this issue but that's an extra non-obvious step for a cloud admin/end user. ** Affects: neutron Importance: Undecided Status: New ** Affects: neutron (Ubuntu) Importance: Undecided Status: New ** Also affects: neutron (Ubuntu) Importance: Undecided Status: New ** Summary changed: - [ovn] use of address scopes does not automatically disable snat + [ovn] use of address scopes does not automatically disable router snat -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1924776 Title: [ovn] use of address scopes does not automatically disable router snat Status in neutron: New Status in neutron package in Ubuntu: New Bug description: OpenStack Ussuri OVN 20.03.x Ubuntu 20.04 When multiple networks/subnets are attached to a router which all form part of the same subnet pool and associated address scope SNAT is not automatically disabled to support routing between the subnets attached to the router. Ensuring the router is created with SNAT disabled resolves this issue but that's an extra non-obvious step for a cloud admin/end user. To manage notifications about this bug go to: https://bugs.launchpad.net/neutron/+bug/1924776/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1922089] Re: [ovn] enable_snat cannot be disabled once enabled
** Also affects: neutron Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1922089 Title: [ovn] enable_snat cannot be disabled once enabled Status in neutron: New Status in neutron package in Ubuntu: New Bug description: Hi, Using Openstack focal/ussuri - ovn version 20.03.1-0ubuntu1.2 and neutron 2:16.2.0-0ubuntu2. If "enable_snat" is enabled on an external gateway on a router, it's not possible to disable it without completely removing said gateway from the router. For example : I have a subnet called subnet_axino_test - 10.0.100.0/24 I run the following : $ openstack router create router_axino_test $ openstack router set --disable-snat --external-gateway net_stg-external router_axino_test $ openstack router add subnet router_axino_test subnet_axino_test And so on OVN, I get nothing : $ sudo ovn-nbctl list NAT |grep -B5 -A4 10.131.100.0/24 Now, I enable SNAT : $ openstack router set --enable-snat --external-gateway net_stg-external router_axino_test This correctly adds an OVN SNAT entry as follows : $ sudo ovn-nbctl list NAT |grep -B5 -A4 10.131.100.0/24 _uuid : a65cc4b8-14ae-4ce4-b274-10eefdcc51dc external_ids: {} external_ip : "A.B.C.D" external_mac: [] logical_ip : "10.131.100.0/24" logical_port: [] options : {} type: snat Now, I remove SNAT from the router : $ openstack router set --disable-snat --external-gateway net_stg-external router_axino_test I confirm this : $ openstack router show router_axino_test | grep enable_snat | external_gateway_info | {"network_id": "4fb8304e-7adb-4cc3-bae5-deb968263eb0", "external_fixed_ips": [{"subnet_id": "6d47-1e44-41af-8f64-dd802d5c3ddc", "ip_address": "A.B.C.D"}], "enable_snat": false} | Above, you can see that "enable_snat" is "false". So I would expect OVN to _not_ have a NAT entry. Yet, it does : $ sudo ovn-nbctl list NAT |grep -B5 -A4 10.131.100.0/24 _uuid : a65cc4b8-14ae-4ce4-b274-10eefdcc51dc external_ids: {} external_ip : "162.213.34.141" external_mac: [] logical_ip : "10.131.100.0/24" logical_port: [] options : {} type: snat The only way to remove SNAT is to completely remove the external gateway from the router, and to re-add it with SNAT disabled : $ openstack router unset --external-gateway router_axino_test $ openstack router set --disable-snat --external-gateway net_stg-external router_axino_test Note that this requires removing all the floating IPs from VMs behind this router, which obviously makes them unreachable - which is less than ideal in production. Thanks To manage notifications about this bug go to: https://bugs.launchpad.net/neutron/+bug/1922089/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1924765] [NEW] [ovn] fip assignment to instance via router with snat disabled is broken
Public bug reported: Ubuntu: 20.04 OpenStack: Ussuri Networking: OVN (20.03.x) Network topology: Geneve overlay network for project networks, router has snat disabled and the project network and the external network are all in the same address scope and subnet pool. OVN routers are simply acting as L3 routers and instances on the project network can be directly accessed by the address assigned to their port (with appropriate route configuration in the outside of openstack world). Issue: Its possible to create and then associate a floating IP on the external network with an instance attached to the project network - however this does not work - access to the instance via the FIP is broken, as is access to its fixed IP (when this worked OK before). Thoughts: The concept of a FIP is very much NAT centric, and in the described configuration NAT is very much disabled. This idea seems to have worked way back in icehouse, however does not work at Ussuri. If this is not a supported network model, the association of the FIP to the instance should error with an appropriate message that NAT is not supported to the in-path router to the external network. ** Affects: neutron Importance: Undecided Status: New ** Affects: neutron (Ubuntu) Importance: Undecided Status: New ** Summary changed: - [ovn] fip assignment to router with snat disabled broken + [ovn] fip assignment to instance via router with snat disabled is broken ** Description changed: + Ubuntu: 20.04 + OpenStack: Ussuri + Networking: OVN (20.03.x) + Network topology: Geneve overlay network for project networks, router has snat disabled and the project network and the external network are all in the same address scope and subnet pool. OVN routers are simply acting as L3 routers and instances on the project network can be directly accessed by the address assigned to their port (with appropriate route configuration in the outside of openstack world). Issue: Its possible to create and then associate a floating IP on the external network with an instance attached to the project network - however this does not work - access to the instance via the FIP is broken, as is access to its fixed IP (when this worked OK before). Thoughts: The concept of a FIP is very much NAT centric, and in the described configuration NAT is very much disabled. This idea seems to have worked way back in icehouse, however does not work at Ussuri. If this is not a supported network model, the association of the FIP to the instance should error with an appropriate message that NAT is not supported to the in-path router to the external network. ** Also affects: neutron Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1924765 Title: [ovn] fip assignment to instance via router with snat disabled is broken Status in neutron: New Status in neutron package in Ubuntu: New Bug description: Ubuntu: 20.04 OpenStack: Ussuri Networking: OVN (20.03.x) Network topology: Geneve overlay network for project networks, router has snat disabled and the project network and the external network are all in the same address scope and subnet pool. OVN routers are simply acting as L3 routers and instances on the project network can be directly accessed by the address assigned to their port (with appropriate route configuration in the outside of openstack world). Issue: Its possible to create and then associate a floating IP on the external network with an instance attached to the project network - however this does not work - access to the instance via the FIP is broken, as is access to its fixed IP (when this worked OK before). Thoughts: The concept of a FIP is very much NAT centric, and in the described configuration NAT is very much disabled. This idea seems to have worked way back in icehouse, however does not work at Ussuri. If this is not a supported network model, the association of the FIP to the instance should error with an appropriate message that NAT is not supported to the in-path router to the external network. To manage notifications about this bug go to: https://bugs.launchpad.net/neutron/+bug/1924765/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1907686] Re: ovn: instance unable to retrieve metadata
2.13.3 uploaded for focal and groovy to: https://launchpad.net/~ci-train-ppa-service/+archive/ubuntu/3690 for testing. ** No longer affects: openvswitch (Ubuntu Bionic) ** Also affects: cloud-archive Importance: Undecided Status: New ** Also affects: cloud-archive/ussuri Importance: Undecided Status: New ** Also affects: cloud-archive/victoria Importance: Undecided Status: New ** Also affects: cloud-archive/wallaby Importance: Undecided Status: New ** Changed in: cloud-archive/wallaby Status: New => Fix Released -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1907686 Title: ovn: instance unable to retrieve metadata Status in charm-ovn-chassis: Invalid Status in Ubuntu Cloud Archive: Fix Released Status in Ubuntu Cloud Archive ussuri series: New Status in Ubuntu Cloud Archive victoria series: New Status in Ubuntu Cloud Archive wallaby series: Fix Released Status in neutron: Invalid Status in openvswitch package in Ubuntu: Fix Released Status in openvswitch source package in Focal: Triaged Status in openvswitch source package in Groovy: Triaged Status in openvswitch source package in Hirsute: Fix Released Bug description: Ubuntu:focal OpenStack: ussuri Instance port: hardware offloaded instance created, attempts to access metadata - metadata agent can't resolve the port/network combination: 2020-12-10 15:00:18.258 4732 INFO neutron.agent.ovn.metadata.agent [-] Port d65418a6-d0e9-47e6-84ba-3d02fe75131a in datapath 37706e4d-ce2a-4d81-8c61-3fd12437a0a7 bound to our ch assis 2020-12-10 15:00:31.672 8062 ERROR neutron.agent.ovn.metadata.server [-] No port found in network 37706e4d-ce2a-4d81-8c61-3fd12437a0a7 with IP address 10.5.1.155 2020-12-10 15:00:31.673 8062 INFO eventlet.wsgi.server [-] 10.5.1.155, "GET /openstack HTTP/1.1" status: 404 len: 297 time: 0.0043790 2020-12-10 15:00:34.639 8062 ERROR neutron.agent.ovn.metadata.server [-] No port found in network 37706e4d-ce2a-4d81-8c61-3fd12437a0a7 with IP address 10.5.1.155 2020-12-10 15:00:34.639 8062 INFO eventlet.wsgi.server [-] 10.5.1.155, "GET /openstack HTTP/1.1" status: 404 len: 297 time: 0.0040138 To manage notifications about this bug go to: https://bugs.launchpad.net/charm-ovn-chassis/+bug/1907686/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1890432] Re: Create subnet is failing under high load with OVN
Backports https://review.opendev.org/c/openstack/neutron/+/774256 https://review.opendev.org/c/openstack/neutron/+/774135 ** No longer affects: charm-neutron-api -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1890432 Title: Create subnet is failing under high load with OVN Status in neutron: Fix Committed Status in neutron package in Ubuntu: Fix Released Status in neutron source package in Focal: Triaged Status in neutron source package in Groovy: Triaged Bug description: Under a high concurrency level create subnet is starting to fail. (12-14% failure rate) The bundle is OVN / Ussuri. neutronclient.common.exceptions.Conflict: Unable to complete operation on subnet This subnet is being modified by another concurrent operation. Stacktrace: https://pastebin.ubuntu.com/p/sQ5CqD6NyS/ Rally task: {% set flavor_name = flavor_name or "m1.medium" %} {% set image_name = image_name or "bionic-kvm" %} --- NeutronNetworks.create_and_delete_subnets: - args: network_create_args: {} subnet_create_args: {} subnet_cidr_start: "1.1.0.0/30" subnets_per_network: 2 runner: type: "constant" times: 100 concurrency: 10 context: network: {} users: tenants: 30 users_per_tenant: 1 quotas: neutron: network: -1 subnet: -1 Concurrency level set to 1 instead of 10 is not triggering the issue. To manage notifications about this bug go to: https://bugs.launchpad.net/neutron/+bug/1890432/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1890432] Re: Create subnet is failing under high load with OVN
Hirsute/Wallaby packages include the fix from: https://review.opendev.org/c/openstack/neutron/+/745330/ So marked "Fix Released" for this target. Focal/Ussuri and Groovy/Wallaby - fix has been merged into the neutron table branch for each release however no new point releases from Neutron for these two release targets yet. ** Changed in: neutron Status: In Progress => Fix Committed ** Changed in: neutron (Ubuntu) Status: Triaged => Invalid ** Changed in: neutron (Ubuntu) Status: Invalid => Fix Released -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1890432 Title: Create subnet is failing under high load with OVN Status in neutron: Fix Committed Status in neutron package in Ubuntu: Fix Released Status in neutron source package in Focal: Triaged Status in neutron source package in Groovy: Triaged Bug description: Under a high concurrency level create subnet is starting to fail. (12-14% failure rate) The bundle is OVN / Ussuri. neutronclient.common.exceptions.Conflict: Unable to complete operation on subnet This subnet is being modified by another concurrent operation. Stacktrace: https://pastebin.ubuntu.com/p/sQ5CqD6NyS/ Rally task: {% set flavor_name = flavor_name or "m1.medium" %} {% set image_name = image_name or "bionic-kvm" %} --- NeutronNetworks.create_and_delete_subnets: - args: network_create_args: {} subnet_create_args: {} subnet_cidr_start: "1.1.0.0/30" subnets_per_network: 2 runner: type: "constant" times: 100 concurrency: 10 context: network: {} users: tenants: 30 users_per_tenant: 1 quotas: neutron: network: -1 subnet: -1 Concurrency level set to 1 instead of 10 is not triggering the issue. To manage notifications about this bug go to: https://bugs.launchpad.net/neutron/+bug/1890432/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1907686] Re: instance unable to retrieve metadata
Note that the fact the port/instance was hardware offloaded is not material here - I just tripped on the same issue with virtio ports. ** Also affects: neutron Importance: Undecided Status: New ** Summary changed: - instance unable to retrieve metadata + ovn: instance unable to retrieve metadata -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1907686 Title: ovn: instance unable to retrieve metadata Status in charm-ovn-chassis: New Status in neutron: New Status in neutron package in Ubuntu: New Bug description: Ubuntu:focal OpenStack: ussuri Instance port: hardware offloaded instance created, attempts to access metadata - metadata agent can't resolve the port/network combination: 2020-12-10 15:00:18.258 4732 INFO neutron.agent.ovn.metadata.agent [-] Port d65418a6-d0e9-47e6-84ba-3d02fe75131a in datapath 37706e4d-ce2a-4d81-8c61-3fd12437a0a7 bound to our ch assis 2020-12-10 15:00:31.672 8062 ERROR neutron.agent.ovn.metadata.server [-] No port found in network 37706e4d-ce2a-4d81-8c61-3fd12437a0a7 with IP address 10.5.1.155 2020-12-10 15:00:31.673 8062 INFO eventlet.wsgi.server [-] 10.5.1.155, "GET /openstack HTTP/1.1" status: 404 len: 297 time: 0.0043790 2020-12-10 15:00:34.639 8062 ERROR neutron.agent.ovn.metadata.server [-] No port found in network 37706e4d-ce2a-4d81-8c61-3fd12437a0a7 with IP address 10.5.1.155 2020-12-10 15:00:34.639 8062 INFO eventlet.wsgi.server [-] 10.5.1.155, "GET /openstack HTTP/1.1" status: 404 len: 297 time: 0.0040138 To manage notifications about this bug go to: https://bugs.launchpad.net/charm-ovn-chassis/+bug/1907686/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1844616] Re: federated user creation creates duplicates of existing user accounts
** Project changed: charm-keystone => keystone ** Also affects: keystone (Ubuntu) Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Identity (keystone). https://bugs.launchpad.net/bugs/1844616 Title: federated user creation creates duplicates of existing user accounts Status in OpenStack Identity (keystone): New Status in keystone package in Ubuntu: New Bug description: Keystone 15.0.0-0ubuntu1~cloud0 DISTRIB_CODENAME=bionic Charm cs:keystone-306 keystone-saml-mellon-3 We had a situation where two user accounts were found with the same name and user ID in both the local_user and federated_user table. This meant that running `openstack user show --domain mydomain username2` would fail with "More than one user exists with the name 'username2'". Listing users would show only one user account, and using the user uuid to 'user show' was working fine. I ended up removing the two rows from local_user to work around this. The bug however, is that federated users with the same name as one that was already located in local_user shouldn't be created like that. mysql> select * from local_user; +-+--+--+---+---++ | id | user_id | domain_id| name | failed_auth_count | failed_auth_at | +-+--+--+---+---++ | 3 | 1e0099400dd34adeba2ed6751064227a | 87fb238ef6d0430cbda59b08e3a1ea82 | admin | 0 | NULL | | 6 | 8840d047cca346e6a00e435306f72ffc | a1effaa626284677ade0fbe3e85c59bd | cinderv2_cinderv3 | 0 | NULL | | 9 | d71b70de0cdd4beba2e5f1d3842c93b1 | fa58dfa26889413e85b4855837952b74 | cinderv2_cinderv3 | 0 | NULL | | 12 | d0750dcc890543918fe043eb5782e0ed | a1effaa626284677ade0fbe3e85c59bd | gnocchi | 0 | NULL | | 15 | c870e8dc427841c08fbba94b824f5765 | fa58dfa26889413e85b4855837952b74 | gnocchi | 0 | NULL | | 18 | 964d6a7b3d8d4a49ac2ef2accd5350d3 | a1effaa626284677ade0fbe3e85c59bd | neutron | 0 | NULL | | 21 | e1e77e91a9ed4dde8230d80b752d4f5c | fa58dfa26889413e85b4855837952b74 | neutron | 0 | NULL | | 24 | d090c19794dd4f27b08deab6713bd4ac | a1effaa626284677ade0fbe3e85c59bd | nova_placement| 0 | NULL | | 27 | 9fbb011ce1fc495ebf716d5cb56cd007 | fa58dfa26889413e85b4855837952b74 | nova_placement| 0 | NULL | | 30 | 1bad96de0fcd41a3b30d2c4e4ad9bb05 | a1effaa626284677ade0fbe3e85c59bd | octavia | 0 | NULL | | 33 | f4da2edc5e8f461b8d71eee67eabe4c2 | fa58dfa26889413e85b4855837952b74 | octavia | 0 | NULL | | 36 | a4d97a3a5a6644eb92848b9ea40ba71f | a1effaa626284677ade0fbe3e85c59bd | barbican | 0 | NULL | | 39 | 4d827a03abb24855b6cc37602fe346a5 | fa58dfa26889413e85b4855837952b74 | barbican | 0 | NULL | | 42 | 63b4389e35e446199b4e6a57a789e89c | a1effaa626284677ade0fbe3e85c59bd | aodh | 0 | NULL | | 45 | 3222d274dd0347a080b5371a348356b3 | fa58dfa26889413e85b4855837952b74 | aodh | 0 | NULL | | 48 | 957f4a409dec46c6b44f38a80949f7d1 | a1effaa626284677ade0fbe3e85c59bd | swift | 0 | NULL | | 51 | 8a89ed1cd1984814b544070295a2854f | fa58dfa26889413e85b4855837952b74 | swift | 0 | NULL | | 54 | 1ee61ad58f0948eab3c43fdf95790dcd | a1effaa626284677ade0fbe3e85c59bd | designate | 0 | NULL | | 57 | 32475aeb4dc0469080581f9acc9f7905 | fa58dfa26889413e85b4855837952b74 | designate | 0 | NULL | | 60 | 79b9411206524f00b0d05d3112a03840 | a1effaa626284677ade0fbe3e85c59bd | glance| 0 | NULL | | 63 | 35257eb811d84e0091381e74d4fbca21 | fa58dfa26889413e85b4855837952b74 | glance| 0 | NULL | | 66 | d07d3c3c619c4478b196bb81b8a4ced5 | a1effaa626284677ade0fbe3e85c59bd | heat_heat-cfn | 0 | NULL
[Yahoo-eng-team] [Bug 1883929] Re: Upgrade from X/O -> B/Q breaks pci_devices in mysql for SR-IOV
How is the whitelist for PCI devices configured? if all of the PCI device naming changed as part of the OS upgrade (or maybe the firmware upgrade) do you also need to update the whitelist configuration for the charm? ** Also affects: charm-nova-compute Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1883929 Title: Upgrade from X/O -> B/Q breaks pci_devices in mysql for SR-IOV Status in OpenStack nova-compute charm: New Status in OpenStack Compute (nova): New Bug description: After upgrade from xenial/ocata to bionic/queens SR-IOV instance creation (--vnic-type direct) fails with missing devices: The pci_devices mysql table if filled with wrong PCI entries, that do not exist on the server. Restarting nova-compute and nova-cloud- controller services did not fix (rediscover) the proper PCI devices. Related errors: 2020-06-17 12:55:19.556 1182599 WARNING nova.pci.utils [req-76b21329-b364-4999-ac86-8c729cb91ac0 - - - - -] No net device was found for VF :d8:05.0: PciDeviceNotFoundById: PCI device :d8:05.0 not found 2020-06-17 12:55:19.603 1182599 WARNING nova.pci.utils [req-76b21329-b364-4999-ac86-8c729cb91ac0 - - - - -] No net device was found for VF :d8:05.1: PciDeviceNotFoundById: PCI device :d8:05.1 not found 2020-06-17 12:55:19.711 1182599 WARNING nova.pci.utils [req-76b21329-b364-4999-ac86-8c729cb91ac0 - - - - -] No net device was found for VF :d8:04.7: PciDeviceNotFoundById: PCI device :d8:04.7 not found Error instance creation: {u'message': u'Device :d8:04.4 not found: could not access /sys/bus/pci/devices/:d8:04.4/config: No such file or directory', u'code': 500, u'details': u'Traceback (most recent call last):\n File "/usr/lib/python2.7/dist-packa ges/nova/compute/manager.py", line 1863, in _do_build_and_run_instance\n filter_properties, request_spec)\n File "/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 2143, in _build_and_run_instance\ninstance_uuid=instance.uuid, reason=six.text_type(e))\ nRescheduledException: Build of instance ec163abf-9c7a-460a-9512-4915f47af6b9 was re-scheduled: Device :d8:04.4 not found: could not access /sys/bus/pci/devices/:d8:04.4/config: No such file or directory\n', u'created': u'2020-06-17T11:46:11Z' To manage notifications about this bug go to: https://bugs.launchpad.net/charm-nova-compute/+bug/1883929/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1784342] Re: AttributeError: 'Subnet' object has no attribute '_obj_network_id'
*** This bug is a duplicate of bug 1839658 *** https://bugs.launchpad.net/bugs/1839658 Ah - this behaviour was enforced @ train see bug 1839658 ** This bug has been marked a duplicate of bug 1839658 "subnet" register in the DB can have network_id=NULL -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1784342 Title: AttributeError: 'Subnet' object has no attribute '_obj_network_id' Status in neutron: Confirmed Status in neutron package in Ubuntu: Confirmed Bug description: Running rally caused subnets to be created without a network_id causing this AttributeError. OpenStack Queens RDO packages [root@controller1 ~]# rpm -qa | grep -i neutron python-neutron-12.0.2-1.el7.noarch openstack-neutron-12.0.2-1.el7.noarch python2-neutron-dynamic-routing-12.0.1-1.el7.noarch python2-neutron-lib-1.13.0-1.el7.noarch openstack-neutron-dynamic-routing-common-12.0.1-1.el7.noarch python2-neutronclient-6.7.0-1.el7.noarch openstack-neutron-bgp-dragent-12.0.1-1.el7.noarch openstack-neutron-common-12.0.2-1.el7.noarch openstack-neutron-ml2-12.0.2-1.el7.noarch MariaDB [neutron]> select project_id, id, name, network_id, cidr from subnets where network_id is null; +--+--+---++-+ | project_id | id | name | network_id | cidr| +--+--+---++-+ | b80468629bc5410ca2c53a7cfbf002b3 | 7a23c72b- 3df8-4641-a494-af7642563c8e | s_rally_1e4bebf1_1s3IN6mo | NULL | 1.9.13.0/24 | | b80468629bc5410ca2c53a7cfbf002b3 | f7a57946-4814-477a-9649-cc475fb4e7b2 | s_rally_1e4bebf1_qWSFSMs9 | NULL | 1.5.20.0/24 | +--+--+---++-+ 2018-07-30 10:35:13.351 42618 ERROR neutron.pecan_wsgi.hooks.translation [req-c921b9fb-499b-41c1-9103-93e71a70820c b6b96932bbef41fdbf957c2dc01776aa 050c556faa5944a8953126c867313770 - default default] GET failed.: AttributeError: 'Subnet' object has no attribute '_obj_network_id' 2018-07-30 10:35:13.351 42618 ERROR neutron.pecan_wsgi.hooks.translation Traceback (most recent call last): 2018-07-30 10:35:13.351 42618 ERROR neutron.pecan_wsgi.hooks.translation File "/usr/lib/python2.7/site-packages/pecan/core.py", line 678, in __call__ 2018-07-30 10:35:13.351 42618 ERROR neutron.pecan_wsgi.hooks.translation self.invoke_controller(controller, args, kwargs, state) 2018-07-30 10:35:13.351 42618 ERROR neutron.pecan_wsgi.hooks.translation File "/usr/lib/python2.7/site-packages/pecan/core.py", line 569, in invoke_controller 2018-07-30 10:35:13.351 42618 ERROR neutron.pecan_wsgi.hooks.translation result = controller(*args, **kwargs) 2018-07-30 10:35:13.351 42618 ERROR neutron.pecan_wsgi.hooks.translation File "/usr/lib/python2.7/site-packages/neutron/db/api.py", line 91, in wrapped 2018-07-30 10:35:13.351 42618 ERROR neutron.pecan_wsgi.hooks.translation setattr(e, '_RETRY_EXCEEDED', True) 2018-07-30 10:35:13.351 42618 ERROR neutron.pecan_wsgi.hooks.translation File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 220, in __exit__ 2018-07-30 10:35:13.351 42618 ERROR neutron.pecan_wsgi.hooks.translation self.force_reraise() 2018-07-30 10:35:13.351 42618 ERROR neutron.pecan_wsgi.hooks.translation File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 196, in force_reraise 2018-07-30 10:35:13.351 42618 ERROR neutron.pecan_wsgi.hooks.translation six.reraise(self.type_, self.value, self.tb) 2018-07-30 10:35:13.351 42618 ERROR neutron.pecan_wsgi.hooks.translation File "/usr/lib/python2.7/site-packages/neutron/db/api.py", line 87, in wrapped 2018-07-30 10:35:13.351 42618 ERROR neutron.pecan_wsgi.hooks.translation return f(*args, **kwargs) 2018-07-30 10:35:13.351 42618 ERROR neutron.pecan_wsgi.hooks.translation File "/usr/lib/python2.7/site-packages/oslo_db/api.py", line 147, in wrapper 2018-07-30 10:35:13.351 42618 ERROR neutron.pecan_wsgi.hooks.translation ectxt.value = e.inner_exc 2018-07-30 10:35:13.351 42618 ERROR neutron.pecan_wsgi.hooks.translation File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 220, in __exit__ 2018-07-30 10:35:13.351 42618 ERROR neutron.pecan_wsgi.hooks.translation self.force_reraise() 2018-07-30 10:35:13.351 42618 ERROR neutron.pecan_wsgi.hooks.translation File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 196, in force_reraise 2018-07-30 10:35:13.351 42618 ERROR neutron.pecan_wsgi.hooks.translation six.reraise(self.type_, self.value, self.tb)
[Yahoo-eng-team] [Bug 1826419] Re: dhcp agent configured with mismatching domain and host entries
** Changed in: cloud-archive/queens Status: Fix Committed => Fix Released -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1826419 Title: dhcp agent configured with mismatching domain and host entries Status in Ubuntu Cloud Archive: Fix Released Status in Ubuntu Cloud Archive queens series: Fix Released Status in Ubuntu Cloud Archive rocky series: Fix Released Status in Ubuntu Cloud Archive stein series: Fix Released Status in Ubuntu Cloud Archive train series: Fix Released Status in neutron: Fix Released Status in neutron package in Ubuntu: Fix Released Status in neutron source package in Bionic: Fix Released Status in neutron source package in Cosmic: Fix Released Status in neutron source package in Disco: Fix Released Status in neutron source package in Eoan: Fix Released Bug description: Related bug 1774710 and bug 1580588 The neutron-dhcp-agent in OpenStack >= Queens makes use of the dns_domain value set on a network to configure the '--domain' parameter of the dnsmasq instance that supports it; at the same time, neutron makes use of CONF.dns_domain when creating dns_assignments for ports - this results in a hosts file for the dnsmasq instance which uses CONF.dns_domain and a --domain parameter of network.dns_domain which do not match. This results in a search path on instances booted attached to the network which is inconsistent with the internal DNS entries that dnsmasq responds with: root@bionic-045546-2:~# host 192.168.21.222 222.21.168.192.in-addr.arpa domain name pointer bionic-045546-2.jamespage.internal. root@bionic-045546-2:~# host bionic-045546-2 bionic-045546-2.designate.local has address 192.168.21.222 In the above example: CONF.dns_domain = jamespage.internal. network.dns_domain = designate.local. Based on previous discussion in bug 1580588 I think that the dns_domain value for a network was intented for use for external DNS integration such as that provided by Designate. The changed made under commit: https://opendev.org/openstack/neutron/commit/137a6d61053 appear to break this assumption, producing somewhat inconsistent behaviour in the dnsmasq instance for the network. To manage notifications about this bug go to: https://bugs.launchpad.net/cloud-archive/+bug/1826419/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1815844] Re: iscsi multipath dm-N device only used on first volume attachment
Marking charm task as invalid as this is a kernel issue with the xenial release kernel. Ubuntu/Linux bug task raised for further progression if updating to the latest HWE kernel on Xenial is not an option. ** Also affects: linux (Ubuntu) Importance: Undecided Status: New ** Changed in: charm-nova-compute Status: Triaged => Invalid -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1815844 Title: iscsi multipath dm-N device only used on first volume attachment Status in OpenStack nova-compute charm: Invalid Status in OpenStack Compute (nova): Invalid Status in os-brick: Invalid Status in linux package in Ubuntu: New Bug description: With nova-compute from cloud:xenial-queens and use-multipath=true iscsi multipath is configured and the dm-N devices used on the first attachment but subsequent attachments only use a single path. The back-end storage is a Purestorage array. The multipath.conf is attached The issue is easily reproduced as shown below: jog@pnjostkinfr01:~⟫ openstack volume create pure2 --size 10 --type pure +-+--+ | Field | Value| +-+--+ | attachments | [] | | availability_zone | nova | | bootable| false| | consistencygroup_id | None | | created_at | 2019-02-13T23:07:40.00 | | description | None | | encrypted | False| | id | e286161b-e8e8-47b0-abe3-4df411993265 | | migration_status| None | | multiattach | False| | name| pure2| | properties | | | replication_status | None | | size| 10 | | snapshot_id | None | | source_volid| None | | status | creating | | type| pure | | updated_at | None | | user_id | c1fa4ae9a0b446f2ba64eebf92705d53 | +-+--+ jog@pnjostkinfr01:~⟫ openstack volume show pure2 ++--+ | Field | Value| ++--+ | attachments| [] | | availability_zone | nova | | bootable | false| | consistencygroup_id| None | | created_at | 2019-02-13T23:07:40.00 | | description| None | | encrypted | False| | id | e286161b-e8e8-47b0-abe3-4df411993265 | | migration_status | None | | multiattach| False| | name | pure2| | os-vol-host-attr:host | cinder@cinder-pure#cinder-pure | | os-vol-mig-status-attr:migstat | None | | os-vol-mig-status-attr:name_id | None | | os-vol-tenant-attr:tenant_id | 9be499fd1eee48dfb4dc6faf3cc0a1d7 | | properties | | | replication_status | None | | size | 10 | | snapshot_id| None | | source_volid | None | | status | available| | type | pure | | updated_at | 2019-02-13T23:07:41.00 | | user_id| c1fa4ae9a0b446f2ba64eebf92705d53 | ++--+ Add the volume to an instance:
[Yahoo-eng-team] [Bug 1734204] Re: Insufficient free host memory pages available to allocate guest RAM with Open vSwitch DPDK in Newton
Picking this back up again - I'll fold the fix for the regression introduced by this change into the same SRU so it will consist of two patches. ** Changed in: nova (Ubuntu Bionic) Status: Won't Fix => Triaged ** Changed in: cloud-archive/queens Status: Won't Fix => Triaged ** Changed in: cloud-archive/queens Assignee: (unassigned) => James Page (james-page) ** Changed in: nova (Ubuntu Bionic) Assignee: (unassigned) => James Page (james-page) -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1734204 Title: Insufficient free host memory pages available to allocate guest RAM with Open vSwitch DPDK in Newton Status in Ubuntu Cloud Archive: Invalid Status in Ubuntu Cloud Archive queens series: Triaged Status in OpenStack Compute (nova): Fix Released Status in nova package in Ubuntu: Invalid Status in nova source package in Bionic: Triaged Bug description: When spawning an instance and scheduling it onto a compute node which still has sufficient pCPUs for the instance and also sufficient free huge pages for the instance memory, nova returns: Raw [stack@undercloud-4 ~]$ nova show 1b72e7a1-c298-4c92-8d2c-0a9fe886e9bc (...) | fault| {"message": "Exceeded maximum number of retries. Exceeded max scheduling attempts 3 for instance 1b72e7a1-c298-4c92-8d2c-0a9fe886e9bc. Last exception: internal error: process exited while connecting to monitor: 2017-11-23T19:53:20.311446Z qemu-kvm: -chardev pty,id=cha", "code": 500, "details": " File \"/usr/lib/python2.7/site-packages/nova/conductor/manager.py\", line 492, in build_instances | | | filter_properties, instances[0].uuid) | | | File \"/usr/lib/python2.7/site-packages/nova/scheduler/utils.py\", line 184, in populate_retry | | | raise exception.MaxRetriesExceeded(reason=msg) | | | ", "created": "2017-11-23T19:53:22Z"} (...) And /var/log/nova/nova-compute.log on the compute node gives the following ERROR message: Raw 2017-11-23 19:53:21.021 153615 ERROR nova.compute.manager [req-2ad59cdf-4901-4df1-8bd7-ebaea20b9361 5d1785ee87294a6fad5e2b91cc20 8c307c08d2234b339c504bfdd896c13e - - -] [instance: 1b72e7a1-c298-4c92-8d2c-0a9fe886e9bc] Instance failed to spawn 2017-11-23 19:53:21.021 153615 ERROR nova.compute.manager [instance: 1b72e7a1-c298-4c92-8d2c-0a9fe886e9bc] Traceback (most recent call last): 2017-11-23 19:53:21.021 153615 ERROR nova.compute.manager [instance: 1b72e7a1-c298-4c92-8d2c-0a9fe886e9bc] File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 2087, in _build_resources 2017-11-23 19:53:21.021 153615 ERROR nova.compute.manager [instance: 1b72e7a1-c298-4c92-8d2c-0a9fe886e9bc] yield resources 2017-11-23 19:53:21.021 153615 ERROR nova.compute.manager [instance: 1b72e7a1-c298-4c92-8d2c-0a9fe886e9bc] File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 1928, in _build_and_run_instance 2017-11-23 19:53:21.021 153615 ERROR nova.compute.manager [instance: 1b72e7a1-c298-4c92-8d2c-0a9fe886e9bc] block_device_info=block_device_info) 2017-11-23 19:53:21.021 153615 ERROR nova.compute.manager [instance: 1b72e7a1-c298-4c92-8d2c-0a9fe886e9bc] File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/driver.py"
[Yahoo-eng-team] [Bug 1859844] Re: Impossible to rename the Default domain id to the string 'default.'
** Changed in: charm-keystone Status: Invalid => New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Identity (keystone). https://bugs.launchpad.net/bugs/1859844 Title: Impossible to rename the Default domain id to the string 'default.' Status in OpenStack keystone charm: New Status in OpenStack Identity (keystone): Invalid Status in keystone package in Ubuntu: Invalid Bug description: Openstack version = Rocky When changing the 'default_domain_id' variable to the string 'default' and changing all references for this variable in the keystone database we get the following error in keystone.log: (keystone.common.wsgi): 2020-01-15 14:16:37,869 ERROR badly formed hexadecimal UUID string Traceback (most recent call last): File "/usr/lib/python3/dist-packages/keystone/common/wsgi.py", line 148, in __call__ result = method(req, **params) File "/usr/lib/python3/dist-packages/keystone/auth/controllers.py", line 102, in authenticate_for_token app_cred_id=app_cred_id, parent_audit_id=token_audit_id) File "/usr/lib/python3/dist-packages/keystone/common/manager.py", line 116, in wrapped __ret_val = __f(*args, **kwargs) File "/usr/lib/python3/dist-packages/keystone/token/provider.py", line 251, in issue_token token_id, issued_at = self.driver.generate_id_and_issued_at(token) File "/usr/lib/python3/dist-packages/keystone/token/providers/fernet/core.py", line 61, in generate_id_and_issued_at app_cred_id=token.application_credential_id File "/usr/lib/python3/dist-packages/keystone/token/token_formatters.py", line 159, in create_token protocol_id, access_token_id, app_cred_id File "/usr/lib/python3/dist-packages/keystone/token/token_formatters.py", line 444, in assemble b_domain_id = cls.convert_uuid_hex_to_bytes(domain_id) File "/usr/lib/python3/dist-packages/keystone/token/token_formatters.py", line 290, in convert_uuid_hex_to_bytes uuid_obj = uuid.UUID(uuid_string) File "/usr/lib/python3.6/uuid.py", line 140, in __init__ raise ValueError('badly formed hexadecimal UUID string') ValueError: badly formed hexadecimal UUID string (keystone.common.wsgi): 2020-01-15 14:16:38,908 WARNING You are not authorized to perform the requested action: identity:get_domain. (keystone.common.wsgi): 2020-01-15 14:16:39,058 WARNING You are not authorized to perform the requested action: identity:get_domain. (keystone.common.wsgi): 2020-01-15 14:16:50,838 WARNING You are not authorized to perform the requested action: identity:list_projects. (keystone.common.wsgi): 2020-01-15 14:16:54,086 WARNING You are not authorized to perform the requested action: identity:list_projects. This change is needed to integrate keystone to ICO (IBM Cloud Orchestrator) To manage notifications about this bug go to: https://bugs.launchpad.net/charm-keystone/+bug/1859844/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1862343] Re: Changing the language in GUI has almost no effect
Packages have PO files but not MO files - the package build does the compilation but the install step completely misses them. ** Changed in: horizon (Ubuntu) Status: New => Triaged ** Changed in: horizon (Ubuntu) Importance: Undecided => High ** Also affects: horizon (Ubuntu Eoan) Importance: Undecided Status: New ** Also affects: horizon (Ubuntu Focal) Importance: High Status: Triaged ** Also affects: cloud-archive Importance: Undecided Status: New ** Also affects: cloud-archive/rocky Importance: Undecided Status: New ** Also affects: cloud-archive/ussuri Importance: Undecided Status: New ** Also affects: cloud-archive/train Importance: Undecided Status: New ** Also affects: cloud-archive/stein Importance: Undecided Status: New ** Changed in: horizon (Ubuntu Eoan) Importance: Undecided => High ** Summary changed: - Changing the language in GUI has almost no effect + compiled messages not shipped in packaging resulting in missing translations -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Dashboard (Horizon). https://bugs.launchpad.net/bugs/1862343 Title: compiled messages not shipped in packaging resulting in missing translations Status in OpenStack openstack-dashboard charm: Invalid Status in Ubuntu Cloud Archive: New Status in Ubuntu Cloud Archive rocky series: New Status in Ubuntu Cloud Archive stein series: New Status in Ubuntu Cloud Archive train series: New Status in Ubuntu Cloud Archive ussuri series: New Status in OpenStack Dashboard (Horizon): Invalid Status in horizon package in Ubuntu: Triaged Status in horizon source package in Eoan: New Status in horizon source package in Focal: Triaged Bug description: I changed the language in GUI to French but interface stays mostly English. Just a few strings are displayed in French, e.g.: - "Password" ("Mot de passe") on the login screen, - units "GB", "TB" as "Gio" and "Tio" in Compute Overview, - "New password" ("Noveau mot de passe") in User Settings. All other strings are in English. See screenshots attached. This is the Stein on Ubuntu Bionic deployment. To manage notifications about this bug go to: https://bugs.launchpad.net/charm-openstack-dashboard/+bug/1862343/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1862343] Re: Changing the language in GUI has almost no effect
it looks like the translation compilation never happens - if you drop into /usr/share/openstack-dashboard and run: sudo python3 manage.py compilemessages and then restart apache the translations appear to be OK ** Also affects: horizon (Ubuntu) Importance: Undecided Status: New ** Changed in: charm-openstack-dashboard Status: New => Invalid ** Changed in: horizon Status: New => Invalid -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Dashboard (Horizon). https://bugs.launchpad.net/bugs/1862343 Title: Changing the language in GUI has almost no effect Status in OpenStack openstack-dashboard charm: Invalid Status in OpenStack Dashboard (Horizon): Invalid Status in horizon package in Ubuntu: New Bug description: I changed the language in GUI to French but interface stays mostly English. Just a few strings are displayed in French, e.g.: - "Password" ("Mot de passe") on the login screen, - units "GB", "TB" as "Gio" and "Tio" in Compute Overview, - "New password" ("Noveau mot de passe") in User Settings. All other strings are in English. See screenshots attached. This is the Stein on Ubuntu Bionic deployment. To manage notifications about this bug go to: https://bugs.launchpad.net/charm-openstack-dashboard/+bug/1862343/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1831986] Re: fwaas_v2 - unable to associate port with firewall (PXC strict mode)
** Changed in: cloud-archive/stein Status: Fix Committed => Fix Released -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1831986 Title: fwaas_v2 - unable to associate port with firewall (PXC strict mode) Status in Ubuntu Cloud Archive: Fix Released Status in Ubuntu Cloud Archive rocky series: Won't Fix Status in Ubuntu Cloud Archive stein series: Fix Released Status in Ubuntu Cloud Archive train series: Fix Released Status in neutron: In Progress Status in neutron-fwaas package in Ubuntu: Fix Released Status in neutron-fwaas source package in Disco: Won't Fix Status in neutron-fwaas source package in Eoan: Fix Released Bug description: [Impact] Unable to associate ports with a firewall under FWaaS v2 [Test Case] Deploy OpenStack (stein or Later) using Charms Create firewall policy, apply to router - failure as unable to associate port with policy in underlying DB [Regression Potential] Medium; the proposed fix has not been accepted upstream as yet (discussion ongoing due to change of database migrations). [Original Bug Report] Impacts both Stein and Rocky (although rocky does not enable v2 just yet). 542 a9761fa9124740028d0c1d70ff7aa542] DBAPIError exception wrapped from (pymysql.err.InternalError) (1105, 'Percona-XtraDB-Cluster prohibits use of DML command on a table (neutron.firewall_group_port_associations_v2) without an explicit primary key with pxc_strict_mode = ENFORCING or MASTER') [SQL: 'DELETE FROM firewall_group_port_associations_v2 WHERE firewall_group_port_associations_v2.firewall_group_id = %(firewall_group_id_1)s'] [parameters: {'firewall_group_id_1': '85a277d0-ebaf-4a5d-9d45-6a74b8f54372'}] (Background on this error at: http://sqlalche.me/e/2j85): pymysql.err.InternalError: (1105, 'Percona-XtraDB-Cluster prohibits use of DML command on a table (neutron.firewall_group_port_associations_v2) without an explicit primary key with pxc_strict_mode = ENFORCING or MASTER') 2019-06-07 10:07:50.937 30837 ERROR oslo_db.sqlalchemy.exc_filters Traceback (most recent call last): 2019-06-07 10:07:50.937 30837 ERROR oslo_db.sqlalchemy.exc_filters File "/usr/lib/python3/dist-packages/sqlalchemy/engine/base.py", line 1193, in _execute_context 2019-06-07 10:07:50.937 30837 ERROR oslo_db.sqlalchemy.exc_filters context) 2019-06-07 10:07:50.937 30837 ERROR oslo_db.sqlalchemy.exc_filters File "/usr/lib/python3/dist-packages/sqlalchemy/engine/default.py", line 509, in do_execute 2019-06-07 10:07:50.937 30837 ERROR oslo_db.sqlalchemy.exc_filters cursor.execute(statement, parameters) 2019-06-07 10:07:50.937 30837 ERROR oslo_db.sqlalchemy.exc_filters File "/usr/lib/python3/dist-packages/pymysql/cursors.py", line 165, in execute 2019-06-07 10:07:50.937 30837 ERROR oslo_db.sqlalchemy.exc_filters result = self._query(query) 2019-06-07 10:07:50.937 30837 ERROR oslo_db.sqlalchemy.exc_filters File "/usr/lib/python3/dist-packages/pymysql/cursors.py", line 321, in _query 2019-06-07 10:07:50.937 30837 ERROR oslo_db.sqlalchemy.exc_filters conn.query(q) 2019-06-07 10:07:50.937 30837 ERROR oslo_db.sqlalchemy.exc_filters File "/usr/lib/python3/dist-packages/pymysql/connections.py", line 860, in query 2019-06-07 10:07:50.937 30837 ERROR oslo_db.sqlalchemy.exc_filters self._affected_rows = self._read_query_result(unbuffered=unbuffered) 2019-06-07 10:07:50.937 30837 ERROR oslo_db.sqlalchemy.exc_filters File "/usr/lib/python3/dist-packages/pymysql/connections.py", line 1061, in _read_query_result 2019-06-07 10:07:50.937 30837 ERROR oslo_db.sqlalchemy.exc_filters result.read() 2019-06-07 10:07:50.937 30837 ERROR oslo_db.sqlalchemy.exc_filters File "/usr/lib/python3/dist-packages/pymysql/connections.py", line 1349, in read 2019-06-07 10:07:50.937 30837 ERROR oslo_db.sqlalchemy.exc_filters first_packet = self.connection._read_packet() 2019-06-07 10:07:50.937 30837 ERROR oslo_db.sqlalchemy.exc_filters File "/usr/lib/python3/dist-packages/pymysql/connections.py", line 1018, in _read_packet 2019-06-07 10:07:50.937 30837 ERROR oslo_db.sqlalchemy.exc_filters packet.check_error() 2019-06-07 10:07:50.937 30837 ERROR oslo_db.sqlalchemy.exc_filters File "/usr/lib/python3/dist-packages/pymysql/connections.py", line 384, in check_error 2019-06-07 10:07:50.937 30837 ERROR oslo_db.sqlalchemy.exc_filters err.raise_mysql_exception(self._data) 2019-06-07 10:07:50.937 30837 ERROR oslo_db.sqlalchemy.exc_filters File "/usr/lib/python3/dist-packages/pymysql/err.py", line 107, in raise_mysql_exception 2019-06-07 10:07:50.937 30837 ERROR oslo_db.sqlalchemy.exc_filters raise errorclass(errno, errval) 2019-06-07 10:07:50.937 30837 ERROR oslo_db.sqlalchemy.exc_filters pymysql.err.InternalError: (1105, 'Percona-XtraDB-Cluster prohibits use of DML command on a
[Yahoo-eng-team] [Bug 1859844] Re: Impossible to rename the Default domain id to the string 'default.'
FTR charm has written the UUID to the configuration file for the last 3 years: https://opendev.org/openstack/charm-keystone/commit/ccf15398 ** Also affects: keystone (Ubuntu) Importance: Undecided Status: New ** Changed in: keystone Status: Incomplete => Invalid ** Also affects: charm-keystone Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Identity (keystone). https://bugs.launchpad.net/bugs/1859844 Title: Impossible to rename the Default domain id to the string 'default.' Status in OpenStack keystone charm: New Status in OpenStack Identity (keystone): Invalid Status in keystone package in Ubuntu: New Bug description: Openstack version = Rocky When changing the 'default_domain_id' variable to the string 'default' and changing all references for this variable in the keystone database we get the following error in keystone.log: (keystone.common.wsgi): 2020-01-15 14:16:37,869 ERROR badly formed hexadecimal UUID string Traceback (most recent call last): File "/usr/lib/python3/dist-packages/keystone/common/wsgi.py", line 148, in __call__ result = method(req, **params) File "/usr/lib/python3/dist-packages/keystone/auth/controllers.py", line 102, in authenticate_for_token app_cred_id=app_cred_id, parent_audit_id=token_audit_id) File "/usr/lib/python3/dist-packages/keystone/common/manager.py", line 116, in wrapped __ret_val = __f(*args, **kwargs) File "/usr/lib/python3/dist-packages/keystone/token/provider.py", line 251, in issue_token token_id, issued_at = self.driver.generate_id_and_issued_at(token) File "/usr/lib/python3/dist-packages/keystone/token/providers/fernet/core.py", line 61, in generate_id_and_issued_at app_cred_id=token.application_credential_id File "/usr/lib/python3/dist-packages/keystone/token/token_formatters.py", line 159, in create_token protocol_id, access_token_id, app_cred_id File "/usr/lib/python3/dist-packages/keystone/token/token_formatters.py", line 444, in assemble b_domain_id = cls.convert_uuid_hex_to_bytes(domain_id) File "/usr/lib/python3/dist-packages/keystone/token/token_formatters.py", line 290, in convert_uuid_hex_to_bytes uuid_obj = uuid.UUID(uuid_string) File "/usr/lib/python3.6/uuid.py", line 140, in __init__ raise ValueError('badly formed hexadecimal UUID string') ValueError: badly formed hexadecimal UUID string (keystone.common.wsgi): 2020-01-15 14:16:38,908 WARNING You are not authorized to perform the requested action: identity:get_domain. (keystone.common.wsgi): 2020-01-15 14:16:39,058 WARNING You are not authorized to perform the requested action: identity:get_domain. (keystone.common.wsgi): 2020-01-15 14:16:50,838 WARNING You are not authorized to perform the requested action: identity:list_projects. (keystone.common.wsgi): 2020-01-15 14:16:54,086 WARNING You are not authorized to perform the requested action: identity:list_projects. This change is needed to integrate keystone to ICO (IBM Cloud Orchestrator) To manage notifications about this bug go to: https://bugs.launchpad.net/charm-keystone/+bug/1859844/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1831986] Re: fwaas_v2 - unable to associate port with firewall (PXC strict mode)
** Description changed: - Impacts both Stein and Rocky (although rocky does not enable v2 just - yet). + [Impact] + Unable to associate ports with a firewall under FWaaS v2 + + [Test Case] + Deploy OpenStack (stein or Later) using Charms + Create firewall policy, apply to router - failure as unable to associate port with policy in underlying DB + + [Regression Potential] + Medium; the proposed fix has not been accepted upstream as yet (discussion ongoing due to change of database migrations). + + [Original Bug Report] + Impacts both Stein and Rocky (although rocky does not enable v2 just yet). 542 a9761fa9124740028d0c1d70ff7aa542] DBAPIError exception wrapped from (pymysql.err.InternalError) (1105, 'Percona-XtraDB-Cluster prohibits use of DML command on a table (neutron.firewall_group_port_associations_v2) without an explicit primary key with pxc_strict_mode = ENFORCING or MASTER') [SQL: 'DELETE FROM firewall_group_port_associations_v2 WHERE firewall_group_port_associations_v2.firewall_group_id = %(firewall_group_id_1)s'] [parameters: {'firewall_group_id_1': '85a277d0-ebaf-4a5d-9d45-6a74b8f54372'}] (Background on this error at: http://sqlalche.me/e/2j85): pymysql.err.InternalError: (1105, 'Percona-XtraDB-Cluster prohibits use of DML command on a table (neutron.firewall_group_port_associations_v2) without an explicit primary key with pxc_strict_mode = ENFORCING or MASTER') 2019-06-07 10:07:50.937 30837 ERROR oslo_db.sqlalchemy.exc_filters Traceback (most recent call last): 2019-06-07 10:07:50.937 30837 ERROR oslo_db.sqlalchemy.exc_filters File "/usr/lib/python3/dist-packages/sqlalchemy/engine/base.py", line 1193, in _execute_context 2019-06-07 10:07:50.937 30837 ERROR oslo_db.sqlalchemy.exc_filters context) 2019-06-07 10:07:50.937 30837 ERROR oslo_db.sqlalchemy.exc_filters File "/usr/lib/python3/dist-packages/sqlalchemy/engine/default.py", line 509, in do_execute 2019-06-07 10:07:50.937 30837 ERROR oslo_db.sqlalchemy.exc_filters cursor.execute(statement, parameters) 2019-06-07 10:07:50.937 30837 ERROR oslo_db.sqlalchemy.exc_filters File "/usr/lib/python3/dist-packages/pymysql/cursors.py", line 165, in execute 2019-06-07 10:07:50.937 30837 ERROR oslo_db.sqlalchemy.exc_filters result = self._query(query) 2019-06-07 10:07:50.937 30837 ERROR oslo_db.sqlalchemy.exc_filters File "/usr/lib/python3/dist-packages/pymysql/cursors.py", line 321, in _query 2019-06-07 10:07:50.937 30837 ERROR oslo_db.sqlalchemy.exc_filters conn.query(q) 2019-06-07 10:07:50.937 30837 ERROR oslo_db.sqlalchemy.exc_filters File "/usr/lib/python3/dist-packages/pymysql/connections.py", line 860, in query 2019-06-07 10:07:50.937 30837 ERROR oslo_db.sqlalchemy.exc_filters self._affected_rows = self._read_query_result(unbuffered=unbuffered) 2019-06-07 10:07:50.937 30837 ERROR oslo_db.sqlalchemy.exc_filters File "/usr/lib/python3/dist-packages/pymysql/connections.py", line 1061, in _read_query_result 2019-06-07 10:07:50.937 30837 ERROR oslo_db.sqlalchemy.exc_filters result.read() 2019-06-07 10:07:50.937 30837 ERROR oslo_db.sqlalchemy.exc_filters File "/usr/lib/python3/dist-packages/pymysql/connections.py", line 1349, in read 2019-06-07 10:07:50.937 30837 ERROR oslo_db.sqlalchemy.exc_filters first_packet = self.connection._read_packet() 2019-06-07 10:07:50.937 30837 ERROR oslo_db.sqlalchemy.exc_filters File "/usr/lib/python3/dist-packages/pymysql/connections.py", line 1018, in _read_packet 2019-06-07 10:07:50.937 30837 ERROR oslo_db.sqlalchemy.exc_filters packet.check_error() 2019-06-07 10:07:50.937 30837 ERROR oslo_db.sqlalchemy.exc_filters File "/usr/lib/python3/dist-packages/pymysql/connections.py", line 384, in check_error 2019-06-07 10:07:50.937 30837 ERROR oslo_db.sqlalchemy.exc_filters err.raise_mysql_exception(self._data) 2019-06-07 10:07:50.937 30837 ERROR oslo_db.sqlalchemy.exc_filters File "/usr/lib/python3/dist-packages/pymysql/err.py", line 107, in raise_mysql_exception 2019-06-07 10:07:50.937 30837 ERROR oslo_db.sqlalchemy.exc_filters raise errorclass(errno, errval) 2019-06-07 10:07:50.937 30837 ERROR oslo_db.sqlalchemy.exc_filters pymysql.err.InternalError: (1105, 'Percona-XtraDB-Cluster prohibits use of DML command on a table (neutron.firewall_group_port_associations_v2) without an explicit primary key with pxc_strict_mode = ENFORCING or MASTER') 2019-06-07 10:07:50.937 30837 ERROR oslo_db.sqlalchemy.exc_filters ProblemType: Bug DistroRelease: Ubuntu 18.04 Package: neutron-server 2:14.0.0-0ubuntu1.1~cloud0 [origin: Canonical] ProcVersionSignature: Ubuntu 4.15.0-51.55-generic 4.15.18 Uname: Linux 4.15.0-51-generic x86_64 ApportVersion: 2.20.9-0ubuntu7.6 Architecture: amd64 CrashDB: { "impl": "launchpad", "project": "cloud-archive", "bug_pattern_url":
[Yahoo-eng-team] [Bug 1831986] Re: fwaas_v2 - unable to associate port with firewall (PXC strict mode)
This bug was fixed in the package neutron-fwaas - 1:15.0.0~rc1-0ubuntu3~cloud0 --- neutron-fwaas (1:15.0.0~rc1-0ubuntu3~cloud0) bionic-train; urgency=medium . * New update for the Ubuntu Cloud Archive. . neutron-fwaas (1:15.0.0~rc1-0ubuntu3) eoan; urgency=medium . * d/p/add-missing-pk-firewall-group-associations-v2.patch: Cherry pick fix to resolve issue with missing primary key on firewall_group_associations_v2 table (LP: #1831986). ** Changed in: cloud-archive Status: Triaged => Fix Released -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1831986 Title: fwaas_v2 - unable to associate port with firewall (PXC strict mode) Status in Ubuntu Cloud Archive: Fix Released Status in Ubuntu Cloud Archive rocky series: Triaged Status in Ubuntu Cloud Archive stein series: Triaged Status in Ubuntu Cloud Archive train series: Fix Released Status in neutron: In Progress Status in neutron-fwaas package in Ubuntu: Fix Released Status in neutron-fwaas source package in Disco: Triaged Status in neutron-fwaas source package in Eoan: Fix Released Bug description: Impacts both Stein and Rocky (although rocky does not enable v2 just yet). 542 a9761fa9124740028d0c1d70ff7aa542] DBAPIError exception wrapped from (pymysql.err.InternalError) (1105, 'Percona-XtraDB-Cluster prohibits use of DML command on a table (neutron.firewall_group_port_associations_v2) without an explicit primary key with pxc_strict_mode = ENFORCING or MASTER') [SQL: 'DELETE FROM firewall_group_port_associations_v2 WHERE firewall_group_port_associations_v2.firewall_group_id = %(firewall_group_id_1)s'] [parameters: {'firewall_group_id_1': '85a277d0-ebaf-4a5d-9d45-6a74b8f54372'}] (Background on this error at: http://sqlalche.me/e/2j85): pymysql.err.InternalError: (1105, 'Percona-XtraDB-Cluster prohibits use of DML command on a table (neutron.firewall_group_port_associations_v2) without an explicit primary key with pxc_strict_mode = ENFORCING or MASTER') 2019-06-07 10:07:50.937 30837 ERROR oslo_db.sqlalchemy.exc_filters Traceback (most recent call last): 2019-06-07 10:07:50.937 30837 ERROR oslo_db.sqlalchemy.exc_filters File "/usr/lib/python3/dist-packages/sqlalchemy/engine/base.py", line 1193, in _execute_context 2019-06-07 10:07:50.937 30837 ERROR oslo_db.sqlalchemy.exc_filters context) 2019-06-07 10:07:50.937 30837 ERROR oslo_db.sqlalchemy.exc_filters File "/usr/lib/python3/dist-packages/sqlalchemy/engine/default.py", line 509, in do_execute 2019-06-07 10:07:50.937 30837 ERROR oslo_db.sqlalchemy.exc_filters cursor.execute(statement, parameters) 2019-06-07 10:07:50.937 30837 ERROR oslo_db.sqlalchemy.exc_filters File "/usr/lib/python3/dist-packages/pymysql/cursors.py", line 165, in execute 2019-06-07 10:07:50.937 30837 ERROR oslo_db.sqlalchemy.exc_filters result = self._query(query) 2019-06-07 10:07:50.937 30837 ERROR oslo_db.sqlalchemy.exc_filters File "/usr/lib/python3/dist-packages/pymysql/cursors.py", line 321, in _query 2019-06-07 10:07:50.937 30837 ERROR oslo_db.sqlalchemy.exc_filters conn.query(q) 2019-06-07 10:07:50.937 30837 ERROR oslo_db.sqlalchemy.exc_filters File "/usr/lib/python3/dist-packages/pymysql/connections.py", line 860, in query 2019-06-07 10:07:50.937 30837 ERROR oslo_db.sqlalchemy.exc_filters self._affected_rows = self._read_query_result(unbuffered=unbuffered) 2019-06-07 10:07:50.937 30837 ERROR oslo_db.sqlalchemy.exc_filters File "/usr/lib/python3/dist-packages/pymysql/connections.py", line 1061, in _read_query_result 2019-06-07 10:07:50.937 30837 ERROR oslo_db.sqlalchemy.exc_filters result.read() 2019-06-07 10:07:50.937 30837 ERROR oslo_db.sqlalchemy.exc_filters File "/usr/lib/python3/dist-packages/pymysql/connections.py", line 1349, in read 2019-06-07 10:07:50.937 30837 ERROR oslo_db.sqlalchemy.exc_filters first_packet = self.connection._read_packet() 2019-06-07 10:07:50.937 30837 ERROR oslo_db.sqlalchemy.exc_filters File "/usr/lib/python3/dist-packages/pymysql/connections.py", line 1018, in _read_packet 2019-06-07 10:07:50.937 30837 ERROR oslo_db.sqlalchemy.exc_filters packet.check_error() 2019-06-07 10:07:50.937 30837 ERROR oslo_db.sqlalchemy.exc_filters File "/usr/lib/python3/dist-packages/pymysql/connections.py", line 384, in check_error 2019-06-07 10:07:50.937 30837 ERROR oslo_db.sqlalchemy.exc_filters err.raise_mysql_exception(self._data) 2019-06-07 10:07:50.937 30837 ERROR oslo_db.sqlalchemy.exc_filters File "/usr/lib/python3/dist-packages/pymysql/err.py", line 107, in raise_mysql_exception 2019-06-07 10:07:50.937 30837 ERROR oslo_db.sqlalchemy.exc_filters raise errorclass(errno, errval) 2019-06-07 10:07:50.937 30837 ERROR oslo_db.sqlalchemy.exc_filters pymysql.err.InternalError: (1105, 'Percona-XtraDB-Cluster
[Yahoo-eng-team] [Bug 1846606] Re: [eoan] Unknown column 'public' in 'firewall_rules_v2'
I think the check constraint is automatically created by sqlalchemy to enforce the Boolean type definition. ** Changed in: neutron (Ubuntu) Assignee: (unassigned) => James Page (james-page) ** Package changed: neutron (Ubuntu) => neutron-fwaas (Ubuntu) ** Changed in: neutron-fwaas (Ubuntu) Assignee: James Page (james-page) => (unassigned) -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1846606 Title: [eoan] Unknown column 'public' in 'firewall_rules_v2' Status in neutron: New Status in neutron-fwaas package in Ubuntu: Confirmed Bug description: I installed a fresh openstack test cluster in eoan today (October 3). Neutron database initialization with the command: sudo su -s /bin/sh -c "neutron-db-manage --config-file /etc/neutron/neutron.conf --config-file /etc/neutron/plugins/ml2/ml2_conf.ini upgrade head" neutron failed with error message: oslo_db.exception.DBError: (pymysql.err.InternalError) (1054, "Unknown column 'public' in 'firewall_rules_v2'") [SQL: 'ALTER TABLE firewall_rules_v2 CHANGE public shared BOOL NULL'] (Background on this error at: http://sqlalche.me/e/2j85) In mysql the table and the column exist, with a constraint on the column: CONSTRAINT `firewall_rules_v2_chk_1` CHECK ((`public` in (0,1))), manually updating the column in mysql failed with the same error message. mysql> ALTER TABLE firewall_rules_v2 CHANGE public shared BOOL NULL; ERROR 1054 (42S22): Unknown column 'public' in 'check constraint firewall_rules_v2_chk_1 expression' I guessed the constraint did not like it if the name of the column was changed. I removed the column 'private' and created it again, without the constraint. The the alter table command worked fine. After doing the same for the private columns in tables firewall_groups_v2 and firewall_policies_v2 Neutron could initialize the database and all was fine (I could create a network and start an instance). neutron 2:15.0.0~rc1-0ubuntu1 mysql-server 8.0.16-0ubuntu3 To manage notifications about this bug go to: https://bugs.launchpad.net/neutron/+bug/1846606/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1834213] Re: After kernel upgrade, nf_conntrack_ipv4 module unloaded, no IP traffic to instances
Adding a neutron bug-task to get an upstream opinion on whether neutron should be loading these modules as the n-ovs-agent starts up. ** Also affects: neutron Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1834213 Title: After kernel upgrade, nf_conntrack_ipv4 module unloaded, no IP traffic to instances Status in OpenStack neutron-openvswitch charm: Fix Committed Status in neutron: New Status in linux package in Ubuntu: Confirmed Bug description: With an environment running Xenial-Queens, and having just upgraded the linux-image-generic kernel for MDS patching, a few of our hypervisor hosts that were rebooted (3 out of 100) ended up dropping IP (tcp/udp) ingress traffic. It turns out that nf_conntrack module was loaded, but nf_conntrack_ipv4 was not loading, and the traffic was being dropped by this rule: table=72, n_packets=214989, priority=50,ct_state=+inv+trk actions=resubmit(,93) The ct_state "inv" means invalid conntrack state, which the manpage describes as: The state is invalid, meaning that the connection tracker couldn’t identify the connection. This flag is a catch- all for problems in the connection or the connection tracker, such as: • L3/L4 protocol handler is not loaded/unavailable. With the Linux kernel datapath, this may mean that the nf_conntrack_ipv4 or nf_conntrack_ipv6 modules are not loaded. • L3/L4 protocol handler determines that the packet is malformed. • Packets are unexpected length for protocol. It appears that there may be an issue when patching the OS of a hypervisor not running instances may fail to update initrd to load nf_conntrack_ipv4 (and/or _ipv6). I couldn't find anywhere in the charm code that this would be loaded unless the charm's "harden" option is used on nova-compute charm (see charmhelpers contrib/host templates). It is unset in our environment, so we are not using any special module probing. Did nf_conntrack_ipv4 get split out from nf_conntrack in recent kernel upgrades or is it possible that the charm should define a modprobe file if we have the OVS firewall driver configured? To manage notifications about this bug go to: https://bugs.launchpad.net/charm-neutron-openvswitch/+bug/1834213/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1834747] Re: Horizon is unable to show instance list is image_id is not set
Cloud Archive being resolved under bug 1837905 ** Changed in: cloud-archive Status: New => Invalid -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Dashboard (Horizon). https://bugs.launchpad.net/bugs/1834747 Title: Horizon is unable to show instance list is image_id is not set Status in Ubuntu Cloud Archive: Invalid Status in OpenStack Dashboard (Horizon): Fix Released Bug description: My setup contains several instance made from empty volume and installation from iso image. Thus, those instances does not have any source image. But some instances still have have image_metadata to tweak instances . As an example, those are the metadata from one of my boot volume : volume_image_metadata | {u'hw_qemu_guest_agent': u'yes', u'hw_vif_multiqueue_enabled': u'true', u'os_require_quiesce': u'yes'} Before Stein, I was able to go to project/instance and list every instances from the project, as expected. Since Stein Horizon release, this page crash without much details. After further investigation, I foud that the culprit is that piece of code in /usr/horizon/openstack_dashboard/dashboards/project/instances/views.py from line 184 boot_volume = volume_dict[instance_volumes[0]['id']] if (hasattr(boot_volume, "volume_image_metadata") and boot_volume.volume_image_metadata['image_id'] in image_dict): instance.image = image_dict[ boot_volume.volume_image_metadata['image_id'] ] I replace this code by that one to take care of the case where there are image metadata but no image_id: boot_volume = volume_dict[instance_volumes[0]['id']] if (hasattr(boot_volume, "volume_image_metadata")): if (hasattr(boot_volume.volume_image_metadata, "image_id")): if (boot_volume.volume_image_metadata['image_id'] in image_dict): instance.image = image_dict[ boot_volume.volume_image_metadata['image_id'] ] That corrected this specific bug but I might not be the only one impacted by it... To manage notifications about this bug go to: https://bugs.launchpad.net/cloud-archive/+bug/1834747/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1832210] Re: fwaas netfilter_log: incorrect decode of log prefix under python 3
This bug was fixed in the package neutron-fwaas - 1:14.0.0-0ubuntu1.1~cloud0 --- neutron-fwaas (1:14.0.0-0ubuntu1.1~cloud0) bionic-stein; urgency=medium . * New update for the Ubuntu Cloud Archive. . neutron-fwaas (1:14.0.0-0ubuntu1.1) disco; urgency=medium . [ Corey Bryant ] * d/gbp.conf: Create stable/stein branch. . [ James Page ] * d/p/netfilter_log-Correct-decode-binary-types.patch: Cherry pick fix to resolve decoding of netfilter log prefix information under Python 3 (LP: #1832210). ** Changed in: cloud-archive/stein Status: Fix Committed => Fix Released -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1832210 Title: fwaas netfilter_log: incorrect decode of log prefix under python 3 Status in Ubuntu Cloud Archive: Fix Released Status in Ubuntu Cloud Archive rocky series: Fix Committed Status in Ubuntu Cloud Archive stein series: Fix Released Status in Ubuntu Cloud Archive train series: Fix Released Status in neutron: Fix Released Status in neutron-fwaas package in Ubuntu: Fix Released Status in neutron-fwaas source package in Cosmic: Won't Fix Status in neutron-fwaas source package in Disco: Fix Released Status in neutron-fwaas source package in Eoan: Fix Released Bug description: Under Python 3, the prefix of a firewall log message is not correctly decoded "b'10612530182266949194'": 2019-06-10 09:14:34 Unknown cookie packet_in pkt=ethernet(dst='fa:16:3e:c6:58:5e',ethertype=2048,src='fa:16:3e:e0:2c:be')ipv4(csum=51290,dst='10.5.0.10',flags=2,header_length=5,identification=37612,offset=0,option=None,proto=6,src='192.168.21.182',tos=16,total_length=52,ttl=63,version=4)tcp(ack=3151291228,bits=17,csum=23092,dst_port=57776,offset=8,option=[TCPOptionNoOperation(kind=1,length=1), TCPOptionNoOperation(kind=1,length=1), TCPOptionTimestamps(kind=8,length=10,ts_ecr=1574746440,ts_val=482688)],seq=2769917228,src_port=22,urgent=0,window_size=3120) 2019-06-10 09:14:34 {'prefix': "b'10612530182266949194'", 'msg': "ethernet(dst='fa:16:3e:c6:58:5e',ethertype=2048,src='fa:16:3e:e0:2c:be')ipv4(csum=51290,dst='10.5.0.10',flags=2,header_length=5,identification=37612,offset=0,option=None,proto=6,src='192.168.21.182',tos=16,total_length=52,ttl=63,version=4)tcp(ack=3151291228,bits=17,csum=23092,dst_port=57776,offset=8,option=[TCPOptionNoOperation(kind=1,length=1), TCPOptionNoOperation(kind=1,length=1), TCPOptionTimestamps(kind=8,length=10,ts_ecr=1574746440,ts_val=482688)],seq=2769917228,src_port=22,urgent=0,window_size=3120)"} 2019-06-10 09:14:34 {'0bf81ded-bf94-437d-ad49-063bba9be9bb': [, ]} This results in the firewall log driver not being able to map the message to the associated port and log resources in neutron resulting in the 'unknown cookie packet_in' warning message. To manage notifications about this bug go to: https://bugs.launchpad.net/cloud-archive/+bug/1832210/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1832210] Re: fwaas netfilter_log: incorrect decode of log prefix under python 3
# apt-cache policy python3-neutron-fwaas python3-neutron-fwaas: Installed: 1:14.0.0-0ubuntu1.1~cloud0 Candidate: 1:14.0.0-0ubuntu1.1~cloud0 Version table: *** 1:14.0.0-0ubuntu1.1~cloud0 500 500 http://ubuntu-cloud.archive.canonical.com/ubuntu bionic-proposed/stein/main amd64 Packages 100 /var/lib/dpkg/status Sample log messages: Aug 5 09:17:33 juju-ccc3cd-bionic-stein-19 neutron-l3-agent: action=DROP, project_id=8d69996595bf43568a66f6e4edb551b7, log_resource_ids=['719c90e1-e6a4-49cb-a105-74cc86cff67f'], port=79704b31-91a4-42ae-b66f-e356b2811df0, pkt=ethernet(dst='fa:16:3e:97:b3:4f',ethertype=2048,src='fa:16:3e:41:6f:cc')ipv4(csum=20223,dst='192.168.21.92',flags=2,header_length=5,identification=3223,offset=0,option=None,proto=1,src='10.5.0.10',tos=0,total_length=84,ttl=63,version=4)icmp(code=0,csum=63684,data=echo(data=b'#\xf4G]\x00\x00\x00\x00\xc8\xc7\x03\x00\x00\x00\x00\x00\x10\x11\x12\x13\x14\x15\x16\x17\x18\x19\x1a\x1b\x1c\x1d\x1e\x1f !"#$%&\'()*+,-./01234567',id=2227,seq=156),type=8) Aug 5 09:17:34 juju-ccc3cd-bionic-stein-19 neutron-l3-agent: action=DROP, project_id=8d69996595bf43568a66f6e4edb551b7, log_resource_ids=['719c90e1-e6a4-49cb-a105-74cc86cff67f'], port=79704b31-91a4-42ae-b66f-e356b2811df0, pkt=ethernet(dst='fa:16:3e:97:b3:4f',ethertype=2048,src='fa:16:3e:41:6f:cc')ipv4(csum=20066,dst='192.168.21.92',flags=2,header_length=5,identification=3380,offset=0,option=None,proto=1,src='10.5.0.10',tos=0,total_length=84,ttl=63,version=4)icmp(code=0,csum=8550,data=echo(data=b'$\xf4G]\x00\x00\x00\x00\x9e%\x04\x00\x00\x00\x00\x00\x10\x11\x12\x13\x14\x15\x16\x17\x18\x19\x1a\x1b\x1c\x1d\x1e\x1f !"#$%&\'()*+,-./01234567',id=2227,seq=157),type=8) Aug 5 09:17:35 juju-ccc3cd-bionic-stein-19 neutron-l3-agent: action=ACCEPT, project_id=8d69996595bf43568a66f6e4edb551b7, log_resource_ids=['719c90e1-e6a4-49cb-a105-74cc86cff67f'], port=79704b31-91a4-42ae-b66f-e356b2811df0, pkt=ethernet(dst='fa:16:3e:97:b3:4f',ethertype=2048,src='fa:16:3e:41:6f:cc')ipv4(csum=2473,dst='192.168.21.92',flags=2,header_length=5,identification=20992,offset=0,option=None,proto=6,src='10.5.0.10',tos=0,total_length=60,ttl=63,version=4)tcp(ack=0,bits=2,csum=7323,dst_port=22,offset=10,option=[TCPOptionMaximumSegmentSize(kind=2,length=4,max_seg_size=8918), TCPOptionSACKPermitted(kind=4,length=2), TCPOptionTimestamps(kind=8,length=10,ts_ecr=0,ts_val=3542096121), TCPOptionNoOperation(kind=1,length=1), TCPOptionWindowScale(kind=3,length=3,shift_cnt=7)],seq=1338233633,src_port=46744,urgent=0,window_size=26754) ** Tags removed: verification-stein-needed ** Tags added: verification-stein-done ** Changed in: neutron-fwaas (Ubuntu Cosmic) Status: Fix Committed => Won't Fix -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1832210 Title: fwaas netfilter_log: incorrect decode of log prefix under python 3 Status in Ubuntu Cloud Archive: Fix Released Status in Ubuntu Cloud Archive rocky series: Fix Committed Status in Ubuntu Cloud Archive stein series: Fix Released Status in Ubuntu Cloud Archive train series: Fix Released Status in neutron: Fix Released Status in neutron-fwaas package in Ubuntu: Fix Released Status in neutron-fwaas source package in Cosmic: Won't Fix Status in neutron-fwaas source package in Disco: Fix Released Status in neutron-fwaas source package in Eoan: Fix Released Bug description: Under Python 3, the prefix of a firewall log message is not correctly decoded "b'10612530182266949194'": 2019-06-10 09:14:34 Unknown cookie packet_in pkt=ethernet(dst='fa:16:3e:c6:58:5e',ethertype=2048,src='fa:16:3e:e0:2c:be')ipv4(csum=51290,dst='10.5.0.10',flags=2,header_length=5,identification=37612,offset=0,option=None,proto=6,src='192.168.21.182',tos=16,total_length=52,ttl=63,version=4)tcp(ack=3151291228,bits=17,csum=23092,dst_port=57776,offset=8,option=[TCPOptionNoOperation(kind=1,length=1), TCPOptionNoOperation(kind=1,length=1), TCPOptionTimestamps(kind=8,length=10,ts_ecr=1574746440,ts_val=482688)],seq=2769917228,src_port=22,urgent=0,window_size=3120) 2019-06-10 09:14:34 {'prefix': "b'10612530182266949194'", 'msg': "ethernet(dst='fa:16:3e:c6:58:5e',ethertype=2048,src='fa:16:3e:e0:2c:be')ipv4(csum=51290,dst='10.5.0.10',flags=2,header_length=5,identification=37612,offset=0,option=None,proto=6,src='192.168.21.182',tos=16,total_length=52,ttl=63,version=4)tcp(ack=3151291228,bits=17,csum=23092,dst_port=57776,offset=8,option=[TCPOptionNoOperation(kind=1,length=1), TCPOptionNoOperation(kind=1,length=1), TCPOptionTimestamps(kind=8,length=10,ts_ecr=1574746440,ts_val=482688)],seq=2769917228,src_port=22,urgent=0,window_size=3120)"} 2019-06-10 09:14:34 {'0bf81ded-bf94-437d-ad49-063bba9be9bb': [, ]} This results in the firewall log driver not being able to map the message to the associated port and log resources in neutron resulting in the 'unknown cookie
[Yahoo-eng-team] [Bug 1580588] Re: [RFE] use network's dns_domain to generate dns_assignment
** Changed in: neutron (Ubuntu) Status: New => Won't Fix -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1580588 Title: [RFE] use network's dns_domain to generate dns_assignment Status in neutron: Confirmed Status in neutron package in Ubuntu: Won't Fix Bug description: Problem: currently, the port's dns_assignment is generated by combining the dns_name and conf.dns_domain even if the dns_domain of port's network is given. expectation: generate the dns_assignment according to the dns_domain of port's network, which will scope the dnsname by network instead of each neutron deployment. To manage notifications about this bug go to: https://bugs.launchpad.net/neutron/+bug/1580588/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1832265] Re: py3: inconsistent encoding of token fields
Ubuntu SRU information: [Impact] Due to inconsistent decode/encode of bytestrings under py3, keystone ldap integration is broken when keystone is run under Python 3. [Test Case] Deploy keystone Configure to use LDAP [Regression Potential] The proposed patch has been +2'ed by upstream and validated as resolving this issue by the original bug reporter; the change simply ensures that any encoded values are decoded before use. ** Changed in: cloud-archive/stein Status: New => Triaged ** Changed in: cloud-archive/rocky Status: New => Triaged ** Changed in: keystone (Ubuntu Cosmic) Status: Triaged => Won't Fix ** Changed in: cloud-archive/train Importance: Undecided => High ** Changed in: cloud-archive/stein Importance: Undecided => High ** Changed in: cloud-archive/rocky Importance: Undecided => High -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Identity (keystone). https://bugs.launchpad.net/bugs/1832265 Title: py3: inconsistent encoding of token fields Status in OpenStack Keystone LDAP integration: Invalid Status in Ubuntu Cloud Archive: Triaged Status in Ubuntu Cloud Archive rocky series: Triaged Status in Ubuntu Cloud Archive stein series: Triaged Status in Ubuntu Cloud Archive train series: Fix Released Status in OpenStack Identity (keystone): In Progress Status in keystone package in Ubuntu: Fix Released Status in keystone source package in Cosmic: Won't Fix Status in keystone source package in Disco: Triaged Bug description: When using an LDAP domain user on a bionic-rocky cloud within horizon, we are unable to see the projects listed in the project selection drop-down, and are unable to query resources from any projects to which we are assigned the role Member. It appears that the following log entries in keystone may be helpful to troubleshooting this issue: (keystone.middleware.auth): 2019-06-10 19:47:02,700 DEBUG RBAC: auth_context: {'trust_id': None, 'trustor_id': None, 'trustee_id': None, 'domain_id': None, 'domain_name': None, 'group_ids': [], 'token': , 'user_id': b'd4fb94cfa3ce0f7829d76fe44697488e7765d88e29f5a896f57d43caadb0fad4', 'user_domain_id': '997b3e91271140feb1635eefba7c65a1', 'system_scope': None, 'project_id': None, 'project_domain_id': None, 'roles': [], 'is_admin_project': True, 'service_user_id': None, 'service_user_domain_id': None, 'service_project_id': None, 'service_project_domain_id': None, 'service_roles': []} (keystone.server.flask.application): 2019-06-10 19:47:02,700 DEBUG Dispatching request to legacy mapper: /v3/users (keystone.server.flask.application): 2019-06-10 19:47:02,700 DEBUG SCRIPT_NAME: `/v3`, PATH_INFO: `/users/d4fb94cfa3ce0f7829d76fe44697488e7765d88e29f5a896f57d43caadb0fad4/projects` (routes.middleware): 2019-06-10 19:47:02,700 DEBUG Matched GET /users/d4fb94cfa3ce0f7829d76fe44697488e7765d88e29f5a896f57d43caadb0fad4/projects (routes.middleware): 2019-06-10 19:47:02,700 DEBUG Route path: '/users/{user_id}/projects', defaults: {'action': 'list_user_projects', 'controller': } (routes.middleware): 2019-06-10 19:47:02,700 DEBUG Match dict: {'user_id': 'd4fb94cfa3ce0f7829d76fe44697488e7765d88e29f5a896f57d43caadb0fad4', 'action': 'list_user_projects', 'controller': } (keystone.common.wsgi): 2019-06-10 19:47:02,700 INFO GET https://keystone.mysite:5000/v3/users/d4fb94cfa3ce0f7829d76fe44697488e7765d88e29f5a896f57d43caadb0fad4/projects (keystone.common.controller): 2019-06-10 19:47:02,700 DEBUG RBAC: Adding query filter params () (keystone.common.authorization): 2019-06-10 19:47:02,700 DEBUG RBAC: Authorizing identity:list_user_projects(user_id=d4fb94cfa3ce0f7829d76fe44697488e7765d88e29f5a896f57d43caadb0fad4) (keystone.policy.backends.rules): 2019-06-10 19:47:02,701 DEBUG enforce identity:list_user_projects: {'trust_id': None, 'trustor_id': None, 'trustee_id': None, 'domain_id': None, 'domain_name': None, 'group_ids': [], 'token': , 'user_id': b'd4fb94cfa3ce0f7829d76fe44697488e7765d88e29f5a896f57d43caadb0fad4', 'user_domain_id': '997b3e91271140feb1635eefba7c65a1', 'system_scope': None, 'project_id': None, 'project_domain_id': None, 'roles': [], 'is_admin_project': True, 'service_user_id': None, 'service_user_domain_id': None, 'service_project_id': None, 'service_project_domain_id': None, 'service_roles': []} (keystone.common.wsgi): 2019-06-10 19:47:02,702 WARNING You are not authorized to perform the requested action: identity:list_user_projects. It actually appears elsewhere in the keystone.log that there is a string which has encapsulated bytecode data in it (or vice versa). (keystone.common.wsgi): 2019-06-10 19:46:59,019 INFO POST https://keystone.mysite:5000/v3/auth/tokens (sqlalchemy.orm.path_registry): 2019-06-10 19:46:59,021 DEBUG set 'memoized_setups' on path 'EntityRegistry((,))' to '{}' (sqlalchemy.pool.QueuePool): 2019-06-10
[Yahoo-eng-team] [Bug 1832265] Re: py3: inconsistent encoding of token fields
** Also affects: cloud-archive/train Importance: Undecided Status: New ** Also affects: cloud-archive/rocky Importance: Undecided Status: New ** Also affects: cloud-archive/stein Importance: Undecided Status: New ** Changed in: cloud-archive/train Status: New => Fix Released -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Identity (keystone). https://bugs.launchpad.net/bugs/1832265 Title: py3: inconsistent encoding of token fields Status in OpenStack Keystone LDAP integration: Invalid Status in Ubuntu Cloud Archive: New Status in Ubuntu Cloud Archive rocky series: New Status in Ubuntu Cloud Archive stein series: New Status in Ubuntu Cloud Archive train series: Fix Released Status in OpenStack Identity (keystone): In Progress Status in keystone package in Ubuntu: Fix Released Status in keystone source package in Cosmic: Triaged Status in keystone source package in Disco: Triaged Bug description: When using an LDAP domain user on a bionic-rocky cloud within horizon, we are unable to see the projects listed in the project selection drop-down, and are unable to query resources from any projects to which we are assigned the role Member. It appears that the following log entries in keystone may be helpful to troubleshooting this issue: (keystone.middleware.auth): 2019-06-10 19:47:02,700 DEBUG RBAC: auth_context: {'trust_id': None, 'trustor_id': None, 'trustee_id': None, 'domain_id': None, 'domain_name': None, 'group_ids': [], 'token': , 'user_id': b'd4fb94cfa3ce0f7829d76fe44697488e7765d88e29f5a896f57d43caadb0fad4', 'user_domain_id': '997b3e91271140feb1635eefba7c65a1', 'system_scope': None, 'project_id': None, 'project_domain_id': None, 'roles': [], 'is_admin_project': True, 'service_user_id': None, 'service_user_domain_id': None, 'service_project_id': None, 'service_project_domain_id': None, 'service_roles': []} (keystone.server.flask.application): 2019-06-10 19:47:02,700 DEBUG Dispatching request to legacy mapper: /v3/users (keystone.server.flask.application): 2019-06-10 19:47:02,700 DEBUG SCRIPT_NAME: `/v3`, PATH_INFO: `/users/d4fb94cfa3ce0f7829d76fe44697488e7765d88e29f5a896f57d43caadb0fad4/projects` (routes.middleware): 2019-06-10 19:47:02,700 DEBUG Matched GET /users/d4fb94cfa3ce0f7829d76fe44697488e7765d88e29f5a896f57d43caadb0fad4/projects (routes.middleware): 2019-06-10 19:47:02,700 DEBUG Route path: '/users/{user_id}/projects', defaults: {'action': 'list_user_projects', 'controller': } (routes.middleware): 2019-06-10 19:47:02,700 DEBUG Match dict: {'user_id': 'd4fb94cfa3ce0f7829d76fe44697488e7765d88e29f5a896f57d43caadb0fad4', 'action': 'list_user_projects', 'controller': } (keystone.common.wsgi): 2019-06-10 19:47:02,700 INFO GET https://keystone.mysite:5000/v3/users/d4fb94cfa3ce0f7829d76fe44697488e7765d88e29f5a896f57d43caadb0fad4/projects (keystone.common.controller): 2019-06-10 19:47:02,700 DEBUG RBAC: Adding query filter params () (keystone.common.authorization): 2019-06-10 19:47:02,700 DEBUG RBAC: Authorizing identity:list_user_projects(user_id=d4fb94cfa3ce0f7829d76fe44697488e7765d88e29f5a896f57d43caadb0fad4) (keystone.policy.backends.rules): 2019-06-10 19:47:02,701 DEBUG enforce identity:list_user_projects: {'trust_id': None, 'trustor_id': None, 'trustee_id': None, 'domain_id': None, 'domain_name': None, 'group_ids': [], 'token': , 'user_id': b'd4fb94cfa3ce0f7829d76fe44697488e7765d88e29f5a896f57d43caadb0fad4', 'user_domain_id': '997b3e91271140feb1635eefba7c65a1', 'system_scope': None, 'project_id': None, 'project_domain_id': None, 'roles': [], 'is_admin_project': True, 'service_user_id': None, 'service_user_domain_id': None, 'service_project_id': None, 'service_project_domain_id': None, 'service_roles': []} (keystone.common.wsgi): 2019-06-10 19:47:02,702 WARNING You are not authorized to perform the requested action: identity:list_user_projects. It actually appears elsewhere in the keystone.log that there is a string which has encapsulated bytecode data in it (or vice versa). (keystone.common.wsgi): 2019-06-10 19:46:59,019 INFO POST https://keystone.mysite:5000/v3/auth/tokens (sqlalchemy.orm.path_registry): 2019-06-10 19:46:59,021 DEBUG set 'memoized_setups' on path 'EntityRegistry((,))' to '{}' (sqlalchemy.pool.QueuePool): 2019-06-10 19:46:59,021 DEBUG Connection checked out from pool (sqlalchemy.pool.QueuePool): 2019-06-10 19:46:59,024 DEBUG Connection being returned to pool (sqlalchemy.pool.QueuePool): 2019-06-10 19:46:59,024 DEBUG Connection rollback-on-return, via agent (keystone.auth.core): 2019-06-10 19:46:59,025 DEBUG MFA Rules not processed for user `b'd4fb94cfa3ce0f7829d76fe44697488e7765d88e29f5a896f57d43caadb0fad4'`. Rule list: `[]` (Enabled: `True`). (keystone.common.wsgi): 2019-06-10 19:46:59,025 ERROR a bytes-like object is required, not 'str'
[Yahoo-eng-team] [Bug 1831986] Re: fwaas_v2 - unable to associate port with firewall (PXC strict mode)
As cosmic EOL's today not targeting for this fix. ** Also affects: cloud-archive/stein Importance: Undecided Status: New ** Also affects: cloud-archive/rocky Importance: Undecided Status: New ** Also affects: cloud-archive/train Importance: Undecided Status: New ** Also affects: neutron-fwaas (Ubuntu Eoan) Importance: Undecided Status: New ** Also affects: neutron-fwaas (Ubuntu Disco) Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1831986 Title: fwaas_v2 - unable to associate port with firewall (PXC strict mode) Status in Ubuntu Cloud Archive: New Status in Ubuntu Cloud Archive rocky series: New Status in Ubuntu Cloud Archive stein series: New Status in Ubuntu Cloud Archive train series: New Status in neutron: In Progress Status in neutron-fwaas package in Ubuntu: New Status in neutron-fwaas source package in Disco: New Status in neutron-fwaas source package in Eoan: New Bug description: Impacts both Stein and Rocky (although rocky does not enable v2 just yet). 542 a9761fa9124740028d0c1d70ff7aa542] DBAPIError exception wrapped from (pymysql.err.InternalError) (1105, 'Percona-XtraDB-Cluster prohibits use of DML command on a table (neutron.firewall_group_port_associations_v2) without an explicit primary key with pxc_strict_mode = ENFORCING or MASTER') [SQL: 'DELETE FROM firewall_group_port_associations_v2 WHERE firewall_group_port_associations_v2.firewall_group_id = %(firewall_group_id_1)s'] [parameters: {'firewall_group_id_1': '85a277d0-ebaf-4a5d-9d45-6a74b8f54372'}] (Background on this error at: http://sqlalche.me/e/2j85): pymysql.err.InternalError: (1105, 'Percona-XtraDB-Cluster prohibits use of DML command on a table (neutron.firewall_group_port_associations_v2) without an explicit primary key with pxc_strict_mode = ENFORCING or MASTER') 2019-06-07 10:07:50.937 30837 ERROR oslo_db.sqlalchemy.exc_filters Traceback (most recent call last): 2019-06-07 10:07:50.937 30837 ERROR oslo_db.sqlalchemy.exc_filters File "/usr/lib/python3/dist-packages/sqlalchemy/engine/base.py", line 1193, in _execute_context 2019-06-07 10:07:50.937 30837 ERROR oslo_db.sqlalchemy.exc_filters context) 2019-06-07 10:07:50.937 30837 ERROR oslo_db.sqlalchemy.exc_filters File "/usr/lib/python3/dist-packages/sqlalchemy/engine/default.py", line 509, in do_execute 2019-06-07 10:07:50.937 30837 ERROR oslo_db.sqlalchemy.exc_filters cursor.execute(statement, parameters) 2019-06-07 10:07:50.937 30837 ERROR oslo_db.sqlalchemy.exc_filters File "/usr/lib/python3/dist-packages/pymysql/cursors.py", line 165, in execute 2019-06-07 10:07:50.937 30837 ERROR oslo_db.sqlalchemy.exc_filters result = self._query(query) 2019-06-07 10:07:50.937 30837 ERROR oslo_db.sqlalchemy.exc_filters File "/usr/lib/python3/dist-packages/pymysql/cursors.py", line 321, in _query 2019-06-07 10:07:50.937 30837 ERROR oslo_db.sqlalchemy.exc_filters conn.query(q) 2019-06-07 10:07:50.937 30837 ERROR oslo_db.sqlalchemy.exc_filters File "/usr/lib/python3/dist-packages/pymysql/connections.py", line 860, in query 2019-06-07 10:07:50.937 30837 ERROR oslo_db.sqlalchemy.exc_filters self._affected_rows = self._read_query_result(unbuffered=unbuffered) 2019-06-07 10:07:50.937 30837 ERROR oslo_db.sqlalchemy.exc_filters File "/usr/lib/python3/dist-packages/pymysql/connections.py", line 1061, in _read_query_result 2019-06-07 10:07:50.937 30837 ERROR oslo_db.sqlalchemy.exc_filters result.read() 2019-06-07 10:07:50.937 30837 ERROR oslo_db.sqlalchemy.exc_filters File "/usr/lib/python3/dist-packages/pymysql/connections.py", line 1349, in read 2019-06-07 10:07:50.937 30837 ERROR oslo_db.sqlalchemy.exc_filters first_packet = self.connection._read_packet() 2019-06-07 10:07:50.937 30837 ERROR oslo_db.sqlalchemy.exc_filters File "/usr/lib/python3/dist-packages/pymysql/connections.py", line 1018, in _read_packet 2019-06-07 10:07:50.937 30837 ERROR oslo_db.sqlalchemy.exc_filters packet.check_error() 2019-06-07 10:07:50.937 30837 ERROR oslo_db.sqlalchemy.exc_filters File "/usr/lib/python3/dist-packages/pymysql/connections.py", line 384, in check_error 2019-06-07 10:07:50.937 30837 ERROR oslo_db.sqlalchemy.exc_filters err.raise_mysql_exception(self._data) 2019-06-07 10:07:50.937 30837 ERROR oslo_db.sqlalchemy.exc_filters File "/usr/lib/python3/dist-packages/pymysql/err.py", line 107, in raise_mysql_exception 2019-06-07 10:07:50.937 30837 ERROR oslo_db.sqlalchemy.exc_filters raise errorclass(errno, errval) 2019-06-07 10:07:50.937 30837 ERROR oslo_db.sqlalchemy.exc_filters pymysql.err.InternalError: (1105, 'Percona-XtraDB-Cluster prohibits use of DML command on a table (neutron.firewall_group_port_associations_v2) without an explicit primary
[Yahoo-eng-team] [Bug 1826523] Re: libvirtError exceptions during volume attach leave volume connected to host
** Changed in: cloud-archive/rocky Status: Fix Committed => Fix Released -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1826523 Title: libvirtError exceptions during volume attach leave volume connected to host Status in Ubuntu Cloud Archive: Fix Released Status in Ubuntu Cloud Archive queens series: Fix Released Status in Ubuntu Cloud Archive rocky series: Fix Released Status in Ubuntu Cloud Archive stein series: Fix Released Status in OpenStack Compute (nova): Fix Released Status in OpenStack Compute (nova) queens series: Fix Committed Status in OpenStack Compute (nova) rocky series: Fix Committed Status in OpenStack Compute (nova) stein series: Fix Committed Status in nova package in Ubuntu: Fix Released Status in nova source package in Bionic: Fix Released Status in nova source package in Cosmic: Fix Released Status in nova source package in Disco: Fix Released Bug description: [Impact] * This is an additional patch required for bug #1825882, when a libvirt exception that prevents the volume attachment to complete, the underlying volumes should be disconnected from the host. [Test Case] * Deploy any OpenStack version up to Pike , which includes ceph backed cinder * Create a guest VM (openstack server ...) * Create a test cinder volume $ openstack volume create test --size 10 * Force a drop on ceph traffic. Run the following command on the nova hypervisor on which the server runs. $ iptables -A OUTPUT -d ceph-mon-addr -p tcp --dport 6800 -j DROP * Attach the volume to a running instance. $ openstack server add volume 7151f507-a6b7-4f6d-a4cc-fd223d9feb5d 742ff117-21ae-4d1b-a52b-5b37955716ff * This should cause the volume attachment to fail $ virsh domblklist instance-x Target Source vda nova/7151f507-a6b7-4f6d-a4cc-fd223d9feb5d_disk No volume should attached after this step. * If the behavior is fixed: * Check that openstack server show , doesn't displays the displays the volume as attached. * If the behavior isn't fixed: * openstack server show , will display the volume in the volumes_attached property. [Expected result] * Volume attach fails and the volume is disconnected from the host. [Actual result] * Volume attach fails but remains connected to the host. [Regression Potential] * The patches have been cherry-picked from upstream which helps to reduce the regression potential of these fixes. [Other Info] * N/A To manage notifications about this bug go to: https://bugs.launchpad.net/cloud-archive/+bug/1826523/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1771506] Re: Unit test failure with OpenSSL 1.1.1
** Changed in: cloud-archive/rocky Status: Fix Committed => Fix Released ** Changed in: cloud-archive/stein Status: Fix Committed => Fix Released -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1771506 Title: Unit test failure with OpenSSL 1.1.1 Status in Ubuntu Cloud Archive: Fix Released Status in Ubuntu Cloud Archive queens series: Fix Released Status in Ubuntu Cloud Archive rocky series: Fix Released Status in Ubuntu Cloud Archive stein series: Fix Released Status in OpenStack Compute (nova): In Progress Status in nova package in Ubuntu: Fix Released Status in nova source package in Bionic: Fix Released Status in nova source package in Cosmic: Fix Released Status in nova source package in Disco: Fix Released Bug description: Hi, Building the Nova Queens package with OpenSSL 1.1.1 leads to unit test problems. This was reported to Debian at: https://bugs.debian.org/898807 The new openssl 1.1.1 is currently in experimental [0]. This package failed to build against this new package [1] while it built fine against the openssl version currently in unstable [2]. Could you please have a look? FAIL: nova.tests.unit.virt.xenapi.test_xenapi.XenAPIDiffieHellmanTestCase.test_encrypt_newlines_inside_message |nova.tests.unit.virt.xenapi.test_xenapi.XenAPIDiffieHellmanTestCase.test_encrypt_newlines_inside_message |-- |_StringException: pythonlogging:'': {{{2018-05-01 20:48:09,960 WARNING [oslo_config.cfg] Config option key_manager.api_class is deprecated. Use option key_manager.backend instead.}}} | |Traceback (most recent call last): | File "/<>/nova/tests/unit/virt/xenapi/test_xenapi.py", line 1592, in test_encrypt_newlines_inside_message |self._test_encryption('Message\nwith\ninterior\nnewlines.') | File "/<>/nova/tests/unit/virt/xenapi/test_xenapi.py", line 1577, in _test_encryption |enc = self.alice.encrypt(message) | File "/<>/nova/virt/xenapi/agent.py", line 432, in encrypt |return self._run_ssl(text).strip('\n') | File "/<>/nova/virt/xenapi/agent.py", line 428, in _run_ssl |raise RuntimeError(_('OpenSSL error: %s') % err) |RuntimeError: OpenSSL error: *** WARNING : deprecated key derivation used. |Using -iter or -pbkdf2 would be better. It looks like due to additional message on stderr. [0] https://lists.debian.org/msgid-search/20180501211400.ga21...@roeckx.be [1] https://breakpoint.cc/openssl-rebuild/2018-05-03-rebuild-openssl1.1.1-pre6/attempted/nova_17.0.0-4_amd64-2018-05-01T20%3A39%3A38Z [2] https://breakpoint.cc/openssl-rebuild/2018-05-03-rebuild-openssl1.1.1-pre6/successful/nova_17.0.0-4_amd64-2018-05-02T18%3A46%3A36Z To manage notifications about this bug go to: https://bugs.launchpad.net/cloud-archive/+bug/1771506/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1808951] Re: python3 + Fedora + SSL + wsgi nova deployment, nova api returns RecursionError: maximum recursion depth exceeded while calling a Python object
** Changed in: cloud-archive/stein Status: Fix Committed => Fix Released -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1808951 Title: python3 + Fedora + SSL + wsgi nova deployment, nova api returns RecursionError: maximum recursion depth exceeded while calling a Python object Status in Ubuntu Cloud Archive: Fix Released Status in Ubuntu Cloud Archive stein series: Fix Released Status in OpenStack Compute (nova): Fix Released Status in tripleo: Triaged Status in nova package in Ubuntu: Fix Released Status in nova source package in Disco: Fix Released Status in nova source package in Eoan: Fix Released Bug description: Description:- So while testing python3 with Fedora in [1], Found an issue while running nova-api behind wsgi. It fails with below Traceback:- 2018-12-18 07:41:55.364 26870 INFO nova.api.openstack.requestlog [req-e1af4808-ecd8-47c7-9568-a5dd9691c2c9 - - - - -] 127.0.0.1 "GET /v2.1/servers/detail?all_tenants=True=True" status: 500 len: 0 microversion: - time: 0.007297 2018-12-18 07:41:55.364 26870 ERROR nova.api.openstack [req-e1af4808-ecd8-47c7-9568-a5dd9691c2c9 - - - - -] Caught error: maximum recursion depth exceeded while calling a Python object: RecursionError: maximum recursion depth exceeded while calling a Python object 2018-12-18 07:41:55.364 26870 ERROR nova.api.openstack Traceback (most recent call last): 2018-12-18 07:41:55.364 26870 ERROR nova.api.openstack File "/usr/lib/python3.6/site-packages/nova/api/openstack/__init__.py", line 94, in __call__ 2018-12-18 07:41:55.364 26870 ERROR nova.api.openstack return req.get_response(self.application) 2018-12-18 07:41:55.364 26870 ERROR nova.api.openstack File "/usr/lib/python3.6/site-packages/webob/request.py", line 1313, in send 2018-12-18 07:41:55.364 26870 ERROR nova.api.openstack application, catch_exc_info=False) 2018-12-18 07:41:55.364 26870 ERROR nova.api.openstack File "/usr/lib/python3.6/site-packages/webob/request.py", line 1277, in call_application 2018-12-18 07:41:55.364 26870 ERROR nova.api.openstack app_iter = application(self.environ, start_response) 2018-12-18 07:41:55.364 26870 ERROR nova.api.openstack File "/usr/lib/python3.6/site-packages/webob/dec.py", line 129, in __call__ 2018-12-18 07:41:55.364 26870 ERROR nova.api.openstack resp = self.call_func(req, *args, **kw) 2018-12-18 07:41:55.364 26870 ERROR nova.api.openstack File "/usr/lib/python3.6/site-packages/webob/dec.py", line 193, in call_func 2018-12-18 07:41:55.364 26870 ERROR nova.api.openstack return self.func(req, *args, **kwargs) 2018-12-18 07:41:55.364 26870 ERROR nova.api.openstack File "/usr/lib/python3.6/site-packages/nova/api/openstack/requestlog.py", line 92, in __call__ 2018-12-18 07:41:55.364 26870 ERROR nova.api.openstack self._log_req(req, res, start) 2018-12-18 07:41:55.364 26870 ERROR nova.api.openstack File "/usr/lib/python3.6/site-packages/oslo_utils/excutils.py", line 220, in __exit__ 2018-12-18 07:41:55.364 26870 ERROR nova.api.openstack self.force_reraise() 2018-12-18 07:41:55.364 26870 ERROR nova.api.openstack File "/usr/lib/python3.6/site-packages/oslo_utils/excutils.py", line 196, in force_reraise 2018-12-18 07:41:55.364 26870 ERROR nova.api.openstack six.reraise(self.type_, self.value, self.tb) 2018-12-18 07:41:55.364 26870 ERROR nova.api.openstack File "/usr/lib/python3.6/site-packages/six.py", line 693, in reraise 2018-12-18 07:41:55.364 26870 ERROR nova.api.openstack raise value 2018-12-18 07:41:55.364 26870 ERROR nova.api.openstack File "/usr/lib/python3.6/site-packages/nova/api/openstack/requestlog.py", line 87, in __call__ 2018-12-18 07:41:55.364 26870 ERROR nova.api.openstack res = req.get_response(self.application) 2018-12-18 07:41:55.364 26870 ERROR nova.api.openstack File "/usr/lib/python3.6/site-packages/webob/request.py", line 1313, in send 2018-12-18 07:41:55.364 26870 ERROR nova.api.openstack application, catch_exc_info=False) 2018-12-18 07:41:55.364 26870 ERROR nova.api.openstack File "/usr/lib/python3.6/site-packages/webob/request.py", line 1277, in call_application 2018-12-18 07:41:55.364 26870 ERROR nova.api.openstack app_iter = application(self.environ, start_response) 2018-12-18 07:41:55.364 26870 ERROR nova.api.openstack File "/usr/lib/python3.6/site-packages/webob/dec.py", line 143, in __call__ 2018-12-18 07:41:55.364 26870 ERROR nova.api.openstack return resp(environ, start_response) 2018-12-18 07:41:55.364 26870 ERROR nova.api.openstack File "/usr/lib/python3.6/site-packages/webob/dec.py", line 129, in __call__ 2018-12-18 07:41:55.364 26870 ERROR nova.api.openstack resp = self.call_func(req, *args, **kw) 2018-12-18 07:41:55.364 26870 ERROR nova.api.openstack File
[Yahoo-eng-team] [Bug 1825882] Re: [SRU] Virsh disk attach errors silently ignored
** Changed in: cloud-archive/rocky Status: Fix Committed => Fix Released -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1825882 Title: [SRU] Virsh disk attach errors silently ignored Status in Ubuntu Cloud Archive: Fix Released Status in Ubuntu Cloud Archive queens series: Fix Released Status in Ubuntu Cloud Archive rocky series: Fix Released Status in Ubuntu Cloud Archive stein series: Fix Released Status in OpenStack Compute (nova): Fix Released Status in OpenStack Compute (nova) queens series: Fix Committed Status in OpenStack Compute (nova) rocky series: Fix Committed Status in OpenStack Compute (nova) stein series: Fix Committed Status in nova package in Ubuntu: Fix Released Status in nova source package in Bionic: Fix Released Status in nova source package in Cosmic: Fix Released Status in nova source package in Disco: Fix Released Bug description: [Impact] The following commit (1) is causing volume attachments which fail due to libvirt device attach erros to be silently ignored and Nova report the attachment as successful. It seems that the original intention of the commit was to log a condition and re-raise the exeption, but if the exception is of type libvirt.libvirtError and does not contain the searched pattern, the exception is ignored. If you unindent the raise statement, errors are reported again. In our case we had ceph/apparmor configuration problems in compute nodes which prevented virsh attaching the device; volumes appeared as successfully attached but the corresponding block device missing in guests VMs. Other libvirt attach error conditions are ignored also, as when you have already occuppied device names (i.e. 'Target vdb already exists', device is busy, etc.) (1) https://github.com/openstack/nova/commit/78891c2305bff6e16706339a9c5eca99a84e409c [Test Case] * Deploy any OpenStack version up to Pike , which includes ceph backed cinder * Create a guest VM (openstack server ...) * Create a test cinder volume $ openstack volume create test --size 10 * Force a drop on ceph traffic. Run the following command on the nova hypervisor on which the server runs. $ iptables -A OUTPUT -d ceph-mon-addr -p tcp --dport 6800 -j DROP * Attach the volume to a running instance. $ openstack server add volume 7151f507-a6b7-4f6d-a4cc-fd223d9feb5d 742ff117-21ae-4d1b-a52b-5b37955716ff * This should cause the volume attachment to fail $ virsh domblklist instance-x Target Source vda nova/7151f507-a6b7-4f6d-a4cc-fd223d9feb5d_disk No volume should attached after this step. * If the behavior is fixed: * Check that openstack server show , doesn't displays the displays the volume as attached. * Check that proper log entries states the libvirt exception and error. * If the behavior isn't fixed: * openstack server show , will display the volume in the volumes_attached property. [Expected result] * Volume attach fails and a proper exception is logged. [Actual result] * Volume attach fails but remains connected to the host and no further exception gets logged. [Regression Potential] * The patches have been cherry-picked from upstream which helps to reduce the regression potential of these fixes. [Other Info] * N/A Description To manage notifications about this bug go to: https://bugs.launchpad.net/cloud-archive/+bug/1825882/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1821594] Re: [SRU] Error in confirm_migration leaves stale allocations and 'confirming' migration state
This bug was fixed in the package nova - 2:18.2.0-0ubuntu2~cloud0 --- nova (2:18.2.0-0ubuntu2~cloud0) bionic-rocky; urgency=medium . * New upstream release for the Ubuntu Cloud Archive. . nova (2:18.2.0-0ubuntu2) cosmic; urgency=medium . * Cherry-picked from upstream to ensure no stale allocations are left over on failed cold migrations (LP: #1821594). - d/p/bug_1821594_1.patch: Fix migration record status - d/p/bug_1821594_2.patch: Delete failed allocation part 1 - d/p/bug_1821594_3.patch: Delete failed allocation part 2 - d/p/bug_1821594_4.patch: New functional test . nova (2:18.2.0-0ubuntu1) cosmic; urgency=medium . [Sahid Orentino Ferdjaoui] * New stable point release for OpenStack Rocky (LP: #1830695). * d/p/ensure-rbd-auth-fallback-uses-matching-credentials.patch: Dropped. Fixed upstream in 18.2.0. . [Corey Bryant] * d/p/skip-openssl-1.1.1-tests.patch: Dropped as this is now properly fixed by xenapi-agent-change-openssl-error-handling.patch. ** Changed in: cloud-archive/rocky Status: Fix Committed => Fix Released -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1821594 Title: [SRU] Error in confirm_migration leaves stale allocations and 'confirming' migration state Status in Ubuntu Cloud Archive: Triaged Status in Ubuntu Cloud Archive queens series: Triaged Status in Ubuntu Cloud Archive rocky series: Fix Released Status in Ubuntu Cloud Archive stein series: Triaged Status in Ubuntu Cloud Archive train series: Fix Committed Status in OpenStack Compute (nova): Fix Released Status in OpenStack Compute (nova) pike series: Triaged Status in OpenStack Compute (nova) queens series: Fix Committed Status in OpenStack Compute (nova) rocky series: Fix Committed Status in OpenStack Compute (nova) stein series: Fix Committed Status in nova package in Ubuntu: Fix Committed Status in nova source package in Bionic: Triaged Status in nova source package in Cosmic: Fix Released Status in nova source package in Disco: Triaged Status in nova source package in Eoan: Fix Committed Bug description: Description: When performing a cold migration, if an exception is raised by the driver during confirm_migration (this runs in the source node), the migration record is stuck in "confirming" state and the allocations against the source node are not removed. The instance is fine at the destination in this stage, but the source host has allocations that is not possible to clean without going to the database or invoking the Placement API via curl. After several migration attempts that fail in the same spot, the source node is filled with these allocations that prevent new instances from being created or instances migrated to this node. When confirm_migration fails in this stage, the migrating instance can be saved through a hard reboot or a reset state to active. Steps to reproduce: Unfortunately, I don't have logs of the real root cause of the problem inside driver.confirm_migration running libvirt driver. However, the stale allocations and migration status problem can be easily reproduced by raising an exception in libvirt driver's confirm_migration method, and it would affect any driver. Expected results: Discussed this issue with efried and mriedem over #openstack-nova on March 25th, 2019. They confirmed that allocations not being cleared up is a bug. Actual results: Instance is fine at the destination after a reset-state. Source node has stale allocations that prevent new instances from being created/migrated to the source node. Migration record is stuck in "confirming" state. Environment: I verified this bug on on pike, queens and stein branches. Running libvirt KVM driver. === [Impact] If users attempting to perform cold migrations face any issues when the virt driver is running the "Confirm Migration" step, the failure leaves stale allocation records in the database, in migration records in "confirming" state. The stale allocations are not cleaned up by nova, consuming the user's quota indefinitely. This bug was confirmed from pike to stein release, and a fix was implemented for queens, rocky and stein. It should be backported to those releases to prevent the issue from reoccurring. This fix prevents new stale allocations being left over, by cleaning them up immediately when the failures occur. At the moment, the users affected by this bug have to clean their previous stale allocations manually. [Test Case] 1. Reproducing the bug 1a. Inject failure The root cause for this problem may vary for each driver and environment, so to reproduce the bug, it is necessary first to inject a failure in
[Yahoo-eng-team] [Bug 1771506] Re: Unit test failure with OpenSSL 1.1.1
** Changed in: cloud-archive/queens Status: Fix Committed => Fix Released -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1771506 Title: Unit test failure with OpenSSL 1.1.1 Status in Ubuntu Cloud Archive: Fix Committed Status in Ubuntu Cloud Archive queens series: Fix Released Status in Ubuntu Cloud Archive rocky series: Fix Committed Status in Ubuntu Cloud Archive stein series: Fix Committed Status in OpenStack Compute (nova): In Progress Status in nova package in Ubuntu: Fix Released Status in nova source package in Bionic: Fix Released Status in nova source package in Cosmic: Fix Released Status in nova source package in Disco: Fix Committed Bug description: Hi, Building the Nova Queens package with OpenSSL 1.1.1 leads to unit test problems. This was reported to Debian at: https://bugs.debian.org/898807 The new openssl 1.1.1 is currently in experimental [0]. This package failed to build against this new package [1] while it built fine against the openssl version currently in unstable [2]. Could you please have a look? FAIL: nova.tests.unit.virt.xenapi.test_xenapi.XenAPIDiffieHellmanTestCase.test_encrypt_newlines_inside_message |nova.tests.unit.virt.xenapi.test_xenapi.XenAPIDiffieHellmanTestCase.test_encrypt_newlines_inside_message |-- |_StringException: pythonlogging:'': {{{2018-05-01 20:48:09,960 WARNING [oslo_config.cfg] Config option key_manager.api_class is deprecated. Use option key_manager.backend instead.}}} | |Traceback (most recent call last): | File "/<>/nova/tests/unit/virt/xenapi/test_xenapi.py", line 1592, in test_encrypt_newlines_inside_message |self._test_encryption('Message\nwith\ninterior\nnewlines.') | File "/<>/nova/tests/unit/virt/xenapi/test_xenapi.py", line 1577, in _test_encryption |enc = self.alice.encrypt(message) | File "/<>/nova/virt/xenapi/agent.py", line 432, in encrypt |return self._run_ssl(text).strip('\n') | File "/<>/nova/virt/xenapi/agent.py", line 428, in _run_ssl |raise RuntimeError(_('OpenSSL error: %s') % err) |RuntimeError: OpenSSL error: *** WARNING : deprecated key derivation used. |Using -iter or -pbkdf2 would be better. It looks like due to additional message on stderr. [0] https://lists.debian.org/msgid-search/20180501211400.ga21...@roeckx.be [1] https://breakpoint.cc/openssl-rebuild/2018-05-03-rebuild-openssl1.1.1-pre6/attempted/nova_17.0.0-4_amd64-2018-05-01T20%3A39%3A38Z [2] https://breakpoint.cc/openssl-rebuild/2018-05-03-rebuild-openssl1.1.1-pre6/successful/nova_17.0.0-4_amd64-2018-05-02T18%3A46%3A36Z To manage notifications about this bug go to: https://bugs.launchpad.net/cloud-archive/+bug/1771506/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1825882] Re: [SRU] Virsh disk attach errors silently ignored
This bug was fixed in the package nova - 2:19.0.0-0ubuntu2.3~cloud0 --- nova (2:19.0.0-0ubuntu2.3~cloud0) bionic-stein; urgency=medium . * New update for the Ubuntu Cloud Archive. . nova (2:19.0.0-0ubuntu2.3) disco; urgency=medium . * d/p/bug_1825882.patch: Cherry-picked from upstream to ensure virsh disk attach does not fail silently (LP: #1825882). * d/p/bug_1826523.patch: Cherry-picked from upstream to ensure always disconnect volumes after libvirt exceptions (LP: #1826523). ** Changed in: cloud-archive/stein Status: Fix Committed => Fix Released -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1825882 Title: [SRU] Virsh disk attach errors silently ignored Status in Ubuntu Cloud Archive: Fix Released Status in Ubuntu Cloud Archive queens series: Fix Released Status in Ubuntu Cloud Archive rocky series: Fix Committed Status in Ubuntu Cloud Archive stein series: Fix Released Status in OpenStack Compute (nova): Fix Released Status in OpenStack Compute (nova) queens series: Fix Committed Status in OpenStack Compute (nova) rocky series: Fix Committed Status in OpenStack Compute (nova) stein series: Fix Committed Status in nova package in Ubuntu: Fix Released Status in nova source package in Bionic: Fix Released Status in nova source package in Cosmic: Fix Released Status in nova source package in Disco: Fix Committed Bug description: [Impact] The following commit (1) is causing volume attachments which fail due to libvirt device attach erros to be silently ignored and Nova report the attachment as successful. It seems that the original intention of the commit was to log a condition and re-raise the exeption, but if the exception is of type libvirt.libvirtError and does not contain the searched pattern, the exception is ignored. If you unindent the raise statement, errors are reported again. In our case we had ceph/apparmor configuration problems in compute nodes which prevented virsh attaching the device; volumes appeared as successfully attached but the corresponding block device missing in guests VMs. Other libvirt attach error conditions are ignored also, as when you have already occuppied device names (i.e. 'Target vdb already exists', device is busy, etc.) (1) https://github.com/openstack/nova/commit/78891c2305bff6e16706339a9c5eca99a84e409c [Test Case] * Deploy any OpenStack version up to Pike , which includes ceph backed cinder * Create a guest VM (openstack server ...) * Create a test cinder volume $ openstack volume create test --size 10 * Force a drop on ceph traffic. Run the following command on the nova hypervisor on which the server runs. $ iptables -A OUTPUT -d ceph-mon-addr -p tcp --dport 6800 -j DROP * Attach the volume to a running instance. $ openstack server add volume 7151f507-a6b7-4f6d-a4cc-fd223d9feb5d 742ff117-21ae-4d1b-a52b-5b37955716ff * This should cause the volume attachment to fail $ virsh domblklist instance-x Target Source vda nova/7151f507-a6b7-4f6d-a4cc-fd223d9feb5d_disk No volume should attached after this step. * If the behavior is fixed: * Check that openstack server show , doesn't displays the displays the volume as attached. * Check that proper log entries states the libvirt exception and error. * If the behavior isn't fixed: * openstack server show , will display the volume in the volumes_attached property. [Expected result] * Volume attach fails and a proper exception is logged. [Actual result] * Volume attach fails but remains connected to the host and no further exception gets logged. [Regression Potential] * The patches have been cherry-picked from upstream which helps to reduce the regression potential of these fixes. [Other Info] * N/A Description To manage notifications about this bug go to: https://bugs.launchpad.net/cloud-archive/+bug/1825882/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1825882] Re: [SRU] Virsh disk attach errors silently ignored
This bug was fixed in the package nova - 2:17.0.9-0ubuntu3~cloud0 --- nova (2:17.0.9-0ubuntu3~cloud0) xenial-queens; urgency=medium . * New update for the Ubuntu Cloud Archive. . nova (2:17.0.9-0ubuntu3) bionic; urgency=medium . * d/p/bug_1825882.patch: Cherry-picked from upstream to ensure virsh disk attach does not fail silently (LP: #1825882). * d/p/bug_1826523.patch: Cherry-picked from upstream to ensure always disconnect volumes after libvirt exceptions (LP: #1826523). ** Changed in: cloud-archive/queens Status: Fix Committed => Fix Released -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1825882 Title: [SRU] Virsh disk attach errors silently ignored Status in Ubuntu Cloud Archive: Fix Released Status in Ubuntu Cloud Archive queens series: Fix Released Status in Ubuntu Cloud Archive rocky series: Fix Committed Status in Ubuntu Cloud Archive stein series: Fix Released Status in OpenStack Compute (nova): Fix Released Status in OpenStack Compute (nova) queens series: Fix Committed Status in OpenStack Compute (nova) rocky series: Fix Committed Status in OpenStack Compute (nova) stein series: Fix Committed Status in nova package in Ubuntu: Fix Released Status in nova source package in Bionic: Fix Released Status in nova source package in Cosmic: Fix Released Status in nova source package in Disco: Fix Committed Bug description: [Impact] The following commit (1) is causing volume attachments which fail due to libvirt device attach erros to be silently ignored and Nova report the attachment as successful. It seems that the original intention of the commit was to log a condition and re-raise the exeption, but if the exception is of type libvirt.libvirtError and does not contain the searched pattern, the exception is ignored. If you unindent the raise statement, errors are reported again. In our case we had ceph/apparmor configuration problems in compute nodes which prevented virsh attaching the device; volumes appeared as successfully attached but the corresponding block device missing in guests VMs. Other libvirt attach error conditions are ignored also, as when you have already occuppied device names (i.e. 'Target vdb already exists', device is busy, etc.) (1) https://github.com/openstack/nova/commit/78891c2305bff6e16706339a9c5eca99a84e409c [Test Case] * Deploy any OpenStack version up to Pike , which includes ceph backed cinder * Create a guest VM (openstack server ...) * Create a test cinder volume $ openstack volume create test --size 10 * Force a drop on ceph traffic. Run the following command on the nova hypervisor on which the server runs. $ iptables -A OUTPUT -d ceph-mon-addr -p tcp --dport 6800 -j DROP * Attach the volume to a running instance. $ openstack server add volume 7151f507-a6b7-4f6d-a4cc-fd223d9feb5d 742ff117-21ae-4d1b-a52b-5b37955716ff * This should cause the volume attachment to fail $ virsh domblklist instance-x Target Source vda nova/7151f507-a6b7-4f6d-a4cc-fd223d9feb5d_disk No volume should attached after this step. * If the behavior is fixed: * Check that openstack server show , doesn't displays the displays the volume as attached. * Check that proper log entries states the libvirt exception and error. * If the behavior isn't fixed: * openstack server show , will display the volume in the volumes_attached property. [Expected result] * Volume attach fails and a proper exception is logged. [Actual result] * Volume attach fails but remains connected to the host and no further exception gets logged. [Regression Potential] * The patches have been cherry-picked from upstream which helps to reduce the regression potential of these fixes. [Other Info] * N/A Description To manage notifications about this bug go to: https://bugs.launchpad.net/cloud-archive/+bug/1825882/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1826523] Re: libvirtError exceptions during volume attach leave volume connected to host
This bug was fixed in the package nova - 2:19.0.0-0ubuntu2.3~cloud0 --- nova (2:19.0.0-0ubuntu2.3~cloud0) bionic-stein; urgency=medium . * New update for the Ubuntu Cloud Archive. . nova (2:19.0.0-0ubuntu2.3) disco; urgency=medium . * d/p/bug_1825882.patch: Cherry-picked from upstream to ensure virsh disk attach does not fail silently (LP: #1825882). * d/p/bug_1826523.patch: Cherry-picked from upstream to ensure always disconnect volumes after libvirt exceptions (LP: #1826523). ** Changed in: cloud-archive/stein Status: Fix Committed => Fix Released -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1826523 Title: libvirtError exceptions during volume attach leave volume connected to host Status in Ubuntu Cloud Archive: Fix Released Status in Ubuntu Cloud Archive queens series: Fix Released Status in Ubuntu Cloud Archive rocky series: Fix Committed Status in Ubuntu Cloud Archive stein series: Fix Released Status in OpenStack Compute (nova): Fix Released Status in OpenStack Compute (nova) queens series: Fix Committed Status in OpenStack Compute (nova) rocky series: Fix Committed Status in OpenStack Compute (nova) stein series: Fix Committed Status in nova package in Ubuntu: Fix Released Status in nova source package in Bionic: Fix Released Status in nova source package in Cosmic: Fix Released Status in nova source package in Disco: Fix Committed Bug description: [Impact] * This is an additional patch required for bug #1825882, when a libvirt exception that prevents the volume attachment to complete, the underlying volumes should be disconnected from the host. [Test Case] * Deploy any OpenStack version up to Pike , which includes ceph backed cinder * Create a guest VM (openstack server ...) * Create a test cinder volume $ openstack volume create test --size 10 * Force a drop on ceph traffic. Run the following command on the nova hypervisor on which the server runs. $ iptables -A OUTPUT -d ceph-mon-addr -p tcp --dport 6800 -j DROP * Attach the volume to a running instance. $ openstack server add volume 7151f507-a6b7-4f6d-a4cc-fd223d9feb5d 742ff117-21ae-4d1b-a52b-5b37955716ff * This should cause the volume attachment to fail $ virsh domblklist instance-x Target Source vda nova/7151f507-a6b7-4f6d-a4cc-fd223d9feb5d_disk No volume should attached after this step. * If the behavior is fixed: * Check that openstack server show , doesn't displays the displays the volume as attached. * If the behavior isn't fixed: * openstack server show , will display the volume in the volumes_attached property. [Expected result] * Volume attach fails and the volume is disconnected from the host. [Actual result] * Volume attach fails but remains connected to the host. [Regression Potential] * The patches have been cherry-picked from upstream which helps to reduce the regression potential of these fixes. [Other Info] * N/A To manage notifications about this bug go to: https://bugs.launchpad.net/cloud-archive/+bug/1826523/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1832265] Re: py3: inconsistent encoding of token fields
** Also affects: keystone (Ubuntu Disco) Importance: Undecided Status: New ** Also affects: keystone (Ubuntu Cosmic) Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Identity (keystone). https://bugs.launchpad.net/bugs/1832265 Title: py3: inconsistent encoding of token fields Status in OpenStack Keystone LDAP integration: Invalid Status in OpenStack Identity (keystone): In Progress Status in keystone package in Ubuntu: Fix Released Status in keystone source package in Cosmic: New Status in keystone source package in Disco: New Bug description: When using an LDAP domain user on a bionic-rocky cloud within horizon, we are unable to see the projects listed in the project selection drop-down, and are unable to query resources from any projects to which we are assigned the role Member. It appears that the following log entries in keystone may be helpful to troubleshooting this issue: (keystone.middleware.auth): 2019-06-10 19:47:02,700 DEBUG RBAC: auth_context: {'trust_id': None, 'trustor_id': None, 'trustee_id': None, 'domain_id': None, 'domain_name': None, 'group_ids': [], 'token': , 'user_id': b'd4fb94cfa3ce0f7829d76fe44697488e7765d88e29f5a896f57d43caadb0fad4', 'user_domain_id': '997b3e91271140feb1635eefba7c65a1', 'system_scope': None, 'project_id': None, 'project_domain_id': None, 'roles': [], 'is_admin_project': True, 'service_user_id': None, 'service_user_domain_id': None, 'service_project_id': None, 'service_project_domain_id': None, 'service_roles': []} (keystone.server.flask.application): 2019-06-10 19:47:02,700 DEBUG Dispatching request to legacy mapper: /v3/users (keystone.server.flask.application): 2019-06-10 19:47:02,700 DEBUG SCRIPT_NAME: `/v3`, PATH_INFO: `/users/d4fb94cfa3ce0f7829d76fe44697488e7765d88e29f5a896f57d43caadb0fad4/projects` (routes.middleware): 2019-06-10 19:47:02,700 DEBUG Matched GET /users/d4fb94cfa3ce0f7829d76fe44697488e7765d88e29f5a896f57d43caadb0fad4/projects (routes.middleware): 2019-06-10 19:47:02,700 DEBUG Route path: '/users/{user_id}/projects', defaults: {'action': 'list_user_projects', 'controller': } (routes.middleware): 2019-06-10 19:47:02,700 DEBUG Match dict: {'user_id': 'd4fb94cfa3ce0f7829d76fe44697488e7765d88e29f5a896f57d43caadb0fad4', 'action': 'list_user_projects', 'controller': } (keystone.common.wsgi): 2019-06-10 19:47:02,700 INFO GET https://keystone.mysite:5000/v3/users/d4fb94cfa3ce0f7829d76fe44697488e7765d88e29f5a896f57d43caadb0fad4/projects (keystone.common.controller): 2019-06-10 19:47:02,700 DEBUG RBAC: Adding query filter params () (keystone.common.authorization): 2019-06-10 19:47:02,700 DEBUG RBAC: Authorizing identity:list_user_projects(user_id=d4fb94cfa3ce0f7829d76fe44697488e7765d88e29f5a896f57d43caadb0fad4) (keystone.policy.backends.rules): 2019-06-10 19:47:02,701 DEBUG enforce identity:list_user_projects: {'trust_id': None, 'trustor_id': None, 'trustee_id': None, 'domain_id': None, 'domain_name': None, 'group_ids': [], 'token': , 'user_id': b'd4fb94cfa3ce0f7829d76fe44697488e7765d88e29f5a896f57d43caadb0fad4', 'user_domain_id': '997b3e91271140feb1635eefba7c65a1', 'system_scope': None, 'project_id': None, 'project_domain_id': None, 'roles': [], 'is_admin_project': True, 'service_user_id': None, 'service_user_domain_id': None, 'service_project_id': None, 'service_project_domain_id': None, 'service_roles': []} (keystone.common.wsgi): 2019-06-10 19:47:02,702 WARNING You are not authorized to perform the requested action: identity:list_user_projects. It actually appears elsewhere in the keystone.log that there is a string which has encapsulated bytecode data in it (or vice versa). (keystone.common.wsgi): 2019-06-10 19:46:59,019 INFO POST https://keystone.mysite:5000/v3/auth/tokens (sqlalchemy.orm.path_registry): 2019-06-10 19:46:59,021 DEBUG set 'memoized_setups' on path 'EntityRegistry((,))' to '{}' (sqlalchemy.pool.QueuePool): 2019-06-10 19:46:59,021 DEBUG Connection checked out from pool (sqlalchemy.pool.QueuePool): 2019-06-10 19:46:59,024 DEBUG Connection being returned to pool (sqlalchemy.pool.QueuePool): 2019-06-10 19:46:59,024 DEBUG Connection rollback-on-return, via agent (keystone.auth.core): 2019-06-10 19:46:59,025 DEBUG MFA Rules not processed for user `b'd4fb94cfa3ce0f7829d76fe44697488e7765d88e29f5a896f57d43caadb0fad4'`. Rule list: `[]` (Enabled: `True`). (keystone.common.wsgi): 2019-06-10 19:46:59,025 ERROR a bytes-like object is required, not 'str' Traceback (most recent call last): File "/usr/lib/python3/dist-packages/keystone/common/wsgi.py", line 148, in __call__ result = method(req, **params) File "/usr/lib/python3/dist-packages/keystone/auth/controllers.py", line 102, in authenticate_for_token app_cred_id=app_cred_id, parent_audit_id=token_audit_id) File
[Yahoo-eng-team] [Bug 1832210] Re: incorrect decode of log prefix under python 3
** Also affects: cloud-archive Importance: Undecided Status: New ** Changed in: neutron-fwaas (Ubuntu Eoan) Status: New => Fix Committed ** Also affects: cloud-archive/rocky Importance: Undecided Status: New ** Also affects: cloud-archive/stein Importance: Undecided Status: New ** Also affects: cloud-archive/train Importance: Undecided Status: New ** Changed in: cloud-archive/train Status: New => Fix Committed ** Changed in: cloud-archive/stein Status: New => Triaged ** Changed in: cloud-archive/rocky Status: New => Triaged ** Changed in: neutron-fwaas (Ubuntu Disco) Status: New => Triaged ** Changed in: neutron-fwaas (Ubuntu Cosmic) Status: New => Triaged ** Changed in: neutron-fwaas (Ubuntu Cosmic) Importance: Undecided => Medium ** Changed in: neutron-fwaas (Ubuntu Disco) Importance: Undecided => Medium ** Changed in: cloud-archive/rocky Importance: Undecided => Medium ** Changed in: cloud-archive/train Importance: Undecided => Medium ** Changed in: neutron-fwaas (Ubuntu Eoan) Importance: Undecided => Medium ** Changed in: cloud-archive/stein Importance: Undecided => Medium -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1832210 Title: incorrect decode of log prefix under python 3 Status in Ubuntu Cloud Archive: Triaged Status in Ubuntu Cloud Archive rocky series: Triaged Status in Ubuntu Cloud Archive stein series: Triaged Status in Ubuntu Cloud Archive train series: Fix Committed Status in neutron: Fix Released Status in neutron-fwaas package in Ubuntu: Fix Committed Status in neutron-fwaas source package in Cosmic: Triaged Status in neutron-fwaas source package in Disco: Triaged Status in neutron-fwaas source package in Eoan: Fix Committed Bug description: Under Python 3, the prefix of a firewall log message is not correctly decoded "b'10612530182266949194'": 2019-06-10 09:14:34 Unknown cookie packet_in pkt=ethernet(dst='fa:16:3e:c6:58:5e',ethertype=2048,src='fa:16:3e:e0:2c:be')ipv4(csum=51290,dst='10.5.0.10',flags=2,header_length=5,identification=37612,offset=0,option=None,proto=6,src='192.168.21.182',tos=16,total_length=52,ttl=63,version=4)tcp(ack=3151291228,bits=17,csum=23092,dst_port=57776,offset=8,option=[TCPOptionNoOperation(kind=1,length=1), TCPOptionNoOperation(kind=1,length=1), TCPOptionTimestamps(kind=8,length=10,ts_ecr=1574746440,ts_val=482688)],seq=2769917228,src_port=22,urgent=0,window_size=3120) 2019-06-10 09:14:34 {'prefix': "b'10612530182266949194'", 'msg': "ethernet(dst='fa:16:3e:c6:58:5e',ethertype=2048,src='fa:16:3e:e0:2c:be')ipv4(csum=51290,dst='10.5.0.10',flags=2,header_length=5,identification=37612,offset=0,option=None,proto=6,src='192.168.21.182',tos=16,total_length=52,ttl=63,version=4)tcp(ack=3151291228,bits=17,csum=23092,dst_port=57776,offset=8,option=[TCPOptionNoOperation(kind=1,length=1), TCPOptionNoOperation(kind=1,length=1), TCPOptionTimestamps(kind=8,length=10,ts_ecr=1574746440,ts_val=482688)],seq=2769917228,src_port=22,urgent=0,window_size=3120)"} 2019-06-10 09:14:34 {'0bf81ded-bf94-437d-ad49-063bba9be9bb': [, ]} This results in the firewall log driver not being able to map the message to the associated port and log resources in neutron resulting in the 'unknown cookie packet_in' warning message. To manage notifications about this bug go to: https://bugs.launchpad.net/cloud-archive/+bug/1832210/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1832766] Re: LDAP group_members_are_ids = false fails in Rocky/Stein
** Also affects: keystone (Ubuntu) Importance: Undecided Status: New ** Changed in: keystone (Ubuntu) Assignee: (unassigned) => James Page (james-page) -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Identity (keystone). https://bugs.launchpad.net/bugs/1832766 Title: LDAP group_members_are_ids = false fails in Rocky/Stein Status in OpenStack Identity (keystone): New Status in keystone package in Ubuntu: New Bug description: I'm running into an interesting issue with the group_members_are_ids: false Per the documentation, this means that the group's group_member_attribute values (in my case "member") are understood to be full LDAP DNs to the user records. https://docs.openstack.org/keystone/queens/_modules/keystone/identity/backends/ldap/core.html#Identity.list_users_in_group Unfortunately, the call to self._transform_group_member_ids(group_members) is calling to self.user._dn_to_id(user_key) where user_key would be a string like "uid=dfreiberger,ou=users,dc=mysite,dc=com". This code is here: https://docs.openstack.org/keystone/queens/_modules/keystone/identity/backends/ldap/core.html#Identity.list_users_in_group This calls out to: return ldap.dn.str2dn(dn)[0][0][1] https://github.com/openstack/keystone/blob/stable/rocky/keystone/identity/backends/ldap/common.py#L1298 from: https://www.python-ldap.org/en/latest/reference/ldap- dn.html#ldap.dn.str2dn, this should spit out something like: >>> ldap.dn.str2dn('cn=Michael Str\xc3\xb6der,dc=example,dc=com',flags=ldap.DN_FORMAT_LDAPV3) [[('cn', 'Michael Str\xc3\xb6der', 4)], [('dc', 'example', 1)], [('dc', 'com', 1)]] Which would then mean the return from _dn_to_id(user_key) would be "Michael Str\xc3\xb6der" or "dfreiberger" in my example user_key above. Ultimately, this means that either group_members_are_ids = false will return a user_id of the first attribute value within the DN string, even if the first field of the DN is not the actual user_name_attribute or user_id_attribute. If group_members_are_ids = true, it will return uidNumbers, which works fine with the knock on calls in the identity backend. - The problem is that the _transform_group_member_ids has to be returning a user ID such as the typical hex ID of a user in the keystone database, not the username of the user. - With group_members_are_ids, uidNumber is returned by the function, but with group_members_are_ids false, usernames are returned by the function. - Also, the _dn_to_id(user_key) from the group only returns the first entry in the DN, not the actual user_id_attribute or user_name_attribute field of the object. This requires a broken assumption that the user_id_attribute field called out in the ldap client config is also the first field of the distinguished name. - This would bug out if, say, your group had a member attribute/value pair of: member="cn=Drew Freiberger,dc=mysite,dc=com", _dn_to_id would return "Drew Freiberger" as my user_id, however, I may have told ldap that the user_name_attribute is uid, and inside my ldap record of "dn=cn=Drew Freiberger,dc=mysite,dc=com", there's a uid=dfreiberger field showing my login name is dfreiberger which is what _dn_to_id should return, or perhaps _dn_to_id should return my uidNumber=12345 attribute to actually function as expected. To manage notifications about this bug go to: https://bugs.launchpad.net/keystone/+bug/1832766/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1832265] Re: keystone LDAP integration in rocky not working for RBAC rules or token auth
Raising a bug task against keystone as I think that we may need to expanded the decode coverage in token_formatters. ** Changed in: keystone (Ubuntu) Assignee: (unassigned) => James Page (james-page) ** Changed in: keystone (Ubuntu) Importance: Undecided => High ** Changed in: keystone (Ubuntu) Status: New => In Progress ** Changed in: charm-keystone-ldap Status: New => Invalid -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Identity (keystone). https://bugs.launchpad.net/bugs/1832265 Title: keystone LDAP integration in rocky not working for RBAC rules or token auth Status in OpenStack Keystone LDAP integration: Invalid Status in OpenStack Identity (keystone): New Status in keystone package in Ubuntu: In Progress Bug description: When using an LDAP domain user on a bionic-rocky cloud within horizon, we are unable to see the projects listed in the project selection drop-down, and are unable to query resources from any projects to which we are assigned the role Member. It appears that the following log entries in keystone may be helpful to troubleshooting this issue: (keystone.middleware.auth): 2019-06-10 19:47:02,700 DEBUG RBAC: auth_context: {'trust_id': None, 'trustor_id': None, 'trustee_id': None, 'domain_id': None, 'domain_name': None, 'group_ids': [], 'token': , 'user_id': b'd4fb94cfa3ce0f7829d76fe44697488e7765d88e29f5a896f57d43caadb0fad4', 'user_domain_id': '997b3e91271140feb1635eefba7c65a1', 'system_scope': None, 'project_id': None, 'project_domain_id': None, 'roles': [], 'is_admin_project': True, 'service_user_id': None, 'service_user_domain_id': None, 'service_project_id': None, 'service_project_domain_id': None, 'service_roles': []} (keystone.server.flask.application): 2019-06-10 19:47:02,700 DEBUG Dispatching request to legacy mapper: /v3/users (keystone.server.flask.application): 2019-06-10 19:47:02,700 DEBUG SCRIPT_NAME: `/v3`, PATH_INFO: `/users/d4fb94cfa3ce0f7829d76fe44697488e7765d88e29f5a896f57d43caadb0fad4/projects` (routes.middleware): 2019-06-10 19:47:02,700 DEBUG Matched GET /users/d4fb94cfa3ce0f7829d76fe44697488e7765d88e29f5a896f57d43caadb0fad4/projects (routes.middleware): 2019-06-10 19:47:02,700 DEBUG Route path: '/users/{user_id}/projects', defaults: {'action': 'list_user_projects', 'controller': } (routes.middleware): 2019-06-10 19:47:02,700 DEBUG Match dict: {'user_id': 'd4fb94cfa3ce0f7829d76fe44697488e7765d88e29f5a896f57d43caadb0fad4', 'action': 'list_user_projects', 'controller': } (keystone.common.wsgi): 2019-06-10 19:47:02,700 INFO GET https://keystone.mysite:5000/v3/users/d4fb94cfa3ce0f7829d76fe44697488e7765d88e29f5a896f57d43caadb0fad4/projects (keystone.common.controller): 2019-06-10 19:47:02,700 DEBUG RBAC: Adding query filter params () (keystone.common.authorization): 2019-06-10 19:47:02,700 DEBUG RBAC: Authorizing identity:list_user_projects(user_id=d4fb94cfa3ce0f7829d76fe44697488e7765d88e29f5a896f57d43caadb0fad4) (keystone.policy.backends.rules): 2019-06-10 19:47:02,701 DEBUG enforce identity:list_user_projects: {'trust_id': None, 'trustor_id': None, 'trustee_id': None, 'domain_id': None, 'domain_name': None, 'group_ids': [], 'token': , 'user_id': b'd4fb94cfa3ce0f7829d76fe44697488e7765d88e29f5a896f57d43caadb0fad4', 'user_domain_id': '997b3e91271140feb1635eefba7c65a1', 'system_scope': None, 'project_id': None, 'project_domain_id': None, 'roles': [], 'is_admin_project': True, 'service_user_id': None, 'service_user_domain_id': None, 'service_project_id': None, 'service_project_domain_id': None, 'service_roles': []} (keystone.common.wsgi): 2019-06-10 19:47:02,702 WARNING You are not authorized to perform the requested action: identity:list_user_projects. It actually appears elsewhere in the keystone.log that there is a string which has encapsulated bytecode data in it (or vice versa). (keystone.common.wsgi): 2019-06-10 19:46:59,019 INFO POST https://keystone.mysite:5000/v3/auth/tokens (sqlalchemy.orm.path_registry): 2019-06-10 19:46:59,021 DEBUG set 'memoized_setups' on path 'EntityRegistry((,))' to '{}' (sqlalchemy.pool.QueuePool): 2019-06-10 19:46:59,021 DEBUG Connection checked out from pool (sqlalchemy.pool.QueuePool): 2019-06-10 19:46:59,024 DEBUG Connection being returned to pool (sqlalchemy.pool.QueuePool): 2019-06-10 19:46:59,024 DEBUG Connection rollback-on-return, via agent (keystone.auth.core): 2019-06-10 19:46:59,025 DEBUG MFA Rules not processed for user `b'd4fb94cfa3ce0f7829d76fe44697488e7765d88e29f5a896f57d43caadb0fad4'`. Rule list: `[]` (Enabled: `True`). (keystone.common.wsgi): 2019-06-10 19:46:59,025 ERROR a bytes-like object is required, not 'str' Traceback (most recent call last): File "/usr/lib/python3/dist-packages/keystone/common/wsgi.py", line 148, in __call__ result = method(req, **params) File
[Yahoo-eng-team] [Bug 1832265] Re: keystone LDAP integration in rocky not working for RBAC rules or token auth
Tl;DR - I think the disassemble functions need to deal with the encoding better under py3; if the user_id gets into keystone decoded, then it should be dealt with correctly throughout. ** Also affects: keystone Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Identity (keystone). https://bugs.launchpad.net/bugs/1832265 Title: keystone LDAP integration in rocky not working for RBAC rules or token auth Status in OpenStack Keystone LDAP integration: Invalid Status in OpenStack Identity (keystone): New Status in keystone package in Ubuntu: In Progress Bug description: When using an LDAP domain user on a bionic-rocky cloud within horizon, we are unable to see the projects listed in the project selection drop-down, and are unable to query resources from any projects to which we are assigned the role Member. It appears that the following log entries in keystone may be helpful to troubleshooting this issue: (keystone.middleware.auth): 2019-06-10 19:47:02,700 DEBUG RBAC: auth_context: {'trust_id': None, 'trustor_id': None, 'trustee_id': None, 'domain_id': None, 'domain_name': None, 'group_ids': [], 'token': , 'user_id': b'd4fb94cfa3ce0f7829d76fe44697488e7765d88e29f5a896f57d43caadb0fad4', 'user_domain_id': '997b3e91271140feb1635eefba7c65a1', 'system_scope': None, 'project_id': None, 'project_domain_id': None, 'roles': [], 'is_admin_project': True, 'service_user_id': None, 'service_user_domain_id': None, 'service_project_id': None, 'service_project_domain_id': None, 'service_roles': []} (keystone.server.flask.application): 2019-06-10 19:47:02,700 DEBUG Dispatching request to legacy mapper: /v3/users (keystone.server.flask.application): 2019-06-10 19:47:02,700 DEBUG SCRIPT_NAME: `/v3`, PATH_INFO: `/users/d4fb94cfa3ce0f7829d76fe44697488e7765d88e29f5a896f57d43caadb0fad4/projects` (routes.middleware): 2019-06-10 19:47:02,700 DEBUG Matched GET /users/d4fb94cfa3ce0f7829d76fe44697488e7765d88e29f5a896f57d43caadb0fad4/projects (routes.middleware): 2019-06-10 19:47:02,700 DEBUG Route path: '/users/{user_id}/projects', defaults: {'action': 'list_user_projects', 'controller': } (routes.middleware): 2019-06-10 19:47:02,700 DEBUG Match dict: {'user_id': 'd4fb94cfa3ce0f7829d76fe44697488e7765d88e29f5a896f57d43caadb0fad4', 'action': 'list_user_projects', 'controller': } (keystone.common.wsgi): 2019-06-10 19:47:02,700 INFO GET https://keystone.mysite:5000/v3/users/d4fb94cfa3ce0f7829d76fe44697488e7765d88e29f5a896f57d43caadb0fad4/projects (keystone.common.controller): 2019-06-10 19:47:02,700 DEBUG RBAC: Adding query filter params () (keystone.common.authorization): 2019-06-10 19:47:02,700 DEBUG RBAC: Authorizing identity:list_user_projects(user_id=d4fb94cfa3ce0f7829d76fe44697488e7765d88e29f5a896f57d43caadb0fad4) (keystone.policy.backends.rules): 2019-06-10 19:47:02,701 DEBUG enforce identity:list_user_projects: {'trust_id': None, 'trustor_id': None, 'trustee_id': None, 'domain_id': None, 'domain_name': None, 'group_ids': [], 'token': , 'user_id': b'd4fb94cfa3ce0f7829d76fe44697488e7765d88e29f5a896f57d43caadb0fad4', 'user_domain_id': '997b3e91271140feb1635eefba7c65a1', 'system_scope': None, 'project_id': None, 'project_domain_id': None, 'roles': [], 'is_admin_project': True, 'service_user_id': None, 'service_user_domain_id': None, 'service_project_id': None, 'service_project_domain_id': None, 'service_roles': []} (keystone.common.wsgi): 2019-06-10 19:47:02,702 WARNING You are not authorized to perform the requested action: identity:list_user_projects. It actually appears elsewhere in the keystone.log that there is a string which has encapsulated bytecode data in it (or vice versa). (keystone.common.wsgi): 2019-06-10 19:46:59,019 INFO POST https://keystone.mysite:5000/v3/auth/tokens (sqlalchemy.orm.path_registry): 2019-06-10 19:46:59,021 DEBUG set 'memoized_setups' on path 'EntityRegistry((,))' to '{}' (sqlalchemy.pool.QueuePool): 2019-06-10 19:46:59,021 DEBUG Connection checked out from pool (sqlalchemy.pool.QueuePool): 2019-06-10 19:46:59,024 DEBUG Connection being returned to pool (sqlalchemy.pool.QueuePool): 2019-06-10 19:46:59,024 DEBUG Connection rollback-on-return, via agent (keystone.auth.core): 2019-06-10 19:46:59,025 DEBUG MFA Rules not processed for user `b'd4fb94cfa3ce0f7829d76fe44697488e7765d88e29f5a896f57d43caadb0fad4'`. Rule list: `[]` (Enabled: `True`). (keystone.common.wsgi): 2019-06-10 19:46:59,025 ERROR a bytes-like object is required, not 'str' Traceback (most recent call last): File "/usr/lib/python3/dist-packages/keystone/common/wsgi.py", line 148, in __call__ result = method(req, **params) File "/usr/lib/python3/dist-packages/keystone/auth/controllers.py", line 102, in authenticate_for_token app_cred_id=app_cred_id, parent_audit_id=token_audit_id) File
[Yahoo-eng-team] [Bug 1832021] Re: Checksum drop of metadata traffic on isolated provider networks
** Also affects: charm-neutron-openvswitch Importance: Undecided Status: New ** This bug is no longer a duplicate of bug 1722584 [SRU] Return traffic from metadata service may get dropped by hypervisor due to wrong checksum ** Changed in: neutron Status: New => Incomplete -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1832021 Title: Checksum drop of metadata traffic on isolated provider networks Status in OpenStack neutron-openvswitch charm: New Status in neutron: Incomplete Bug description: When an isolated network using provider networks for tenants (meaning without virtual routers: DVR or network node), metadata access occurs in the qdhcp ip netns rather than the qrouter netns. The following options are set in the dhcp_agent.ini file: force_metadata = True enable_isolated_metadata = True VMs on the provider tenant network are unable to access metadata as packets are dropped due to checksum. When we added the following in the qdhcp netns, VMs regained access to metadata: iptables -t mangle -A OUTPUT -o ns-+ -p tcp --sport 80 -j CHECKSUM --checksum-fill It seems this setting was recently removed from the qrouter netns [0] but it never existed in the qdhcp to begin with. [0] https://review.opendev.org/#/c/654645/ Related LP Bug #1831935 See https://bugs.launchpad.net/charm-neutron-openvswitch/+bug/1831935/comments/10 To manage notifications about this bug go to: https://bugs.launchpad.net/charm-neutron-openvswitch/+bug/1832021/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1826419] Re: dhcp agent configured with mismatching domain and host entries
This bug was fixed in the package neutron - 2:14.0.0-0ubuntu2~cloud0 --- neutron (2:14.0.0-0ubuntu2~cloud0) bionic-train; urgency=medium . * New update for the Ubuntu Cloud Archive. . neutron (2:14.0.0-0ubuntu2) eoan; urgency=medium . * d/p/bug1826419.patch: Cherry pick fix to revert incorrect changes to internal DNS behaviour (LP: #1826419). ** Changed in: cloud-archive/train Status: Fix Committed => Fix Released -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1826419 Title: dhcp agent configured with mismatching domain and host entries Status in Ubuntu Cloud Archive: Fix Released Status in Ubuntu Cloud Archive queens series: Fix Committed Status in Ubuntu Cloud Archive rocky series: Fix Released Status in Ubuntu Cloud Archive stein series: Fix Released Status in Ubuntu Cloud Archive train series: Fix Released Status in neutron: Fix Released Status in neutron package in Ubuntu: Fix Released Status in neutron source package in Bionic: Fix Released Status in neutron source package in Cosmic: Fix Released Status in neutron source package in Disco: Fix Released Status in neutron source package in Eoan: Fix Released Bug description: Related bug 1774710 and bug 1580588 The neutron-dhcp-agent in OpenStack >= Queens makes use of the dns_domain value set on a network to configure the '--domain' parameter of the dnsmasq instance that supports it; at the same time, neutron makes use of CONF.dns_domain when creating dns_assignments for ports - this results in a hosts file for the dnsmasq instance which uses CONF.dns_domain and a --domain parameter of network.dns_domain which do not match. This results in a search path on instances booted attached to the network which is inconsistent with the internal DNS entries that dnsmasq responds with: root@bionic-045546-2:~# host 192.168.21.222 222.21.168.192.in-addr.arpa domain name pointer bionic-045546-2.jamespage.internal. root@bionic-045546-2:~# host bionic-045546-2 bionic-045546-2.designate.local has address 192.168.21.222 In the above example: CONF.dns_domain = jamespage.internal. network.dns_domain = designate.local. Based on previous discussion in bug 1580588 I think that the dns_domain value for a network was intented for use for external DNS integration such as that provided by Designate. The changed made under commit: https://opendev.org/openstack/neutron/commit/137a6d61053 appear to break this assumption, producing somewhat inconsistent behaviour in the dnsmasq instance for the network. To manage notifications about this bug go to: https://bugs.launchpad.net/cloud-archive/+bug/1826419/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1826419] Re: dhcp agent configured with mismatching domain and host entries
This bug was fixed in the package neutron - 2:13.0.2-0ubuntu3.3~cloud0 --- neutron (2:13.0.2-0ubuntu3.3~cloud0) bionic-rocky; urgency=medium . * New update for the Ubuntu Cloud Archive. . neutron (2:13.0.2-0ubuntu3.3) cosmic; urgency=medium . * d/p/bug1826419.patch: Cherry pick fix to revert incorrect changes to internal DNS behaviour (LP: #1826419). ** Changed in: cloud-archive/rocky Status: Fix Committed => Fix Released -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1826419 Title: dhcp agent configured with mismatching domain and host entries Status in Ubuntu Cloud Archive: Fix Released Status in Ubuntu Cloud Archive queens series: Fix Committed Status in Ubuntu Cloud Archive rocky series: Fix Released Status in Ubuntu Cloud Archive stein series: Fix Released Status in Ubuntu Cloud Archive train series: Fix Released Status in neutron: Fix Released Status in neutron package in Ubuntu: Fix Released Status in neutron source package in Bionic: Fix Released Status in neutron source package in Cosmic: Fix Released Status in neutron source package in Disco: Fix Released Status in neutron source package in Eoan: Fix Released Bug description: Related bug 1774710 and bug 1580588 The neutron-dhcp-agent in OpenStack >= Queens makes use of the dns_domain value set on a network to configure the '--domain' parameter of the dnsmasq instance that supports it; at the same time, neutron makes use of CONF.dns_domain when creating dns_assignments for ports - this results in a hosts file for the dnsmasq instance which uses CONF.dns_domain and a --domain parameter of network.dns_domain which do not match. This results in a search path on instances booted attached to the network which is inconsistent with the internal DNS entries that dnsmasq responds with: root@bionic-045546-2:~# host 192.168.21.222 222.21.168.192.in-addr.arpa domain name pointer bionic-045546-2.jamespage.internal. root@bionic-045546-2:~# host bionic-045546-2 bionic-045546-2.designate.local has address 192.168.21.222 In the above example: CONF.dns_domain = jamespage.internal. network.dns_domain = designate.local. Based on previous discussion in bug 1580588 I think that the dns_domain value for a network was intented for use for external DNS integration such as that provided by Designate. The changed made under commit: https://opendev.org/openstack/neutron/commit/137a6d61053 appear to break this assumption, producing somewhat inconsistent behaviour in the dnsmasq instance for the network. To manage notifications about this bug go to: https://bugs.launchpad.net/cloud-archive/+bug/1826419/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1826419] Re: dhcp agent configured with mismatching domain and host entries
This bug was fixed in the package neutron - 2:14.0.0-0ubuntu2~cloud0 --- neutron (2:14.0.0-0ubuntu2~cloud0) bionic-train; urgency=medium . * New update for the Ubuntu Cloud Archive. . neutron (2:14.0.0-0ubuntu2) eoan; urgency=medium . * d/p/bug1826419.patch: Cherry pick fix to revert incorrect changes to internal DNS behaviour (LP: #1826419). ** Changed in: cloud-archive Status: Fix Committed => Fix Released -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1826419 Title: dhcp agent configured with mismatching domain and host entries Status in Ubuntu Cloud Archive: Fix Released Status in Ubuntu Cloud Archive queens series: Fix Committed Status in Ubuntu Cloud Archive rocky series: Fix Released Status in Ubuntu Cloud Archive stein series: Fix Released Status in Ubuntu Cloud Archive train series: Fix Released Status in neutron: Fix Released Status in neutron package in Ubuntu: Fix Released Status in neutron source package in Bionic: Fix Released Status in neutron source package in Cosmic: Fix Released Status in neutron source package in Disco: Fix Released Status in neutron source package in Eoan: Fix Released Bug description: Related bug 1774710 and bug 1580588 The neutron-dhcp-agent in OpenStack >= Queens makes use of the dns_domain value set on a network to configure the '--domain' parameter of the dnsmasq instance that supports it; at the same time, neutron makes use of CONF.dns_domain when creating dns_assignments for ports - this results in a hosts file for the dnsmasq instance which uses CONF.dns_domain and a --domain parameter of network.dns_domain which do not match. This results in a search path on instances booted attached to the network which is inconsistent with the internal DNS entries that dnsmasq responds with: root@bionic-045546-2:~# host 192.168.21.222 222.21.168.192.in-addr.arpa domain name pointer bionic-045546-2.jamespage.internal. root@bionic-045546-2:~# host bionic-045546-2 bionic-045546-2.designate.local has address 192.168.21.222 In the above example: CONF.dns_domain = jamespage.internal. network.dns_domain = designate.local. Based on previous discussion in bug 1580588 I think that the dns_domain value for a network was intented for use for external DNS integration such as that provided by Designate. The changed made under commit: https://opendev.org/openstack/neutron/commit/137a6d61053 appear to break this assumption, producing somewhat inconsistent behaviour in the dnsmasq instance for the network. To manage notifications about this bug go to: https://bugs.launchpad.net/cloud-archive/+bug/1826419/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1832210] Re: incorrect decode of log prefix under python 3
** Also affects: neutron-fwaas (Ubuntu Eoan) Importance: Undecided Status: New ** Also affects: neutron-fwaas (Ubuntu Disco) Importance: Undecided Status: New ** Also affects: neutron-fwaas (Ubuntu Cosmic) Importance: Undecided Status: New ** Description changed: Under Python 3, the prefix of a firewall log message is not correctly - decode: + decoded "b'10612530182266949194'": 2019-06-10 09:14:34 Unknown cookie packet_in pkt=ethernet(dst='fa:16:3e:c6:58:5e',ethertype=2048,src='fa:16:3e:e0:2c:be')ipv4(csum=51290,dst='10.5.0.10',flags=2,header_length=5,identification=37612,offset=0,option=None,proto=6,src='192.168.21.182',tos=16,total_length=52,ttl=63,version=4)tcp(ack=3151291228,bits=17,csum=23092,dst_port=57776,offset=8,option=[TCPOptionNoOperation(kind=1,length=1), TCPOptionNoOperation(kind=1,length=1), TCPOptionTimestamps(kind=8,length=10,ts_ecr=1574746440,ts_val=482688)],seq=2769917228,src_port=22,urgent=0,window_size=3120) 2019-06-10 09:14:34 {'prefix': "b'10612530182266949194'", 'msg': "ethernet(dst='fa:16:3e:c6:58:5e',ethertype=2048,src='fa:16:3e:e0:2c:be')ipv4(csum=51290,dst='10.5.0.10',flags=2,header_length=5,identification=37612,offset=0,option=None,proto=6,src='192.168.21.182',tos=16,total_length=52,ttl=63,version=4)tcp(ack=3151291228,bits=17,csum=23092,dst_port=57776,offset=8,option=[TCPOptionNoOperation(kind=1,length=1), TCPOptionNoOperation(kind=1,length=1), TCPOptionTimestamps(kind=8,length=10,ts_ecr=1574746440,ts_val=482688)],seq=2769917228,src_port=22,urgent=0,window_size=3120)"} 2019-06-10 09:14:34 {'0bf81ded-bf94-437d-ad49-063bba9be9bb': [, ]} This results in the firewall log driver not being able to map the message to the associated port and log resources in neutron resulting in the 'unknown cookie packet_in' warning message. -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1832210 Title: incorrect decode of log prefix under python 3 Status in neutron: In Progress Status in neutron-fwaas package in Ubuntu: New Status in neutron-fwaas source package in Cosmic: New Status in neutron-fwaas source package in Disco: New Status in neutron-fwaas source package in Eoan: New Bug description: Under Python 3, the prefix of a firewall log message is not correctly decoded "b'10612530182266949194'": 2019-06-10 09:14:34 Unknown cookie packet_in pkt=ethernet(dst='fa:16:3e:c6:58:5e',ethertype=2048,src='fa:16:3e:e0:2c:be')ipv4(csum=51290,dst='10.5.0.10',flags=2,header_length=5,identification=37612,offset=0,option=None,proto=6,src='192.168.21.182',tos=16,total_length=52,ttl=63,version=4)tcp(ack=3151291228,bits=17,csum=23092,dst_port=57776,offset=8,option=[TCPOptionNoOperation(kind=1,length=1), TCPOptionNoOperation(kind=1,length=1), TCPOptionTimestamps(kind=8,length=10,ts_ecr=1574746440,ts_val=482688)],seq=2769917228,src_port=22,urgent=0,window_size=3120) 2019-06-10 09:14:34 {'prefix': "b'10612530182266949194'", 'msg': "ethernet(dst='fa:16:3e:c6:58:5e',ethertype=2048,src='fa:16:3e:e0:2c:be')ipv4(csum=51290,dst='10.5.0.10',flags=2,header_length=5,identification=37612,offset=0,option=None,proto=6,src='192.168.21.182',tos=16,total_length=52,ttl=63,version=4)tcp(ack=3151291228,bits=17,csum=23092,dst_port=57776,offset=8,option=[TCPOptionNoOperation(kind=1,length=1), TCPOptionNoOperation(kind=1,length=1), TCPOptionTimestamps(kind=8,length=10,ts_ecr=1574746440,ts_val=482688)],seq=2769917228,src_port=22,urgent=0,window_size=3120)"} 2019-06-10 09:14:34 {'0bf81ded-bf94-437d-ad49-063bba9be9bb': [, ]} This results in the firewall log driver not being able to map the message to the associated port and log resources in neutron resulting in the 'unknown cookie packet_in' warning message. To manage notifications about this bug go to: https://bugs.launchpad.net/neutron/+bug/1832210/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1832210] Re: incorrect decode of log prefix under python 3
Illustrated: >>> str(b'10612530182266949194') "b'10612530182266949194'" >>> b'10612530182266949194'.decode('UTF-8') '10612530182266949194' ** Also affects: neutron Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1832210 Title: incorrect decode of log prefix under python 3 Status in neutron: New Status in neutron-fwaas package in Ubuntu: New Bug description: Under Python 3, the prefix of a firewall log message is not correctly decode: 2019-06-10 09:14:34 Unknown cookie packet_in pkt=ethernet(dst='fa:16:3e:c6:58:5e',ethertype=2048,src='fa:16:3e:e0:2c:be')ipv4(csum=51290,dst='10.5.0.10',flags=2,header_length=5,identification=37612,offset=0,option=None,proto=6,src='192.168.21.182',tos=16,total_length=52,ttl=63,version=4)tcp(ack=3151291228,bits=17,csum=23092,dst_port=57776,offset=8,option=[TCPOptionNoOperation(kind=1,length=1), TCPOptionNoOperation(kind=1,length=1), TCPOptionTimestamps(kind=8,length=10,ts_ecr=1574746440,ts_val=482688)],seq=2769917228,src_port=22,urgent=0,window_size=3120) 2019-06-10 09:14:34 {'prefix': "b'10612530182266949194'", 'msg': "ethernet(dst='fa:16:3e:c6:58:5e',ethertype=2048,src='fa:16:3e:e0:2c:be')ipv4(csum=51290,dst='10.5.0.10',flags=2,header_length=5,identification=37612,offset=0,option=None,proto=6,src='192.168.21.182',tos=16,total_length=52,ttl=63,version=4)tcp(ack=3151291228,bits=17,csum=23092,dst_port=57776,offset=8,option=[TCPOptionNoOperation(kind=1,length=1), TCPOptionNoOperation(kind=1,length=1), TCPOptionTimestamps(kind=8,length=10,ts_ecr=1574746440,ts_val=482688)],seq=2769917228,src_port=22,urgent=0,window_size=3120)"} 2019-06-10 09:14:34 {'0bf81ded-bf94-437d-ad49-063bba9be9bb': [, ]} This results in the firewall log driver not being able to map the message to the associated port and log resources in neutron resulting in the 'unknown cookie packet_in' warning message. To manage notifications about this bug go to: https://bugs.launchpad.net/neutron/+bug/1832210/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1826419] Re: dhcp agent configured with mismatching domain and host entries
Ubuntu SRU information [Impact] Use of Neutron internal DNS resolution for resolution of instances attached to the same project network is inconsistent due to use of configuration options for the actual hostname entries in the dnsmasq hosts file paired with the network 'dns_domain' attribute which is used to set the search path for the same dnsmasq instance. [Test Case] Deploy OpenStack with Neutron internal DNS support enabled Configure neutron with a dns_domain of 'testcase.internal' Set a dns_domain attribute on the project network ('designate.local') Boot an instance attached to the network DNS resolution within the host will be asymmetric in terms of the actual dns domain used. root@bionic-045546-2:~# host 192.168.21.222 222.21.168.192.in-addr.arpa domain name pointer bionic-045546-2.testcase.internal. root@bionic-045546-2:~# host bionic-045546-2 bionic-045546-2.designate.local has address 192.168.21.222 which should be root@bionic-045546-2:~# host bionic-045546-2 bionic-045546-2.testcase.local has address 192.168.21.222 [Regression Potential] Minimal; the proposed fix actually reverts changes in Neutron which altered the behaviour of the internal DNS support in Neutron incorrectly. [Other Info] This change in behaviour has been discussed at the upstream Neutron irc meeting with consensus that the behaviour changes are incorrect and should be reverted. ** Also affects: cloud-archive Importance: Undecided Status: New ** Also affects: cloud-archive/queens Importance: Undecided Status: New ** Also affects: cloud-archive/train Importance: Undecided Status: New ** Also affects: cloud-archive/rocky Importance: Undecided Status: New ** Also affects: cloud-archive/stein Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1826419 Title: dhcp agent configured with mismatching domain and host entries Status in Ubuntu Cloud Archive: New Status in Ubuntu Cloud Archive queens series: New Status in Ubuntu Cloud Archive rocky series: New Status in Ubuntu Cloud Archive stein series: New Status in Ubuntu Cloud Archive train series: New Status in neutron: In Progress Status in neutron package in Ubuntu: Fix Released Status in neutron source package in Bionic: Triaged Status in neutron source package in Cosmic: Triaged Status in neutron source package in Disco: Triaged Status in neutron source package in Eoan: Fix Released Bug description: Related bug 1774710 and bug 1580588 The neutron-dhcp-agent in OpenStack >= Queens makes use of the dns_domain value set on a network to configure the '--domain' parameter of the dnsmasq instance that supports it; at the same time, neutron makes use of CONF.dns_domain when creating dns_assignments for ports - this results in a hosts file for the dnsmasq instance which uses CONF.dns_domain and a --domain parameter of network.dns_domain which do not match. This results in a search path on instances booted attached to the network which is inconsistent with the internal DNS entries that dnsmasq responds with: root@bionic-045546-2:~# host 192.168.21.222 222.21.168.192.in-addr.arpa domain name pointer bionic-045546-2.jamespage.internal. root@bionic-045546-2:~# host bionic-045546-2 bionic-045546-2.designate.local has address 192.168.21.222 In the above example: CONF.dns_domain = jamespage.internal. network.dns_domain = designate.local. Based on previous discussion in bug 1580588 I think that the dns_domain value for a network was intented for use for external DNS integration such as that provided by Designate. The changed made under commit: https://opendev.org/openstack/neutron/commit/137a6d61053 appear to break this assumption, producing somewhat inconsistent behaviour in the dnsmasq instance for the network. To manage notifications about this bug go to: https://bugs.launchpad.net/cloud-archive/+bug/1826419/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1826419] Re: dhcp agent configured with mismatching domain and host entries
** Also affects: neutron (Ubuntu Disco) Importance: Undecided Status: New ** Also affects: neutron (Ubuntu Eoan) Importance: Undecided Status: New ** Also affects: neutron (Ubuntu Bionic) Importance: Undecided Status: New ** Also affects: neutron (Ubuntu Cosmic) Importance: Undecided Status: New ** Changed in: neutron (Ubuntu Bionic) Status: New => Triaged -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1826419 Title: dhcp agent configured with mismatching domain and host entries Status in neutron: In Progress Status in neutron package in Ubuntu: New Status in neutron source package in Bionic: Triaged Status in neutron source package in Cosmic: New Status in neutron source package in Disco: New Status in neutron source package in Eoan: New Bug description: Related bug 1774710 and bug 1580588 The neutron-dhcp-agent in OpenStack >= Queens makes use of the dns_domain value set on a network to configure the '--domain' parameter of the dnsmasq instance that supports it; at the same time, neutron makes use of CONF.dns_domain when creating dns_assignments for ports - this results in a hosts file for the dnsmasq instance which uses CONF.dns_domain and a --domain parameter of network.dns_domain which do not match. This results in a search path on instances booted attached to the network which is inconsistent with the internal DNS entries that dnsmasq responds with: root@bionic-045546-2:~# host 192.168.21.222 222.21.168.192.in-addr.arpa domain name pointer bionic-045546-2.jamespage.internal. root@bionic-045546-2:~# host bionic-045546-2 bionic-045546-2.designate.local has address 192.168.21.222 In the above example: CONF.dns_domain = jamespage.internal. network.dns_domain = designate.local. Based on previous discussion in bug 1580588 I think that the dns_domain value for a network was intented for use for external DNS integration such as that provided by Designate. The changed made under commit: https://opendev.org/openstack/neutron/commit/137a6d61053 appear to break this assumption, producing somewhat inconsistent behaviour in the dnsmasq instance for the network. To manage notifications about this bug go to: https://bugs.launchpad.net/neutron/+bug/1826419/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1826419] [NEW] dhcp agent configured with mismatching domain and host entries
Public bug reported: Related bug 1774710 and bug 1580588 The neutron-dhcp-agent in OpenStack >= Queens makes use of the dns_domain value set on a network to configure the '--domain' parameter of the dnsmasq instance that supports it; at the same time, neutron makes use of CONF.dns_domain when creating dns_assignments for ports - this results in a hosts file for the dnsmasq instance which uses CONF.dns_domain and a --domain parameter of network.dns_domain which do not match. This results in a search path on instances booted attached to the network which is inconsistent with the internal DNS entries that dnsmasq responds with: root@bionic-045546-2:~# host 192.168.21.222 222.21.168.192.in-addr.arpa domain name pointer bionic-045546-2.jamespage.internal. root@bionic-045546-2:~# host bionic-045546-2 bionic-045546-2.designate.local has address 192.168.21.222 In the above example: CONF.dns_domain = jamespage.internal. network.dns_domain = designate.local. Based on previous discussion in bug 1580588 I think that the dns_domain value for a network was intented for use for external DNS integration such as that provided by Designate. The changed made under commit: https://opendev.org/openstack/neutron/commit/137a6d61053 appear to break this assumption, producing somewhat inconsistent behaviour in the dnsmasq instance for the network. ** Affects: neutron Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1826419 Title: dhcp agent configured with mismatching domain and host entries Status in neutron: New Bug description: Related bug 1774710 and bug 1580588 The neutron-dhcp-agent in OpenStack >= Queens makes use of the dns_domain value set on a network to configure the '--domain' parameter of the dnsmasq instance that supports it; at the same time, neutron makes use of CONF.dns_domain when creating dns_assignments for ports - this results in a hosts file for the dnsmasq instance which uses CONF.dns_domain and a --domain parameter of network.dns_domain which do not match. This results in a search path on instances booted attached to the network which is inconsistent with the internal DNS entries that dnsmasq responds with: root@bionic-045546-2:~# host 192.168.21.222 222.21.168.192.in-addr.arpa domain name pointer bionic-045546-2.jamespage.internal. root@bionic-045546-2:~# host bionic-045546-2 bionic-045546-2.designate.local has address 192.168.21.222 In the above example: CONF.dns_domain = jamespage.internal. network.dns_domain = designate.local. Based on previous discussion in bug 1580588 I think that the dns_domain value for a network was intented for use for external DNS integration such as that provided by Designate. The changed made under commit: https://opendev.org/openstack/neutron/commit/137a6d61053 appear to break this assumption, producing somewhat inconsistent behaviour in the dnsmasq instance for the network. To manage notifications about this bug go to: https://bugs.launchpad.net/neutron/+bug/1826419/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1774710] Re: DHCP agent doesn't do anything with a network's dns_domain attribute
** Changed in: neutron (Ubuntu) Status: New => Fix Released -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1774710 Title: DHCP agent doesn't do anything with a network's dns_domain attribute Status in neutron: Fix Released Status in neutron package in Ubuntu: Fix Released Bug description: 0) Set up Neutron with ML2/OVS or LB, or anything that uses the DHCP agent 1) Create a network with dns_domain 2) Boot a VM on it Notice the VM doesn't have the DNS domain in it's /etc/resolv.conf In short, per-network DNS domains are not respected by the DHCP agent. The dns_domain attribute is persisted in the Neutron DB and passed on to the DHCP agent via RPC, but the agent doesn't do anything with it. Versions: Master and all previous versions. WIP fix is in https://review.openstack.org/#/c/571546. To manage notifications about this bug go to: https://bugs.launchpad.net/neutron/+bug/1774710/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1580588] Re: [RFE] use network's dns_domain to generate dns_assignment
** Also affects: neutron (Ubuntu) Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1580588 Title: [RFE] use network's dns_domain to generate dns_assignment Status in neutron: Confirmed Status in neutron package in Ubuntu: New Bug description: Problem: currently, the port's dns_assignment is generated by combining the dns_name and conf.dns_domain even if the dns_domain of port's network is given. expectation: generate the dns_assignment according to the dns_domain of port's network, which will scope the dnsname by network instead of each neutron deployment. To manage notifications about this bug go to: https://bugs.launchpad.net/neutron/+bug/1580588/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1744079] Re: [SRU] disk over-commit still not correctly calculated during live migration
This bug was fixed in the package nova - 2:16.1.7-0ubuntu1~cloud2 --- nova (2:16.1.7-0ubuntu1~cloud2) xenial-pike; urgency=medium . * d/p/(re)fix-disk-size-during-live-migration-with-disk-over-commit.patch: Cherry-picked from upstream stable/pike branch to ensure disk size check is corectly calculated on the destination host for live migration with disk over-commit (LP: #1708572) (LP: #1744079). ** Changed in: cloud-archive/pike Status: Fix Committed => Fix Released -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1744079 Title: [SRU] disk over-commit still not correctly calculated during live migration Status in Ubuntu Cloud Archive: Fix Committed Status in Ubuntu Cloud Archive mitaka series: New Status in Ubuntu Cloud Archive ocata series: Fix Released Status in Ubuntu Cloud Archive pike series: Fix Released Status in Ubuntu Cloud Archive queens series: Fix Released Status in Ubuntu Cloud Archive rocky series: Fix Released Status in OpenStack Compute (nova): Fix Released Status in OpenStack Compute (nova) queens series: In Progress Status in OpenStack Compute (nova) rocky series: In Progress Status in nova package in Ubuntu: Fix Released Status in nova source package in Xenial: Triaged Status in nova source package in Bionic: Fix Released Status in nova source package in Cosmic: Fix Released Status in nova source package in Disco: Fix Released Bug description: [Impact] nova compares disk space with disk_available_least field, which is possible to be negative, due to overcommit. So the migration may fail because of a "Migration pre-check error: Unable to migrate dfcd087a-5dff-439d-8875-2f702f081539: Disk of instance is too large(available on destination host:-3221225472 < need:22806528)" when trying a migration to another compute that has plenty of free space in his disk. [Test Case] Deploy openstack environment. Make sure there is a negative disk_available_least and a adequate free_disk_gb in one test compute node, then migrate a VM to it with disk-overcommit (openstack server migrate --live --block-migration --disk-overcommit ). You will see above migration pre-check error. This is the formula to compute disk_available_least and free_disk_gb. disk_free_gb = disk_info_dict['free'] disk_over_committed = self._get_disk_over_committed_size_total() available_least = disk_free_gb * units.Gi - disk_over_committed data['disk_available_least'] = available_least / units.Gi The following command can be used to query the value of disk_available_least nova hypervisor-show |grep disk Steps to Reproduce: 1. set disk_allocation_ratio config option > 1.0 2. qemu-img resize cirros-0.3.0-x86_64-disk.img +40G 3. glance image-create --disk-format qcow2 ... 4. boot VMs based on resized image 5. we see disk_available_least becomes negative [Regression Potential] Minimal - we're just changing from the following line: disk_available_gb = dst_compute_info['disk_available_least'] to the following codes: if disk_over_commit: disk_available_gb = dst_compute_info['free_disk_gb'] else: disk_available_gb = dst_compute_info['disk_available_least'] When enabling overcommit, disk_available_least is possible to be negative, so we should use free_disk_gb instead of it by backporting the following two fixes. https://git.openstack.org/cgit/openstack/nova/commit/?id=e097c001c8e0efe8879da57264fcb7bdfdf2 https://git.openstack.org/cgit/openstack/nova/commit/?id=e2cc275063658b23ed88824100919a6dfccb760d This is the code path for check_can_live_migrate_destination: _migrate_live(os-migrateLive API, migrate_server.py) -> migrate_server -> _live_migrate -> _build_live_migrate_task -> _call_livem_checks_on_host -> check_can_live_migrate_destination BTW, redhat also has a same bug - https://bugzilla.redhat.com/show_bug.cgi?id=1477706 [Original Bug Report] Change I8a705114d47384fcd00955d4a4f204072fed57c2 (written by me... sigh) addressed a bug which prevented live migration to a target host with overcommitted disk when made with microversion <2.25. It achieved this, but the fix is still not correct. We now do: if disk_over_commit: disk_available_gb = dst_compute_info['local_gb'] Unfortunately local_gb is *total* disk, not available disk. We actually want free_disk_gb. Fun fact: due to the way we calculate this for filesystems, without taking into account reserved space, this can also be negative. The test we're currently running is: could we fit this guest's allocated disks on the target if the target disk was empty. This is at least better than it was before, as we don't spuriously fail early. In fact, we're effectively disabling a test which is disabled for microversion >=2.25
[Yahoo-eng-team] [Bug 1744079] Re: [SRU] disk over-commit still not correctly calculated during live migration
This bug was fixed in the package nova - 2:15.1.5-0ubuntu1~cloud3 --- nova (2:15.1.5-0ubuntu1~cloud3) xenial-ocata; urgency=medium . * d/p/(re)fix-disk-size-during-live-migration-with-disk-over-commit.patch: Cherry-picked from upstream ocata gerrit reviews to ensure disk size check is corectly calculated on the destination host for live migration with disk over-commit (LP: #1708572) (LP: #1744079). ** Changed in: cloud-archive/ocata Status: Fix Committed => Fix Released -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1744079 Title: [SRU] disk over-commit still not correctly calculated during live migration Status in Ubuntu Cloud Archive: Fix Committed Status in Ubuntu Cloud Archive mitaka series: New Status in Ubuntu Cloud Archive ocata series: Fix Released Status in Ubuntu Cloud Archive pike series: Fix Released Status in Ubuntu Cloud Archive queens series: Fix Released Status in Ubuntu Cloud Archive rocky series: Fix Released Status in OpenStack Compute (nova): Fix Released Status in OpenStack Compute (nova) queens series: In Progress Status in OpenStack Compute (nova) rocky series: In Progress Status in nova package in Ubuntu: Fix Released Status in nova source package in Xenial: Triaged Status in nova source package in Bionic: Fix Released Status in nova source package in Cosmic: Fix Released Status in nova source package in Disco: Fix Released Bug description: [Impact] nova compares disk space with disk_available_least field, which is possible to be negative, due to overcommit. So the migration may fail because of a "Migration pre-check error: Unable to migrate dfcd087a-5dff-439d-8875-2f702f081539: Disk of instance is too large(available on destination host:-3221225472 < need:22806528)" when trying a migration to another compute that has plenty of free space in his disk. [Test Case] Deploy openstack environment. Make sure there is a negative disk_available_least and a adequate free_disk_gb in one test compute node, then migrate a VM to it with disk-overcommit (openstack server migrate --live --block-migration --disk-overcommit ). You will see above migration pre-check error. This is the formula to compute disk_available_least and free_disk_gb. disk_free_gb = disk_info_dict['free'] disk_over_committed = self._get_disk_over_committed_size_total() available_least = disk_free_gb * units.Gi - disk_over_committed data['disk_available_least'] = available_least / units.Gi The following command can be used to query the value of disk_available_least nova hypervisor-show |grep disk Steps to Reproduce: 1. set disk_allocation_ratio config option > 1.0 2. qemu-img resize cirros-0.3.0-x86_64-disk.img +40G 3. glance image-create --disk-format qcow2 ... 4. boot VMs based on resized image 5. we see disk_available_least becomes negative [Regression Potential] Minimal - we're just changing from the following line: disk_available_gb = dst_compute_info['disk_available_least'] to the following codes: if disk_over_commit: disk_available_gb = dst_compute_info['free_disk_gb'] else: disk_available_gb = dst_compute_info['disk_available_least'] When enabling overcommit, disk_available_least is possible to be negative, so we should use free_disk_gb instead of it by backporting the following two fixes. https://git.openstack.org/cgit/openstack/nova/commit/?id=e097c001c8e0efe8879da57264fcb7bdfdf2 https://git.openstack.org/cgit/openstack/nova/commit/?id=e2cc275063658b23ed88824100919a6dfccb760d This is the code path for check_can_live_migrate_destination: _migrate_live(os-migrateLive API, migrate_server.py) -> migrate_server -> _live_migrate -> _build_live_migrate_task -> _call_livem_checks_on_host -> check_can_live_migrate_destination BTW, redhat also has a same bug - https://bugzilla.redhat.com/show_bug.cgi?id=1477706 [Original Bug Report] Change I8a705114d47384fcd00955d4a4f204072fed57c2 (written by me... sigh) addressed a bug which prevented live migration to a target host with overcommitted disk when made with microversion <2.25. It achieved this, but the fix is still not correct. We now do: if disk_over_commit: disk_available_gb = dst_compute_info['local_gb'] Unfortunately local_gb is *total* disk, not available disk. We actually want free_disk_gb. Fun fact: due to the way we calculate this for filesystems, without taking into account reserved space, this can also be negative. The test we're currently running is: could we fit this guest's allocated disks on the target if the target disk was empty. This is at least better than it was before, as we don't spuriously fail early. In fact, we're effectively disabling a test which is disabled for microversion
[Yahoo-eng-team] [Bug 1823038] Re: Neutron-keepalived-state-change fails to check initial router state
This bug was fixed in the package neutron - 2:14.0.0~rc1-0ubuntu3~cloud0 --- neutron (2:14.0.0~rc1-0ubuntu3~cloud0) bionic-stein; urgency=medium . * New update for the Ubuntu Cloud Archive. . neutron (2:14.0.0~rc1-0ubuntu3) disco; urgency=medium . * d/p/bug1823038.patch: Cherry pick fix to ensure that None is not passed as an argument when spawning the neutron-keepalived-state-change agent (LP: #1823038). ** Changed in: cloud-archive Status: Fix Committed => Fix Released -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1823038 Title: Neutron-keepalived-state-change fails to check initial router state Status in Ubuntu Cloud Archive: Fix Released Status in Ubuntu Cloud Archive pike series: Confirmed Status in Ubuntu Cloud Archive queens series: Confirmed Status in Ubuntu Cloud Archive rocky series: Confirmed Status in Ubuntu Cloud Archive stein series: Fix Released Status in neutron: Confirmed Status in neutron package in Ubuntu: Fix Released Status in neutron source package in Bionic: Confirmed Status in neutron source package in Cosmic: Confirmed Status in neutron source package in Disco: Fix Released Bug description: As fix for bug https://bugs.launchpad.net/neutron/+bug/1818614 we added to neutron-keepalived-state-change monitor possibility to check initial status of router (master or slave). Unfortunately for some reason I see now in journal log of functional job that this check is failing with error like: Apr 03 09:19:09 ubuntu-bionic-ovh-gra1-0004666718 neutron-keepalived-state-change[1553]: 2019-04-03 09:19:09.778 1553 ERROR neutron.agent.l3.keepalived_state_change [-] Failed to get initial status of router cd300e6b-8222-4100-8f6a-3b5c4d5fe37b: FailedToDropPrivileges: privsep helper command exited non-zero (96) 2019-04-03 09:19:09.778 1553 ERROR neutron.agent.l3.keepalived_state_change Traceback (most recent call last): 2019-04-03 09:19:09.778 1553 ERROR neutron.agent.l3.keepalived_state_change File "/home/zuul/src/git.openstack.org/openstack/neutron/neutron/agent/l3/keepalived_state_change.py", line 98, in handle_initial_state 2019-04-03 09:19:09.778 1553 ERROR neutron.agent.l3.keepalived_state_change for address in ip.addr.list(): 2019-04-03 09:19:09.778 1553 ERROR neutron.agent.l3.keepalived_state_change File "/home/zuul/src/git.openstack.org/openstack/neutron/neutron/agent/linux/ip_lib.py", line 540, in list 2019-04-03 09:19:09.778 1553 ERROR neutron.agent.l3.keepalived_state_change **kwargs) 2019-04-03 09:19:09.778 1553 ERROR neutron.agent.l3.keepalived_state_change File "/home/zuul/src/git.openstack.org/openstack/neutron/neutron/agent/linux/ip_lib.py", line 1412, in get_devices_with_ip 2019-04-03 09:19:09.778 1553 ERROR neutron.agent.l3.keepalived_state_change devices = privileged.get_link_devices(namespace, **link_args) 2019-04-03 09:19:09.778 1553 ERROR neutron.agent.l3.keepalived_state_change File "/home/zuul/src/git.openstack.org/openstack/neutron/.tox/dsvm-functional-python27/local/lib/python2.7/site-packages/oslo_privsep/priv_context.py", line 240, in _wrap 2019-04-03 09:19:09.778 1553 ERROR neutron.agent.l3.keepalived_state_change self.start() 2019-04-03 09:19:09.778 1553 ERROR neutron.agent.l3.keepalived_state_change File "/home/zuul/src/git.openstack.org/openstack/neutron/.tox/dsvm-functional-python27/local/lib/python2.7/site-packages/oslo_privsep/priv_context.py", line 251, in start 2019-04-03 09:19:09.778 1553 ERROR neutron.agent.l3.keepalived_state_change channel = daemon.RootwrapClientChannel(context=self) 2019-04-03 09:19:09.778 1553 ERROR neutron.agent.l3.keepalived_state_change File
[Yahoo-eng-team] [Bug 1824017] Re: stein requires python-cinderclient >= 4.0.0
This bug was fixed in the package horizon - 3:15.0.0~rc2-0ubuntu2~cloud0 --- horizon (3:15.0.0~rc2-0ubuntu2~cloud0) bionic-stein; urgency=medium . * New update for the Ubuntu Cloud Archive. . horizon (3:15.0.0~rc2-0ubuntu2) disco; urgency=medium . * d/p/set-min-version-of-python-cinderclient-to-4.0.0.patch: Ensure python3-cinderclient is >= 4.0.0, as this is required for create/update of volumes from the horizon dashboard (LP: #1824017). ** Changed in: cloud-archive Status: Triaged => Fix Released -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Dashboard (Horizon). https://bugs.launchpad.net/bugs/1824017 Title: stein requires python-cinderclient >= 4.0.0 Status in Ubuntu Cloud Archive: Fix Released Status in Ubuntu Cloud Archive stein series: Fix Released Status in OpenStack Dashboard (Horizon): Fix Released Status in horizon package in Ubuntu: Fix Released Status in horizon source package in Disco: Fix Released Bug description: Attempting to create volume from stein dashboard fails with "Error: Unable to create volume.". The log shows: [Tue Apr 09 17:40:58.263170 2019] [wsgi:error] [pid 18815:tid 140441351403264] [remote 10.5.0.17:50962] Recoverable error: Invalid input for field/attribute volume. Value: {'size': 1, 'consistencygroup_id': None, 'snapshot_id': None, 'name': 'b1', 'description': '', 'volume_type': '', 'user_id': None, 'project_id': None, 'availability_zone': 'nova', 'status': 'creating', 'attach_status': 'detached', 'metadata': {}, 'imageRef': None, 'source_volid': None, 'source_replica': None, 'multiattach': False, 'backup_id': None}. Additional properties are not allowed ('project_id', 'user_id', 'status', 'attach_status', 'source_replica' were unexpected) (HTTP 400) (Request-ID: req-e64a3589-403c-4c58-87dd-58233a70bde6) We're running with python3-cinderclient 1:3.5.0-0ubuntu1. Upgrading to python3-cinderclient 1:4.1.0-0ubuntu1 fixes this. Relevent python-cinderclient commit from version 4.0.0: commit 8d566689001a442c2312e366acc167af8fd3 Author: Neha Alhat Date: Thu Jun 7 18:22:16 2018 +0530 Remove unnecessary parameters from volume create APIs As per Cinder code, following parameters are not required to be passed in the request body of create volume API. * status * user_id * attach_status * project_id * source_replica If you pass these parameters, previously it was ignored but in the schema validation changes[1] we don't allow additionalProperties to be passed in the request body. If user passes additional parameters which are not as per API specs[2], then it will be rejected with 400 error. On patch[3], tempest tests: test_volume_snapshot_create_get_list_delete, test_volume_create_get_delete" are failing because of these unnecessary parameters. This patch removes these unnecessary parameters passed to the create Volume API. [1]https://blueprints.launchpad.net/cinder/+spec/json-schema-validation [2]https://review.openstack.org/#/c/507386/ [3]https://review.openstack.org/#/c/573093/ Change-Id: I37744bfd0b0bc59682c3e680c1200f608ad3991b To manage notifications about this bug go to: https://bugs.launchpad.net/cloud-archive/+bug/1824017/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1823038] Re: Neutron-keepalived-state-change fails to check initial router state
** Also affects: neutron (Ubuntu) Importance: Undecided Status: New ** Also affects: neutron (Ubuntu Disco) Importance: Undecided Status: New ** Also affects: neutron (Ubuntu Bionic) Importance: Undecided Status: New ** Also affects: neutron (Ubuntu Cosmic) Importance: Undecided Status: New ** Also affects: cloud-archive Importance: Undecided Status: New ** Also affects: cloud-archive/rocky Importance: Undecided Status: New ** Also affects: cloud-archive/stein Importance: Undecided Status: New ** Also affects: cloud-archive/pike Importance: Undecided Status: New ** Also affects: cloud-archive/queens Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1823038 Title: Neutron-keepalived-state-change fails to check initial router state Status in Ubuntu Cloud Archive: New Status in Ubuntu Cloud Archive pike series: New Status in Ubuntu Cloud Archive queens series: New Status in Ubuntu Cloud Archive rocky series: New Status in Ubuntu Cloud Archive stein series: New Status in neutron: Confirmed Status in neutron package in Ubuntu: New Status in neutron source package in Bionic: New Status in neutron source package in Cosmic: New Status in neutron source package in Disco: New Bug description: As fix for bug https://bugs.launchpad.net/neutron/+bug/1818614 we added to neutron-keepalived-state-change monitor possibility to check initial status of router (master or slave). Unfortunately for some reason I see now in journal log of functional job that this check is failing with error like: Apr 03 09:19:09 ubuntu-bionic-ovh-gra1-0004666718 neutron-keepalived-state-change[1553]: 2019-04-03 09:19:09.778 1553 ERROR neutron.agent.l3.keepalived_state_change [-] Failed to get initial status of router cd300e6b-8222-4100-8f6a-3b5c4d5fe37b: FailedToDropPrivileges: privsep helper command exited non-zero (96) 2019-04-03 09:19:09.778 1553 ERROR neutron.agent.l3.keepalived_state_change Traceback (most recent call last): 2019-04-03 09:19:09.778 1553 ERROR neutron.agent.l3.keepalived_state_change File "/home/zuul/src/git.openstack.org/openstack/neutron/neutron/agent/l3/keepalived_state_change.py", line 98, in handle_initial_state 2019-04-03 09:19:09.778 1553 ERROR neutron.agent.l3.keepalived_state_change for address in ip.addr.list(): 2019-04-03 09:19:09.778 1553 ERROR neutron.agent.l3.keepalived_state_change File "/home/zuul/src/git.openstack.org/openstack/neutron/neutron/agent/linux/ip_lib.py", line 540, in list 2019-04-03 09:19:09.778 1553 ERROR neutron.agent.l3.keepalived_state_change **kwargs) 2019-04-03 09:19:09.778 1553 ERROR neutron.agent.l3.keepalived_state_change File "/home/zuul/src/git.openstack.org/openstack/neutron/neutron/agent/linux/ip_lib.py", line 1412, in get_devices_with_ip 2019-04-03 09:19:09.778 1553 ERROR neutron.agent.l3.keepalived_state_change devices = privileged.get_link_devices(namespace, **link_args) 2019-04-03 09:19:09.778 1553 ERROR neutron.agent.l3.keepalived_state_change File "/home/zuul/src/git.openstack.org/openstack/neutron/.tox/dsvm-functional-python27/local/lib/python2.7/site-packages/oslo_privsep/priv_context.py", line 240, in _wrap 2019-04-03 09:19:09.778 1553 ERROR neutron.agent.l3.keepalived_state_change self.start() 2019-04-03 09:19:09.778 1553 ERROR neutron.agent.l3.keepalived_state_change File "/home/zuul/src/git.openstack.org/openstack/neutron/.tox/dsvm-functional-python27/local/lib/python2.7/site-packages/oslo_privsep/priv_context.py", line 251, in start 2019-04-03 09:19:09.778 1553 ERROR neutron.agent.l3.keepalived_state_change channel = daemon.RootwrapClientChannel(context=self)
[Yahoo-eng-team] [Bug 1819453] Re: keystone-ldap TypeError: cannot concatenate 'str' and 'NoneType' object
** Also affects: keystone Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Identity (keystone). https://bugs.launchpad.net/bugs/1819453 Title: keystone-ldap TypeError: cannot concatenate 'str' and 'NoneType' object Status in OpenStack Identity (keystone): New Status in keystone package in Ubuntu: New Bug description: Proposed action: = Key / value failed check error. Should check key exists and warn user of bad users / continue Bug presented by: = openstack user list --domain customerdata cannot concatenate 'str' and 'NoneType' objects (HTTP 400) (Request-ID: req-cc0e225d-d033-4dfa-aff8-7311389d4f58) Trace: == (keystone.common.wsgi): 2019-03-11 12:30:47,154 ERROR cannot concatenate 'str' and 'NoneType' objects Traceback (most recent call last): File "/usr/lib/python2.7/dist-packages/keystone/common/wsgi.py", line 228, in __call__ result = method(req, **params) File "/usr/lib/python2.7/dist-packages/keystone/common/controller.py", line 235, in wrapper return f(self, request, filters, **kwargs) File "/usr/lib/python2.7/dist-packages/keystone/identity/controllers.py", line 233, in list_users return UserV3.wrap_collection(request.context_dict, refs, hints=hints) File "/usr/lib/python2.7/dist-packages/keystone/common/controller.py", line 499, in wrap_collection cls.wrap_member(context, ref) File "/usr/lib/python2.7/dist-packages/keystone/common/controller.py", line 468, in wrap_member cls._add_self_referential_link(context, ref) File "/usr/lib/python2.7/dist-packages/keystone/common/controller.py", line 464, in _add_self_referential_link ref['links']['self'] = cls.base_url(context) + '/' + ref['id'] TypeError: cannot concatenate 'str' and 'NoneType' objects Offending Data: === @ line 233 i put LOG.debug( pprint.pformat( refs ) ) grep -b 2 "'id': None," /varlog/keystone/keystone.log {'domain_id': u'8ce102de5ac644288f61838f5e0f46e7', 'email': u'customerd...@cusomter.com', 'id': None, -- {'domain_id': u'8ce102de5ac644288f61838f5e0f46e7', 'email': u'customerd...@cusomter.com', 'id': None, -- {'domain_id': u'8ce102de5ac644288f61838f5e0f46e7', 'email': u'customerd...@cusomter.com', 'id': None, Platform: = cat /etc/*-release DISTRIB_ID=Ubuntu DISTRIB_RELEASE=16.04 DISTRIB_CODENAME=xenial DISTRIB_DESCRIPTION="Ubuntu 16.04.5 LTS" NAME="Ubuntu" VERSION="16.04.5 LTS (Xenial Xerus)" ID=ubuntu ID_LIKE=debian PRETTY_NAME="Ubuntu 16.04.5 LTS" VERSION_ID="16.04" HOME_URL="http://www.ubuntu.com/; SUPPORT_URL="http://help.ubuntu.com/; BUG_REPORT_URL="http://bugs.launchpad.net/ubuntu/; VERSION_CODENAME=xenial UBUNTU_CODENAME=xenial verions: dpkg --list | grep keystone ii keystone 2:11.0.3-0ubuntu1~cloud0 all OpenStack identity service - Daemons ii python-keystone 2:11.0.3-0ubuntu1~cloud0 all OpenStack identity service - Python library ii python-keystoneauth1 2.18.0-0ubuntu2~cloud0 all authentication library for OpenStack Identity - Python 2.7 ii python-keystoneclient1:3.10.0-0ubuntu1~cloud0 all client library for the OpenStack Keystone API - Python 2.x ii python-keystonemiddleware4.14.0-0ubuntu1.2~cloud0 all Middleware for OpenStack Identity (Keystone) - Python 2.x To manage notifications about this bug go to: https://bugs.launchpad.net/keystone/+bug/1819453/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1744079] Re: [SRU] disk over-commit still not correctly calculated during live migration
This bug was fixed in the package nova - 2:18.0.3-0ubuntu1~cloud0 --- nova (2:18.0.3-0ubuntu1~cloud0) bionic-rocky; urgency=medium . * New update for the Ubuntu Cloud Archive. . nova (2:18.0.3-0ubuntu1) cosmic; urgency=medium . * d/gbp.conf: Create stable/rocky branch. * d/p/disk-size-live-migration-overcommit.patch: Cherry-picked from https://review.openstack.org/#/c/602477 to ensure proper disk calculation during live migration with over-commit (LP: #1744079). * New stable point release for OpenStack Rocky (LP: #1806049). ** Changed in: cloud-archive/rocky Status: Fix Committed => Fix Released -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1744079 Title: [SRU] disk over-commit still not correctly calculated during live migration Status in Ubuntu Cloud Archive: Fix Committed Status in Ubuntu Cloud Archive ocata series: Triaged Status in Ubuntu Cloud Archive pike series: Triaged Status in Ubuntu Cloud Archive queens series: Fix Released Status in Ubuntu Cloud Archive rocky series: Fix Released Status in OpenStack Compute (nova): Fix Released Status in OpenStack Compute (nova) queens series: In Progress Status in OpenStack Compute (nova) rocky series: In Progress Status in nova package in Ubuntu: Fix Released Status in nova source package in Xenial: Triaged Status in nova source package in Bionic: Fix Released Status in nova source package in Cosmic: Fix Released Status in nova source package in Disco: Fix Released Bug description: [Impact] nova compares disk space with disk_available_least field, which is possible to be negative, due to overcommit. So the migration may fail because of a "Migration pre-check error: Unable to migrate dfcd087a-5dff-439d-8875-2f702f081539: Disk of instance is too large(available on destination host:-3221225472 < need:22806528)" when trying a migration to another compute that has plenty of free space in his disk. [Test Case] Deploy openstack environment. Make sure there is a negative disk_available_least and a adequate free_disk_gb in one test compute node, then migrate a VM to it with disk-overcommit (openstack server migrate --live --block-migration --disk-overcommit ). You will see above migration pre-check error. This is the formula to compute disk_available_least and free_disk_gb. disk_free_gb = disk_info_dict['free'] disk_over_committed = self._get_disk_over_committed_size_total() available_least = disk_free_gb * units.Gi - disk_over_committed data['disk_available_least'] = available_least / units.Gi The following command can be used to query the value of disk_available_least nova hypervisor-show |grep disk Steps to Reproduce: 1. set disk_allocation_ratio config option > 1.0 2. qemu-img resize cirros-0.3.0-x86_64-disk.img +40G 3. glance image-create --disk-format qcow2 ... 4. boot VMs based on resized image 5. we see disk_available_least becomes negative [Regression Potential] Minimal - we're just changing from the following line: disk_available_gb = dst_compute_info['disk_available_least'] to the following codes: if disk_over_commit: disk_available_gb = dst_compute_info['free_disk_gb'] else: disk_available_gb = dst_compute_info['disk_available_least'] When enabling overcommit, disk_available_least is possible to be negative, so we should use free_disk_gb instead of it by backporting the following two fixes. https://git.openstack.org/cgit/openstack/nova/commit/?id=e097c001c8e0efe8879da57264fcb7bdfdf2 https://git.openstack.org/cgit/openstack/nova/commit/?id=e2cc275063658b23ed88824100919a6dfccb760d This is the code path for check_can_live_migrate_destination: _migrate_live(os-migrateLive API, migrate_server.py) -> migrate_server -> _live_migrate -> _build_live_migrate_task -> _call_livem_checks_on_host -> check_can_live_migrate_destination BTW, redhat also has a same bug - https://bugzilla.redhat.com/show_bug.cgi?id=1477706 [Original Bug Report] Change I8a705114d47384fcd00955d4a4f204072fed57c2 (written by me... sigh) addressed a bug which prevented live migration to a target host with overcommitted disk when made with microversion <2.25. It achieved this, but the fix is still not correct. We now do: if disk_over_commit: disk_available_gb = dst_compute_info['local_gb'] Unfortunately local_gb is *total* disk, not available disk. We actually want free_disk_gb. Fun fact: due to the way we calculate this for filesystems, without taking into account reserved space, this can also be negative. The test we're currently running is: could we fit this guest's allocated disks on the target if the target disk was empty. This is at least better than it was before, as we don't spuriously fail early. In fact,
[Yahoo-eng-team] [Bug 1744079] Re: [SRU] disk over-commit still not correctly calculated during live migration
This bug was fixed in the package nova - 2:17.0.7-0ubuntu1~cloud0 --- nova (2:17.0.7-0ubuntu1~cloud0) xenial-queens; urgency=medium . * New upstream release for the Ubuntu Cloud Archive. . nova (2:17.0.7-0ubuntu1) bionic; urgency=medium . * d/p/disk-size-live-migration-overcommit.patch: Cherry-picked from https://review.openstack.org/#/c/602478 to ensure proper disk calculation during live migration with over-commit (LP: #1744079). * New stable point release for OpenStack Queens (LP: #1806043). ** Changed in: cloud-archive/queens Status: Fix Committed => Fix Released -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1744079 Title: [SRU] disk over-commit still not correctly calculated during live migration Status in Ubuntu Cloud Archive: Fix Committed Status in Ubuntu Cloud Archive ocata series: Triaged Status in Ubuntu Cloud Archive pike series: Triaged Status in Ubuntu Cloud Archive queens series: Fix Released Status in Ubuntu Cloud Archive rocky series: Fix Committed Status in OpenStack Compute (nova): Fix Released Status in OpenStack Compute (nova) queens series: In Progress Status in OpenStack Compute (nova) rocky series: In Progress Status in nova package in Ubuntu: Fix Released Status in nova source package in Xenial: Triaged Status in nova source package in Bionic: Fix Released Status in nova source package in Cosmic: Fix Released Status in nova source package in Disco: Fix Released Bug description: [Impact] nova compares disk space with disk_available_least field, which is possible to be negative, due to overcommit. So the migration may fail because of a "Migration pre-check error: Unable to migrate dfcd087a-5dff-439d-8875-2f702f081539: Disk of instance is too large(available on destination host:-3221225472 < need:22806528)" when trying a migration to another compute that has plenty of free space in his disk. [Test Case] Deploy openstack environment. Make sure there is a negative disk_available_least and a adequate free_disk_gb in one test compute node, then migrate a VM to it with disk-overcommit (openstack server migrate --live --block-migration --disk-overcommit ). You will see above migration pre-check error. This is the formula to compute disk_available_least and free_disk_gb. disk_free_gb = disk_info_dict['free'] disk_over_committed = self._get_disk_over_committed_size_total() available_least = disk_free_gb * units.Gi - disk_over_committed data['disk_available_least'] = available_least / units.Gi The following command can be used to query the value of disk_available_least nova hypervisor-show |grep disk Steps to Reproduce: 1. set disk_allocation_ratio config option > 1.0 2. qemu-img resize cirros-0.3.0-x86_64-disk.img +40G 3. glance image-create --disk-format qcow2 ... 4. boot VMs based on resized image 5. we see disk_available_least becomes negative [Regression Potential] Minimal - we're just changing from the following line: disk_available_gb = dst_compute_info['disk_available_least'] to the following codes: if disk_over_commit: disk_available_gb = dst_compute_info['free_disk_gb'] else: disk_available_gb = dst_compute_info['disk_available_least'] When enabling overcommit, disk_available_least is possible to be negative, so we should use free_disk_gb instead of it by backporting the following two fixes. https://git.openstack.org/cgit/openstack/nova/commit/?id=e097c001c8e0efe8879da57264fcb7bdfdf2 https://git.openstack.org/cgit/openstack/nova/commit/?id=e2cc275063658b23ed88824100919a6dfccb760d This is the code path for check_can_live_migrate_destination: _migrate_live(os-migrateLive API, migrate_server.py) -> migrate_server -> _live_migrate -> _build_live_migrate_task -> _call_livem_checks_on_host -> check_can_live_migrate_destination BTW, redhat also has a same bug - https://bugzilla.redhat.com/show_bug.cgi?id=1477706 [Original Bug Report] Change I8a705114d47384fcd00955d4a4f204072fed57c2 (written by me... sigh) addressed a bug which prevented live migration to a target host with overcommitted disk when made with microversion <2.25. It achieved this, but the fix is still not correct. We now do: if disk_over_commit: disk_available_gb = dst_compute_info['local_gb'] Unfortunately local_gb is *total* disk, not available disk. We actually want free_disk_gb. Fun fact: due to the way we calculate this for filesystems, without taking into account reserved space, this can also be negative. The test we're currently running is: could we fit this guest's allocated disks on the target if the target disk was empty. This is at least better than it was before, as we don't spuriously fail early. In fact, we're effectively disabling a
[Yahoo-eng-team] [Bug 1758868] Re: ovs restart can lead to critical ovs flows missing
*** This bug is a duplicate of bug 1584647 *** https://bugs.launchpad.net/bugs/1584647 On the assumption that bug 1584647 resolved this issue marking as a dupe - please comment if this is not the case or the issue remains. ** This bug has been marked a duplicate of bug 1584647 [SRU] "Interface monitor is not active" can be observed at ovs-agent start -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1758868 Title: ovs restart can lead to critical ovs flows missing Status in neutron: New Status in neutron package in Ubuntu: New Bug description: Hi, Running mitaka on xenial (neutron 2:8.4.0-0ubuntu6). We have l2pop and no l3ha. Using ovs with GRE tunnels. The cloud has around 30 compute nodes (mostly arm64). Last week, ovs got restarted during a package upgrade : 2018-03-21 17:17:25 upgrade openvswitch-common:arm64 2.5.2-0ubuntu0.16.04.3 2.5.4-0ubuntu0.16.04.1 This led to instances on 2 arm64 compute nodes lose networking completely. Upon closer inspection, I realized that a flow was missing in br-tun table 3 : https://pastebin.ubuntu.com/p/VXRJJX8J3k/ I believe this is due to a race in ovs_neutron_agent.py. These flows in table 3 are set up in provision_local_vlan() : https://github.com/openstack/neutron/blob/mitaka- eol/neutron/plugins/ml2/drivers/openvswitch/agent/ovs_neutron_agent.py#L675 which is called by port_bound() : https://github.com/openstack/neutron/blob/mitaka-eol/neutron/plugins/ml2/drivers/openvswitch/agent/ovs_neutron_agent.py#L789-L791 which is called by treat_vif_port() : https://github.com/openstack/neutron/blob/mitaka- eol/neutron/plugins/ml2/drivers/openvswitch/agent/ovs_neutron_agent.py#L1405-L1410 which is called by treat_devices_added_or_updated() : https://github.com/openstack/neutron/blob/mitaka- eol/neutron/plugins/ml2/drivers/openvswitch/agent/ovs_neutron_agent.py#L1517-L1525 which is called by process_network_ports() : https://github.com/openstack/neutron/blob/mitaka- eol/neutron/plugins/ml2/drivers/openvswitch/agent/ovs_neutron_agent.py#L1618-L1623 which is called by the big rpc_loop() : https://github.com/openstack/neutron/blob/mitaka- eol/neutron/plugins/ml2/drivers/openvswitch/agent/ovs_neutron_agent.py#L2023-L2029 So how does the agent knows when to create these table 3 flows ? Well, in rpc_loop(), it checks for OVS restarts (https://github.com/openstack/neutron/blob/mitaka-eol/neutron/plugins/ml2/drivers/openvswitch/agent/ovs_neutron_agent.py#L1947-L1948), and if OVS did restart, it does some basic ovs setup (default flows, etc), and (very important for later), it restarts the OVS polling manager. Later (still in rpc_loop()), it sets "ovs_restarted" to True, and process the ports as usual. The expected behaviour here is that since the polling manager got restarted, any port up will be marked as "added" and processed as such, in port_bound() (see call stack above). If this function is called on a port when ovs_restarted is True, then provision_local_vlan() will get called and will able the table 3 flows. This is all working great under the assumption that the polling manager (which is an async process) will raise the "I got new port !" event before the rpc_loop() checks it (in process_port_events(), called by process_port_info()). However, if for example the node is under load, this may not always be the case. What happens then is that the rpc_loop in which OVS is detected as restarted doesn't see any change on the ports, and so does nothing. The next run of the rpc_loop will process the "I got new port !" events, but that loop will not be running with ovs_restarted set to True, so the ports won't be brought up properly - more specifically, the table 3 flows in br-tun will be missing. This is shown in the debug logs : https://pastebin.ubuntu.com/p/M8yYn3YnQ6/ - you can see the loop in which "OVS is restarted" is detected (loop iteration 320773) doesn't process any port ("iteration:320773 completed. Processed ports statistics: {'regular': {'updated': 0, 'added': 0, 'removed': 0}}.), but the next iteration does process 3 "added" ports. You can see that the "output received" is logged in the first loop, 49ms after "starting polling" is logged, which is presumably the problem. On all the non-failing nodes, the output is received before "starting polling". I believe the proper thing to do is to set "sync" to True (in rpc_loop()) if an ovs restart is detected, forcing process_port_info() to not use async events and scan the ports itself using scan_ports(). Thanks To manage notifications about this bug go to: https://bugs.launchpad.net/neutron/+bug/1758868/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe :
[Yahoo-eng-team] [Bug 1797309] Re: Every item in navigation bar of workflow form should be hide if the parameter ready is false
Marking Ubuntu bug task as invalid - its fixed upstream in the Horizon project and will be picked up with the next snapshot upload to Ubuntu development. ** Changed in: horizon (Ubuntu) Status: New => Triaged ** Changed in: horizon (Ubuntu) Status: Triaged => Invalid -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Dashboard (Horizon). https://bugs.launchpad.net/bugs/1797309 Title: Every item in navigation bar of workflow form should be hide if the parameter ready is false Status in OpenStack Dashboard (Horizon): Fix Released Status in horizon package in Ubuntu: Invalid Bug description: In workflow wizard, every item of navigation used parameter 'ng-show="viewModel.ready"' to determine whether it should be display. I think it should use the parameter 'ready' of every item, like this: 'ng-show="step.ready"'. I think it make sense. To manage notifications about this bug go to: https://bugs.launchpad.net/horizon/+bug/1797309/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1803745] [NEW] neutron-dynamic-routing: unit test failures with master branch of neutron
Public bug reported: neutron-dynamic-routing unit tests currently fail with the tip of the master branch of neutron; project has neutron is its requirements.txt however the latest release version on pypi is from the rocky release. == Failed 3 tests - output below: == neutron_dynamic_routing.tests.unit.services.bgp.scheduler.test_bgp_dragent_scheduler.TestRescheduleBgpSpeaker.test_no_schedule_with_non_available_dragent - Captured traceback: ~~~ b'Traceback (most recent call last):' b' File "/home/jamespage/src/openstack/neutron-dynamic-routing/.tox/py37/lib/python3.7/site-packages/neutron/tests/base.py", line 151, in func' b'return f(self, *args, **kwargs)' b' File "/home/jamespage/src/openstack/neutron-dynamic-routing/neutron_dynamic_routing/tests/unit/services/bgp/scheduler/test_bgp_dragent_scheduler.py", line 341, in test_no_schedule_with_non_available_dragent' b'self.assertEqual(binds, [])' b' File "/home/jamespage/src/openstack/neutron-dynamic-routing/.tox/py37/lib/python3.7/site-packages/testtools/testcase.py", line 411, in assertEqual' b'self.assertThat(observed, matcher, message)' b' File "/home/jamespage/src/openstack/neutron-dynamic-routing/.tox/py37/lib/python3.7/site-packages/testtools/testcase.py", line 498, in assertThat' b'raise mismatch_error' b'testtools.matchers._impl.MismatchError: !=:' b"reference = []" b'actual= []' b'' b'' neutron_dynamic_routing.tests.unit.services.bgp.scheduler.test_bgp_dragent_scheduler.TestRescheduleBgpSpeaker.test_schedule_unbind_bgp_speaker -- Captured traceback: ~~~ b'Traceback (most recent call last):' b' File "/home/jamespage/src/openstack/neutron-dynamic-routing/.tox/py37/lib/python3.7/site-packages/neutron/tests/base.py", line 151, in func' b'return f(self, *args, **kwargs)' b' File "/home/jamespage/src/openstack/neutron-dynamic-routing/neutron_dynamic_routing/tests/unit/services/bgp/scheduler/test_bgp_dragent_scheduler.py", line 349, in test_schedule_unbind_bgp_speaker' b'self.assertEqual(binds, [])' b' File "/home/jamespage/src/openstack/neutron-dynamic-routing/.tox/py37/lib/python3.7/site-packages/testtools/testcase.py", line 411, in assertEqual' b'self.assertThat(observed, matcher, message)' b' File "/home/jamespage/src/openstack/neutron-dynamic-routing/.tox/py37/lib/python3.7/site-packages/testtools/testcase.py", line 498, in assertThat' b'raise mismatch_error' b'testtools.matchers._impl.MismatchError: !=:' b"reference = []" b'actual= []' b'' b'' neutron_dynamic_routing.tests.unit.services.bgp.scheduler.test_bgp_dragent_scheduler.TestRescheduleBgpSpeaker.test_reschedule_bgp_speaker_bound_to_down_dragent --- Captured traceback: ~~~ b'Traceback (most recent call last):' b' File "/home/jamespage/src/openstack/neutron-dynamic-routing/.tox/py37/lib/python3.7/site-packages/neutron/tests/base.py", line 151, in func' b'return f(self, *args, **kwargs)' b' File "/home/jamespage/src/openstack/neutron-dynamic-routing/neutron_dynamic_routing/tests/unit/services/bgp/scheduler/test_bgp_dragent_scheduler.py", line 333, in test_reschedule_bgp_speaker_bound_to_down_dragent' b'self.assertEqual(binds[0].agent_id, agents[1].id)' b' File "/home/jamespage/src/openstack/neutron-dynamic-routing/.tox/py37/lib/python3.7/site-packages/testtools/testcase.py", line 411, in assertEqual' b'self.assertThat(observed, matcher, message)' b' File "/home/jamespage/src/openstack/neutron-dynamic-routing/.tox/py37/lib/python3.7/site-packages/testtools/testcase.py", line 498, in assertThat' b'raise mismatch_error' b'testtools.matchers._impl.MismatchError: !=:' b"reference = '1129a824-aa3f-4a8a-aba3-62fccf1b4d12'" b"actual= 'ed467789-07e8-41fd-9cf2-fbcb18e1fdd7'" b'' b'' This is reproducable by directly installing the neutron from git into the virtualenv: pip install --upgrade git+https://github.com/openstack/neutron.git#egg=neutron ** Affects: neutron Importance: Undecided Status: New ** Affects: neutron-dynamic-routing (Ubuntu) Importance: Medium Status: Triaged ** Also affects: neutron Importance: Undecided Status: New ** Summary changed: - Unit test failures with master branch of neutron + neutron-dynamic-routing: unit test failures
[Yahoo-eng-team] [Bug 1794564] Re: Apparmor denies /usr/bin/nova-compute access to /proc/loadavg on openstack hypervisor show
** Changed in: charm-nova-compute Status: New => Triaged ** Changed in: nova Status: New => Invalid ** Changed in: charm-nova-compute Importance: Undecided => Medium ** Changed in: charm-nova-compute Assignee: (unassigned) => James Page (james-page) ** Changed in: charm-nova-compute Status: Triaged => In Progress -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1794564 Title: Apparmor denies /usr/bin/nova-compute access to /proc/loadavg on openstack hypervisor show Status in OpenStack nova-compute charm: In Progress Status in OpenStack Compute (nova): Invalid Bug description: On Xenial-Queens cloud, I'm seeing failure with nova-compute 17.0.5-0ubuntu1~cloud0 package unable to run uptime due to a failure to read /proc/loadavg. Kernel log entries: [4726259.738185] audit: type=1400 audit(1537977315.312:59959): apparmor="DENIED" operation="open" profile="/usr/bin/nova-compute" name="/proc/loadavg" pid=1958757 comm="uptime" requested_mask="r" denied_mask="r" fsuid=64060 ouid=0 [4726265.862186] audit: type=1400 audit(1537977321.436:59960): apparmor="DENIED" operation="open" profile="/usr/bin/nova-compute" name="/proc/loadavg" pid=1959961 comm="uptime" requested_mask="r" denied_mask="r" fsuid=64060 ouid=0 This happens when running "openstack hypervisor show " with AppArmor in enforce mode. this read access to /proc/loadavg should be added to apparmor profiles for the nova-compute package. To manage notifications about this bug go to: https://bugs.launchpad.net/charm-nova-compute/+bug/1794564/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1751396] Re: DVR: Inter Tenant Traffic between two networks and connected through a shared network not reachable with DVR routers
This bug was fixed in the package neutron - 2:13.0.1-0ubuntu1~cloud0 --- neutron (2:13.0.1-0ubuntu1~cloud0) bionic-rocky; urgency=medium . * New upstream release for the Ubuntu Cloud Archive. . neutron (2:13.0.1-0ubuntu1) cosmic; urgency=medium . * New stable point release for OpenStack Rocky. * d/p/revert-dvr-add-error-handling.patch: Cherry-picked from upstream to revert DVR regressions (LP: #1751396) * d/p/revert-dvr-inter-tenant.patch: Cherry-picked from upstream to revert DVR regression (LP: #1783654). ** Changed in: cloud-archive Status: Fix Committed => Fix Released -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1751396 Title: DVR: Inter Tenant Traffic between two networks and connected through a shared network not reachable with DVR routers Status in Ubuntu Cloud Archive: Fix Released Status in Ubuntu Cloud Archive pike series: Invalid Status in Ubuntu Cloud Archive queens series: Fix Committed Status in Ubuntu Cloud Archive rocky series: Fix Released Status in neutron: Fix Released Status in neutron package in Ubuntu: Fix Released Status in neutron source package in Artful: Invalid Status in neutron source package in Bionic: Fix Committed Status in neutron source package in Cosmic: Fix Released Bug description: Inter Tenant Traffic between Two Tenants on two different private networks connected through a common shared network (created by Admin) is not route able through DVR routers Steps to reproduce it: (NOTE: No external, just shared network) This is only reproducable in Multinode scenario. ( 1 Controller - 2 compute ). Make sure that the two VMs are isolated in two different computes. openstack network create --share shared_net openstack subnet create shared_net_sn --network shared_net --subnet- range 172.168.10.0/24 openstack network create net_A openstack subnet create net_A_sn --network net_A --subnet-range 10.1.0.0/24 openstack network create net_B openstack subnet create net_B_sn --network net_B --subnet-range 10.2.0.0/24 openstack router create router_A openstack port create --network=shared_net --fixed-ip subnet=shared_net_sn,ip-address=172.168.10.20 port_router_A_shared_net openstack router add port router_A port_router_A_shared_net openstack router add subnet router_A net_A_sn openstack router create router_B openstack port create --network=shared_net --fixed-ip subnet=shared_net_sn,ip-address=172.168.10.30 port_router_B_shared_net openstack router add port router_B port_router_B_shared_net openstack router add subnet router_B net_B_sn openstack server create server_A --flavor m1.tiny --image cirros --nic net-id=net_A openstack server create server_B --flavor m1.tiny --image cirros --nic net-id=net_B Add static routes to the router. openstack router set router_A --route destination=10.1.0.0/24,gateway=172.168.10.20 openstack router set router_B --route destination=10.2.0.0/24,gateway=172.168.10.30 ``` Ping from one instance to the other times out Ubuntu SRU details: --- [Impact] See above [Test Case] Deploy OpenStack with dvr enabled and then follow the steps above. [Regression Potential] The patches that are backported have already landed upstream in the corresponding stable branches, helping to minimize any regression potential. To manage notifications about this bug go to: https://bugs.launchpad.net/cloud-archive/+bug/1751396/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp