[Yahoo-eng-team] [Bug 1866160] [NEW] Update security group failed with the same stateful data
Public bug reported: With the merge of the stateless support patchset(https://review.opendev.org/#/c/572767/48). We found a failing scenario that the stateful configuration can't be updated when updating with the same stateful configuration. To reproduce: - Create a security group with stateful False(True) - Add this security group to the port - Update the security group to be stateful False(True) The update will fail because the security group is used by the port. The security groups should be updatable with the same stateful data. ** Affects: neutron Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1866160 Title: Update security group failed with the same stateful data Status in neutron: New Bug description: With the merge of the stateless support patchset(https://review.opendev.org/#/c/572767/48). We found a failing scenario that the stateful configuration can't be updated when updating with the same stateful configuration. To reproduce: - Create a security group with stateful False(True) - Add this security group to the port - Update the security group to be stateful False(True) The update will fail because the security group is used by the port. The security groups should be updatable with the same stateful data. To manage notifications about this bug go to: https://bugs.launchpad.net/neutron/+bug/1866160/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1866139] [NEW] GARP not sent on provider network after live migration
Public bug reported: Using Rocky, with OVS. Live migrated a VM on regular VLAN based provider network. Network connectivity was stopped, no GARP packets observed on tcpdump. Things started working after VM initiated traffic, causing MAC to be relearned. Looking at the code, send_ip_addr_adv_notif(), in ip_lib.py is responsible for using arping utility to send out a GARP. But this is only referenced in l3-agent code. This is a provider network. No routers, no floating IPs. I see this very old bug in OVN: https://bugs.launchpad.net/networking-ovn/+bug/1545897 But we are not using OVN, and that fix was fixed in OVN code itself. This is Openstack with OVS agent. How is live migration and GARP handled for fixed IPs? ** Affects: neutron Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1866139 Title: GARP not sent on provider network after live migration Status in neutron: New Bug description: Using Rocky, with OVS. Live migrated a VM on regular VLAN based provider network. Network connectivity was stopped, no GARP packets observed on tcpdump. Things started working after VM initiated traffic, causing MAC to be relearned. Looking at the code, send_ip_addr_adv_notif(), in ip_lib.py is responsible for using arping utility to send out a GARP. But this is only referenced in l3-agent code. This is a provider network. No routers, no floating IPs. I see this very old bug in OVN: https://bugs.launchpad.net/networking-ovn/+bug/1545897 But we are not using OVN, and that fix was fixed in OVN code itself. This is Openstack with OVS agent. How is live migration and GARP handled for fixed IPs? To manage notifications about this bug go to: https://bugs.launchpad.net/neutron/+bug/1866139/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1866072] Re: TypeError: 'TestOpenStackClient' object is not subscriptable
Reviewed: https://review.opendev.org/711211 Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=cbffac7df1deb106876b4fba1d481ce70bda9302 Submitter: Zuul Branch:master commit cbffac7df1deb106876b4fba1d481ce70bda9302 Author: Lee Yarwood Date: Wed Mar 4 11:56:53 2020 + functional: Avoid race and fix use of self.api within test_bug_1831771 This test would previously only attempt to invoke a race between instance.save(expected_task_state=task_states.SPAWNING) and a parallel attempt to delete an instance when the instance also has a vm_state of ACTIVE and task_state of None. However vm_state and task_state would often be different within the test resulting in no attempt to invoke the test being made. As instance.save is only called with expected_task_state set to task_states.SPAWNING by _unshelve_instance and _build_and_run_instance we should just check for this and avoid any state races within the test. Additionally when attempting to invoke the race this test would call _wait_for_server_parameter and provide self.api. This change removes this argument as since I8c96b337f32148f8f5899c9b87af331b1fa41424 this is no longer required and will result in a `TypeError: 'TestOpenStackClient' object is not subscriptable` error. Closes-Bug: #1866072 Change-Id: I36da36cc5b099174eece0dfba29485fc20b2867b ** Changed in: nova Status: In Progress => Fix Released -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1866072 Title: TypeError: 'TestOpenStackClient' object is not subscriptable Status in OpenStack Compute (nova): Fix Released Bug description: Description === nova.tests.functional.regressions.test_bug_1831771.TestDelete.test_delete_during_create is often failing with the following trace: 2020-03-03 09:32:17.177641 | ubuntu-bionic | Traceback (most recent call last): 2020-03-03 09:32:17.177668 | ubuntu-bionic | File "/home/zuul/src/opendev.org/openstack/nova/nova/compute/manager.py", line 2196, in _do_build_and_run_instance 2020-03-03 09:32:17.177688 | ubuntu-bionic | filter_properties, request_spec) 2020-03-03 09:32:17.177707 | ubuntu-bionic | File "/home/zuul/src/opendev.org/openstack/nova/nova/compute/manager.py", line 2496, in _build_and_run_instance 2020-03-03 09:32:17.177726 | ubuntu-bionic | instance.save(expected_task_state=task_states.SPAWNING) 2020-03-03 09:32:17.177745 | ubuntu-bionic | File "/home/zuul/src/opendev.org/openstack/nova/nova/tests/functional/regressions/test_bug_1831771.py", line 67, in wrap_save 2020-03-03 09:32:17.16 | ubuntu-bionic | delete_race(instance) 2020-03-03 09:32:17.177797 | ubuntu-bionic | File "/home/zuul/src/opendev.org/openstack/nova/nova/tests/functional/regressions/test_bug_1831771.py", line 37, in delete_race 2020-03-03 09:32:17.177828 | ubuntu-bionic | {'OS-EXT-STS:task_state': task_states.DELETING}, 2020-03-03 09:32:17.177864 | ubuntu-bionic | File "/home/zuul/src/opendev.org/openstack/nova/nova/tests/functional/integrated_helpers.py", line 83, in _wait_for_server_parameter 2020-03-03 09:32:17.177895 | ubuntu-bionic | server = api.get_server(server['id']) 2020-03-03 09:32:17.177929 | ubuntu-bionic | TypeError: 'TestOpenStackClient' object is not subscriptable This appears to be as a result of https://review.opendev.org/#/c/697694/ *and* a race within the test resulting in the delete_race method above not always being used with each run of the test. Steps to reproduce == * $ tox -e functional-py36 nova.tests.functional.regressions.test_bug_1831771.TestDelete.test_delete_during_create Expected result === delete_race should always run and the test should pass. Actual result = delete_race either isn't called with the test then passing or it is and the test fails. Environment === 1. Exact version of OpenStack you are running. See the following list for all releases: http://docs.openstack.org/releases/ master - b3e14931d6aac6ee5776ce1e6974c75a5a6b1823 2. Which hypervisor did you use? (For example: Libvirt + KVM, Libvirt + XEN, Hyper-V, PowerKVM, ...) What's the version of that? N/A 2. Which storage type did you use? (For example: Ceph, LVM, GPFS, ...) What's the version of that? N/A 3. Which networking type did you use? (For example: nova-network, Neutron with OpenVSwitch, ...) N/A Logs & Configs == https://zuul.opendev.org/t/openstack/build/b781ccf934894bf89cdeb13f58de1c5f/log /job-output.txt#4056 To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1866072/+subscriptions --
[Yahoo-eng-team] [Bug 1866129] [NEW] dnsmasq_version_supported() sometimes throws an exception
Public bug reported: I saw this traceback recently in https://review.opendev.org/#/c/710460/ but it seems related to the recent change https://review.opendev.org/#/c/704436/ (DHCPv6 - Use addr6_list in dnsmasq) Traceback (most recent call last): File "/home/zuul/src/opendev.org/openstack/neutron/neutron/tests/base.py", line 182, in func return f(self, *args, **kwargs) File "/home/zuul/src/opendev.org/openstack/neutron/neutron/tests/functional/sanity/test_sanity.py", line 32, in test_dnsmasq_version checks.dnsmasq_version_supported() File "/home/zuul/src/opendev.org/openstack/neutron/neutron/cmd/sanity/checks.py", line 238, in dnsmasq_version_supported if (cfg.CONF.dnsmasq_enable_addr6_list is True and File "/home/zuul/src/opendev.org/openstack/neutron/.tox/dsvm-functional/lib/python3.6/site-packages/oslo_config/cfg.py", line 2209, in __getattr__ raise NoSuchOptError(name) oslo_config.cfg.NoSuchOptError: no such option dnsmasq_enable_addr6_list in group [DEFAULT] I think the SanityTestCase class needs to register the DNSMASQ_OPTS. Don't know why it's not completely breaking the gate but think it's an easy fix. ** Affects: neutron Importance: Medium Assignee: Brian Haley (brian-haley) Status: New ** Tags: l3-ipam-dhcp -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1866129 Title: dnsmasq_version_supported() sometimes throws an exception Status in neutron: New Bug description: I saw this traceback recently in https://review.opendev.org/#/c/710460/ but it seems related to the recent change https://review.opendev.org/#/c/704436/ (DHCPv6 - Use addr6_list in dnsmasq) Traceback (most recent call last): File "/home/zuul/src/opendev.org/openstack/neutron/neutron/tests/base.py", line 182, in func return f(self, *args, **kwargs) File "/home/zuul/src/opendev.org/openstack/neutron/neutron/tests/functional/sanity/test_sanity.py", line 32, in test_dnsmasq_version checks.dnsmasq_version_supported() File "/home/zuul/src/opendev.org/openstack/neutron/neutron/cmd/sanity/checks.py", line 238, in dnsmasq_version_supported if (cfg.CONF.dnsmasq_enable_addr6_list is True and File "/home/zuul/src/opendev.org/openstack/neutron/.tox/dsvm-functional/lib/python3.6/site-packages/oslo_config/cfg.py", line 2209, in __getattr__ raise NoSuchOptError(name) oslo_config.cfg.NoSuchOptError: no such option dnsmasq_enable_addr6_list in group [DEFAULT] I think the SanityTestCase class needs to register the DNSMASQ_OPTS. Don't know why it's not completely breaking the gate but think it's an easy fix. To manage notifications about this bug go to: https://bugs.launchpad.net/neutron/+bug/1866129/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1863009] Re: os-deferred-delete restore server API policy is allowed for everyone even policy defaults is admin_or_owner
Reviewed: https://review.opendev.org/707457 Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=f83c591e30e5283af3bd6b05b7bd041c83d5c20f Submitter: Zuul Branch:master commit f83c591e30e5283af3bd6b05b7bd041c83d5c20f Author: Ghanshyam Mann Date: Wed Feb 12 13:28:40 2020 -0600 Fix os-os-deferred-delete policy to be admin_or_owner os-deferred-delete restore server API policy is default to admin_or_owner[1] but API is allowed for everyone. We can see the test trying with other project context can access the API - https://review.opendev.org/#/c/707455/ This is because API does not pass the server project_id in policy target[2] and if no target is passed then, policy.py add the default targets which is nothing but context.project_id (allow for everyone who try to access)[3] This commit fix this policy by passing the server's project_id in policy target. Closes-bug: #1863009 [1] https://github.com/openstack/nova/blob/1fcd74730d343b7cee12a0a50ea537dc4ff87f65/nova/policies/deferred_delete.py#L27 [2] https://github.com/openstack/nova/blob/1fcd74730d343b7cee12a0a50ea537dc4ff87f65/nova/api/openstack/compute/deferred_delete.py#L38 [3] https://github.com/openstack/nova/blob/c16315165ce307c605cf4b608b2df3aa06f46982/nova/policy.py#L191 Change-Id: Ib05501b678d0b58bbd9e77cd5d79a9b6ef661497 ** Changed in: nova Status: In Progress => Fix Released -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1863009 Title: os-deferred-delete restore server API policy is allowed for everyone even policy defaults is admin_or_owner Status in OpenStack Compute (nova): Fix Released Bug description: os-deferred-delete restore server API policy is default to admin_or_owner[1] but API is allowed for everyone. We can see the test trying with other project context can access the API - https://review.opendev.org/#/c/707455/ This is because API does not pass the server project_id in policy target - https://github.com/openstack/nova/blob/1fcd74730d343b7cee12a0a50ea537dc4ff87f65/nova/api/openstack/compute/deferred_delete.py#L38 and if no target is passed then, policy.py add the default targets which is nothing but context.project_id (allow for everyone try to access) - https://github.com/openstack/nova/blob/c16315165ce307c605cf4b608b2df3aa06f46982/nova/policy.py#L191 [1] - https://github.com/openstack/nova/blob/1fcd74730d343b7cee12a0a50ea537dc4ff87f65/nova/policies/deferred_delete.py#L27 To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1863009/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1866106] Re: Can't set "pointer_model = None" in nova.conf
** Changed in: oslo.config Status: New => Won't Fix ** Tags added: config libvirt -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1866106 Title: Can't set "pointer_model = None" in nova.conf Status in OpenStack Compute (nova): In Progress Status in oslo.config: Won't Fix Bug description: Description === nova.conf includes option pointer_model. The help text in the config has 2 "possible values" sections (copied below) specifying either "None" or "" as correct values. Neither of these is accepted by Nova. Here are the error messages from nova-compute.log: 2020-03-03 11:05:17.233 228915 ERROR nova ConfigFileValueError: Value for option pointer_model is not valid: Valid values are [None, ps2mouse, usbtablet], but found '' 2020-03-03 11:06:24.761 229290 ERROR nova ConfigFileValueError: Value for option pointer_model is not valid: Valid values are [None, ps2mouse, usbtablet], but found 'None' # # Generic property to specify the pointer type. # # Input devices allow interaction with a graphical framebuffer. For # example to provide a graphic tablet for absolute cursor movement. # # If set, the 'hw_pointer_model' image property takes precedence over # this configuration option. # # Possible values: # # * None: Uses default behavior provided by drivers (mouse on PS2 for # libvirt x86) # * ps2mouse: Uses relative movement. Mouse connected by PS2 # * usbtablet: Uses absolute movement. Tablet connect by USB # # Related options: # # * usbtablet must be configured with VNC enabled or SPICE enabled and SPICE # agent disabled. When used with libvirt the instance mode should be # configured as HVM. # (string value) # Possible values: # - # ps2mouse - # usbtablet - #pointer_model = usbtablet Steps to reproduce == On an openstack hypervisor: 1. Edit nova.conf and change line "#pointer_model = usbtablet" to either "pointer_model = None" or "pointer_model = " 2. Restart nova-compute service 3. Tail nova-compute.log Expected result === Nova runs without errors and does not load the USB driver. Actual result = Nova throws the error described above. Environment === 1. Openstack version is Rocky: root@us01odc-p01-hv227:~# dpkg -l | grep nova ii nova-common 2:18.2.1-0ubuntu1~cloud4 all OpenStack Compute - common files ii nova-compute 2:18.2.1-0ubuntu1~cloud4 all OpenStack Compute - compute node base ii nova-compute-kvm 2:18.2.1-0ubuntu1~cloud4 all OpenStack Compute - compute node (KVM) ii nova-compute-libvirt 2:18.2.1-0ubuntu1~cloud4 all OpenStack Compute - compute node libvirt support ii python-nova 2:18.2.1-0ubuntu1~cloud4 all OpenStack Compute Python 2 libraries ii python-novaclient 2:11.0.0-0ubuntu1~cloud0 all client library for OpenStack Compute API - Python 2.7 2. Hypervisor: libvirt+KVM root@us01odc-p01-hv227:~# libvirtd --version libvirtd (libvirt) 4.0.0 root@us01odc-p01-hv227:~# kvm --version QEMU emulator version 2.11.1(Debian 1:2.11+dfsg-1ubuntu7.21) Copyright (c) 2003-2017 Fabrice Bellard and the QEMU Project developers 3. Storage type: local LVM root@us01odc-p01-hv227:~# lvm version LVM version: 2.02.176(2) (2017-11-03) Library version: 1.02.145 (2017-11-03) Driver version: 4.39.0 Configuration: ./configure --build=x86_64-linux-gnu --prefix=/usr --includedir=${prefix}/include --mandir=${prefix}/share/man --infodir=${prefix}/share/info --sysconfdir=/etc --localstatedir=/var --disable-silent-rules --libdir=${prefix}/lib/x86_64-linux-gnu --libexecdir=${prefix}/lib/x86_64-linux-gnu --runstatedir=/run --disable-maintainer-mode --disable-dependency-tracking --exec-prefix= --bindir=/bin --libdir=/lib/x86_64-linux-gnu --sbindir=/sbin --with-usrlibdir=/usr/lib/x86_64-linux-gnu --with-optimisation=-O2 --with-cache=internal --with-clvmd=corosync --with-cluster=internal --with-device-uid=0 --with-device-gid=6 --with-device-mode=0660 --with-default-pid-dir=/run --with-default-run-dir=/run/lvm --with-default-locking-dir=/run/lock/lvm --with-thin=internal --with-thin-check=/usr/sbin/thin_check --with-thin-dump=/usr/sbin/thin_dump --with-thin-repair=/usr/sbin/thin_repair --enable-applib --enable-blkid_wiping --enable-cmdlib --enable-cmirrord --enable-dmeventd --enable-dbus-service --enable-lvmetad --enable-lvmlockd-dlm --enable-lvmlockd-sanlock --enable-lvmpolld --enable-notify-dbus
[Yahoo-eng-team] [Bug 1866106] Re: Can't set "pointer_model = None" in nova.conf
The problem is in oslo.config I think right here: https://github.com/openstack/oslo.config/blob/20a7cee3e3019d60c4b367bb76922a1db41d1750/oslo_config/types.py#L142 That's coercing the value None to a string 'None' so it fails. According to Ben Nemec: (02:13:48 PM) bnemec: The only way for a config opt to have a None value is for that to be the default and for the opt to be unset. (02:14:14 PM) bnemec: So completely absent from the file, not something like "opt=" But that seems like a bug because I would think that code could be smarter about not coercing the value if the value is None, None is a valid choice and a default is set (so you need to override the default). ** Also affects: oslo.config Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1866106 Title: Can't set "pointer_model = None" in nova.conf Status in OpenStack Compute (nova): Confirmed Status in oslo.config: New Bug description: Description === nova.conf includes option pointer_model. The help text in the config has 2 "possible values" sections (copied below) specifying either "None" or "" as correct values. Neither of these is accepted by Nova. Here are the error messages from nova-compute.log: 2020-03-03 11:05:17.233 228915 ERROR nova ConfigFileValueError: Value for option pointer_model is not valid: Valid values are [None, ps2mouse, usbtablet], but found '' 2020-03-03 11:06:24.761 229290 ERROR nova ConfigFileValueError: Value for option pointer_model is not valid: Valid values are [None, ps2mouse, usbtablet], but found 'None' # # Generic property to specify the pointer type. # # Input devices allow interaction with a graphical framebuffer. For # example to provide a graphic tablet for absolute cursor movement. # # If set, the 'hw_pointer_model' image property takes precedence over # this configuration option. # # Possible values: # # * None: Uses default behavior provided by drivers (mouse on PS2 for # libvirt x86) # * ps2mouse: Uses relative movement. Mouse connected by PS2 # * usbtablet: Uses absolute movement. Tablet connect by USB # # Related options: # # * usbtablet must be configured with VNC enabled or SPICE enabled and SPICE # agent disabled. When used with libvirt the instance mode should be # configured as HVM. # (string value) # Possible values: # - # ps2mouse - # usbtablet - #pointer_model = usbtablet Steps to reproduce == On an openstack hypervisor: 1. Edit nova.conf and change line "#pointer_model = usbtablet" to either "pointer_model = None" or "pointer_model = " 2. Restart nova-compute service 3. Tail nova-compute.log Expected result === Nova runs without errors and does not load the USB driver. Actual result = Nova throws the error described above. Environment === 1. Openstack version is Rocky: root@us01odc-p01-hv227:~# dpkg -l | grep nova ii nova-common 2:18.2.1-0ubuntu1~cloud4 all OpenStack Compute - common files ii nova-compute 2:18.2.1-0ubuntu1~cloud4 all OpenStack Compute - compute node base ii nova-compute-kvm 2:18.2.1-0ubuntu1~cloud4 all OpenStack Compute - compute node (KVM) ii nova-compute-libvirt 2:18.2.1-0ubuntu1~cloud4 all OpenStack Compute - compute node libvirt support ii python-nova 2:18.2.1-0ubuntu1~cloud4 all OpenStack Compute Python 2 libraries ii python-novaclient 2:11.0.0-0ubuntu1~cloud0 all client library for OpenStack Compute API - Python 2.7 2. Hypervisor: libvirt+KVM root@us01odc-p01-hv227:~# libvirtd --version libvirtd (libvirt) 4.0.0 root@us01odc-p01-hv227:~# kvm --version QEMU emulator version 2.11.1(Debian 1:2.11+dfsg-1ubuntu7.21) Copyright (c) 2003-2017 Fabrice Bellard and the QEMU Project developers 3. Storage type: local LVM root@us01odc-p01-hv227:~# lvm version LVM version: 2.02.176(2) (2017-11-03) Library version: 1.02.145 (2017-11-03) Driver version: 4.39.0 Configuration: ./configure --build=x86_64-linux-gnu --prefix=/usr --includedir=${prefix}/include --mandir=${prefix}/share/man --infodir=${prefix}/share/info --sysconfdir=/etc --localstatedir=/var --disable-silent-rules --libdir=${prefix}/lib/x86_64-linux-gnu --libexecdir=${prefix}/lib/x86_64-linux-gnu --runstatedir=/run --disable-maintainer-mode --disable-dependency-tracking --exec-prefix= --bindir=/bin --libdir=/lib/x86_64-linux-gnu --sbindir=/sbin
[Yahoo-eng-team] [Bug 1866106] [NEW] Can't set "pointer_model = None" in nova.conf
Public bug reported: Description === nova.conf includes option pointer_model. The help text in the config has 2 "possible values" sections (copied below) specifying either "None" or "" as correct values. Neither of these is accepted by Nova. Here are the error messages from nova-compute.log: 2020-03-03 11:05:17.233 228915 ERROR nova ConfigFileValueError: Value for option pointer_model is not valid: Valid values are [None, ps2mouse, usbtablet], but found '' 2020-03-03 11:06:24.761 229290 ERROR nova ConfigFileValueError: Value for option pointer_model is not valid: Valid values are [None, ps2mouse, usbtablet], but found 'None' # # Generic property to specify the pointer type. # # Input devices allow interaction with a graphical framebuffer. For # example to provide a graphic tablet for absolute cursor movement. # # If set, the 'hw_pointer_model' image property takes precedence over # this configuration option. # # Possible values: # # * None: Uses default behavior provided by drivers (mouse on PS2 for # libvirt x86) # * ps2mouse: Uses relative movement. Mouse connected by PS2 # * usbtablet: Uses absolute movement. Tablet connect by USB # # Related options: # # * usbtablet must be configured with VNC enabled or SPICE enabled and SPICE # agent disabled. When used with libvirt the instance mode should be # configured as HVM. # (string value) # Possible values: # - # ps2mouse - # usbtablet - #pointer_model = usbtablet Steps to reproduce == On an openstack hypervisor: 1. Edit nova.conf and change line "#pointer_model = usbtablet" to either "pointer_model = None" or "pointer_model = " 2. Restart nova-compute service 3. Tail nova-compute.log Expected result === Nova runs without errors and does not load the USB driver. Actual result = Nova throws the error described above. Environment === 1. Openstack version is Rocky: root@us01odc-p01-hv227:~# dpkg -l | grep nova ii nova-common 2:18.2.1-0ubuntu1~cloud4 all OpenStack Compute - common files ii nova-compute 2:18.2.1-0ubuntu1~cloud4 all OpenStack Compute - compute node base ii nova-compute-kvm 2:18.2.1-0ubuntu1~cloud4 all OpenStack Compute - compute node (KVM) ii nova-compute-libvirt 2:18.2.1-0ubuntu1~cloud4 all OpenStack Compute - compute node libvirt support ii python-nova 2:18.2.1-0ubuntu1~cloud4 all OpenStack Compute Python 2 libraries ii python-novaclient 2:11.0.0-0ubuntu1~cloud0 all client library for OpenStack Compute API - Python 2.7 2. Hypervisor: libvirt+KVM root@us01odc-p01-hv227:~# libvirtd --version libvirtd (libvirt) 4.0.0 root@us01odc-p01-hv227:~# kvm --version QEMU emulator version 2.11.1(Debian 1:2.11+dfsg-1ubuntu7.21) Copyright (c) 2003-2017 Fabrice Bellard and the QEMU Project developers 3. Storage type: local LVM root@us01odc-p01-hv227:~# lvm version LVM version: 2.02.176(2) (2017-11-03) Library version: 1.02.145 (2017-11-03) Driver version: 4.39.0 Configuration: ./configure --build=x86_64-linux-gnu --prefix=/usr --includedir=${prefix}/include --mandir=${prefix}/share/man --infodir=${prefix}/share/info --sysconfdir=/etc --localstatedir=/var --disable-silent-rules --libdir=${prefix}/lib/x86_64-linux-gnu --libexecdir=${prefix}/lib/x86_64-linux-gnu --runstatedir=/run --disable-maintainer-mode --disable-dependency-tracking --exec-prefix= --bindir=/bin --libdir=/lib/x86_64-linux-gnu --sbindir=/sbin --with-usrlibdir=/usr/lib/x86_64-linux-gnu --with-optimisation=-O2 --with-cache=internal --with-clvmd=corosync --with-cluster=internal --with-device-uid=0 --with-device-gid=6 --with-device-mode=0660 --with-default-pid-dir=/run --with-default-run-dir=/run/lvm --with-default-locking-dir=/run/lock/lvm --with-thin=internal --with-thin-check=/usr/sbin/thin_check --with-thin-dump=/usr/sbin/thin_dump --with-thin-repair=/usr/sbin/thin_repair --enable-applib --enable-blkid_wiping --enable-cmdlib --enable-cmirrord --enable-dmeventd --enable-dbus-service --enable-lvmetad --enable-lvmlockd-dlm --enable-lvmlockd-sanlock --enable-lvmpolld --enable-notify-dbus --enable-pkgconfig --enable-readline --enable-udev_rules --enable-udev_sync 3. Networking: neutron Logs & Configs == nova-compute.log: 2020-03-04 10:48:34.774 116036 ERROR nova 2020-03-04 10:48:36.616 116046 INFO os_vif [-] Loaded VIF plugins: ovs, linux_bridge 2020-03-04 10:48:36.688 116046 WARNING oslo_config.cfg [req-20c7f0dd-706b-41fa-b374-0b264184c2c4 - - - - -] Deprecated: Option "use_neutron" from group "DEFAULT" is deprecated for removal ( nova-network is deprecated, as are any related configuration options. ). Its value may be
[Yahoo-eng-team] [Bug 1865281] Re: Remove deprecation warning when mixing new and old engine facade
Reviewed: https://review.opendev.org/710557 Committed: https://git.openstack.org/cgit/openstack/neutron-lib/commit/?id=77b761dfe438f7a2d61dac3337b65d0f9d85c20b Submitter: Zuul Branch:master commit 77b761dfe438f7a2d61dac3337b65d0f9d85c20b Author: Rodolfo Alonso Hernandez Date: Sat Feb 29 12:33:08 2020 + Remove warning message when using old and new engine facade The aim of this patch is to reduce the log pollution, specially in testing. The warning message is now a LOG.debug message. Change-Id: Ief9df3d23a192a9e374819ffc8869c4f1966bfd4 Closes-Bug: #1865281 ** Changed in: neutron Status: In Progress => Fix Released -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1865281 Title: Remove deprecation warning when mixing new and old engine facade Status in neutron: Fix Released Bug description: In neutron-lib context, if both the new and the old engine facade are used, a warning message is written. To avoid log pollution (specially in tests), I suggest removing this warning. BTW, we still need to finish this BP. To manage notifications about this bug go to: https://bugs.launchpad.net/neutron/+bug/1865281/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1803745] Re: neutron-dynamic-routing: unit test failures with master branch of neutron
Updating status on old gate-failure bug ** Changed in: neutron Status: In Progress => Fix Released -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1803745 Title: neutron-dynamic-routing: unit test failures with master branch of neutron Status in neutron: Fix Released Status in neutron-dynamic-routing package in Ubuntu: Triaged Bug description: neutron-dynamic-routing unit tests currently fail with the tip of the master branch of neutron; project has neutron is its requirements.txt however the latest release version on pypi is from the rocky release. == Failed 3 tests - output below: == neutron_dynamic_routing.tests.unit.services.bgp.scheduler.test_bgp_dragent_scheduler.TestRescheduleBgpSpeaker.test_no_schedule_with_non_available_dragent - Captured traceback: ~~~ b'Traceback (most recent call last):' b' File "/home/jamespage/src/openstack/neutron-dynamic-routing/.tox/py37/lib/python3.7/site-packages/neutron/tests/base.py", line 151, in func' b'return f(self, *args, **kwargs)' b' File "/home/jamespage/src/openstack/neutron-dynamic-routing/neutron_dynamic_routing/tests/unit/services/bgp/scheduler/test_bgp_dragent_scheduler.py", line 341, in test_no_schedule_with_non_available_dragent' b'self.assertEqual(binds, [])' b' File "/home/jamespage/src/openstack/neutron-dynamic-routing/.tox/py37/lib/python3.7/site-packages/testtools/testcase.py", line 411, in assertEqual' b'self.assertThat(observed, matcher, message)' b' File "/home/jamespage/src/openstack/neutron-dynamic-routing/.tox/py37/lib/python3.7/site-packages/testtools/testcase.py", line 498, in assertThat' b'raise mismatch_error' b'testtools.matchers._impl.MismatchError: !=:' b"reference = []" b'actual= []' b'' b'' neutron_dynamic_routing.tests.unit.services.bgp.scheduler.test_bgp_dragent_scheduler.TestRescheduleBgpSpeaker.test_schedule_unbind_bgp_speaker -- Captured traceback: ~~~ b'Traceback (most recent call last):' b' File "/home/jamespage/src/openstack/neutron-dynamic-routing/.tox/py37/lib/python3.7/site-packages/neutron/tests/base.py", line 151, in func' b'return f(self, *args, **kwargs)' b' File "/home/jamespage/src/openstack/neutron-dynamic-routing/neutron_dynamic_routing/tests/unit/services/bgp/scheduler/test_bgp_dragent_scheduler.py", line 349, in test_schedule_unbind_bgp_speaker' b'self.assertEqual(binds, [])' b' File "/home/jamespage/src/openstack/neutron-dynamic-routing/.tox/py37/lib/python3.7/site-packages/testtools/testcase.py", line 411, in assertEqual' b'self.assertThat(observed, matcher, message)' b' File "/home/jamespage/src/openstack/neutron-dynamic-routing/.tox/py37/lib/python3.7/site-packages/testtools/testcase.py", line 498, in assertThat' b'raise mismatch_error' b'testtools.matchers._impl.MismatchError: !=:' b"reference = []" b'actual= []' b'' b'' neutron_dynamic_routing.tests.unit.services.bgp.scheduler.test_bgp_dragent_scheduler.TestRescheduleBgpSpeaker.test_reschedule_bgp_speaker_bound_to_down_dragent --- Captured traceback: ~~~ b'Traceback (most recent call last):' b' File "/home/jamespage/src/openstack/neutron-dynamic-routing/.tox/py37/lib/python3.7/site-packages/neutron/tests/base.py", line 151, in func' b'return f(self, *args, **kwargs)' b' File "/home/jamespage/src/openstack/neutron-dynamic-routing/neutron_dynamic_routing/tests/unit/services/bgp/scheduler/test_bgp_dragent_scheduler.py", line 333, in test_reschedule_bgp_speaker_bound_to_down_dragent' b'self.assertEqual(binds[0].agent_id, agents[1].id)' b' File "/home/jamespage/src/openstack/neutron-dynamic-routing/.tox/py37/lib/python3.7/site-packages/testtools/testcase.py", line 411, in assertEqual' b'self.assertThat(observed, matcher, message)' b' File "/home/jamespage/src/openstack/neutron-dynamic-routing/.tox/py37/lib/python3.7/site-packages/testtools/testcase.py", line 498, in assertThat' b'raise mismatch_error' b'testtools.matchers._impl.MismatchError: !=:' b"reference = '1129a824-aa3f-4a8a-aba3-62fccf1b4d12'" b"actual=
[Yahoo-eng-team] [Bug 1843413] Re: neutron-tempest-iptables_hybrid-fedora job is failing with RETRY_LIMIT constantly
That was fixed some time ago, updating status ** Changed in: neutron Status: Confirmed => Fix Released -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1843413 Title: neutron-tempest-iptables_hybrid-fedora job is failing with RETRY_LIMIT constantly Status in neutron: Fix Released Bug description: It happens since at least couple of days. Example of failure https://db6e53ffb305ec0848f1-fa8c367d29960a6ba7cc5d4b52d5b2a7.ssl.cf2.rackcdn.com/638641/13/check /neutron-tempest-iptables_hybrid-fedora/2d1ff9a/job-output.txt It is failing on: 2019-09-10 08:35:42.868413 | 2019-09-10 08:35:42.868692 | PLAY [tempest] 2019-09-10 08:35:42.972968 | 2019-09-10 08:35:42.973254 | TASK [fetch-subunit-output : Find stestr or testr executable] 2019-09-10 08:35:44.218559 | controller | ERROR 2019-09-10 08:35:44.418916 | controller | { 2019-09-10 08:35:44.419116 | controller | "msg": "non-zero return code", 2019-09-10 08:35:44.419227 | controller | "rc": 1 2019-09-10 08:35:44.419348 | controller | } To manage notifications about this bug go to: https://bugs.launchpad.net/neutron/+bug/1843413/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1866087] [NEW] [OVN Octavia Provider] Deleting of listener fails
Public bug reported: Sometimes, while removing a listener the command fails with log below. The problem has been recently found on OVN octavia provider gate. Mar 04 14:44:18 mjozefcz-ovn-provider-master devstack@o-api.service[30146]: DEBUG ovsdbapp.backend.ovs_idl.transaction [-] Running txn n=1 command(idx=0): DbRemoveCommand(table=Load_Balancer, record=86c3b5dc-5ec7-48c0-9fe7-d67fc78ef084, co Mar 04 14:44:18 mjozefcz-ovn-provider-master devstack@o-api.service[30146]: DEBUG ovsdbapp.backend.ovs_idl.transaction [-] Running txn n=1 command(idx=1): LbDelCommand(lb=86c3b5dc-5ec7-48c0-9fe7-d67fc78ef084, vip=None, if_exists=False) {{( Mar 04 14:44:18 mjozefcz-ovn-provider-master devstack@o-api.service[30146]: DEBUG ovsdbapp.backend.ovs_idl.transaction [-] Running txn n=1 command(idx=2): DbClearCommand(table=Load_Balancer, record=86c3b5dc-5ec7-48c0-9fe7-d67fc78ef084, col Mar 04 14:44:18 mjozefcz-ovn-provider-master devstack@o-api.service[30146]: ERROR ovsdbapp.backend.ovs_idl.transaction [-] Traceback (most recent call last): Mar 04 14:44:18 mjozefcz-ovn-provider-master devstack@o-api.service[30146]: File "/usr/local/lib/python3.6/dist-packages/ovsdbapp/backend/ovs_idl/connection.py", line 122, in run Mar 04 14:44:18 mjozefcz-ovn-provider-master devstack@o-api.service[30146]: txn.results.put(txn.do_commit()) Mar 04 14:44:18 mjozefcz-ovn-provider-master devstack@o-api.service[30146]: File "/usr/local/lib/python3.6/dist-packages/ovsdbapp/backend/ovs_idl/transaction.py", line 86, in do_commit Mar 04 14:44:18 mjozefcz-ovn-provider-master devstack@o-api.service[30146]: command.run_idl(txn) Mar 04 14:44:18 mjozefcz-ovn-provider-master devstack@o-api.service[30146]: File "/usr/local/lib/python3.6/dist-packages/ovsdbapp/backend/ovs_idl/command.py", line 182, in run_idl Mar 04 14:44:18 mjozefcz-ovn-provider-master devstack@o-api.service[30146]: record = self.api.lookup(self.table, self.record) Mar 04 14:44:18 mjozefcz-ovn-provider-master devstack@o-api.service[30146]: File "/usr/local/lib/python3.6/dist-packages/ovsdbapp/backend/ovs_idl/__init__.py", line 107, in lookup Mar 04 14:44:18 mjozefcz-ovn-provider-master devstack@o-api.service[30146]: return self._lookup(table, record) Mar 04 14:44:18 mjozefcz-ovn-provider-master devstack@o-api.service[30146]: File "/usr/local/lib/python3.6/dist-packages/ovsdbapp/backend/ovs_idl/__init__.py", line 151, in _lookup Mar 04 14:44:18 mjozefcz-ovn-provider-master devstack@o-api.service[30146]: row = idlutils.row_by_value(self, rl.table, rl.column, record) Mar 04 14:44:18 mjozefcz-ovn-provider-master devstack@o-api.service[30146]: File "/usr/local/lib/python3.6/dist-packages/ovsdbapp/backend/ovs_idl/idlutils.py", line 65, in row_by_value Mar 04 14:44:18 mjozefcz-ovn-provider-master devstack@o-api.service[30146]: raise RowNotFound(table=table, col=column, match=match) Mar 04 14:44:18 mjozefcz-ovn-provider-master devstack@o-api.service[30146]: ovsdbapp.backend.ovs_idl.idlutils.RowNotFound: Cannot find Load_Balancer with name=86c3b5dc-5ec7-48c0-9fe7-d67fc78ef084 Mar 04 14:44:18 mjozefcz-ovn-provider-master devstack@o-api.service[30146] Looks like in this situation the LB had multiple protocols configured (TCP and UDP). While removing fist listener from the LB the one of created OVN LB rows needs to be deleted, but then driver wants to update the vip entries on it. That is not needed. ** Affects: neutron Importance: High Assignee: Maciej Jozefczyk (maciej.jozefczyk) Status: In Progress ** Tags: ovn-octavia-provider ** Changed in: neutron Assignee: (unassigned) => Maciej Jozefczyk (maciej.jozefczyk) ** Changed in: neutron Importance: Undecided => High ** Changed in: neutron Status: New => Confirmed ** Tags added: ovn-octavia-provider -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1866087 Title: [OVN Octavia Provider] Deleting of listener fails Status in neutron: In Progress Bug description: Sometimes, while removing a listener the command fails with log below. The problem has been recently found on OVN octavia provider gate. Mar 04 14:44:18 mjozefcz-ovn-provider-master devstack@o-api.service[30146]: DEBUG ovsdbapp.backend.ovs_idl.transaction [-] Running txn n=1 command(idx=0): DbRemoveCommand(table=Load_Balancer, record=86c3b5dc-5ec7-48c0-9fe7-d67fc78ef084, co Mar 04 14:44:18 mjozefcz-ovn-provider-master devstack@o-api.service[30146]: DEBUG ovsdbapp.backend.ovs_idl.transaction [-] Running txn n=1 command(idx=1): LbDelCommand(lb=86c3b5dc-5ec7-48c0-9fe7-d67fc78ef084, vip=None, if_exists=False) {{( Mar 04 14:44:18 mjozefcz-ovn-provider-master devstack@o-api.service[30146]: DEBUG ovsdbapp.backend.ovs_idl.transaction [-] Running txn n=1 command(idx=2): DbClearCommand(table=Load_Balancer, record=86c3b5dc-5ec7-48c0-9fe7-d67fc78ef084,
[Yahoo-eng-team] [Bug 1613423] Re: Mitaka + Trusty (kernel 3.13) not using apparmor capability by default, when it does, live migration doesn't work (/tmp/memfd-XXX can't be created)
QEMU BUG: https://bugs.launchpad.net/qemu/+bug/1626972 ** Changed in: libvirt (Ubuntu) Status: Fix Released => Incomplete -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1613423 Title: Mitaka + Trusty (kernel 3.13) not using apparmor capability by default, when it does, live migration doesn't work (/tmp/memfd-XXX can't be created) Status in OpenStack Compute (nova): Invalid Status in OpenStack Security Advisory: Invalid Status in libvirt package in Ubuntu: Incomplete Bug description: In my environment: Trusty (3.13) + JuJu (1.25) w/ latest charms + Kilo upgraded to Mitaka (already using non-tunnelled live migrations, after latest SRU to disable tunnelled live migrations) BUG #1 My compute nodes are NOT loading "apparmor" libvirt capability by default: inaddy@tkcompute01:~$ virsh capabilities | grep apparmor | echo $? 1 inaddy@tkcompute02:~$ virsh capabilities | grep apparmor | echo $? 1 inaddy@tkcompute03:~$ virsh capabilities | grep apparmor | echo $? 1 Because "libvirt" is loaded before apparmor profile is loaded and qemu.conf doesn't specify 'security_driver = "apparmor' in its file. If you try to add the security driver to the file, libvirt and nova- compute won't start because apparmor isn't started when they start. For trusty, apparmor is started as a legacy SYS-V init script, at the end of initialisation, causing this problem. After re-starting libvirt-bin service, apparmor starts being used: inaddy@tkcompute01:~$ sudo service libvirt-bin restart libvirt-bin stop/waiting libvirt-bin start/running, process 7031 inaddy@tkcompute01:~$ virsh capabilities | grep apparmor | echo $? 0 inaddy@tkcompute02:~$ sudo service libvirt-bin restart libvirt-bin stop/waiting libvirt-bin start/running, process 7031 inaddy@tkcompute02:~$ virsh capabilities | grep apparmor | echo $? 0 inaddy@tkcompute03:~$ sudo service libvirt-bin restart libvirt-bin stop/waiting libvirt-bin start/running, process 7031 inaddy@tkcompute03:~$ virsh capabilities | grep apparmor | echo $? 0 BUG #2 (after fixing BUG #1) And, when libvirt starts using apparmor, and creating apparmor profiles for every virtual machine created in the compute nodes, mitaka qemu (2.5) uses a fallback mechanism for creating shared memory for live-migrations. This fall back mechanism, on kernels 3.13 - that don't have memfd_create() system-call, try to create files on /tmp/ directory and fails.. causing live-migration not to work. Trusty with kernel 3.13 + Mitaka with qemu 2.5 + apparmor capability = can't live migrate. From qemu 2.5, logic is on : void *qemu_memfd_alloc(const char *name, size_t size, unsigned int seals, int *fd) { if (memfd_create)... ### only works with HWE kernels else ### 3.13 kernels, gets blocked by apparmor tmpdir = g_get_tmp_dir ... mfd = mkstemp(fname) } And you can see the errors: From the host trying to send the virtual machine: 2016-08-15 16:36:26.160 1974 ERROR nova.virt.libvirt.driver [req-0cac612b-8d53-4610-b773-d07ad6bacb91 691a581cfa7046278380ce82b1c38ddd 133ebc3585c041aebaead8c062cd6511 - - -] [instance: 2afa1131-bc8c-43d2-9c4a-962c1bf7723e] Migration operation has aborted 2016-08-15 16:36:26.248 1974 ERROR nova.virt.libvirt.driver [req-0cac612b-8d53-4610-b773-d07ad6bacb91 691a581cfa7046278380ce82b1c38ddd 133ebc3585c041aebaead8c062cd6511 - - -] [instance: 2afa1131-bc8c-43d2-9c4a-962c1bf7723e] Live Migration failure: internal error: unable to execute QEMU command 'migrate': Migration disabled: failed to allocate shared memory From the host trying to receive the virtual machine: Aug 15 16:36:19 tkcompute01 kernel: [ 1194.356794] type=1400 audit(1471289779.791:72): apparmor="STATUS" operation="profile_load" profile="unconfined" name="libvirt-2afa1131-bc8c-43d2-9c4a-962c1bf7723e" pid=12565 comm="apparmor_parser" Aug 15 16:36:19 tkcompute01 kernel: [ 1194.357048] type=1400 audit(1471289779.791:73): apparmor="STATUS" operation="profile_load" profile="unconfined" name="qemu_bridge_helper" pid=12565 comm="apparmor_parser" Aug 15 16:36:20 tkcompute01 kernel: [ 1194.877027] type=1400 audit(1471289780.311:74): apparmor="STATUS" operation="profile_replace" profile="unconfined" name="libvirt-2afa1131-bc8c-43d2-9c4a-962c1bf7723e" pid=12613 comm="apparmor_parser" Aug 15 16:36:20 tkcompute01 kernel: [ 1194.904407] type=1400 audit(1471289780.343:75): apparmor="STATUS" operation="profile_replace" profile="unconfined" name="qemu_bridge_helper" pid=12613 comm="apparmor_parser" Aug 15 16:36:20 tkcompute01 kernel: [ 1194.973064] type=1400 audit(1471289780.407:76): apparmor="DENIED" operation="mknod" profile="libvirt-2afa1131-bc8c-43d2-9c4a-962c1bf7723e" name="/tmp/memfd-tNpKSj" pid=12625
[Yahoo-eng-team] [Bug 1613423] Re: Mitaka + Trusty (kernel 3.13) not using apparmor capability by default, when it does, live migration doesn't work (/tmp/memfd-XXX can't be created)
Preferred way for this fix was: commit 0d34fbabc13891da41582b0823867dc5733fffef Author: Rafael David Tinoco Date: Mon Oct 24 15:35:03 2016 vhost: migration blocker only if shared log is used Commit 31190ed7 added a migration blocker in vhost_dev_init() to check if memfd would succeed. It is better if this blocker first checks if vhost backend requires shared log. This will avoid a situation where a blocker is added inappropriately (e.g. shared log allocation fails when vhost backend doesn't support it). Signed-off-by: Rafael David Tinoco Reviewed-by: Marc-André Lureau Reviewed-by: Michael S. Tsirkin Signed-off-by: Michael S. Tsirkin And accepted upstream. I'm closing this bug. ** Changed in: libvirt (Ubuntu) Status: New => Fix Released -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1613423 Title: Mitaka + Trusty (kernel 3.13) not using apparmor capability by default, when it does, live migration doesn't work (/tmp/memfd-XXX can't be created) Status in OpenStack Compute (nova): Invalid Status in OpenStack Security Advisory: Invalid Status in libvirt package in Ubuntu: Incomplete Bug description: In my environment: Trusty (3.13) + JuJu (1.25) w/ latest charms + Kilo upgraded to Mitaka (already using non-tunnelled live migrations, after latest SRU to disable tunnelled live migrations) BUG #1 My compute nodes are NOT loading "apparmor" libvirt capability by default: inaddy@tkcompute01:~$ virsh capabilities | grep apparmor | echo $? 1 inaddy@tkcompute02:~$ virsh capabilities | grep apparmor | echo $? 1 inaddy@tkcompute03:~$ virsh capabilities | grep apparmor | echo $? 1 Because "libvirt" is loaded before apparmor profile is loaded and qemu.conf doesn't specify 'security_driver = "apparmor' in its file. If you try to add the security driver to the file, libvirt and nova- compute won't start because apparmor isn't started when they start. For trusty, apparmor is started as a legacy SYS-V init script, at the end of initialisation, causing this problem. After re-starting libvirt-bin service, apparmor starts being used: inaddy@tkcompute01:~$ sudo service libvirt-bin restart libvirt-bin stop/waiting libvirt-bin start/running, process 7031 inaddy@tkcompute01:~$ virsh capabilities | grep apparmor | echo $? 0 inaddy@tkcompute02:~$ sudo service libvirt-bin restart libvirt-bin stop/waiting libvirt-bin start/running, process 7031 inaddy@tkcompute02:~$ virsh capabilities | grep apparmor | echo $? 0 inaddy@tkcompute03:~$ sudo service libvirt-bin restart libvirt-bin stop/waiting libvirt-bin start/running, process 7031 inaddy@tkcompute03:~$ virsh capabilities | grep apparmor | echo $? 0 BUG #2 (after fixing BUG #1) And, when libvirt starts using apparmor, and creating apparmor profiles for every virtual machine created in the compute nodes, mitaka qemu (2.5) uses a fallback mechanism for creating shared memory for live-migrations. This fall back mechanism, on kernels 3.13 - that don't have memfd_create() system-call, try to create files on /tmp/ directory and fails.. causing live-migration not to work. Trusty with kernel 3.13 + Mitaka with qemu 2.5 + apparmor capability = can't live migrate. From qemu 2.5, logic is on : void *qemu_memfd_alloc(const char *name, size_t size, unsigned int seals, int *fd) { if (memfd_create)... ### only works with HWE kernels else ### 3.13 kernels, gets blocked by apparmor tmpdir = g_get_tmp_dir ... mfd = mkstemp(fname) } And you can see the errors: From the host trying to send the virtual machine: 2016-08-15 16:36:26.160 1974 ERROR nova.virt.libvirt.driver [req-0cac612b-8d53-4610-b773-d07ad6bacb91 691a581cfa7046278380ce82b1c38ddd 133ebc3585c041aebaead8c062cd6511 - - -] [instance: 2afa1131-bc8c-43d2-9c4a-962c1bf7723e] Migration operation has aborted 2016-08-15 16:36:26.248 1974 ERROR nova.virt.libvirt.driver [req-0cac612b-8d53-4610-b773-d07ad6bacb91 691a581cfa7046278380ce82b1c38ddd 133ebc3585c041aebaead8c062cd6511 - - -] [instance: 2afa1131-bc8c-43d2-9c4a-962c1bf7723e] Live Migration failure: internal error: unable to execute QEMU command 'migrate': Migration disabled: failed to allocate shared memory From the host trying to receive the virtual machine: Aug 15 16:36:19 tkcompute01 kernel: [ 1194.356794] type=1400 audit(1471289779.791:72): apparmor="STATUS" operation="profile_load" profile="unconfined" name="libvirt-2afa1131-bc8c-43d2-9c4a-962c1bf7723e" pid=12565 comm="apparmor_parser" Aug 15 16:36:19 tkcompute01 kernel: [ 1194.357048] type=1400 audit(1471289779.791:73): apparmor="STATUS" operation="profile_load" profile="unconfined" name="qemu_bridge_helper" pid=12565 comm="apparmor_parser" Aug 15
[Yahoo-eng-team] [Bug 1866077] [NEW] [L3][IPv6] IPv6 traffic with DVR in compute host
Public bug reported: One question: how to let the IPv6 traffic to the outside world run directly in the compute host? We have a BP before: https://blueprints.launchpad.net/neutron/+spec/ipv6-router-and-dvr And one spec for it: https://review.opendev.org/#/c/136878/ ** Affects: neutron Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1866077 Title: [L3][IPv6] IPv6 traffic with DVR in compute host Status in neutron: New Bug description: One question: how to let the IPv6 traffic to the outside world run directly in the compute host? We have a BP before: https://blueprints.launchpad.net/neutron/+spec/ipv6-router-and-dvr And one spec for it: https://review.opendev.org/#/c/136878/ To manage notifications about this bug go to: https://bugs.launchpad.net/neutron/+bug/1866077/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1866072] [NEW] TypeError: 'TestOpenStackClient' object is not subscriptable
Public bug reported: Description === nova.tests.functional.regressions.test_bug_1831771.TestDelete.test_delete_during_create is often failing with the following trace: 2020-03-03 09:32:17.177641 | ubuntu-bionic | Traceback (most recent call last): 2020-03-03 09:32:17.177668 | ubuntu-bionic | File "/home/zuul/src/opendev.org/openstack/nova/nova/compute/manager.py", line 2196, in _do_build_and_run_instance 2020-03-03 09:32:17.177688 | ubuntu-bionic | filter_properties, request_spec) 2020-03-03 09:32:17.177707 | ubuntu-bionic | File "/home/zuul/src/opendev.org/openstack/nova/nova/compute/manager.py", line 2496, in _build_and_run_instance 2020-03-03 09:32:17.177726 | ubuntu-bionic | instance.save(expected_task_state=task_states.SPAWNING) 2020-03-03 09:32:17.177745 | ubuntu-bionic | File "/home/zuul/src/opendev.org/openstack/nova/nova/tests/functional/regressions/test_bug_1831771.py", line 67, in wrap_save 2020-03-03 09:32:17.16 | ubuntu-bionic | delete_race(instance) 2020-03-03 09:32:17.177797 | ubuntu-bionic | File "/home/zuul/src/opendev.org/openstack/nova/nova/tests/functional/regressions/test_bug_1831771.py", line 37, in delete_race 2020-03-03 09:32:17.177828 | ubuntu-bionic | {'OS-EXT-STS:task_state': task_states.DELETING}, 2020-03-03 09:32:17.177864 | ubuntu-bionic | File "/home/zuul/src/opendev.org/openstack/nova/nova/tests/functional/integrated_helpers.py", line 83, in _wait_for_server_parameter 2020-03-03 09:32:17.177895 | ubuntu-bionic | server = api.get_server(server['id']) 2020-03-03 09:32:17.177929 | ubuntu-bionic | TypeError: 'TestOpenStackClient' object is not subscriptable This appears to be as a result of https://review.opendev.org/#/c/697694/ *and* a race within the test resulting in the delete_race method above not always being used with each run of the test. Steps to reproduce == * $ tox -e functional-py36 nova.tests.functional.regressions.test_bug_1831771.TestDelete.test_delete_during_create Expected result === delete_race should always run and the test should pass. Actual result = delete_race either isn't called with the test then passing or it is and the test fails. Environment === 1. Exact version of OpenStack you are running. See the following list for all releases: http://docs.openstack.org/releases/ master - b3e14931d6aac6ee5776ce1e6974c75a5a6b1823 2. Which hypervisor did you use? (For example: Libvirt + KVM, Libvirt + XEN, Hyper-V, PowerKVM, ...) What's the version of that? N/A 2. Which storage type did you use? (For example: Ceph, LVM, GPFS, ...) What's the version of that? N/A 3. Which networking type did you use? (For example: nova-network, Neutron with OpenVSwitch, ...) N/A Logs & Configs == https://zuul.opendev.org/t/openstack/build/b781ccf934894bf89cdeb13f58de1c5f/log /job-output.txt#4056 ** Affects: nova Importance: Undecided Assignee: Lee Yarwood (lyarwood) Status: In Progress -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1866072 Title: TypeError: 'TestOpenStackClient' object is not subscriptable Status in OpenStack Compute (nova): In Progress Bug description: Description === nova.tests.functional.regressions.test_bug_1831771.TestDelete.test_delete_during_create is often failing with the following trace: 2020-03-03 09:32:17.177641 | ubuntu-bionic | Traceback (most recent call last): 2020-03-03 09:32:17.177668 | ubuntu-bionic | File "/home/zuul/src/opendev.org/openstack/nova/nova/compute/manager.py", line 2196, in _do_build_and_run_instance 2020-03-03 09:32:17.177688 | ubuntu-bionic | filter_properties, request_spec) 2020-03-03 09:32:17.177707 | ubuntu-bionic | File "/home/zuul/src/opendev.org/openstack/nova/nova/compute/manager.py", line 2496, in _build_and_run_instance 2020-03-03 09:32:17.177726 | ubuntu-bionic | instance.save(expected_task_state=task_states.SPAWNING) 2020-03-03 09:32:17.177745 | ubuntu-bionic | File "/home/zuul/src/opendev.org/openstack/nova/nova/tests/functional/regressions/test_bug_1831771.py", line 67, in wrap_save 2020-03-03 09:32:17.16 | ubuntu-bionic | delete_race(instance) 2020-03-03 09:32:17.177797 | ubuntu-bionic | File "/home/zuul/src/opendev.org/openstack/nova/nova/tests/functional/regressions/test_bug_1831771.py", line 37, in delete_race 2020-03-03 09:32:17.177828 | ubuntu-bionic | {'OS-EXT-STS:task_state': task_states.DELETING}, 2020-03-03 09:32:17.177864 | ubuntu-bionic | File "/home/zuul/src/opendev.org/openstack/nova/nova/tests/functional/integrated_helpers.py", line 83, in _wait_for_server_parameter 2020-03-03 09:32:17.177895 | ubuntu-bionic |
[Yahoo-eng-team] [Bug 1866068] [NEW] [OVN] neutron_pg_drop port group table creation race condition
Public bug reported: With HA controllers, when first two ports are created simultaneously and each request is picked by a different neutron-server, it can happen one port fails the creation because it fails creating neutron_pg_drop port group entry in OVN. This is because neutron_pg_drop entry is unique in the whole cloud and is created on the first attempt of port creation, if it doesn't exist. The solution can be creating the entry during server start. ** Affects: neutron Importance: Undecided Assignee: Jakub Libosvar (libosvar) Status: New ** Tags: ovn -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1866068 Title: [OVN] neutron_pg_drop port group table creation race condition Status in neutron: New Bug description: With HA controllers, when first two ports are created simultaneously and each request is picked by a different neutron-server, it can happen one port fails the creation because it fails creating neutron_pg_drop port group entry in OVN. This is because neutron_pg_drop entry is unique in the whole cloud and is created on the first attempt of port creation, if it doesn't exist. The solution can be creating the entry during server start. To manage notifications about this bug go to: https://bugs.launchpad.net/neutron/+bug/1866068/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1861032] Re: [RFE] Add support for configuring dnsmasq with multiple IPv6 addresses in same subnet on same port
Reviewed: https://review.opendev.org/704436 Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=592c2f8d91c3172c75cc5a2464350891b0a303f1 Submitter: Zuul Branch:master commit 592c2f8d91c3172c75cc5a2464350891b0a303f1 Author: Harald Jensås Date: Fri Jan 17 12:29:10 2020 +0100 DHCPv6 - Use addr6_list in dnsmasq Adds a new bool option dnsmasq_enable_addr6_list, when enabled configuration for dnsmasq will be created with a single dhcp-host entry specifying a list of ip addresses allocated for a port. Previously the dnsmasq dhcp-agent driver would write a separate dhcp-host entry for each fixed-ip of a port in the dnsmasq hosts file. The result of the previous behaviour is that dnsmasq will only use one of the config entries, i.e the first one matching the mac identifier. The trade-off is that only a single dns_assignment will be used for IPv6 addresses within the same subnet. (But in practice, this was always the case since only the first config entry would be used by dnsmasq.) Why is this neccecary: This is done to enable ironic provisioning over IPv6 using DHCPv6-stateful. For background info, please read dnsmasq-discuss thread: http://lists.thekelleys.org.uk/pipermail/dnsmasq-discuss/2020q1/thread.html#13671 Closes-Bug: #1861032 Change-Id: I833840e7daed2efa7efaece27cfd1ba28e0feb90 ** Changed in: neutron Status: In Progress => Fix Released -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1861032 Title: [RFE] Add support for configuring dnsmasq with multiple IPv6 addresses in same subnet on same port Status in neutron: Fix Released Bug description: To enable network boot and Ironic provisioning a patch has been proposed to dnsmasq. The patch add's the possibility to provide list's of individual addresses as well as prefixed ranges of ipv6 addresses for a dhcp-host reservation. When dnsmasq recieve a request matching the clid or mac address is recieved the server will iterate over all candidate addresses until it find's one that is not already leased to a different clid/iaid and advertise this address. Using multiple reservations for a single host makes it possible to maintain a static leases only configuration which support network booting systems with UEFI firmware that request a new address (a new SOLICIT with a new IA_NA option using a new IAID) for different boot modes, for instance 'PXE over IPv6', and 'HTTP-Boot over IPv6'. Open Virtual Machine Firmware (OVMF) and most UEFI firmware build on the EDK2 code base exhibit this behaviour. A new configuration syntax is introduces in dnsmasq in patch: http://lists.thekelleys.org.uk/pipermail/dnsmasq- discuss/2020q1/013743.html For example: --dhcp- host=52:54:00:3f:5c:c0,[fd12:3456::aa02][fd12:3456::aa04],host1 The above will make the two addresses fd12:3456::aa02 and fd12:3456::aa04 available to the host with hardware address 52:54:00:3f:5c:c0. This RFE is to add functionality to the dnsmasq dhcp-agent implementation to write the new configuration format in the dnsmasq hosts file. Given a neutron port: "ports": [ { "dns_assignment": [ { "hostname": "myport02", "ip_address": "fd12:3456::aa02", "fqdn": "myport02.my-domain.org" }, { "hostname": "myport04", "ip_address": "fd12:3456::aa04", "fqdn": "myport04.my-domain.org" }, ], "fixed_ips": [ { "ip_address": "fd12:3456::aa02", "subnet_id": "008ba151-0b8c-4a67-98b5-0d2b87666062" }, { "ip_address": "fd12:3456::aa04", "subnet_id": "008ba151-0b8c-4a67-98b5-0d2b87666062" } ], "id": "d80b1a3b-4fc1-49f3-952e-1e2ab7081d8b", "mac_address": "fa:16:3e:58:42:ed", "network_id": "70c1db1f-b701-45bd-96e0-a313ee3430b3", }, ] } Current behaviour - dhcp-host=fa:16:3e:58:42:ed,myport02.my-domain.org,[fd12:3456::aa02] dhcp-host=fa:16:3e:58:42:ed,myport04.my-domain.org,[fd12:3456::aa04] NOTE, this configuration means dnsmasq will only ever lease fd12:3456::aa04. As it will always find that as the first valid configuration for mac fa:16:3e:58:42:ed. In other words, the _current behaviour is broken_. New behaviour - dhcp-host=fa:16:3e:58:42:ed,myport02.my-domain.org,[fd12:3456::aa02][fd12:3456::aa04] This will allow dnsmasq to lase both addresses when requests
[Yahoo-eng-team] [Bug 1866039] [NEW] [OVN] QoS gives different bandwidth limit measures than ml2/ovs
Public bug reported: There is a difference in QoS tempest tests results between ml2/ovs and ml2/ovn. In the change [1] that enables QoS tempest tests for OVN the test neutron_tempest_plugin.scenario.test_qos.QoSTest.test_qos_basic_and_update fails on the last check [2], after the policy is updated to be configured with values: max_kbps=constants.LIMIT_KILO_BITS_PER_SECOND * 3 max_burst_kbps=constants.LIMIT_KILO_BITS_PER_SECOND * 3, Which means: max_kbps = 3000 max_burst_kbps = 3000 Previous QoS validations in this test passes with values (max_kbps, max_burst_kbps): (1000, 1000) and (2000, 2000). I added some more debug log to the tempest test here [3], so that we can compare test expected and measured values. Those are taken from test runs from gates. --- Expected is calculated as: TOLERANCE_FACTOR = 1.5 constants.LIMIT_KILO_BITS_PER_SECOND = 1000 MULTIPLEXING_FACTOR = 1 or 2 or 3 depends on stage of the test LIMIT_BYTES_SEC = (constants.LIMIT_KILO_BITS_PER_SECOND * 1024 * TOLERANCE_FACTOR / 8.0) * MULTIPLEXING_FACTOR --- Results: If expected <= measured, the test passes. |max_kbps/max_burst_kbps|expected(bps)|ovs(bps)|ovn(bps)|linux_bridge(bps)| |(1000, 1000)|192000|112613|141250|129124| |(2000, 2000)|384000|311978|408886, 411005, 385152, 422114, 352903|300163| |(3000, 3000)|576000|523677|820522,. failed|459569| As we see only for (3000, 3000) OVN test failed. For (2000, 2000) it passed after 5 retries. --- So lets see how the QoS is configured on OVN nowadays: stack@mjozefcz-devstack-qos-2:~/logs$ neutron qos-bandwidth-limit-rule-list 047f7a8c-e143-471f-979c-4a4d95cefa5e neutron CLI is deprecated and will be removed in the future. Use openstack CLI instead. +---+--++--+ | direction | id | max_burst_kbps | max_kbps | +---+--++--+ | egress| 9dd84dc7-f216-432f-b1aa-ec17eb488720 | 3000 | 3000 | +---+--++--+ Configured OVN NBDB: stack@mjozefcz-devstack-qos-2:~/logs$ ovn-nbctl list qos _uuid : 1176fe8f-695d-4f79-a99f-f0df8a7b8652 action : {} bandwidth : {burst=3000, rate=3000} direction : from-lport external_ids: {} match : "inport == \"4521ef05-d139-4d84-a100-efb83fde2b47\"" priority: 2002 Configured meter on bridge: stack@mjozefcz-devstack-qos-2:~/logs$ sudo ovs-ofctl -O OpenFlow13 dump-meters br-int OFPST_METER_CONFIG reply (OF1.3) (xid=0x2): meter=1 kbps burst stats bands= type=drop rate=3000 burst_size=3000 Flow in bridge: stack@mjozefcz-devstack-qos-2:~/logs$ sudo ovs-ofctl -O OpenFlow13 dump-flows br-int | grep meter cookie=0x398f0e17, duration=71156.273s, table=16, n_packets=136127, n_bytes=41572857, priority=2002,reg14=0x4,metadata=0x1 actions=meter:1,resubmit(,17) -- Questions: * Why the test results are different compared to ml2/OVS? * Maybe burst values should be configured differently? [1] https://review.opendev.org/#/c/704833/ [2] https://github.com/openstack/neutron-tempest-plugin/blob/328edc882a3debf4f1b39687dfb559d7c5c385f3/neutron_tempest_plugin/scenario/test_qos.py#L271 [3] https://review.opendev.org/#/c/711048/ ** Affects: neutron Importance: Undecided Assignee: Maciej Jozefczyk (maciej.jozefczyk) Status: New ** Tags: ovn qos ** Changed in: neutron Assignee: (unassigned) => Maciej Jozefczyk (maciej.jozefczyk) ** Summary changed: - [OVN] QoS gives different burst limit values + [OVN] QoS gives different bandwidth limit values ** Summary changed: - [OVN] QoS gives different bandwidth limit values + [OVN] QoS gives different bandwidth limit measures than ml2/ovs -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1866039 Title: [OVN] QoS gives different bandwidth limit measures than ml2/ovs Status in neutron: New Bug description: There is a difference in QoS tempest tests results between ml2/ovs and ml2/ovn. In the change [1] that enables QoS tempest tests for OVN the test neutron_tempest_plugin.scenario.test_qos.QoSTest.test_qos_basic_and_update fails on the last check [2], after the policy is updated to be configured with values: max_kbps=constants.LIMIT_KILO_BITS_PER_SECOND * 3 max_burst_kbps=constants.LIMIT_KILO_BITS_PER_SECOND * 3, Which means: max_kbps = 3000 max_burst_kbps = 3000 Previous QoS validations in this test passes with values (max_kbps,