[Yahoo-eng-team] [Bug 1905701] Re: Do not recreate libvirt secret when one already exists on the host during a host reboot
https://review.opendev.org/c/openstack/nova/+/765769 proposed to stable/victoria ** Also affects: nova/queens Importance: Undecided Status: New ** Also affects: nova/rocky Importance: Undecided Status: New ** Also affects: nova/trunk Importance: Undecided Status: New ** Also affects: nova/train Importance: Undecided Status: New ** Also affects: nova/stein Importance: Undecided Status: New ** Also affects: nova/ussuri Importance: Undecided Status: New ** Also affects: nova/victoria Importance: Undecided Status: New ** No longer affects: nova/trunk ** Changed in: nova/victoria Status: New => In Progress ** Changed in: nova/victoria Assignee: (unassigned) => Lee Yarwood (lyarwood) -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1905701 Title: Do not recreate libvirt secret when one already exists on the host during a host reboot Status in OpenStack Compute (nova): In Progress Status in OpenStack Compute (nova) queens series: In Progress Status in OpenStack Compute (nova) rocky series: In Progress Status in OpenStack Compute (nova) stein series: In Progress Status in OpenStack Compute (nova) train series: In Progress Status in OpenStack Compute (nova) ussuri series: In Progress Status in OpenStack Compute (nova) victoria series: In Progress Bug description: Description === When [compute]/resume_guests_state_on_host_boot is enabled the compute manager will attempt to restart instances on start up. When using the libvirt driver and instances with attached LUKSv1 encrypted volumes a call is made to _attach_encryptor that currently assumes that any volume libvirt secrets don't already exist on the host. As a result this call will currently lead to an attempt to lookup encryption metadata that fails as the compute service is using a bare bones local only admin context to drive the restart of the instances. The libvirt secrets associated with LUKSv1 encrypted volumes actually persist a host reboot and thus this call to fetch encryption metadata, fetch the symmetric key etc are not required. Removal of these calls in this context should allow the compute service to start instances with these volumes attached. Steps to reproduce == * Enable [compute]/resume_guests_state_on_host_boot * Launch instances with encrypted LUKSv1 volumes attached * Reboot the underlying host Expected result === * The instances are restarted successfully by Nova as no external calls are made and the existing libvirt secret for any encrypted LUKSv1 volumes are reused. Actual result = * The instances fail to restart as the initial calls made by the Nova service use an empty admin context without a service catelog etc. Environment === 1. Exact version of OpenStack you are running. See the following master 2. Which hypervisor did you use? (For example: Libvirt + KVM, Libvirt + XEN, Hyper-V, PowerKVM, ...) What's the version of that? libvirt + QEMU/KVM 2. Which storage type did you use? (For example: Ceph, LVM, GPFS, ...) What's the version of that? N/A 3. Which networking type did you use? (For example: nova-network, Neutron with OpenVSwitch, ...) N/A Logs & Configs == 2020-08-20 11:30:12.273 7 ERROR nova.virt.libvirt.driver [instance: c5b3e7d4-99ea-409c-aba6-d32751f93ccf] File "/usr/lib/python3.6/site-packages/nova/virt/libvirt/driver.py", line 1641, in _connect_volume 2020-08-20 11:30:12.273 7 ERROR nova.virt.libvirt.driver [instance: c5b3e7d4-99ea-409c-aba6-d32751f93ccf] self._attach_encryptor(context, connection_info, encryption) 2020-08-20 11:30:12.273 7 ERROR nova.virt.libvirt.driver [instance: c5b3e7d4-99ea-409c-aba6-d32751f93ccf] File "/usr/lib/python3.6/site-packages/nova/virt/libvirt/driver.py", line 1760, in _attach_encryptor 2020-08-20 11:30:12.273 7 ERROR nova.virt.libvirt.driver [instance: c5b3e7d4-99ea-409c-aba6-d32751f93ccf] key = keymgr.get(context, encryption['encryption_key_id']) 2020-08-20 11:30:12.273 7 ERROR nova.virt.libvirt.driver [instance: c5b3e7d4-99ea-409c-aba6-d32751f93ccf] File "/usr/lib/python3.6/site-packages/castellan/key_manager/barbican_key_manager.py", line 575, in get 2020-08-20 11:30:12.273 7 ERROR nova.virt.libvirt.driver [instance: c5b3e7d4-99ea-409c-aba6-d32751f93ccf] secret = self._get_secret(context, managed_object_id) 2020-08-20 11:30:12.273 7 ERROR nova.virt.libvirt.driver [instance: c5b3e7d4-99ea-409c-aba6-d32751f93ccf] File "/usr/lib/python3.6/site-packages/castellan/key_manager/barbican_key_manager.py", line 545, in _ge t_secret 2020-08-20 11:30:12.273 7 ERROR nova.virt.libvirt.driver [instance:
[Yahoo-eng-team] [Bug 1905493] Re: cloud-init status --wait hangs indefinitely in a nested lxd container
it's interesting that apparmor appears to work ok in the first-level container, but fails in the nested container, e.g.: $ lxc shell lp1905493-f root@lp1905493-f:~# systemctl status apparmor ● apparmor.service - Load AppArmor profiles Loaded: loaded (/lib/systemd/system/apparmor.service; enabled; vendor preset: enabled) Active: active (exited) since Wed 2021-03-17 18:17:44 UTC; 2h 53min ago Docs: man:apparmor(7) https://gitlab.com/apparmor/apparmor/wikis/home/ Process: 118 ExecStart=/lib/apparmor/apparmor.systemd reload (code=exited, status=0/SUCCESS) Main PID: 118 (code=exited, status=0/SUCCESS) Mar 17 18:17:44 lp1905493-f systemd[1]: Starting Load AppArmor profiles... Mar 17 18:17:44 lp1905493-f apparmor.systemd[118]: Restarting AppArmor Mar 17 18:17:44 lp1905493-f apparmor.systemd[118]: Reloading AppArmor profiles Mar 17 18:17:44 lp1905493-f apparmor.systemd[129]: Skipping profile in /etc/apparmor.d/disable: usr.sbin.rsyslogd Mar 17 18:17:44 lp1905493-f systemd[1]: Finished Load AppArmor profiles. root@lp1905493-f:~# lxc shell layer2 root@layer2:~# systemctl status apparmor ● apparmor.service - Load AppArmor profiles Loaded: loaded (/lib/systemd/system/apparmor.service; enabled; vendor preset: enabled) Active: failed (Result: exit-code) since Wed 2021-03-17 18:40:16 UTC; 2h 31min ago Docs: man:apparmor(7) https://gitlab.com/apparmor/apparmor/wikis/home/ Main PID: 105 (code=exited, status=1/FAILURE) Mar 17 18:40:15 layer2 apparmor.systemd[147]: /sbin/apparmor_parser: Unable to replace "nvidia_modprobe". Permission denied; attempted to load a profile while confined? Mar 17 18:40:15 layer2 apparmor.systemd[157]: /sbin/apparmor_parser: Unable to replace "/usr/bin/man". Permission denied; attempted to load a profile while confined? Mar 17 18:40:15 layer2 apparmor.systemd[164]: /sbin/apparmor_parser: Unable to replace "/usr/sbin/tcpdump". Permission denied; attempted to load a profile while confined? Mar 17 18:40:16 layer2 apparmor.systemd[150]: /sbin/apparmor_parser: Unable to replace "/usr/lib/NetworkManager/nm-dhcp-client.action". Permission denied; attempted to load a profile while confined? Mar 17 18:40:16 layer2 apparmor.systemd[161]: /sbin/apparmor_parser: Unable to replace "mount-namespace-capture-helper". Permission denied; attempted to load a profile while confined? Mar 17 18:40:16 layer2 apparmor.systemd[161]: /sbin/apparmor_parser: Unable to replace "/usr/lib/snapd/snap-confine". Permission denied; attempted to load a profile while confined? Mar 17 18:40:16 layer2 apparmor.systemd[105]: Error: At least one profile failed to load Mar 17 18:40:16 layer2 systemd[1]: apparmor.service: Main process exited, code=exited, status=1/FAILURE Mar 17 18:40:16 layer2 systemd[1]: apparmor.service: Failed with result 'exit-code'. Mar 17 18:40:16 layer2 systemd[1]: Failed to start Load AppArmor profiles. ** Also affects: apparmor Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to cloud-init. https://bugs.launchpad.net/bugs/1905493 Title: cloud-init status --wait hangs indefinitely in a nested lxd container Status in AppArmor: New Status in cloud-init: Invalid Status in snapd: Confirmed Status in dbus package in Ubuntu: New Status in systemd package in Ubuntu: Invalid Bug description: When booting a nested lxd container inside another lxd container (just a normal container, not a VM) (i.e. just L2), using cloud-init -status --wait, the "." is just printed off infinitely and never returns. To manage notifications about this bug go to: https://bugs.launchpad.net/apparmor/+bug/1905493/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1905493] Re: cloud-init status --wait hangs indefinitely in a nested lxd container
Yep, that's what I've found; cloud-init is just waiting for its later stages to run, which are blocked by snapd.seeded.service exiting. ** Changed in: cloud-init Status: New => Invalid -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to cloud-init. https://bugs.launchpad.net/bugs/1905493 Title: cloud-init status --wait hangs indefinitely in a nested lxd container Status in cloud-init: Invalid Status in snapd: Confirmed Status in dbus package in Ubuntu: New Status in systemd package in Ubuntu: Invalid Bug description: When booting a nested lxd container inside another lxd container (just a normal container, not a VM) (i.e. just L2), using cloud-init -status --wait, the "." is just printed off infinitely and never returns. To manage notifications about this bug go to: https://bugs.launchpad.net/cloud-init/+bug/1905493/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1905493] Re: cloud-init status --wait hangs indefinitely in a nested lxd container
FWIW I know what the snapd issue is, the issue is that snapd does not and will not work in a nested LXD container, we need to add code to make snapd.seeded.service die/exit gracefully in this situation. ** Also affects: snapd Importance: Undecided Status: New ** Changed in: snapd Status: New => Confirmed ** Changed in: snapd Importance: Undecided => Low -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to cloud-init. https://bugs.launchpad.net/bugs/1905493 Title: cloud-init status --wait hangs indefinitely in a nested lxd container Status in cloud-init: Invalid Status in snapd: Confirmed Status in dbus package in Ubuntu: New Status in systemd package in Ubuntu: Invalid Bug description: When booting a nested lxd container inside another lxd container (just a normal container, not a VM) (i.e. just L2), using cloud-init -status --wait, the "." is just printed off infinitely and never returns. To manage notifications about this bug go to: https://bugs.launchpad.net/cloud-init/+bug/1905493/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1552042] Re: Host data corruption through nova inject_key feature
Thanks for following up on this longstanding report. Given the fix is unlikely to be backported to supported stable branches, the VMT considers such reports class B1 ( https://security.openstack.org/vmt- process.html#incident-report-taxonomy ) so there's no call for issuing an advisory. ** Changed in: ossa Status: Incomplete => Won't Fix ** Information type changed from Public Security to Public ** Tags added: security -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1552042 Title: Host data corruption through nova inject_key feature Status in OpenStack Compute (nova): Fix Released Status in OpenStack Security Advisory: Won't Fix Bug description: Reported by Garth Mollett from Red Hat. The nova.virt.disk.vfs.VFSLocalFS has measures to prevent symlink traversal outside of the root of the images directory but it does not prevent access to device nodes inside the image itself. A simple fix should be to mount with the 'nodev' option. Under certain circumstances, the boot process will fold back to VFSLocalFS when trying to inject the public key, for libvirt: * when libguestfs is not installed or can't be loaded. * use_cow_images=false and inject_partition for non-nbd * for loopback mount at least, there is a race condition to win in virt/disk/mount/api.py between kpartx and a /dev/mapper/ file creation: os.path.exists can run before the path exists even though it's there half a second later. The xenapi is also likely vulnerable, though untested. To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1552042/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1919386] Re: Project administrators are allowed to view networks across projects
Fix merged in neutron-lib. ** Changed in: neutron Status: In Progress => Fix Released -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1919386 Title: Project administrators are allowed to view networks across projects Status in neutron: Fix Released Bug description: The new default policies in neutron help fix tenancy issues where users of one project are not allowed to view, create, modify, or delete resources within another project (enforcing hard tenancy). With the new policies enabled by default, I'm able to view networks for other projects as an administrator of another project. ╭─ubuntu@neutron-devstack /opt/stack/neutron ‹master› ╰─➤ $ openstack --os-cloud devstack-alt-admin network create alt-network /usr/lib/python3/dist-packages/secretstorage/dhcrypto.py:15: CryptographyDeprecationWarning: int_from_bytes is deprecated, use int.from_bytes instead from cryptography.utils import int_from_bytes /usr/lib/python3/dist-packages/secretstorage/util.py:19: CryptographyDeprecationWarning: int_from_bytes is deprecated, use int.from_bytes instead from cryptography.utils import int_from_bytes +---+--+ | Field | Value| +---+--+ | admin_state_up| UP | | availability_zone_hints | | | availability_zones| | | created_at| 2021-03-16T21:27:28Z | | description | | | dns_domain| None | | id| 84c7464b-3351-4a47-88d1-3b6615967e87 | | ipv4_address_scope| None | | ipv6_address_scope| None | | is_default| False| | is_vlan_transparent | None | | mtu | 1450 | | name | alt-network | | port_security_enabled | True | | project_id| 13bde21b76fe4744904785a9a61512b7 | | provider:network_type | vxlan| | provider:physical_network | None | | provider:segmentation_id | 3| | qos_policy_id | None | | revision_number | 1| | router:external | Internal | | segments | None | | shared| False| | status| ACTIVE | | subnets | | | tags | | | updated_at| 2021-03-16T21:27:28Z | +---+--+ ╭─ubuntu@neutron-devstack /opt/stack/neutron ‹master› ╰─➤ $ openstack --os-cloud devstack-admin-admin network show alt-network /usr/lib/python3/dist-packages/secretstorage/dhcrypto.py:15: CryptographyDeprecationWarning: int_from_bytes is deprecated, use int.from_bytes instead from cryptography.utils import int_from_bytes /usr/lib/python3/dist-packages/secretstorage/util.py:19: CryptographyDeprecationWarning: int_from_bytes is deprecated, use int.from_bytes instead from cryptography.utils import int_from_bytes +---+--+ | Field | Value| +---+--+ | admin_state_up| UP | | availability_zone_hints | | | availability_zones| | | created_at| 2021-03-16T21:27:28Z | | description | | | dns_domain| None | | id| 84c7464b-3351-4a47-88d1-3b6615967e87 | | ipv4_address_scope| None | | ipv6_address_scope| None | | is_default| None | | is_vlan_transparent | None | |
[Yahoo-eng-team] [Bug 1905493] Re: cloud-init status --wait hangs indefinitely in a nested lxd container
The systemd-logind problem is due to dbus defaulting to apparmor mode 'enabled', but apparmor can't do much of anything inside a container so it fails to start, and dbus can't contact it. In the 2nd level container, create a file like '/etc/dbus-1/system.d/no- apparmor.conf' with content: http://www.freedesktop.org/standards/dbus/1.0/busconfig.dtd;> Then restart the 2nd level container and recheck systemd-logind which should now work Of course, fixing dbus should be a bit smarter about only disabling its use of apparmor if it's inside a container. However, cloud-init status --wait still hangs after systemd-logind starts up, so that wasn't the original problem (or at least wasn't the only problem) ** Also affects: dbus (Ubuntu) Importance: Undecided Status: New ** Changed in: systemd (Ubuntu) Status: New => Invalid ** Changed in: cloud-init Status: Invalid => New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to cloud-init. https://bugs.launchpad.net/bugs/1905493 Title: cloud-init status --wait hangs indefinitely in a nested lxd container Status in cloud-init: New Status in dbus package in Ubuntu: New Status in systemd package in Ubuntu: Invalid Bug description: When booting a nested lxd container inside another lxd container (just a normal container, not a VM) (i.e. just L2), using cloud-init -status --wait, the "." is just printed off infinitely and never returns. To manage notifications about this bug go to: https://bugs.launchpad.net/cloud-init/+bug/1905493/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1919487] [NEW] virDomainBlockCommit called when deleting a snapshot via os-assisted-volume-snapshots even when instance is shutoff
Public bug reported: Description === Attempting to delete a NFS volume snapshot (via c-api and the the os- assisted-volume-snapshots n-api) of a volume attached to a SHUTOFF instance currently results in n-cpu attempting to fire off a virDomainBlockCommit command even though the instance isn't running. Steps to reproduce == 1. Create multiple volume snapshots against a volume. 2. Attach the volume to an ACTIVE instance. 3. Stop the instance and ensure it is SHUTOFF. 4. Attempt to delete the latest snapshot. Expected result === qemu-img commit or qemu-img rebase should be used to handle this offline. Actual result = virDomainBlockCommit is called even though the domain isn't running. Environment === 1. Exact version of OpenStack you are running. See the following list for all releases: http://docs.openstack.org/releases/ master 2. Which hypervisor did you use? (For example: Libvirt + KVM, Libvirt + XEN, Hyper-V, PowerKVM, ...) What's the version of that? libvirt + KVM 2. Which storage type did you use? (For example: Ceph, LVM, GPFS, ...) What's the version of that? NFS c-vol 3. Which networking type did you use? (For example: nova-network, Neutron with OpenVSwitch, ...) N/A Logs & Configs == Jul 03 09:37:57 localhost.localdomain nova-compute[127223]: ERROR oslo_messaging.rpc.server [req-570281c6-566e-44a3-9953-eeb634513778 req-0fbbe87f-fd1d-4861-9fb3-21b8eb011e55 service nova] Exception during message handling: libvirt.libvirtError: Requested operation is not valid: domain is not > Jul 03 09:37:57 localhost.localdomain nova-compute[127223]: ERROR oslo_messaging.rpc.server Traceback (most recent call last): Jul 03 09:37:57 localhost.localdomain nova-compute[127223]: ERROR oslo_messaging.rpc.server File "/usr/local/lib/python3.7/site-packages/oslo_messaging/rpc/server.py", line 165, in _process_incoming Jul 03 09:37:57 localhost.localdomain nova-compute[127223]: ERROR oslo_messaging.rpc.server res = self.dispatcher.dispatch(message) Jul 03 09:37:57 localhost.localdomain nova-compute[127223]: ERROR oslo_messaging.rpc.server File "/usr/local/lib/python3.7/site-packages/oslo_messaging/rpc/dispatcher.py", line 273, in dispatch Jul 03 09:37:57 localhost.localdomain nova-compute[127223]: ERROR oslo_messaging.rpc.server return self._do_dispatch(endpoint, method, ctxt, args) Jul 03 09:37:57 localhost.localdomain nova-compute[127223]: ERROR oslo_messaging.rpc.server File "/usr/local/lib/python3.7/site-packages/oslo_messaging/rpc/dispatcher.py", line 193, in _do_dispatch Jul 03 09:37:57 localhost.localdomain nova-compute[127223]: ERROR oslo_messaging.rpc.server result = func(ctxt, **new_args) Jul 03 09:37:57 localhost.localdomain nova-compute[127223]: ERROR oslo_messaging.rpc.server File "/usr/local/lib/python3.7/site-packages/oslo_messaging/rpc/server.py", line 241, in inner Jul 03 09:37:57 localhost.localdomain nova-compute[127223]: ERROR oslo_messaging.rpc.server return func(*args, **kwargs) Jul 03 09:37:57 localhost.localdomain nova-compute[127223]: ERROR oslo_messaging.rpc.server File "/opt/stack/nova/nova/exception_wrapper.py", line 78, in wrapped Jul 03 09:37:57 localhost.localdomain nova-compute[127223]: ERROR oslo_messaging.rpc.server function_name, call_dict, binary) Jul 03 09:37:57 localhost.localdomain nova-compute[127223]: ERROR oslo_messaging.rpc.server File "/usr/local/lib/python3.7/site-packages/oslo_utils/excutils.py", line 220, in __exit__ Jul 03 09:37:57 localhost.localdomain nova-compute[127223]: ERROR oslo_messaging.rpc.server self.force_reraise() Jul 03 09:37:57 localhost.localdomain nova-compute[127223]: ERROR oslo_messaging.rpc.server File "/usr/local/lib/python3.7/site-packages/oslo_utils/excutils.py", line 196, in force_reraise Jul 03 09:37:57 localhost.localdomain nova-compute[127223]: ERROR oslo_messaging.rpc.server six.reraise(self.type_, self.value, self.tb) Jul 03 09:37:57 localhost.localdomain nova-compute[127223]: ERROR oslo_messaging.rpc.server File "/usr/local/lib/python3.7/site-packages/six.py", line 703, in reraise Jul 03 09:37:57 localhost.localdomain nova-compute[127223]: ERROR oslo_messaging.rpc.server raise value Jul 03 09:37:57 localhost.localdomain nova-compute[127223]: ERROR oslo_messaging.rpc.server File "/opt/stack/nova/nova/exception_wrapper.py", line 69, in wrapped Jul 03 09:37:57 localhost.localdomain nova-compute[127223]: ERROR oslo_messaging.rpc.server return f(self, context, *args, **kw) Jul 03 09:37:57 localhost.localdomain nova-compute[127223]: ERROR oslo_messaging.rpc.server File "/opt/stack/nova/nova/compute/manager.py", line 3916, in volume_snapshot_delete Jul 03 09:37:57 localhost.localdomain nova-compute[127223]: ERROR oslo_messaging.rpc.server snapshot_id, delete_info) Jul 03 09:37:57 localhost.localdomain nova-compute[127223]: ERROR
[Yahoo-eng-team] [Bug 1896621] Re: instance corrupted after volume retype
https://review.opendev.org/c/openstack/nova/+/758732 has been released in ussuri 21.2.0 ** Changed in: nova/ussuri Status: New => Fix Released -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1896621 Title: instance corrupted after volume retype Status in OpenStack Compute (nova): Fix Released Status in OpenStack Compute (nova) queens series: In Progress Status in OpenStack Compute (nova) rocky series: In Progress Status in OpenStack Compute (nova) stein series: In Progress Status in OpenStack Compute (nova) train series: In Progress Status in OpenStack Compute (nova) ussuri series: Fix Released Status in OpenStack Compute (nova) victoria series: Fix Released Bug description: Description === Following a cinder volume retype on a volume attached to a running instance, the instance became corrupt and cannot boot into the guest operating system any more. Upon further investigating it seems the retype operation failed. The nova-compute logs registered the following error: Exception during message handling: libvirtError: block copy still active: domain has active block job see log extract: http://paste.openstack.org/show/798201/ Steps to reproduce == I'm not sure how easy this would be to replicate the exact problem. As an admin user within the project, in Horizon go to Project | Volume | Volume, then from the context menu of the required volume select "change volume type". Select the new type and migration policy 'on-demand'. Following this it was reported that the instance was none-responsive, when checking in the console the instance was unable to boot from the volume. Environment === DISTRIB_ID="OSA" DISTRIB_RELEASE="18.1.5" DISTRIB_CODENAME="Rocky" DISTRIB_DESCRIPTION="OpenStack-Ansible" # nova-manage --version 18.1.1 # virsh version Compiled against library: libvirt 4.0.0 Using library: libvirt 4.0.0 Using API: QEMU 4.0.0 Running hypervisor: QEMU 2.11.1 Cinder v13.0.3 backed volumes using Zadara VPSA driver To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1896621/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1885528] Re: snapshot delete fails on shutdown VM
** Also affects: nova/rocky Importance: Undecided Status: New ** Also affects: nova/queens Importance: Undecided Status: New ** Also affects: nova/ussuri Importance: Undecided Status: New ** Also affects: nova/victoria Importance: Undecided Status: New ** Also affects: nova/stein Importance: Undecided Status: New ** Also affects: nova/trunk Importance: Undecided Status: New ** Changed in: nova/ussuri Assignee: (unassigned) => Lee Yarwood (lyarwood) ** Changed in: nova/victoria Assignee: (unassigned) => Lee Yarwood (lyarwood) ** Changed in: nova/rocky Status: New => In Progress ** Changed in: nova/trunk Assignee: (unassigned) => Lee Yarwood (lyarwood) ** Changed in: nova/ussuri Status: New => In Progress ** Changed in: nova/victoria Status: New => In Progress -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1885528 Title: snapshot delete fails on shutdown VM Status in OpenStack Compute (nova): Fix Released Status in OpenStack Compute (nova) queens series: New Status in OpenStack Compute (nova) rocky series: In Progress Status in OpenStack Compute (nova) stein series: New Status in OpenStack Compute (nova) trunk series: New Status in OpenStack Compute (nova) ussuri series: In Progress Status in OpenStack Compute (nova) victoria series: In Progress Bug description: Description: When we try to delete the last snapshot of a VM in shutdown state, this snapshot_delete will fail (and be stuck in state error-deleting). When setting state==available and redeleting the snapshot, the volume will be corrupted and the VM will never start again. Volumes are stored on NFS. (for root cause and fix, see the bottom of this post) To reproduce: - storage on NFS - create a VM and some snapshots - shut down the VM (ie volume is still considered "attached" but vm is no longer "active") - delete the last snapshot Expected Result: snapshot is deleted, vm still works Actual result: The snapshot is stuck on error deleting. After setting the snapshot state==available and deleting the snapshot again, the volume will be corrupted and the VM will never start again. (non-existing backing_file in qcow on disk) Environment: - openstack version: stein, deployed via kolla-ansible. I suspect this downloads from git but i don't know the exact version. - hypervisor: Libvirt + KVM - storage: NFS - networking: Neutron with OpenVSwitch Nova debug Logs: 2020-02-06 12:20:10.713 6 ERROR nova.virt.libvirt.driver [req-d38b5ec8-afdb-4dfe-af12-0c47598c6a47 6dd1c995b2ea4ddfbeb0685bc52e5fbf 6bebb564667d4a75b9281fd826b32ecf - d efault default] [instance: 711651a3-8440-42dd-a210-e7e550a8624e] Error occurred during volume_snapshot_delete, sending error status to Cinder.: DiskNotFound: No disk at volume-86c06b12-699c-4b54-8bca-fb92c99a2bf0.63d1585e-eb76-4e8f-bc96-93960e9c9692 2020-02-06 12:20:10.713 6 ERROR nova.virt.libvirt.driver [instance: 711651a3-8440-42dd-a210-e7e550a8624e] Traceback (most recent call last): 2020-02-06 12:20:10.713 6 ERROR nova.virt.libvirt.driver [instance: 711651a3-8440-42dd-a210-e7e550a8624e] File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/dri ver.py", line 2726, in volume_snapshot_delete 2020-02-06 12:20:10.713 6 ERROR nova.virt.libvirt.driver [instance: 711651a3-8440-42dd-a210-e7e550a8624e] snapshot_id, delete_info=delete_info) 2020-02-06 12:20:10.713 6 ERROR nova.virt.libvirt.driver [instance: 711651a3-8440-42dd-a210-e7e550a8624e] File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/dri ver.py", line 2686, in _volume_snapshot_delete 2020-02-06 12:20:10.713 6 ERROR nova.virt.libvirt.driver [instance: 711651a3-8440-42dd-a210-e7e550a8624e] rebase_base) 2020-02-06 12:20:10.713 6 ERROR nova.virt.libvirt.driver [instance: 711651a3-8440-42dd-a210-e7e550a8624e] File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/dri ver.py", line 2519, in _rebase_with_qemu_img 2020-02-06 12:20:10.713 6 ERROR nova.virt.libvirt.driver [instance: 711651a3-8440-42dd-a210-e7e550a8624e] b_file_fmt = images.qemu_img_info(backing_file).file_forma t 2020-02-06 12:20:10.713 6 ERROR nova.virt.libvirt.driver [instance: 711651a3-8440-42dd-a210-e7e550a8624e] File "/usr/lib/python2.7/site-packages/nova/virt/images.py", line 58, in qemu_img_info 2020-02-06 12:20:10.713 6 ERROR nova.virt.libvirt.driver [instance: 711651a3-8440-42dd-a210-e7e550a8624e] raise exception.DiskNotFound(location=path) 2020-02-06 12:20:10.713 6 ERROR nova.virt.libvirt.driver [instance: 711651a3-8440-42dd-a210-e7e550a8624e] DiskNotFound: No disk at volume-86c06b12-699c-4b54-8bca-fb92c9 9a2bf0.63d1585e-eb76-4e8f-bc96-93960e9c9692 2020-02-06 12:20:10.713 6 ERROR nova.virt.libvirt.driver [instance:
[Yahoo-eng-team] [Bug 1886855] Re: Insufficient error handling when parsing iscsiadm -m node output with blank iscsi target
** No longer affects: nova -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1886855 Title: Insufficient error handling when parsing iscsiadm -m node output with blank iscsi target Status in Cinder: New Status in os-brick: New Bug description: We encountered the following error when attempting to reboot a VM with multiple attached volumes - 2020-07-02 05:46:05.960 ERROR oslo_messaging.rpc.server [req-0c171deb-bc82-4687-91d6-76e8f95b8e19 service] Exception during message handling: IndexError: list index out of range ... 2020-07-02 05:46:05.960 TRACE oslo_messaging.rpc.server File "/usr/lib/python2.7/site-packages/os_brick/initiator/connectors/iscsi.py", line 157, in _get_iscsi_nodes 2020-07-02 05:46:05.960 TRACE oslo_messaging.rpc.server lines.append((info[0].split(',')[0], info[1])) 2020-07-02 05:46:05.960 TRACE oslo_messaging.rpc.server IndexError: list index out of range This is observed on os-brick version - 1.15.9 The same code in current master branch - https://github.com/openstack/os-brick/blob/master/os_brick/initiator/connectors/iscsi.py#L136 iscsiadm -m node output - 172.30.0.191:3260,-1 iqn.2010-10.org.openstack:volume-f1ff35f1-9716-4929-831f-32e7b207c742 172.30.0.193:3260,-1 iqn.2010-10.org.openstack:volume-5393e371-337f-4332-b39f-4926e4a1f9f7 172.30.0.193:3260,-1 iqn.2010-10.org.openstack:volume-1520a7d6-4351-416a-a703-c82f1bc9839d []:3260,-1 172.30.0.191:3260,-1 iqn.2010-10.org.openstack:volume-fd632af2-45d9-4266-be67-b84e61fb3cbb 172.30.0.193:3260,-1 iqn.2010-10.org.openstack:volume-c1b325b9-7bd2-4d91-a3ef-295736e52eca 172.30.0.191:3260,-1 iqn.2010-10.org.openstack:volume-6a1a112e-1140-482d-9064-fe1b03391f2b The blank target causes an unhandled exception. A simple python code snippet to show the same - >>> out = "172.30.0.193:3260,-1 iqn.2010-10.org.openstack:volume-1520a7d6-4351-416a-a703-c82f1bc9839d\n[]:3260,-1\n172.30.0.191:3260,-1 iqn.2010-10.org.openstack:volume-fd632af2-45d9-4266-be67-b84e61fb3cbb" >>> lines = [] >>> out.splitlines() ['172.30.0.193:3260,-1 iqn.2010-10.org.openstack:volume-1520a7d6-4351-416a-a703-c82f1bc9839d', '[]:3260,-1', '172.30.0.191:3260,-1 iqn.2010-10.org.openstack:volume-fd632af2-45d9-4266-be67-b84e61fb3cbb'] >>> for line in out.splitlines(): ... if line: ... info = line.split() ... lines.append((info[0].split(',')[0], info[1])) ... Traceback (most recent call last): File "", line 4, in The blank iscsi target was most probably due to corruption of the iscsi data during discovery. Using strace we could trace that blank target belongs to a OpenStack volume - open("/var/lib/iscsi/nodes/iqn.2010-10.org.openstack:volume-d88869e6-d27b-4121-9bd1-d8c86ce9d7e1/172.30.0.193,3260", O_RDONLY) = 5 Expected: Management of that single volume associated with blank iscsi target to be affected. Observed: None of the volumes and volume backed VMs can be managed on the affected host. To manage notifications about this bug go to: https://bugs.launchpad.net/cinder/+bug/1886855/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1907756] Re: ERROR: No matching distribution found for hacking<3.1.0, >=3.0.1
** Changed in: nova Status: In Progress => Fix Released -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1907756 Title: ERROR: No matching distribution found for hacking<3.1.0,>=3.0.1 Status in OpenStack Compute (nova): Fix Released Status in OpenStack Compute (nova) train series: Fix Released Status in OpenStack Compute (nova) ussuri series: Fix Released Bug description: The openstack-tox-lower-constraints job fails in stable/ussuri. https://storage.bhs.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/zuul_opendev_logs_0ce/765082/1/check /openstack-tox-lower-constraints/0ceb0d5/job-output.txt 2020-12-11 04:45:08.261271 | ubuntu-bionic | == log start === 2020-12-11 04:45:08.261311 | ubuntu-bionic | Looking in indexes: https://mirror.dfw.rax.opendev.org/pypi/simple, https://mirror.dfw.rax.opendev.org/wheel/ubuntu-18.04-x86_64 2020-12-11 04:45:08.261335 | ubuntu-bionic | ERROR: Could not find a version that satisfies the requirement hacking<3.1.0,>=3.0.1 2020-12-11 04:45:08.261360 | ubuntu-bionic | ERROR: No matching distribution found for hacking<3.1.0,>=3.0.1 To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1907756/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1918250] Re: live migration is failing with libvirt >= 6.8.0
@Martin: You reported this against the upstream nova project but you are linking to the RDO specific nova wrapper code. Is the reported problem really affects the upstream nova project? I'm marking this Invalid from upstream nova perspective. If you disagree then please set it back to New and help us pointing to the fault in upstream nova. ** Changed in: nova Status: New => Invalid -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1918250 Title: live migration is failing with libvirt >= 6.8.0 Status in OpenStack Compute (nova): Invalid Status in tripleo: In Progress Bug description: With libvirt 6.8.0 introduced virt-ssh-helper: + * remote: ``virt-ssh-helper`` replaces ``nc`` for SSH tunnelling + +Libvirt now provides a ``virt-ssh-helper`` binary on the server +side. The libvirt remote client will use this binary for setting +up an SSH tunnelled connection to hosts. If not present, it will +transparently fallback to the traditional ``nc`` tunnel. The new +binary makes it possible for libvirt to transparently connect +across hosts even if libvirt is built with a different installation +prefix on the client vs server. It also enables remote access to +the unprivileged per-user libvirt daemons(eg using a URI such as +``qemu+ssh://hostname/session``. The only requirement is that +``virt-ssh-helper`` is present in $PATH of the remote host. Libvirt first checks for the `virt-ssh-helper` binary, if it's not present, then it falls back to `nc`. The code where the 'nova-migration-wrapper' script looks for the "nc" binary is here[1] libvirt used to first check for `nc` (netcat). But these two libvirt commits[2][3] -- which are present in the libvirt build used in this bug -- have now changed it to first look for `virt-ssh-helper`, if it not available, then fall back to `nc`. The nova-migration-wrapper doesn't accept this command and denies the connection. Mar 08 16:52:39 overcloud-novacompute-1 nova_migration_wrapper[240622]: Denying connection='192.168.24.18 54668 192.168.24.9 2022' command=['sh', '-c', "'which", 'virt-ssh- helper', '1>/dev/null', '2>&1;', 'if', 'test', '$?', '=', '0;', 'then', '', '', '', '', 'virt-ssh-helper', "'qemu:///system';", 'else', '', '', '', 'if', "'nc'", '-q', '2>&1', '|', 'grep', '"requires', 'an', 'argument"', '>/dev/null', '2>&1;', 'then', 'ARG=-q0;else', "ARG=;fi;'nc'", '$ARG', '-U', '/var/run/libvirt /libvirt-sock;', "fi'"] A possible workaround is to force-use "netcat" (`nc`) by appending to the migration URI: "=netcat", so the `diff` of the URL: - qemu+ssh://nova_migration@compute-0.ctlplane.redhat.local:2022/system?keyfile=/etc/nova/migration/identity + qemu+ssh://nova_migration@compute-0.ctlplane.redhat.local:2022/system?keyfile=/etc/nova/migration/identity=netcat But longer term we want to allow the virt-ssh-helper, because that's needed to work properly with the split daemons as the socket path has changed [1] https://github.com/rdo-packages/nova-distgit/blob/rpm-master/nova- migration-wrapper#L32 [2] https://libvirt.org/git/?p=libvirt.git;a=commit;h=f8ec7c842d (rpc: use new virt-ssh-helper binary for remote tunnelling, 2020-07-08) [3] https://libvirt.org/git/?p=libvirt.git;a=commit;h=7d959c302d (rpc: Fix virt-ssh-helper detection, 2020-10-27) To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1918250/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1917409] Re: neutron-l3-agents won't become active
*** This bug is a duplicate of bug 1883089 *** https://bugs.launchpad.net/bugs/1883089 ** This bug has been marked a duplicate of bug 1883089 [L3] floating IP failed to bind due to no agent gateway port(fip-ns) -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1917409 Title: neutron-l3-agents won't become active Status in neutron: Fix Released Status in neutron package in Ubuntu: New Bug description: We have a Ubuntu Ussari cloud deployed on Ubuntu 20.04 using the juju charms from the 20.08 bundle (planning to upgrade soon). The problem that is occuring that all l3 agents for routers using a particular external network show up with their ha_state in standby. I've tried removing and re-adding, and we never see the state go to active. $ neutron l3-agent-list-hosting-router bradm-router neutron CLI is deprecated and will be removed in the future. Use openstack CLI instead. +--+-++---+--+ | id | host| admin_state_up | alive | ha_state | +--+-++---+--+ | 09ae92c9-ae8f-4209-b1a8-d593cc6d6602 | oschv1.maas | True | :-) | standby | | 4d9fe934-b1f8-4c2b-83ea-04971f827209 | oschv2.maas | True | :-) | standby | | 70b8b60e-7fbd-4b3a-80a3-90875ca72ce6 | oschv4.maas | True | :-) | standby | +--+-++---+--+ This generates a stack trace: 2021-03-01 02:59:47.344 3675486 ERROR neutron.agent.l3.router_info [-] 'NoneType' object has no attribute 'get' Traceback (most recent call last): File "/usr/lib/python3/dist-packages/oslo_messaging/rpc/server.py", line 165, in _process_incoming res = self.dispatcher.dispatch(message) File "/usr/lib/python3/dist-packages/oslo_messaging/rpc/dispatcher.py", line 276, in dispatch return self._do_dispatch(endpoint, method, ctxt, args) File "/usr/lib/python3/dist-packages/oslo_messaging/rpc/dispatcher.py", line 196, in _do_dispatch result = func(ctxt, **new_args) File "/usr/lib/python3/dist-packages/neutron_lib/db/api.py", line 139, in wrapped setattr(e, '_RETRY_EXCEEDED', True) File "/usr/lib/python3/dist-packages/oslo_utils/excutils.py", line 220, in __exit__ self.force_reraise() File "/usr/lib/python3/dist-packages/oslo_utils/excutils.py", line 196, in force_reraise six.reraise(self.type_, self.value, self.tb) File "/usr/lib/python3/dist-packages/six.py", line 703, in reraise raise value File "/usr/lib/python3/dist-packages/neutron_lib/db/api.py", line 135, in wrapped return f(*args, **kwargs) File "/usr/lib/python3/dist-packages/oslo_db/api.py", line 154, in wrapper ectxt.value = e.inner_exc File "/usr/lib/python3/dist-packages/oslo_utils/excutils.py", line 220, in __exit__ self.force_reraise() File "/usr/lib/python3/dist-packages/oslo_utils/excutils.py", line 196, in force_reraise six.reraise(self.type_, self.value, self.tb) File "/usr/lib/python3/dist-packages/six.py", line 703, in reraise raise value File "/usr/lib/python3/dist-packages/oslo_db/api.py", line 142, in wrapper return f(*args, **kwargs) File "/usr/lib/python3/dist-packages/neutron_lib/db/api.py", line 183, in wrapped LOG.debug("Retry wrapper got retriable exception: %s", e) File "/usr/lib/python3/dist-packages/oslo_utils/excutils.py", line 220, in __exit__ self.force_reraise() File "/usr/lib/python3/dist-packages/oslo_utils/excutils.py", line 196, in force_reraise six.reraise(self.type_, self.value, self.tb) File "/usr/lib/python3/dist-packages/six.py", line 703, in reraise raise value File "/usr/lib/python3/dist-packages/neutron_lib/db/api.py", line 179, in wrapped return f(*dup_args, **dup_kwargs) File "/usr/lib/python3/dist-packages/neutron/api/rpc/handlers/l3_rpc.py", line 306, in get_agent_gateway_port agent_port = self.l3plugin.create_fip_agent_gw_port_if_not_exists( File "/usr/lib/python3/dist-packages/neutron/db/l3_dvr_db.py", line 1101, in create_fip_agent_gw_port_if_not_exists self._populate_mtu_and_subnets_for_ports(context, [agent_port]) File "/usr/lib/python3/dist-packages/neutron/db/l3_db.py", line 1772, in _populate_mtu_and_subnets_for_ports network_ids = [p['network_id'] File "/usr/lib/python3/dist-packages/neutron/db/l3_db.py", line 1772, in network_ids = [p['network_id'] File "/usr/lib/python3/dist-packages/neutron/db/l3_db.py", line 1720, in _each_port_having_fixed_ips fixed_ips = port.get('fixed_ips', []) This system was running successfully after deployment,
[Yahoo-eng-team] [Bug 1919357] Re: "Secure live migration with QEMU-native TLS in nova"-guide misses essential config option
** Changed in: nova Status: New => In Progress ** Changed in: nova Assignee: (unassigned) => Josephine Seifert (josei) ** Changed in: nova Importance: Undecided => High ** Also affects: nova/stein Importance: Undecided Status: New ** Also affects: nova/victoria Importance: Undecided Status: New ** Also affects: nova/train Importance: Undecided Status: New ** Also affects: nova/ussuri Importance: Undecided Status: New ** Changed in: nova/stein Importance: Undecided => High ** Changed in: nova/train Importance: Undecided => High ** Changed in: nova/ussuri Importance: Undecided => High ** Changed in: nova/victoria Importance: Undecided => High ** Tags added: tls ** Tags added: live-migration -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1919357 Title: "Secure live migration with QEMU-native TLS in nova"-guide misses essential config option Status in OpenStack Compute (nova): In Progress Status in OpenStack Compute (nova) stein series: New Status in OpenStack Compute (nova) train series: New Status in OpenStack Compute (nova) ussuri series: New Status in OpenStack Compute (nova) victoria series: New Status in OpenStack Security Advisory: Won't Fix Status in OpenStack Security Notes: New Bug description: - [x] This doc is inaccurate in this way: __ I followed the guide to setup qemu native tls for live migration. After checking, that libvirt is able to use tls using tcpdump to listen on the port for tls, I also wanted to check that it works when I live migrate an instance. Apparently it didn't. But it used the port for unencrypted TCP [1]. After digging through documentation and code afterwards I found that in this code part: https://github.com/openstack/nova/blob/stable/victoria/nova/virt/libvirt/driver.py#L1120 @staticmethod def _live_migration_uri(dest): uris = { 'kvm': 'qemu+%(scheme)s://%(dest)s/system', 'qemu': 'qemu+%(scheme)s://%(dest)s/system', 'xen': 'xenmigr://%(dest)s/system', 'parallels': 'parallels+tcp://%(dest)s/system', } dest = oslo_netutils.escape_ipv6(dest) virt_type = CONF.libvirt.virt_type # TODO(pkoniszewski): Remove fetching live_migration_uri in Pike uri = CONF.libvirt.live_migration_uri if uri: return uri % dest uri = uris.get(virt_type) if uri is None: raise exception.LiveMigrationURINotAvailable(virt_type=virt_type) str_format = { 'dest': dest, 'scheme': CONF.libvirt.live_migration_scheme or 'tcp', } return uri % str_format the uri is calculated using the config parameter 'live_migration_scheme' or using the hard coded tcp parameter. Coming from the guide for qemu native tls, there was no hint that this config option needs to be set. In fact without setting this 'live_migration_scheme' config option to tls, there is no way to see, that the live migration still uses the unencrypted tcp connection - one has to use tcpdump and listen for tcp or tls to recognize it. Neither in the logs nor in any debug output there is any hint that it is still unencrypted! Thus I conclude there might be OpenStack deployments which are configured as the guide say but these config changes have no effect! - [x] This is a doc addition request. To fix this the config parameter 'live_migration_scheme' should be set to tls and maybe there should be a warning in the documentation, that without doing this, the traffic is still unencrypted. - [ ] I have a fix to the document that I can paste below including example: input and output. [1] without setting 'live_migration_scheme' in the nova.conf $ tcpdump -i INTERFACE -n -X port 16509 and '(tcp[((tcp[12] & 0xf0) >> 2)] < 0x14 || tcp[((tcp[12] & 0xf0) >> 2)] > 0x17)' tcpdump: verbose output suppressed, use -v or -vv for full protocol decode listening on INTERFACE, link-type EN10MB (Ethernet), capture size 262144 bytes 17:10:56.387407 IP 192.168.70.101.50900 > 192.168.70.100.16509: Flags [P.], seq 304:6488, ack 285, win 502, options [nop,nop,TS val 424149655 ecr 1875309961], length 6184 0x: 4500 185c ad05 4000 4006 677c c0a8 4665 E..\..@.@.g|..Fe 0x0010: c0a8 4664 c6d4 407d a407 70a6 15ad 0a5a ..Fd..@}..pZ 0x0020: 8018 01f6 2669 0101 080a 1948 0297 0x0030: 6fc6 f589 1828 2000 8086 0001 o..( 0x0040: 012f 0009 .../ 0x0050: 0001 000f 6465 7374 696e 6174 destinat 0x0060: 696f 6e5f 786d 6c00 0007 129b ion_xml. 0x0070: 3c64 6f6d 6169 6e20 7479 7065 3d27 6b76 ...inst 0x0090:
[Yahoo-eng-team] [Bug 1552042] Re: Host data corruption through nova inject_key feature
The fix merged to master https://review.opendev.org/c/openstack/nova/+/324720 ** Changed in: nova Status: In Progress => Fix Released -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1552042 Title: Host data corruption through nova inject_key feature Status in OpenStack Compute (nova): Fix Released Status in OpenStack Security Advisory: Incomplete Bug description: Reported by Garth Mollett from Red Hat. The nova.virt.disk.vfs.VFSLocalFS has measures to prevent symlink traversal outside of the root of the images directory but it does not prevent access to device nodes inside the image itself. A simple fix should be to mount with the 'nodev' option. Under certain circumstances, the boot process will fold back to VFSLocalFS when trying to inject the public key, for libvirt: * when libguestfs is not installed or can't be loaded. * use_cow_images=false and inject_partition for non-nbd * for loopback mount at least, there is a race condition to win in virt/disk/mount/api.py between kpartx and a /dev/mapper/ file creation: os.path.exists can run before the path exists even though it's there half a second later. The xenapi is also likely vulnerable, though untested. To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1552042/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp