Reviewed: https://review.opendev.org/c/openstack/nova/+/890406 Committed: https://opendev.org/openstack/nova/commit/f1dc4ec39bcfda1bd4b97e233a9da498b6378c4f Submitter: "Zuul (22348)" Branch: master
commit f1dc4ec39bcfda1bd4b97e233a9da498b6378c4f Author: Bence Romsics <bence.roms...@gmail.com> Date: Wed Aug 2 16:22:55 2023 +0200 Do not untrack resources of a server being unshelved This patch concerns the time when a VM is being unshelved and the compute manager set the task_state to spawning, claimed resources of the VM and then called driver.spawn(). So the instance is in vm_state SHELVED_OFFLOADED, task_state spawning. If at this point a new update_available_resource periodic job is started that collects all the instances assigned to the node to calculate resource usage. However the calculation assumed that a VM in SHELVED_OFFLOADED state does not need resource allocation on the node (probably being removed from the node as it is offloaded) and deleted the resource claim. Given all this we ended up with the VM spawned successfully but having lost the resource claim on the node. This patch changes what we do in vm_state SHELVED_OFFLOADED, task_state spawning. We no longer delete the resource claim in this state and keep tracking the resource in stats. Change-Id: I8c9944810c09d501a6d3f60f095d9817b756872d Closes-Bug: #2025480 ** Changed in: nova Status: In Progress => Fix Released -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/2025480 Title: overlapping pinned CPUs after unshelve Status in OpenStack Compute (nova): Fix Released Bug description: It seems that after unshelve, occasionally the request for a dedicated CPU is ignored. More precisely the first pinned CPU does not seem to be marked as consumed, so the second may end up on the same CPU. This was first observed on victoria (6 times out of 46 tries), but then I was able to reproduce it on master too (6 times out of 20 tries). The logs attached are from the victoria environment, which was a single- host all-in-one devstack running only the vms used for this reprouduction. stable/victoria devstack 3eb6e2d7 nova 1aca09b966 master devstack b10c0602 nova 2aea80c0af config: [[post-config|/etc/nova/nova.conf]] [DEFAULT] scheduler_default_filters = NUMATopologyFilter, ... [compute] cpu_dedicated_set = 0,1 Confirming this config in placement: $ openstack --os-placement-api-version 1.17 resource provide inventory show 46b3d4de-bb45-4607-8860-040eb2dcd0d7 PCPU +------------------+-------+ | Field | Value | +------------------+-------+ | allocation_ratio | 1.0 | | min_unit | 1 | | max_unit | 2 | | reserved | 0 | | step_size | 1 | | total | 2 | +------------------+-------+ Reproduction steps: openstack flavor create cirros256-pinned --public --vcpus 1 --ram 256 --disk 1 --property hw_rng:allowed=True --property hw:cpu_policy=dedicated openstack server list -f value -c ID | xargs -r openstack server delete --wait openstack server create --flavor cirros256-pinned --image cirros-0.5.1-x86_64-disk --nic net-id=private vm0 --wait openstack server shelve vm0 sleep 10 # make sure shelve finished openstack server create --flavor cirros256-pinned --image cirros-0.5.1-x86_64-disk --nic net-id=private vm1 --wait openstack server shelve vm1 sleep 10 openstack server unshelve vm0 ; sleep 15 ; openstack server unshelve vm1 # the amount of sleep could easily be relevant watch openstack server list # wait until both go ACTIVE # both vms ended up on the same cpu $ for vm in $( sudo virsh list --name ) ; do sudo virsh dumpxml $vm | xmlstarlet sel -t -v '//vcpupin/@cpuset' ; echo ; done 0 0 Data collected from the environment where the above reproduction triggered the bug: $ openstack server list +--------------------------------------+------+--------+--------------------------------------------------------+--------------------------+------------------+ | ID | Name | Status | Networks | Image | Flavor | +--------------------------------------+------+--------+--------------------------------------------------------+--------------------------+------------------+ | 4734b8a5-a6dd-432a-86c9-ba0367bb86cc | vm1 | ACTIVE | private=10.0.0.27, fdfb:ab27:b2b2:0:f816:3eff:fe80:2fd | cirros-0.5.1-x86_64-disk | cirros256-pinned | | e30de509-6988-4535-a6f5-520c52fba087 | vm0 | ACTIVE | private=10.0.0.6, fdfb:ab27:b2b2:0:f816:3eff:fe78:d368 | cirros-0.5.1-x86_64-disk | cirros256-pinned | +--------------------------------------+------+--------+--------------------------------------------------------+--------------------------+------------------+ $ openstack server show vm0 +-------------------------------------+-----------------------------------------------------------------+ | Field | Value | +-------------------------------------+-----------------------------------------------------------------+ | OS-DCF:diskConfig | MANUAL | | OS-EXT-AZ:availability_zone | nova | | OS-EXT-SRV-ATTR:host | devstack1v | | OS-EXT-SRV-ATTR:hypervisor_hostname | devstack1v | | OS-EXT-SRV-ATTR:instance_name | instance-0000001f | | OS-EXT-STS:power_state | Running | | OS-EXT-STS:task_state | None | | OS-EXT-STS:vm_state | active | | OS-SRV-USG:launched_at | 2023-06-29T10:45:25.000000 | | OS-SRV-USG:terminated_at | None | | accessIPv4 | | | accessIPv6 | | | addresses | private=10.0.0.6, fdfb:ab27:b2b2:0:f816:3eff:fe78:d368 | | config_drive | | | created | 2023-06-29T10:44:42Z | | flavor | cirros256-pinned (a61fac7f-9d3a-4788-8dd8-b84534ed5ba6) | | hostId | a660a4498447edab1528a7c92f7eaccd7aad2c258c896f7820aef9f6 | | id | e30de509-6988-4535-a6f5-520c52fba087 | | image | cirros-0.5.1-x86_64-disk (aaefde72-6da8-45e3-bc58-d7d1b9333d0e) | | key_name | None | | name | vm0 | | progress | 0 | | project_id | 0ad9573f6c57482bbd8202ce43706ca6 | | properties | | | security_groups | name='default' | | status | ACTIVE | | updated | 2023-06-29T10:45:26Z | | user_id | 971105d66de542588ddccdef15594fab | | volumes_attached | | +-------------------------------------+-----------------------------------------------------------------+ $ openstack server show vm1 +-------------------------------------+-----------------------------------------------------------------+ | Field | Value | +-------------------------------------+-----------------------------------------------------------------+ | OS-DCF:diskConfig | MANUAL | | OS-EXT-AZ:availability_zone | nova | | OS-EXT-SRV-ATTR:host | devstack1v | | OS-EXT-SRV-ATTR:hypervisor_hostname | devstack1v | | OS-EXT-SRV-ATTR:instance_name | instance-00000020 | | OS-EXT-STS:power_state | Running | | OS-EXT-STS:task_state | None | | OS-EXT-STS:vm_state | active | | OS-SRV-USG:launched_at | 2023-06-29T10:45:45.000000 | | OS-SRV-USG:terminated_at | None | | accessIPv4 | | | accessIPv6 | | | addresses | private=10.0.0.27, fdfb:ab27:b2b2:0:f816:3eff:fe80:2fd | | config_drive | | | created | 2023-06-29T10:45:00Z | | flavor | cirros256-pinned (a61fac7f-9d3a-4788-8dd8-b84534ed5ba6) | | hostId | a660a4498447edab1528a7c92f7eaccd7aad2c258c896f7820aef9f6 | | id | 4734b8a5-a6dd-432a-86c9-ba0367bb86cc | | image | cirros-0.5.1-x86_64-disk (aaefde72-6da8-45e3-bc58-d7d1b9333d0e) | | key_name | None | | name | vm1 | | progress | 0 | | project_id | 0ad9573f6c57482bbd8202ce43706ca6 | | properties | | | security_groups | name='default' | | status | ACTIVE | | updated | 2023-06-29T10:45:45Z | | user_id | 971105d66de542588ddccdef15594fab | | volumes_attached | | +-------------------------------------+-----------------------------------------------------------------+ $ openstack flavor show cirros256-pinned +----------------------------+--------------------------------------------------+ | Field | Value | +----------------------------+--------------------------------------------------+ | OS-FLV-DISABLED:disabled | False | | OS-FLV-EXT-DATA:ephemeral | 0 | | access_project_ids | None | | disk | 1 | | id | a61fac7f-9d3a-4788-8dd8-b84534ed5ba6 | | name | cirros256-pinned | | os-flavor-access:is_public | True | | properties | hw:cpu_policy='dedicated', hw_rng:allowed='True' | | ram | 256 | | rxtx_factor | 1.0 | | swap | | | vcpus | 1 | +----------------------------+--------------------------------------------------+ $ openstack server event list vm0 +------------------------------------------+--------------------------------------+----------+----------------------------+ | Request ID | Server ID | Action | Start Time | +------------------------------------------+--------------------------------------+----------+----------------------------+ | req-e09149a8-6ef9-49c3-a757-57bbc14d6004 | e30de509-6988-4535-a6f5-520c52fba087 | unshelve | 2023-06-29T10:45:19.000000 | | req-4628f975-a57b-468e-a41c-0ef405df4bae | e30de509-6988-4535-a6f5-520c52fba087 | shelve | 2023-06-29T10:44:49.000000 | | req-d7595149-3cc6-472c-9da8-7a40c020a620 | e30de509-6988-4535-a6f5-520c52fba087 | create | 2023-06-29T10:44:42.000000 | +------------------------------------------+--------------------------------------+----------+----------------------------+ $ openstack server event show vm0 req-e09149a8-6ef9-49c3-a757-57bbc14d6004 +---------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | Field | Value | +---------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | action | unshelve | | events | [{'event': 'compute_unshelve_instance', 'start_time': '2023-06-29T10:45:19.000000', 'finish_time': '2023-06-29T10:45:26.000000', 'result': 'Success', 'traceback': None}, {'event': 'schedule_instances', 'start_time': | | | '2023-06-29T10:45:19.000000', 'finish_time': '2023-06-29T10:45:19.000000', 'result': 'Success', 'traceback': None}, {'event': 'get_image_info', 'start_time': '2023-06-29T10:45:19.000000', 'finish_time': | | | '2023-06-29T10:45:19.000000', 'result': 'Success', 'traceback': None}] | | instance_uuid | e30de509-6988-4535-a6f5-520c52fba087 | | message | None | | project_id | 0ad9573f6c57482bbd8202ce43706ca6 | | request_id | req-e09149a8-6ef9-49c3-a757-57bbc14d6004 | | start_time | 2023-06-29T10:45:19.000000 | | user_id | 971105d66de542588ddccdef15594fab | +---------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ $ openstack server event show vm0 req-4628f975-a57b-468e-a41c-0ef405df4bae +---------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | Field | Value | +---------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | action | shelve | | events | [{'event': 'compute_shelve_instance', 'start_time': '2023-06-29T10:44:49.000000', 'finish_time': '2023-06-29T10:44:55.000000', 'result': 'Success', 'traceback': None}] | | instance_uuid | e30de509-6988-4535-a6f5-520c52fba087 | | message | None | | project_id | 0ad9573f6c57482bbd8202ce43706ca6 | | request_id | req-4628f975-a57b-468e-a41c-0ef405df4bae | | start_time | 2023-06-29T10:44:49.000000 | | user_id | 971105d66de542588ddccdef15594fab | +---------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ $ openstack server event list vm1 +------------------------------------------+--------------------------------------+----------+----------------------------+ | Request ID | Server ID | Action | Start Time | +------------------------------------------+--------------------------------------+----------+----------------------------+ | req-9ab0eabc-bf97-48bf-b534-dd22df3c6452 | 4734b8a5-a6dd-432a-86c9-ba0367bb86cc | unshelve | 2023-06-29T10:45:35.000000 | | req-1dc9b980-d765-415a-b946-2daf53f361b1 | 4734b8a5-a6dd-432a-86c9-ba0367bb86cc | shelve | 2023-06-29T10:45:07.000000 | | req-5071fd9f-00e2-44f8-9b6c-8715fe918e98 | 4734b8a5-a6dd-432a-86c9-ba0367bb86cc | create | 2023-06-29T10:45:00.000000 | +------------------------------------------+--------------------------------------+----------+----------------------------+ $ openstack server event show vm1 req-9ab0eabc-bf97-48bf-b534-dd22df3c6452 +---------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | Field | Value | +---------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | action | unshelve | | events | [{'event': 'compute_unshelve_instance', 'start_time': '2023-06-29T10:45:36.000000', 'finish_time': '2023-06-29T10:45:45.000000', 'result': 'Success', 'traceback': None}, {'event': 'schedule_instances', 'start_time': | | | '2023-06-29T10:45:35.000000', 'finish_time': '2023-06-29T10:45:36.000000', 'result': 'Success', 'traceback': None}, {'event': 'get_image_info', 'start_time': '2023-06-29T10:45:35.000000', 'finish_time': | | | '2023-06-29T10:45:35.000000', 'result': 'Success', 'traceback': None}] | | instance_uuid | 4734b8a5-a6dd-432a-86c9-ba0367bb86cc | | message | None | | project_id | 0ad9573f6c57482bbd8202ce43706ca6 | | request_id | req-9ab0eabc-bf97-48bf-b534-dd22df3c6452 | | start_time | 2023-06-29T10:45:35.000000 | | user_id | 971105d66de542588ddccdef15594fab | +---------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ $ openstack server event show vm1 req-1dc9b980-d765-415a-b946-2daf53f361b1 +---------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | Field | Value | +---------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | action | shelve | | events | [{'event': 'compute_shelve_instance', 'start_time': '2023-06-29T10:45:08.000000', 'finish_time': '2023-06-29T10:45:13.000000', 'result': 'Success', 'traceback': None}] | | instance_uuid | 4734b8a5-a6dd-432a-86c9-ba0367bb86cc | | message | None | | project_id | 0ad9573f6c57482bbd8202ce43706ca6 | | request_id | req-1dc9b980-d765-415a-b946-2daf53f361b1 | | start_time | 2023-06-29T10:45:07.000000 | | user_id | 971105d66de542588ddccdef15594fab | +---------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ Logs that seemed relevant to me: vm0 unshelve: jún 29 10:45:19 devstack1v nova-compute[379150]: DEBUG nova.virt.hardware [None req-e09149a8-6ef9-49c3-a757-57bbc14d6004 admin admin] Attempting to fit instance cell InstanceNUMACell(cpu_pinning_raw={0=0},cpu_policy='dedicated',cpu_thread_policy=None,cpu_topology=VirtCPUTopology,cpuset=set([]),cpuset_reserved=None,id=0,memory=256,pagesize=None,pcpuset=set([0])) on host_cell NUMACell(cpu_usage=0,cpuset=set([]),id=0,memory=7957,memory_usage=0,mempages=[NUMAPagesTopology,NUMAPagesTopology],network_metadata=NetworkMetadata,pcpuset=set([0,1]),pinned_cpus=set([]),siblings=[set([0]),set([1])]) {{(pid=379150) _numa_fit_instance_cell /opt/stack/nova/nova/virt/hardware.py:1046}} jún 29 10:45:19 devstack1v nova-compute[379150]: DEBUG nova.virt.hardware [None req-e09149a8-6ef9-49c3-a757-57bbc14d6004 admin admin] Pinning has been requested {{(pid=379150) _numa_fit_instance_cell /opt/stack/nova/nova/virt/hardware.py:1133}} jún 29 10:45:19 devstack1v nova-compute[379150]: DEBUG nova.virt.hardware [None req-e09149a8-6ef9-49c3-a757-57bbc14d6004 admin admin] Packing an instance onto a set of siblings: host_cell_free_siblings: [{0}, {1}] instance_cell: InstanceNUMACell(cpu_pinning_raw={0=0},cpu_policy='dedicated',cpu_thread_policy=None,cpu_topology=VirtCPUTopology,cpuset=set([]),cpuset_reserved=None,id=0,memory=256,pagesize=None,pcpuset=set([0])) host_cell_id: 0 threads_per_core: 1 num_cpu_reserved: 0 {{(pid=379150) _pack_instance_onto_cores /opt/stack/nova/nova/virt/hardware.py:731}} jún 29 10:45:19 devstack1v nova-compute[379150]: DEBUG nova.virt.hardware [None req-e09149a8-6ef9-49c3-a757-57bbc14d6004 admin admin] Built sibling_sets: defaultdict(<class 'list'>, {1: [{0}, {1}]}) {{(pid=379150) _pack_instance_onto_cores /opt/stack/nova/nova/virt/hardware.py:752}} jún 29 10:45:19 devstack1v nova-compute[379150]: INFO nova.virt.hardware [None req-e09149a8-6ef9-49c3-a757-57bbc14d6004 admin admin] Computed NUMA topology CPU pinning: usable pCPUs: [[0], [1]], vCPUs mapping: [(0, 0)] jún 29 10:45:19 devstack1v nova-compute[379150]: DEBUG nova.virt.hardware [None req-e09149a8-6ef9-49c3-a757-57bbc14d6004 admin admin] Selected cores for pinning: [(0, 0)], in cell 0 {{(pid=379150) _pack_instance_onto_cores /opt/stack/nova/nova/virt/hardware.py:1013}} jún 29 10:45:19 devstack1v nova-compute[379150]: INFO nova.compute.claims [None req-e09149a8-6ef9-49c3-a757-57bbc14d6004 admin admin] [instance: e30de509-6988-4535-a6f5-520c52fba087] Claim successful on node devstack1v vm1 unshelve: jún 29 10:45:37 devstack1v nova-compute[379150]: DEBUG nova.virt.hardware [None req-9ab0eabc-bf97-48bf-b534-dd22df3c6452 admin admin] Attempting to fit instance cell InstanceNUMACell(cpu_pinning_raw={0=0},cpu_policy='dedicated',cpu_thread_policy=None,cpu_topology=VirtCPUTopology,cpuset=set([]),cpuset_reserved=None,id=0,memory=256,pagesize=None,pcpuset=set([0])) on host_cell NUMACell(cpu_usage=0,cpuset=set([]),id=0,memory=7957,memory_usage=0,mempages=[NUMAPagesTopology,NUMAPagesTopology],network_metadata=NetworkMetadata,pcpuset=set([0,1]),pinned_cpus=set([]),siblings=[set([0]),set([1])]) {{(pid=379150) _numa_fit_instance_cell /opt/stack/nova/nova/virt/hardware.py:1046}} jún 29 10:45:37 devstack1v nova-compute[379150]: DEBUG nova.virt.hardware [None req-9ab0eabc-bf97-48bf-b534-dd22df3c6452 admin admin] Pinning has been requested {{(pid=379150) _numa_fit_instance_cell /opt/stack/nova/nova/virt/hardware.py:1133}} jún 29 10:45:37 devstack1v nova-compute[379150]: DEBUG nova.virt.hardware [None req-9ab0eabc-bf97-48bf-b534-dd22df3c6452 admin admin] Packing an instance onto a set of siblings: host_cell_free_siblings: [{0}, {1}] instance_cell: InstanceNUMACell(cpu_pinning_raw={0=0},cpu_policy='dedicated',cpu_thread_policy=None,cpu_topology=VirtCPUTopology,cpuset=set([]),cpuset_reserved=None,id=0,memory=256,pagesize=None,pcpuset=set([0])) host_cell_id: 0 threads_per_core: 1 num_cpu_reserved: 0 {{(pid=379150) _pack_instance_onto_cores /opt/stack/nova/nova/virt/hardware.py:731}} jún 29 10:45:37 devstack1v nova-compute[379150]: DEBUG nova.virt.hardware [None req-9ab0eabc-bf97-48bf-b534-dd22df3c6452 admin admin] Built sibling_sets: defaultdict(<class 'list'>, {1: [{0}, {1}]}) {{(pid=379150) _pack_instance_onto_cores /opt/stack/nova/nova/virt/hardware.py:752}} jún 29 10:45:37 devstack1v nova-compute[379150]: INFO nova.virt.hardware [None req-9ab0eabc-bf97-48bf-b534-dd22df3c6452 admin admin] Computed NUMA topology CPU pinning: usable pCPUs: [[0], [1]], vCPUs mapping: [(0, 0)] jún 29 10:45:37 devstack1v nova-compute[379150]: DEBUG nova.virt.hardware [None req-9ab0eabc-bf97-48bf-b534-dd22df3c6452 admin admin] Selected cores for pinning: [(0, 0)], in cell 0 {{(pid=379150) _pack_instance_onto_cores /opt/stack/nova/nova/virt/hardware.py:1013}} jún 29 10:45:37 devstack1v nova-compute[379150]: INFO nova.compute.claims [None req-9ab0eabc-bf97-48bf-b534-dd22df3c6452 admin admin] [instance: 4734b8a5-a6dd-432a-86c9-ba0367bb86cc] Claim successful on node devstack1v The first error in the logs came not during unshelve but well after: jún 29 10:47:27 devstack1v nova-compute[379150]: ERROR nova.compute.manager [None req-a52cbcd3-ad79-457e-a0e1-3ec225226411 None None] Error updating resources for node devstack1v.: nova.exception.CPUPinningInvalid: CPU set to pin [0] must be a subset of free CPU set [1] jún 29 10:47:27 devstack1v nova-compute[379150]: ERROR nova.compute.manager Traceback (most recent call last): jún 29 10:47:27 devstack1v nova-compute[379150]: ERROR nova.compute.manager File "/opt/stack/nova/nova/compute/manager.py", line 9973, in _update_available_resource_for_node jún 29 10:47:27 devstack1v nova-compute[379150]: ERROR nova.compute.manager self.rt.update_available_resource(context, nodename, jún 29 10:47:27 devstack1v nova-compute[379150]: ERROR nova.compute.manager File "/opt/stack/nova/nova/compute/resource_tracker.py", line 896, in update_available_resource jún 29 10:47:27 devstack1v nova-compute[379150]: ERROR nova.compute.manager self._update_available_resource(context, resources, startup=startup) jún 29 10:47:27 devstack1v nova-compute[379150]: ERROR nova.compute.manager File "/usr/local/lib/python3.8/dist-packages/oslo_concurrency/lockutils.py", line 360, in inner jún 29 10:47:27 devstack1v nova-compute[379150]: ERROR nova.compute.manager return f(*args, **kwargs) jún 29 10:47:27 devstack1v nova-compute[379150]: ERROR nova.compute.manager File "/opt/stack/nova/nova/compute/resource_tracker.py", line 968, in _update_available_resource jún 29 10:47:27 devstack1v nova-compute[379150]: ERROR nova.compute.manager instance_by_uuid = self._update_usage_from_instances( jún 29 10:47:27 devstack1v nova-compute[379150]: ERROR nova.compute.manager File "/opt/stack/nova/nova/compute/resource_tracker.py", line 1534, in _update_usage_from_instances jún 29 10:47:27 devstack1v nova-compute[379150]: ERROR nova.compute.manager self._update_usage_from_instance(context, instance, nodename) jún 29 10:47:27 devstack1v nova-compute[379150]: ERROR nova.compute.manager File "/opt/stack/nova/nova/compute/resource_tracker.py", line 1497, in _update_usage_from_instance jún 29 10:47:27 devstack1v nova-compute[379150]: ERROR nova.compute.manager self._update_usage(self._get_usage_dict(instance, instance), jún 29 10:47:27 devstack1v nova-compute[379150]: ERROR nova.compute.manager File "/opt/stack/nova/nova/compute/resource_tracker.py", line 1301, in _update_usage jún 29 10:47:27 devstack1v nova-compute[379150]: ERROR nova.compute.manager cn.numa_topology = hardware.numa_usage_from_instance_numa( jún 29 10:47:27 devstack1v nova-compute[379150]: ERROR nova.compute.manager File "/opt/stack/nova/nova/virt/hardware.py", line 2450, in numa_usage_from_instance_numa jún 29 10:47:27 devstack1v nova-compute[379150]: ERROR nova.compute.manager new_cell.pin_cpus(pinned_cpus) jún 29 10:47:27 devstack1v nova-compute[379150]: ERROR nova.compute.manager File "/opt/stack/nova/nova/objects/numa.py", line 91, in pin_cpus jún 29 10:47:27 devstack1v nova-compute[379150]: ERROR nova.compute.manager raise exception.CPUPinningInvalid(requested=list(cpus), jún 29 10:47:27 devstack1v nova-compute[379150]: ERROR nova.compute.manager nova.exception.CPUPinningInvalid: CPU set to pin [0] must be a subset of free CPU set [1] jún 29 10:47:27 devstack1v nova-compute[379150]: ERROR nova.compute.manager jún 29 10:47:28 devstack1v nova-compute[379150]: DEBUG oslo_service.periodic_task [None req-a52cbcd3-ad79-457e-a0e1-3ec225226411 None None] Running periodic task ComputeManager._instance_usage_audit {{(pid=379150) run_periodic_tasks /usr/local/lib/python3.8/dist-packages/oslo_service/periodic_task.py:210}} Between the two unshelves the update_available_resource task ran once: $ egrep 'Running periodic task ComputeManager.update_available_resource' ~/n-cpu.log jún 29 10:35:16 devstack1v nova-compute[379150]: DEBUG oslo_service.periodic_task [None req-a52cbcd3-ad79-457e-a0e1-3ec225226411 None None] Running periodic task ComputeManager.update_available_resource {{(pid=379150) run_periodic_tasks /usr/local/lib/python3.8/dist-packages/oslo_service/periodic_task.py:210}} jún 29 10:36:17 devstack1v nova-compute[379150]: DEBUG oslo_service.periodic_task [None req-a52cbcd3-ad79-457e-a0e1-3ec225226411 None None] Running periodic task ComputeManager.update_available_resource {{(pid=379150) run_periodic_tasks /usr/local/lib/python3.8/dist-packages/oslo_service/periodic_task.py:210}} jún 29 10:37:17 devstack1v nova-compute[379150]: DEBUG oslo_service.periodic_task [None req-a52cbcd3-ad79-457e-a0e1-3ec225226411 None None] Running periodic task ComputeManager.update_available_resource {{(pid=379150) run_periodic_tasks /usr/local/lib/python3.8/dist-packages/oslo_service/periodic_task.py:210}} jún 29 10:38:19 devstack1v nova-compute[379150]: DEBUG oslo_service.periodic_task [None req-a52cbcd3-ad79-457e-a0e1-3ec225226411 None None] Running periodic task ComputeManager.update_available_resource {{(pid=379150) run_periodic_tasks /usr/local/lib/python3.8/dist-packages/oslo_service/periodic_task.py:210}} jún 29 10:39:20 devstack1v nova-compute[379150]: DEBUG oslo_service.periodic_task [None req-a52cbcd3-ad79-457e-a0e1-3ec225226411 None None] Running periodic task ComputeManager.update_available_resource {{(pid=379150) run_periodic_tasks /usr/local/lib/python3.8/dist-packages/oslo_service/periodic_task.py:210}} jún 29 10:40:22 devstack1v nova-compute[379150]: DEBUG oslo_service.periodic_task [None req-a52cbcd3-ad79-457e-a0e1-3ec225226411 None None] Running periodic task ComputeManager.update_available_resource {{(pid=379150) run_periodic_tasks /usr/local/lib/python3.8/dist-packages/oslo_service/periodic_task.py:210}} jún 29 10:41:22 devstack1v nova-compute[379150]: DEBUG oslo_service.periodic_task [None req-a52cbcd3-ad79-457e-a0e1-3ec225226411 None None] Running periodic task ComputeManager.update_available_resource {{(pid=379150) run_periodic_tasks /usr/local/lib/python3.8/dist-packages/oslo_service/periodic_task.py:210}} jún 29 10:42:23 devstack1v nova-compute[379150]: DEBUG oslo_service.periodic_task [None req-a52cbcd3-ad79-457e-a0e1-3ec225226411 None None] Running periodic task ComputeManager.update_available_resource {{(pid=379150) run_periodic_tasks /usr/local/lib/python3.8/dist-packages/oslo_service/periodic_task.py:210}} jún 29 10:43:24 devstack1v nova-compute[379150]: DEBUG oslo_service.periodic_task [None req-a52cbcd3-ad79-457e-a0e1-3ec225226411 None None] Running periodic task ComputeManager.update_available_resource {{(pid=379150) run_periodic_tasks /usr/local/lib/python3.8/dist-packages/oslo_service/periodic_task.py:210}} jún 29 10:44:24 devstack1v nova-compute[379150]: DEBUG oslo_service.periodic_task [None req-a52cbcd3-ad79-457e-a0e1-3ec225226411 None None] Running periodic task ComputeManager.update_available_resource {{(pid=379150) run_periodic_tasks /usr/local/lib/python3.8/dist-packages/oslo_service/periodic_task.py:210}} jún 29 10:45:24 devstack1v nova-compute[379150]: DEBUG oslo_service.periodic_task [None req-a52cbcd3-ad79-457e-a0e1-3ec225226411 None None] Running periodic task ComputeManager.update_available_resource {{(pid=379150) run_periodic_tasks /usr/local/lib/python3.8/dist-packages/oslo_service/periodic_task.py:210}} jún 29 10:46:26 devstack1v nova-compute[379150]: DEBUG oslo_service.periodic_task [None req-a52cbcd3-ad79-457e-a0e1-3ec225226411 None None] Running periodic task ComputeManager.update_available_resource {{(pid=379150) run_periodic_tasks /usr/local/lib/python3.8/dist-packages/oslo_service/periodic_task.py:210}} jún 29 10:47:27 devstack1v nova-compute[379150]: DEBUG oslo_service.periodic_task [None req-a52cbcd3-ad79-457e-a0e1-3ec225226411 None None] Running periodic task ComputeManager.update_available_resource {{(pid=379150) run_periodic_tasks /usr/local/lib/python3.8/dist-packages/oslo_service/periodic_task.py:210}} jún 29 10:48:28 devstack1v nova-compute[379150]: DEBUG oslo_service.periodic_task [None req-a52cbcd3-ad79-457e-a0e1-3ec225226411 None None] Running periodic task ComputeManager.update_available_resource {{(pid=379150) run_periodic_tasks /usr/local/lib/python3.8/dist-packages/oslo_service/periodic_task.py:210}} jún 29 10:49:28 devstack1v nova-compute[379150]: DEBUG oslo_service.periodic_task [None req-a52cbcd3-ad79-457e-a0e1-3ec225226411 None None] Running periodic task ComputeManager.update_available_resource {{(pid=379150) run_periodic_tasks /usr/local/lib/python3.8/dist-packages/oslo_service/periodic_task.py:210}} I also attached a .tar.gz with the following contents: for service in n-api n-cond-cell1 n-cpu n-sch n-super-cond do sudo journalctl -u devstack@$service -S '2023-06-29 10:35:00' -U '2023-06-29 10:50:00' > ~/$service.log done $ sudo virsh list Id Name State ----------------------------------- 63 instance-0000001f running 64 instance-00000020 running sudo virsh dumpxml instance-0000001f > ~/instance-0000001f.xml sudo virsh dumpxml instance-00000020 > ~/instance-00000020.xml Let me know if I can collect more information. To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/2025480/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp