** Summary changed:
- Nova scheduler tries to assign an already-in-use SRIOV QAT VF to a new
instance
+ [SRU] Nova scheduler tries to assign an already-in-use SRIOV QAT VF to a new
instance
** Description changed:
+ [Impact]
+ This patch is required to prevent nova from accidentally marking pci_device
allocations as deleted when it incorrectly reads the passthrough whitelist
+
+ [Test Case]
+ * deploy openstack (any version that supports sriov)
+ * single compute configured for sriov with at least once device in
pci_passthrough_whitelist
+ * create a vm and attach sriov port
+ * remove device from pci_passthrough_whitelist and restart nova-compute
+ * check that pci_devices allocations have not been marked as deleted
+
+ [Regression Potential]
+ None anticipated
+ ----------------------------------------------------------------------------
Upon trying to create VM instance (Say A) with one QAT VF, it fails with the
following error i.e., “Requested operation is not valid: PCI device
0000:88:04.7 is in use by driver QEMU, domain instance-00000081”. Please note
that, PCI device 0000:88:04.7 is already being assigned to another VM (Say B) .
We have installed openstack-mitaka release on CentO7 system. It has two Intel
QAT devices. There are 32 VF devices available per QAT Device/DH895xCC device
Out of 64 VFs, only 8 VFs are allocated (to VM instances) and rest should be
available.
- But the nova scheduler tries to assign an already-in-use SRIOV VF to a new
instance and instance fails. It appears that the nova database is not tracking
which VF's have already been taken. But if I shut down VM B instance, then
other instance VM A boots up and vice-versa. Note that, both the VM instances
cannot run simultaneously because of the aforesaid issue.
+ But the nova scheduler tries to assign an already-in-use SRIOV VF to a new
instance and instance fails. It appears that the nova database is not tracking
which VF's have already been taken. But if I shut down VM B instance, then
other instance VM A boots up and vice-versa. Note that, both the VM instances
cannot run simultaneously because of the aforesaid issue.
We should always be able to create as many instances with the requested
PCI devices as there are available VFs.
Please feel free to let me know if additional information is needed. Can
anyone please suggest why it tries to assign same PCI device which has
been assigned already? Is there any way to resolve this issue? Thank you
in advance for your support and help.
[root@localhost ~(keystone_admin)]# lspci -d:435
83:00.0 Co-processor: Intel Corporation DH895XCC Series QAT
88:00.0 Co-processor: Intel Corporation DH895XCC Series QAT
[root@localhost ~(keystone_admin)]#
-
[root@localhost ~(keystone_admin)]# lspci -d:443 | grep "QAT Virtual
Function" | wc -l
64
[root@localhost ~(keystone_admin)]#
-
-
+
[root@localhost ~(keystone_admin)]# mysql -u root nova -e "SELECT
hypervisor_hostname, address, instance_uuid, status FROM pci_devices JOIN
compute_nodes oncompute_nodes.id=compute_node_id" | grep 0000:88:04.7
localhost 0000:88:04.7 e10a76f3-e58e-4071-a4dd-7a545e8000de allocated
localhost 0000:88:04.7 c3dbac90-198d-4150-ba0f-a80b912d8021 allocated
localhost 0000:88:04.7 c7f6adad-83f0-4881-b68f-6d154d565ce3 allocated
localhost.nfv.benunets.com 0000:88:04.7
0c3c11a5-f9a4-4f0d-b120-40e4dde843d4 allocated
[root@localhost ~(keystone_admin)]#
-
+
[root@localhost ~(keystone_admin)]# grep -r
e10a76f3-e58e-4071-a4dd-7a545e8000de /etc/libvirt/qemu
/etc/libvirt/qemu/instance-00000081.xml:
<uuid>e10a76f3-e58e-4071-a4dd-7a545e8000de</uuid>
/etc/libvirt/qemu/instance-00000081.xml: <entry
name='uuid'>e10a76f3-e58e-4071-a4dd-7a545e8000de</entry>
/etc/libvirt/qemu/instance-00000081.xml: <source
file='/var/lib/nova/instances/e10a76f3-e58e-4071-a4dd-7a545e8000de/disk'/>
/etc/libvirt/qemu/instance-00000081.xml: <source
path='/var/lib/nova/instances/e10a76f3-e58e-4071-a4dd-7a545e8000de/console.log'/>
/etc/libvirt/qemu/instance-00000081.xml: <source
path='/var/lib/nova/instances/e10a76f3-e58e-4071-a4dd-7a545e8000de/console.log'/>
[root@localhost ~(keystone_admin)]#
[root@localhost ~(keystone_admin)]# grep -r
0c3c11a5-f9a4-4f0d-b120-40e4dde843d4 /etc/libvirt/qemu
/etc/libvirt/qemu/instance-000000ab.xml:
<uuid>0c3c11a5-f9a4-4f0d-b120-40e4dde843d4</uuid>
/etc/libvirt/qemu/instance-000000ab.xml: <entry
name='uuid'>0c3c11a5-f9a4-4f0d-b120-40e4dde843d4</entry>
/etc/libvirt/qemu/instance-000000ab.xml: <source
file='/var/lib/nova/instances/0c3c11a5-f9a4-4f0d-b120-40e4dde843d4/disk'/>
/etc/libvirt/qemu/instance-000000ab.xml: <source
path='/var/lib/nova/instances/0c3c11a5-f9a4-4f0d-b120-40e4dde843d4/console.log'/>
/etc/libvirt/qemu/instance-000000ab.xml: <source
path='/var/lib/nova/instances/0c3c11a5-f9a4-4f0d-b120-40e4dde843d4/console.log'/>
[root@localhost ~(keystone_admin)]#
-
- On the controller, , it appears there are duplicate PCI device entries in the
Database:
-
+
+ On the controller, , it appears there are duplicate PCI device entries
+ in the Database:
+
MariaDB [nova]> select hypervisor_hostname,address,count(*) from pci_devices
JOIN compute_nodes on compute_nodes.id=compute_node_id group by
hypervisor_hostname,address having count(*) > 1;
+---------------------+--------------+----------+
| hypervisor_hostname | address | count(*) |
+---------------------+--------------+----------+
| localhost | 0000:05:00.0 | 3 |
| localhost | 0000:05:00.1 | 3 |
| localhost | 0000:83:01.0 | 3 |
| localhost | 0000:83:01.1 | 3 |
| localhost | 0000:83:01.2 | 3 |
| localhost | 0000:83:01.3 | 3 |
| localhost | 0000:83:01.4 | 3 |
| localhost | 0000:83:01.5 | 3 |
| localhost | 0000:83:01.6 | 3 |
| localhost | 0000:83:01.7 | 3 |
| localhost | 0000:83:02.0 | 3 |
| localhost | 0000:83:02.1 | 3 |
| localhost | 0000:83:02.2 | 3 |
| localhost | 0000:83:02.3 | 3 |
| localhost | 0000:83:02.4 | 3 |
| localhost | 0000:83:02.5 | 3 |
| localhost | 0000:83:02.6 | 3 |
| localhost | 0000:83:02.7 | 3 |
| localhost | 0000:83:03.0 | 3 |
| localhost | 0000:83:03.1 | 3 |
| localhost | 0000:83:03.2 | 3 |
| localhost | 0000:83:03.3 | 3 |
| localhost | 0000:83:03.4 | 3 |
| localhost | 0000:83:03.5 | 3 |
| localhost | 0000:83:03.6 | 3 |
| localhost | 0000:83:03.7 | 3 |
| localhost | 0000:83:04.0 | 3 |
| localhost | 0000:83:04.1 | 3 |
| localhost | 0000:83:04.2 | 3 |
| localhost | 0000:83:04.3 | 3 |
| localhost | 0000:83:04.4 | 3 |
| localhost | 0000:83:04.5 | 3 |
| localhost | 0000:83:04.6 | 3 |
| localhost | 0000:83:04.7 | 3 |
| localhost | 0000:88:01.0 | 3 |
| localhost | 0000:88:01.1 | 3 |
| localhost | 0000:88:01.2 | 3 |
| localhost | 0000:88:01.3 | 3 |
| localhost | 0000:88:01.4 | 3 |
| localhost | 0000:88:01.5 | 3 |
| localhost | 0000:88:01.6 | 3 |
| localhost | 0000:88:01.7 | 3 |
| localhost | 0000:88:02.0 | 3 |
| localhost | 0000:88:02.1 | 3 |
| localhost | 0000:88:02.2 | 3 |
| localhost | 0000:88:02.3 | 3 |
| localhost | 0000:88:02.4 | 3 |
| localhost | 0000:88:02.5 | 3 |
| localhost | 0000:88:02.6 | 3 |
| localhost | 0000:88:02.7 | 3 |
| localhost | 0000:88:03.0 | 3 |
| localhost | 0000:88:03.1 | 3 |
| localhost | 0000:88:03.2 | 3 |
| localhost | 0000:88:03.3 | 3 |
| localhost | 0000:88:03.4 | 3 |
| localhost | 0000:88:03.5 | 3 |
| localhost | 0000:88:03.6 | 3 |
| localhost | 0000:88:03.7 | 3 |
| localhost | 0000:88:04.0 | 3 |
| localhost | 0000:88:04.1 | 3 |
| localhost | 0000:88:04.2 | 3 |
| localhost | 0000:88:04.3 | 3 |
| localhost | 0000:88:04.4 | 3 |
| localhost | 0000:88:04.5 | 3 |
| localhost | 0000:88:04.6 | 3 |
| localhost | 0000:88:04.7 | 3 |
+---------------------+--------------+----------+
66 rows in set (0.00 sec)
-
+
MariaDB [nova]>
** Tags added: sts-sru-needed
** Also affects: nova (Ubuntu)
Importance: Undecided
Status: New
** Also affects: cloud-archive
Importance: Undecided
Status: New
** Also affects: cloud-archive/mitaka
Importance: Undecided
Status: New
** Also affects: cloud-archive/rocky
Importance: Undecided
Status: New
** Also affects: cloud-archive/ocata
Importance: Undecided
Status: New
** Also affects: cloud-archive/stein
Importance: Undecided
Status: New
** Also affects: cloud-archive/queens
Importance: Undecided
Status: New
** Also affects: nova (Ubuntu Bionic)
Importance: Undecided
Status: New
** Also affects: nova (Ubuntu Cosmic)
Importance: Undecided
Status: New
** Also affects: nova (Ubuntu Xenial)
Importance: Undecided
Status: New
** Also affects: nova (Ubuntu Eoan)
Importance: Undecided
Status: New
** Also affects: nova (Ubuntu Disco)
Importance: Undecided
Status: New
--
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1633120
Title:
[SRU] Nova scheduler tries to assign an already-in-use SRIOV QAT VF to
a new instance
To manage notifications about this bug go to:
https://bugs.launchpad.net/cloud-archive/+bug/1633120/+subscriptions
--
ubuntu-bugs mailing list
[email protected]
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs