Public bug reported:

Upon trying to create VM instance (Say A) with one QAT VF, it fails with the 
following error i.e., “Requested operation is not valid: PCI device 
0000:88:04.7 is in use by driver QEMU, domain instance-00000081”. Please note 
that, PCI device 0000:88:04.7 is already being assigned to another VM (Say B) . 
 We have installed openstack-mitaka release on CentO7 system. It has two Intel 
QAT devices. There are 32 VF devices available per QAT Device/DH895xCC device 
Out of 64 VFs, only 8 VFs are allocated (to VM instances) and rest should be 
available.
But the nova scheduler tries to assign an already-in-use SRIOV VF to a new 
instance and instance fails. It appears that the nova database is not tracking 
which VF's have already been taken. But if I shut down VM B instance, then 
other instance VM A boots up and vice-versa. Note that, both the VM instances 
cannot run simultaneously because of the aforesaid issue.     

We should always be able to create as many instances with the requested
PCI devices as there are available VFs.

Please feel free to let me know if additional information is needed. Can
anyone please suggest why it tries to assign same PCI device which has
been assigned already? Is there any way to resolve this issue? Thank you
in advance for your support and help.

[root@localhost ~(keystone_admin)]# lspci -d:435
83:00.0 Co-processor: Intel Corporation DH895XCC Series QAT
88:00.0 Co-processor: Intel Corporation DH895XCC Series QAT
[root@localhost ~(keystone_admin)]#


[root@localhost ~(keystone_admin)]# lspci -d:443 | grep "QAT Virtual Function" 
| wc -l
64
[root@localhost ~(keystone_admin)]#
 
 
[root@localhost ~(keystone_admin)]# mysql -u root nova -e "SELECT 
hypervisor_hostname, address, instance_uuid, status FROM pci_devices JOIN 
compute_nodes oncompute_nodes.id=compute_node_id" | grep 0000:88:04.7
localhost  0000:88:04.7    e10a76f3-e58e-4071-a4dd-7a545e8000de    allocated
localhost  0000:88:04.7    c3dbac90-198d-4150-ba0f-a80b912d8021    allocated
localhost  0000:88:04.7    c7f6adad-83f0-4881-b68f-6d154d565ce3    allocated
localhost.nfv.benunets.com 0000:88:04.7    0c3c11a5-f9a4-4f0d-b120-40e4dde843d4 
   allocated
[root@localhost ~(keystone_admin)]#
 
[root@localhost ~(keystone_admin)]# grep -r 
e10a76f3-e58e-4071-a4dd-7a545e8000de /etc/libvirt/qemu
/etc/libvirt/qemu/instance-00000081.xml:  
<uuid>e10a76f3-e58e-4071-a4dd-7a545e8000de</uuid>
/etc/libvirt/qemu/instance-00000081.xml:      <entry 
name='uuid'>e10a76f3-e58e-4071-a4dd-7a545e8000de</entry>
/etc/libvirt/qemu/instance-00000081.xml:      <source 
file='/var/lib/nova/instances/e10a76f3-e58e-4071-a4dd-7a545e8000de/disk'/>
/etc/libvirt/qemu/instance-00000081.xml:      <source 
path='/var/lib/nova/instances/e10a76f3-e58e-4071-a4dd-7a545e8000de/console.log'/>
/etc/libvirt/qemu/instance-00000081.xml:      <source 
path='/var/lib/nova/instances/e10a76f3-e58e-4071-a4dd-7a545e8000de/console.log'/>
[root@localhost ~(keystone_admin)]#
[root@localhost ~(keystone_admin)]# grep -r 
0c3c11a5-f9a4-4f0d-b120-40e4dde843d4 /etc/libvirt/qemu
/etc/libvirt/qemu/instance-000000ab.xml:  
<uuid>0c3c11a5-f9a4-4f0d-b120-40e4dde843d4</uuid>
/etc/libvirt/qemu/instance-000000ab.xml:      <entry 
name='uuid'>0c3c11a5-f9a4-4f0d-b120-40e4dde843d4</entry>
/etc/libvirt/qemu/instance-000000ab.xml:      <source 
file='/var/lib/nova/instances/0c3c11a5-f9a4-4f0d-b120-40e4dde843d4/disk'/>
/etc/libvirt/qemu/instance-000000ab.xml:      <source 
path='/var/lib/nova/instances/0c3c11a5-f9a4-4f0d-b120-40e4dde843d4/console.log'/>
/etc/libvirt/qemu/instance-000000ab.xml:      <source 
path='/var/lib/nova/instances/0c3c11a5-f9a4-4f0d-b120-40e4dde843d4/console.log'/>
[root@localhost ~(keystone_admin)]#
 
On the controller, , it appears there are duplicate PCI device entries in the 
Database:
 
MariaDB [nova]> select hypervisor_hostname,address,count(*) from pci_devices 
JOIN compute_nodes on compute_nodes.id=compute_node_id group by 
hypervisor_hostname,address having count(*) > 1;
+---------------------+--------------+----------+
| hypervisor_hostname | address      | count(*) |
+---------------------+--------------+----------+
| localhost              | 0000:05:00.0 |        3 |
| localhost              | 0000:05:00.1 |        3 |
| localhost              | 0000:83:01.0 |        3 |
| localhost              | 0000:83:01.1 |        3 |
| localhost              | 0000:83:01.2 |        3 |
| localhost              | 0000:83:01.3 |        3 |
| localhost              | 0000:83:01.4 |        3 |
| localhost              | 0000:83:01.5 |        3 |
| localhost              | 0000:83:01.6 |        3 |
| localhost              | 0000:83:01.7 |        3 |
| localhost              | 0000:83:02.0 |        3 |
| localhost              | 0000:83:02.1 |        3 |
| localhost              | 0000:83:02.2 |        3 |
| localhost              | 0000:83:02.3 |        3 |
| localhost              | 0000:83:02.4 |        3 |
| localhost              | 0000:83:02.5 |        3 |
| localhost              | 0000:83:02.6 |        3 |
| localhost              | 0000:83:02.7 |        3 |
| localhost              | 0000:83:03.0 |        3 |
| localhost              | 0000:83:03.1 |        3 |
| localhost              | 0000:83:03.2 |        3 |
| localhost              | 0000:83:03.3 |        3 |
| localhost              | 0000:83:03.4 |        3 |
| localhost              | 0000:83:03.5 |        3 |
| localhost              | 0000:83:03.6 |        3 |
| localhost              | 0000:83:03.7 |        3 |
| localhost              | 0000:83:04.0 |        3 |
| localhost              | 0000:83:04.1 |        3 |
| localhost              | 0000:83:04.2 |        3 |
| localhost              | 0000:83:04.3 |        3 |
| localhost              | 0000:83:04.4 |        3 |
| localhost              | 0000:83:04.5 |        3 |
| localhost              | 0000:83:04.6 |        3 |
| localhost              | 0000:83:04.7 |        3 |
| localhost              | 0000:88:01.0 |        3 |
| localhost              | 0000:88:01.1 |        3 |
| localhost              | 0000:88:01.2 |        3 |
| localhost              | 0000:88:01.3 |        3 |
| localhost              | 0000:88:01.4 |        3 |
| localhost              | 0000:88:01.5 |        3 |
| localhost              | 0000:88:01.6 |        3 |
| localhost              | 0000:88:01.7 |        3 |
| localhost              | 0000:88:02.0 |        3 |
| localhost              | 0000:88:02.1 |        3 |
| localhost              | 0000:88:02.2 |        3 |
| localhost              | 0000:88:02.3 |        3 |
| localhost              | 0000:88:02.4 |        3 |
| localhost              | 0000:88:02.5 |        3 |
| localhost              | 0000:88:02.6 |        3 |
| localhost              | 0000:88:02.7 |        3 |
| localhost              | 0000:88:03.0 |        3 |
| localhost              | 0000:88:03.1 |        3 |
| localhost              | 0000:88:03.2 |        3 |
| localhost              | 0000:88:03.3 |        3 |
| localhost              | 0000:88:03.4 |        3 |
| localhost              | 0000:88:03.5 |        3 |
| localhost              | 0000:88:03.6 |        3 |
| localhost              | 0000:88:03.7 |        3 |
| localhost              | 0000:88:04.0 |        3 |
| localhost              | 0000:88:04.1 |        3 |
| localhost              | 0000:88:04.2 |        3 |
| localhost              | 0000:88:04.3 |        3 |
| localhost              | 0000:88:04.4 |        3 |
| localhost              | 0000:88:04.5 |        3 |
| localhost              | 0000:88:04.6 |        3 |
| localhost              | 0000:88:04.7 |        3 |
+---------------------+--------------+----------+
66 rows in set (0.00 sec)
 
MariaDB [nova]>

** Affects: nova
     Importance: Undecided
         Status: New

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1633120

Title:
  Nova scheduler tries to assign an already-in-use SRIOV QAT VF to a new
  instance (openstack-mitaka)

Status in OpenStack Compute (nova):
  New

Bug description:
  Upon trying to create VM instance (Say A) with one QAT VF, it fails with the 
following error i.e., “Requested operation is not valid: PCI device 
0000:88:04.7 is in use by driver QEMU, domain instance-00000081”. Please note 
that, PCI device 0000:88:04.7 is already being assigned to another VM (Say B) . 
 We have installed openstack-mitaka release on CentO7 system. It has two Intel 
QAT devices. There are 32 VF devices available per QAT Device/DH895xCC device 
Out of 64 VFs, only 8 VFs are allocated (to VM instances) and rest should be 
available.
  But the nova scheduler tries to assign an already-in-use SRIOV VF to a new 
instance and instance fails. It appears that the nova database is not tracking 
which VF's have already been taken. But if I shut down VM B instance, then 
other instance VM A boots up and vice-versa. Note that, both the VM instances 
cannot run simultaneously because of the aforesaid issue.     

  We should always be able to create as many instances with the
  requested PCI devices as there are available VFs.

  Please feel free to let me know if additional information is needed.
  Can anyone please suggest why it tries to assign same PCI device which
  has been assigned already? Is there any way to resolve this issue?
  Thank you in advance for your support and help.

  [root@localhost ~(keystone_admin)]# lspci -d:435
  83:00.0 Co-processor: Intel Corporation DH895XCC Series QAT
  88:00.0 Co-processor: Intel Corporation DH895XCC Series QAT
  [root@localhost ~(keystone_admin)]#

  
  [root@localhost ~(keystone_admin)]# lspci -d:443 | grep "QAT Virtual 
Function" | wc -l
  64
  [root@localhost ~(keystone_admin)]#
   
   
  [root@localhost ~(keystone_admin)]# mysql -u root nova -e "SELECT 
hypervisor_hostname, address, instance_uuid, status FROM pci_devices JOIN 
compute_nodes oncompute_nodes.id=compute_node_id" | grep 0000:88:04.7
  localhost  0000:88:04.7    e10a76f3-e58e-4071-a4dd-7a545e8000de    allocated
  localhost  0000:88:04.7    c3dbac90-198d-4150-ba0f-a80b912d8021    allocated
  localhost  0000:88:04.7    c7f6adad-83f0-4881-b68f-6d154d565ce3    allocated
  localhost.nfv.benunets.com 0000:88:04.7    
0c3c11a5-f9a4-4f0d-b120-40e4dde843d4    allocated
  [root@localhost ~(keystone_admin)]#
   
  [root@localhost ~(keystone_admin)]# grep -r 
e10a76f3-e58e-4071-a4dd-7a545e8000de /etc/libvirt/qemu
  /etc/libvirt/qemu/instance-00000081.xml:  
<uuid>e10a76f3-e58e-4071-a4dd-7a545e8000de</uuid>
  /etc/libvirt/qemu/instance-00000081.xml:      <entry 
name='uuid'>e10a76f3-e58e-4071-a4dd-7a545e8000de</entry>
  /etc/libvirt/qemu/instance-00000081.xml:      <source 
file='/var/lib/nova/instances/e10a76f3-e58e-4071-a4dd-7a545e8000de/disk'/>
  /etc/libvirt/qemu/instance-00000081.xml:      <source 
path='/var/lib/nova/instances/e10a76f3-e58e-4071-a4dd-7a545e8000de/console.log'/>
  /etc/libvirt/qemu/instance-00000081.xml:      <source 
path='/var/lib/nova/instances/e10a76f3-e58e-4071-a4dd-7a545e8000de/console.log'/>
  [root@localhost ~(keystone_admin)]#
  [root@localhost ~(keystone_admin)]# grep -r 
0c3c11a5-f9a4-4f0d-b120-40e4dde843d4 /etc/libvirt/qemu
  /etc/libvirt/qemu/instance-000000ab.xml:  
<uuid>0c3c11a5-f9a4-4f0d-b120-40e4dde843d4</uuid>
  /etc/libvirt/qemu/instance-000000ab.xml:      <entry 
name='uuid'>0c3c11a5-f9a4-4f0d-b120-40e4dde843d4</entry>
  /etc/libvirt/qemu/instance-000000ab.xml:      <source 
file='/var/lib/nova/instances/0c3c11a5-f9a4-4f0d-b120-40e4dde843d4/disk'/>
  /etc/libvirt/qemu/instance-000000ab.xml:      <source 
path='/var/lib/nova/instances/0c3c11a5-f9a4-4f0d-b120-40e4dde843d4/console.log'/>
  /etc/libvirt/qemu/instance-000000ab.xml:      <source 
path='/var/lib/nova/instances/0c3c11a5-f9a4-4f0d-b120-40e4dde843d4/console.log'/>
  [root@localhost ~(keystone_admin)]#
   
  On the controller, , it appears there are duplicate PCI device entries in the 
Database:
   
  MariaDB [nova]> select hypervisor_hostname,address,count(*) from pci_devices 
JOIN compute_nodes on compute_nodes.id=compute_node_id group by 
hypervisor_hostname,address having count(*) > 1;
  +---------------------+--------------+----------+
  | hypervisor_hostname | address      | count(*) |
  +---------------------+--------------+----------+
  | localhost              | 0000:05:00.0 |        3 |
  | localhost              | 0000:05:00.1 |        3 |
  | localhost              | 0000:83:01.0 |        3 |
  | localhost              | 0000:83:01.1 |        3 |
  | localhost              | 0000:83:01.2 |        3 |
  | localhost              | 0000:83:01.3 |        3 |
  | localhost              | 0000:83:01.4 |        3 |
  | localhost              | 0000:83:01.5 |        3 |
  | localhost              | 0000:83:01.6 |        3 |
  | localhost              | 0000:83:01.7 |        3 |
  | localhost              | 0000:83:02.0 |        3 |
  | localhost              | 0000:83:02.1 |        3 |
  | localhost              | 0000:83:02.2 |        3 |
  | localhost              | 0000:83:02.3 |        3 |
  | localhost              | 0000:83:02.4 |        3 |
  | localhost              | 0000:83:02.5 |        3 |
  | localhost              | 0000:83:02.6 |        3 |
  | localhost              | 0000:83:02.7 |        3 |
  | localhost              | 0000:83:03.0 |        3 |
  | localhost              | 0000:83:03.1 |        3 |
  | localhost              | 0000:83:03.2 |        3 |
  | localhost              | 0000:83:03.3 |        3 |
  | localhost              | 0000:83:03.4 |        3 |
  | localhost              | 0000:83:03.5 |        3 |
  | localhost              | 0000:83:03.6 |        3 |
  | localhost              | 0000:83:03.7 |        3 |
  | localhost              | 0000:83:04.0 |        3 |
  | localhost              | 0000:83:04.1 |        3 |
  | localhost              | 0000:83:04.2 |        3 |
  | localhost              | 0000:83:04.3 |        3 |
  | localhost              | 0000:83:04.4 |        3 |
  | localhost              | 0000:83:04.5 |        3 |
  | localhost              | 0000:83:04.6 |        3 |
  | localhost              | 0000:83:04.7 |        3 |
  | localhost              | 0000:88:01.0 |        3 |
  | localhost              | 0000:88:01.1 |        3 |
  | localhost              | 0000:88:01.2 |        3 |
  | localhost              | 0000:88:01.3 |        3 |
  | localhost              | 0000:88:01.4 |        3 |
  | localhost              | 0000:88:01.5 |        3 |
  | localhost              | 0000:88:01.6 |        3 |
  | localhost              | 0000:88:01.7 |        3 |
  | localhost              | 0000:88:02.0 |        3 |
  | localhost              | 0000:88:02.1 |        3 |
  | localhost              | 0000:88:02.2 |        3 |
  | localhost              | 0000:88:02.3 |        3 |
  | localhost              | 0000:88:02.4 |        3 |
  | localhost              | 0000:88:02.5 |        3 |
  | localhost              | 0000:88:02.6 |        3 |
  | localhost              | 0000:88:02.7 |        3 |
  | localhost              | 0000:88:03.0 |        3 |
  | localhost              | 0000:88:03.1 |        3 |
  | localhost              | 0000:88:03.2 |        3 |
  | localhost              | 0000:88:03.3 |        3 |
  | localhost              | 0000:88:03.4 |        3 |
  | localhost              | 0000:88:03.5 |        3 |
  | localhost              | 0000:88:03.6 |        3 |
  | localhost              | 0000:88:03.7 |        3 |
  | localhost              | 0000:88:04.0 |        3 |
  | localhost              | 0000:88:04.1 |        3 |
  | localhost              | 0000:88:04.2 |        3 |
  | localhost              | 0000:88:04.3 |        3 |
  | localhost              | 0000:88:04.4 |        3 |
  | localhost              | 0000:88:04.5 |        3 |
  | localhost              | 0000:88:04.6 |        3 |
  | localhost              | 0000:88:04.7 |        3 |
  +---------------------+--------------+----------+
  66 rows in set (0.00 sec)
   
  MariaDB [nova]>

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1633120/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to     : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp

Reply via email to