[Bug 1771662] Re: libvirtError: Node device not found: no node device with matching name

 Christian Ehrhardt  Fri, 18 May 2018 00:22:03 -0700

Newly deployed Cavium System with 18.04 to get my own view onto this
(without openstack/charms in the way)


1. start a basic guest
   $ sudo apt install uvtool-libvirt qemu-efi-aarch64
   $ uvt-simplestreams-libvirt --verbose sync --source 
http://cloud-images.ubuntu.com/daily arch=arm64 label=daily release=bionic
   $ uvt-kvm create --password=ubuntu b1 release=bionic arch=arm64 label=daily

=> Just works, nothing special in logs
Since it was stated that the special VF/PF are not uses this already breaks the 
argument made in the bug report - my guest just works on this system.

2. check the odd PF/VF situation

Please note that I had only the initial renames to the new naming scheme, but 
no others:
dmesg | grep renamed
[   10.450002] thunder-nicvf 0002:01:00.2 enP2p1s0f2: renamed from eth1
[   10.489989] thunder-nicvf 0002:01:00.1 enP2p1s0f1: renamed from eth0
[   10.629936] thunder-nicvf 0002:01:00.4 enP2p1s0f4: renamed from eth3
[   10.877936] thunder-nicvf 0002:01:00.3 enP2p1s0f3: renamed from eth2
[   10.957933] thunder-nicvf 0002:01:00.5 enP2p1s0f5: renamed from eth4

None of the devices has pyhsical_port_id but that is no fatal.
Because on other platforms I found the same e.g. ppc64el some have that some 
don't 
/sys/devices/pci0003:00/0003:00:00.0/0003:01:00.0/0003:02:09.0/0003:09:00.0/net/enP3p9s0f0/phys_port_id':
 Operation not supported
/sys/devices/pci0005:00/0005:00:00.0/0005:01:00.3/net/enP5p1s0f3/phys_port_id 
0400000000334233343130363730453131

It will just use NULL which essentially menas there is just one phys
port and that is fine.

It is more interesting that it later checks physfn which exists on Cavium (but 
not on ppc64 for example)
ll /sys/devices/pci0002:00/0002:00:02.0/0002:01:01.4/physfn
lrwxrwxrwx 1 root root 0 May 18 06:23 
/sys/devices/pci0002:00/0002:00:02.0/0002:01:01.4/physfn -> ../0002:01:00.0/

If this would NOT exist it would give up here.
But it does exist, so it tries to go on with it and then fails as it doesn't 
find anything.
That would match what we read in the reported upstream mail discussion.

But none of this matters as per jhobbs it should not use those devices
at all.

FYI code in libvirt around that:
virNetDevGetPhysicalFunction
-> virNetDevGetPhysPortID
   -> virNetDevSysfsFile
   This gives you something like
   /sys/devices/pci0002:00/0002:00:02.0/0002:01:00.4/net/enP2p1s0f4/phys_port_id
-> virNetDevSysfsDeviceFile
-> virPCIGetNetName
If none of these functions failed BUT returned no path then the reported 
message appears.
On other HW it either works OR just doesn't find the paths and gives up before 
the error message.


3. check libvirt capabilities and status
As I asked before, we would need to know the libvirt action that fails, as all 
I tried just works.

Also general probing like one would expect on an initial nova node setup:
  $ virsh capabilities
  $ virsh domcapabilities
  $ virsh sysinfo
  $ virsh nodeinfo
works just fine without the reported errors.

4. Lets even use those devices now
The host uses enP2p1s0f1, that is:
0002:01:00.1 Ethernet controller: Cavium, Inc. THUNDERX Network Interface 
Controller virtual function (rev 09)
So lets use its siblings
As passthrough host-interface
  0002:01:00.2 Ethernet controller: Cavium, Inc. THUNDERX Network Interface 
Controller virtual function (rev 09)
  <interface type='hostdev' managed='yes'>
    <source>
      <address type='pci' domain='0x0002' bus='0x1' slot='0x0' function='0x2'/>
    </source>
  </interface>
As passthrough generic hostdev:
  0002:01:00.3 Ethernet controller: Cavium, Inc. THUNDERX Network Interface 
Controller virtual function (rev 09)
   <hostdev mode='subsystem' type='pci' managed='yes'>
     <driver name='vfio'/>
     <source>
       <address type='pci' domain='0x0002' bus='0x1' slot='0x0' function='0x3'/>
     </source>
   </hostdev>

Note: please follow the upstream mailing list discussion on the
difference of those.

$ virsh attach-device b1 interface.xml
error: Failed to attach device from interface.xml
error: internal error: The PF device for VF /sys/bus/pci/devices/0002:01:00.2 
has no network device name
And in Log:
4624: error : virPCIGetVirtualFunctionInfo:3016 : internal error: The PF device 
for VF /sys/bus/pci/devices/0002:01:00.2 has no network device name

As outlined in the mail-thread these special devices can still be attached, if 
you let libvirt handle it not as VFs but as generic PCI.
$ virsh attach-device b1 hostdev.xml 
Device attached successfully
My guest can work fine with this now.

And e voila when you attach it as hostdev then (due to unplugging/pluggin on 
the host) you get the device renames you have seen.
[ 3222.919212] vfio-pci 0002:01:00.3: enabling device (0004 -> 0006)
[ 3229.172142] thunder-nicvf 0002:01:00.3: enabling device (0004 -> 0006)
[ 3229.219106] thunder-nicvf 0002:01:00.3 enP2p1s0f3: renamed from eth0


This is your error IMHO, but you said multiple times you are not doing that.
I assume you really want to use the VFs as passthrough devices - which is a 
whole other story than "just set up openstack".

If you really just set up the base nova node, then total +1 on Ryans:
"At this point, we can compare the logs to Xenial, but I think the next
step is back to the charms/nova-compute to determine how a node reports
back to openstack that a compute node is ready."

** Changed in: libvirt (Ubuntu)
       Status: New => Incomplete

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1771662

Title:
  libvirtError: Node device not found: no node device with matching name

To manage notifications about this bug go to:
https://bugs.launchpad.net/charm-nova-compute/+bug/1771662/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1771662] Re: libvirtError: Node device not found: no node device with matching name

Reply via email to