Hello, In the context of running Openstack on a cluster of Cavium ThunderX cn8890 aarch64 servers, we are trying to attach virtual functions to a VM.
First some introduction. This Cavium SoC has a different approach to Virtual
Functions than on x86 NICs, in which VFs are always enabled and there are two
types of VFs and *one single* PF, as follows:
- primary VFs - these are in fact assigned by the system to the physical ports
of the server, e.g em2p1s0f1, em2p1s0f3 etc below.
- secondary VFs - the main purpose of these is to provide additional HW queues
under SW control (usually DPDK applications) by automatically binding them to
the needed physical port.
- one single "physical" function, device 0002:01:00.0 below, which to the best
of my knowledge acts merely as a stub and cannot be assigned an interface name.
Below is the output of "dpdk-devbind.py -s" which provides some useful
information.
Network devices using DPDK-compatible driver
============================================
0002:01:00.2 'Device a034' drv=vfio-pci unused=nicvf
Network devices using kernel driver
===================================
0000:01:10.0 'THUNDERX BGX (Common Ethernet Interface)' if= drv=thunder-BGX
unused=thunder_bgx,vfio-pci
0000:01:10.1 'THUNDERX BGX (Common Ethernet Interface)' if= drv=thunder-BGX
unused=thunder_bgx,vfio-pci
0002:01:00.0 'THUNDERX Network Interface Controller' if= drv=thunder-nic
unused=nicpf,vfio-pci
0002:01:00.1 'Device a034' if=em2p1s0f1 drv=thunder-nicvf unused=nicvf,vfio-pci
0002:01:00.3 'Device a034' if=em2p1s0f3 drv=thunder-nicvf unused=nicvf,vfio-pci
0002:01:00.4 'Device a034' if=em2p1s0f4 drv=thunder-nicvf unused=nicvf,vfio-pci
0002:01:00.5 'Device a034' if=em2p1s0f5 drv=thunder-nicvf unused=nicvf,vfio-pci
0002:01:00.6 'Device a034' if= drv=thunder-nicvf unused=nicvf,vfio-pci
0002:01:00.7 'Device a034' if= drv=thunder-nicvf unused=nicvf,vfio-pci
0002:01:01.0 'Device a034' if= drv=thunder-nicvf unused=nicvf,vfio-pci
Now for the problem. I don't have a domain definition because libvirt fails to
start a domain, but I might be able to find what nova generates. But what it
tries to do is passthrough em2p1s0f3, address 0002:01:00.3:
<interface type='hostdev' managed='yes'>
<source>
<address type='pci' domain='0x0002' bus='0x1' slot='0x0' function='0x3'/>
</source>
</interface>
You can find attached a trimmed libvirtd.log where the main error is:
43236: error : virPCIGetVirtualFunctionInfo:2927 : internal error: The PF
device for VF /sys/bus/pci/devices/0002:01:00.3 has no network device name
I have actually spent a few days trying to do some hacks and learn some more.
The main idea is that virPCIGetVirtualFunctionInfo fails to find the physical
name for the virtual device at address 0002:01:00.3, which as I explained in
the introduction is something that this Cavium SoC does not do.
Looking further down the stream, almost all of the helper functions need a
linkdev for the physical function, which means that making libvirt work on this
system means some heavy refactoring, a solution being to use the sysfs path
rather than the interface name.
This will not work 100% from what I've seen, at least virNetDevGetVfConfig uses
netlink to save the admin MAC (part of virNetDevSaveNetConfig), and netlink
needs the ifname.
So I'm quite stuck on finding a workaround/fix for this platform which would
potentially be something upstreamable, so that we, ENEA, don't burden with
maintaining an ugly hack. Right now we are using libvirt 3.5.0 but we can
upgrade to something newer if need.
The question(s) thus, are
1. is this problem known in the libvirt community?
2. Is there any plan to make it work?
3. Can you give some pointers on an approach to adapt libvirt to this system?
4. Maybe it's worth changing the kernel to assign a sort of dummy interface to
the physical function?
Thanks and sorry for the long email,
/Ciprian
libvirtd_fragment.log
Description: libvirtd_fragment.log
-- libvir-list mailing list [email protected] https://www.redhat.com/mailman/listinfo/libvir-list
