On Wed, 21 Mar 2018 15:46:01 +0000
Ciprian Barbu <ciprian.ba...@enea.com> wrote:

> Hello,
> 
> In the context of running Openstack on a cluster of Cavium ThunderX cn8890 
> aarch64 servers, we are trying to attach virtual functions to a VM.
> 
> First some introduction. This Cavium SoC has a different approach to Virtual 
> Functions than on x86 NICs, in which VFs are always enabled and there are two 
> types of VFs and *one single* PF, as follows:
> - primary VFs - these are in fact assigned by the system to the physical 
> ports of the server, e.g em2p1s0f1, em2p1s0f3 etc below.
> - secondary VFs - the main purpose of these is to provide additional HW 
> queues under SW control (usually DPDK applications) by automatically binding 
> them to the needed physical port.
> - one single "physical" function, device 0002:01:00.0 below, which to the 
> best of my knowledge acts merely as a stub and cannot be assigned an 
> interface name.
> 
> Below is the output of "dpdk-devbind.py -s" which provides some useful 
> information.
> 
> Network devices using DPDK-compatible driver 
> ============================================
> 0002:01:00.2 'Device a034' drv=vfio-pci unused=nicvf
> 
> Network devices using kernel driver
> ===================================
> 0000:01:10.0 'THUNDERX BGX (Common Ethernet Interface)' if= drv=thunder-BGX 
> unused=thunder_bgx,vfio-pci
> 0000:01:10.1 'THUNDERX BGX (Common Ethernet Interface)' if= drv=thunder-BGX 
> unused=thunder_bgx,vfio-pci
> 0002:01:00.0 'THUNDERX Network Interface Controller' if= drv=thunder-nic 
> unused=nicpf,vfio-pci
> 0002:01:00.1 'Device a034' if=em2p1s0f1 drv=thunder-nicvf 
> unused=nicvf,vfio-pci
> 0002:01:00.3 'Device a034' if=em2p1s0f3 drv=thunder-nicvf 
> unused=nicvf,vfio-pci
> 0002:01:00.4 'Device a034' if=em2p1s0f4 drv=thunder-nicvf 
> unused=nicvf,vfio-pci
> 0002:01:00.5 'Device a034' if=em2p1s0f5 drv=thunder-nicvf 
> unused=nicvf,vfio-pci
> 0002:01:00.6 'Device a034' if= drv=thunder-nicvf unused=nicvf,vfio-pci
> 0002:01:00.7 'Device a034' if= drv=thunder-nicvf unused=nicvf,vfio-pci
> 0002:01:01.0 'Device a034' if= drv=thunder-nicvf unused=nicvf,vfio-pci
> 
> Now for the problem. I don't have a domain definition because libvirt fails 
> to start a domain, but I might be able to find what nova generates. But what 
> it tries to do is passthrough em2p1s0f3, address 0002:01:00.3:
> <interface type='hostdev' managed='yes'>
>   <source>
>     <address type='pci' domain='0x0002' bus='0x1' slot='0x0' function='0x3'/>
>   </source>
> </interface>

When you use an <interface> definition, I believe libvirt is
interpreting this specifically as a network device and perhaps expects
to find an interface on the pf through which it can do setup.  You can
also specify assigned devices via a <hostdev> entry, such as:

 <hostdev mode='subsystem' type='pci' managed='yes'>
   <driver name='vfio'/>
   <source>
     <address type='pci' domain='0x0002' bus='0x1' slot='0x0' function='0x3'/>
   </source>
 </hostdev>

In which case libvirt shouldn't care that the device is a VF and
should have no dependency on a PF interface (or ability to configure
the VF via the PF), I think.  Cc'ing libvirt experts.  There's a
proposed stub driver in the upstream kernel that would also act in a
similar fashion, the host PF driver is nothing more than a stub that
enables the VFs, so libvirt would need to handle those VFs in a way
that has no dependency on the PF being a network interface, or any
other sort of interface.  Thanks,

Alex

> You can find attached a trimmed libvirtd.log where the main error is:
> 43236: error : virPCIGetVirtualFunctionInfo:2927 : internal error: The PF 
> device for VF /sys/bus/pci/devices/0002:01:00.3 has no network device name
> 
> I have actually spent a few days trying to do some hacks and learn some more. 
> The main idea is that virPCIGetVirtualFunctionInfo fails to find the physical 
> name for the virtual device at address 0002:01:00.3, which as I explained in 
> the introduction is something that this Cavium SoC does not do.
> 
> Looking further down the stream, almost all of the helper functions need a 
> linkdev for the physical function, which means that making libvirt work on 
> this system means some heavy refactoring, a solution being to use the sysfs 
> path rather than the interface name.
> This will not work 100% from what I've seen, at least virNetDevGetVfConfig 
> uses netlink to save the admin MAC (part of virNetDevSaveNetConfig), and 
> netlink needs the ifname.
> 
> So I'm quite stuck on finding a workaround/fix for this platform which would 
> potentially be something upstreamable, so that we, ENEA, don't burden with 
> maintaining an ugly hack. Right now we are using libvirt 3.5.0 but we can 
> upgrade to something newer if need.
> 
> The question(s) thus, are
> 1. is this problem known in the libvirt community?
> 2. Is there any plan to make it work?
> 3. Can you give some pointers on an approach to adapt libvirt to this system?
> 4. Maybe it's worth changing the kernel to assign a sort of dummy interface 
> to the physical function?
> 
> Thanks and sorry for the long email,
> /Ciprian

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list

Reply via email to