> -----Original Message----- > From: Peter Xu <pet...@redhat.com> > Sent: Wednesday, March 27, 2019 10:43 AM > To: Knut Omang <knut.om...@oracle.com> > Cc: Elijah Shakkour <elija...@mellanox.com>; Michael S. Tsirkin > <m...@redhat.com>; Alex Williamson <alex.william...@redhat.com>; > Marcel Apfelbaum <marcel.apfelb...@gmail.com>; Stefan Hajnoczi > <stefa...@gmail.com>; qemu-devel@nongnu.org > Subject: Re: QEMU and vIOMMU support for emulated VF passthrough to > nested (L2) VM > > On Wed, Mar 27, 2019 at 08:57:56AM +0100, Knut Omang wrote: > > On Wed, 2019-03-27 at 14:41 +0800, Peter Xu wrote: > > > On Tue, Mar 26, 2019 at 01:23:12PM +0000, Elijah Shakkour wrote: > > > > Adding QEMU-devel > > > > > > Hi, Elijah, > > > > > > > > > > > -----Original Message----- > > > > From: Michael S. Tsirkin <m...@redhat.com> > > > > Sent: Tuesday, March 26, 2019 2:53 PM > > > > To: Elijah Shakkour <elija...@mellanox.com> > > > > Cc: Knut Omang <knut.om...@oracle.com>; Alex Williamson > > > > <alex.william...@redhat.com>; > > > Marcel Apfelbaum <marcel.apfelb...@gmail.com>; Stefan Hajnoczi > > > <stefa...@gmail.com>; pet...@redhat.com > > > > Subject: Re: QEMU and vIOMMU support for emulated VF passthrough > > > > to nested (L2) VM > > > > > > > > I think you forgot to copy the qemu mailing list. > > > > > > > > On Tue, Mar 26, 2019 at 10:08:17AM +0000, Elijah Shakkour wrote: > > > > > My questions are: > > > > > > > > > > - Suppose that there is an emulated NIC that supports SRIOV (I > > > > > implemented such a > > > NIC), now does QEMU support a scenario of an emulated NIC that > > > supports SRIOV in Hyper-V > > > L1 guest, that invokes VF and pass it to nested linux L2 guest? > > > > > > I am not an expert of SR-IOV but I can't see a limitation to not > > > allow that to happen. > > > > > > > > - I'm using vIOMMU in L1, so what is needed to be done in QEMU > > > > > or maybe in emulated > > > NIC PF/VF to allow DMA remapping and INT remapping work as > expected? > > > > > > Your below command line should work, and even it seems to be an > > > overkill.
I didn't have DMA nor MMIO read/write working with my old command line. But, when I removed all CPU flags and only provided "-cpu host", I see that MMIO works. Still, DMA read/write from emulated device doesn't work for VF. For example: Driver provides me a buffer pointer through MMIO write, this address (pointer) is GPA of L2, and when I try to call pci_dma_read() with this address I get: " Unassigned mem read 0000000000000000 " I expected this address to be translated correctly in the vIOMMU. Is there something that I'm missing here? > > > > > > If your device is completely emulated, IIUC you only simply need > > > this on the latest QEMU: > > > > > > -M q35 -device intel-iommu > > > > > > Split-irqchip and IR is on by default now, so you'll naturally gain > > > x2apic if it's supported. You can use x-aw-bits but only if you > > > really need address space beyond 39 bits (which I suspect). The > > > rest parameters are optional too. What do you mean by "completely emulated"? I think that, for the sake of our talk, all I need is configuration space and MMIO call backs, and MSIX, PCIe capability in both PF and VF and SRIOV capability in PF. > > > > > > > > - Does the command line below -that I use to run QEMU- seem ok > > > > > for the scenario I > > > described to work? > > > > > > Before I look into details of the cmdline - I'd say MMIO in L2 > > > should have nothing to do with IOMMU... > > > > The addresses used in L2 is the GPAs of the L2, which would typically > > be different from the L2 HPAs == L1 GPAs, so I think the IOMMU mappings > must work. > > > > You would need something like 'intel_iommu=on iommu=pt' as boot > parameters for L1. > > Yes, the IOMMU must work to do the assignment. What I meant was that > IOMMU should not be in the code path of MMIO accesses even for L2. > IIUC that's the processor who reads/writes to the memory region and if it's a > MMIO issue then it probably has little to do with IOMMU. > > Thanks, As I said, MMIO works, but when I try to do DMA from VF device to GPA in L2 it fails. I expected this DMA read to work fine without any changes to be done. > > > > > > Are you sure the MMIO traps are > > > setup correctly? Can the VF do IO properly even without L2? The answer is Yes for both questions. > > > > I agree with Peter that just running the VF as another function in L1 > > would be good to test before trying to get L2 passthrough to work. > > > > I recommend you also verify that passing the PF through works as > > expected, unless you already have done so. > > > > And do you see correct BAR address values in the lspci -vvv output in > > the L2 instance? > > > > The SR/IOV logic is from QEMU's perspective just another device > > instance apart from the differences in the BAR setup code, so if > > passing through a non-virtual device works, and VF BAR addresses > > appear right, I believe VFs should work as well. > > > > > Also I don't know whether there can be some tricks when you boot L2 > > > with vfio-pci when the device to assign is a VF. > > > > A lot has happened since I was actively using the SR/IOV patch set > > myself so that might entirely be possible from my perspective. > > > > Thanks, > > Knut > > As I said, my problem now is in translation of L2 GPA provided by driver, when I call DMA read/write for this address from VF. Any insights? > > > > > > > > > > -----Original Message----- > > > > > From: Michael S. Tsirkin <m...@redhat.com> > > > > > Sent: Monday, March 25, 2019 4:14 AM > > > > > To: Elijah Shakkour <elija...@mellanox.com> > > > > > Cc: Knut Omang <knut.om...@oracle.com>; Alex Williamson > > > > > <alex.william...@redhat.com>; Marcel Apfelbaum > > > > > <marcel.apfelb...@gmail.com>; Stefan Hajnoczi > > > > > <stefa...@gmail.com> > > > > > Subject: Re: QEMU and vIOMMU support for emulated VF > passthrough > > > > > to nested (L2) VM > > > > > > > > > > Pls post all questions on list. > > > > > I have a policy against answering off-list mail. > > > > > Cc Pter Xu might be a good idea, too. > > > > > > > > > > On Sun, Mar 24, 2019 at 09:56:26PM +0000, Elijah Shakkour wrote: > > > > > > Hey, > > > > > > > > > > > > I'm emulating Mellanox ConnectX-4 in QEMU and right now, I'm > > > > > > adding SRIOV > > > capability. > > > > > > I'm using Knut Omang SRIOV patches rebased to QEMU v2.12. > > > > > > My server (L0) is Linux. L1 guest is Windows2016 Hyper-V and > > > > > > L2 guest is Linux > > > RH7.2. > > > > > > I can see my device in L1 VM and I see the invocation of the > > > > > > VF via SRIOV > > > capability. > > > > > > Inside L2 guest I see the virtual function in "lspci' command. > > > > > > But when driver of L2 guest issues MMIO read/write, my MMIO ops > don't get called. > > > > > > I implemented my VF basically like Omang SRIOV example patch. > > > > > > > > > > > > Could you please shed some light on what you think I might be > missing? > > > > > > > > > > > > Here is the command line I run: > > > > > > > > > > > > ./x86_64-softmmu/qemu-system-x86_64 \ -machine > > > > > > q35,accel=kvm,usb=off,dump-guest-core=off,kernel-irqchip=split > > > > > > \ -m 32G \ -smp 2 \ -enable-kvm \ -cpu > > > > > > > host,vmx=on,ss=on,cx16=on,x2apic=on,hypervisor=on,lahf_lm=on,h > > > > > > v_time ,h v_relaxed,hv_vapic,hv_spinlocks=0x1fff,kvm=on \ > > > > > > -vnc 127.0.0.1:0,to=99,id=default \ -drive > > > > > > file=$IMAGE,format=qcow2,if=none,id=drive-sata0-0-0 \ > > > > > > -chardev > > > > > > pty,id=charserial0 \ -device > > > > > > intel-iommu,intremap=on,caching-mode=on,device- > iotlb=on,eim=on > > > > > > ,x-aw- > > > > > > bi > > > > > > ts=48 \ -device > > > > > > ide-hd,bus=ide.0,drive=drive-sata0-0-0,id=sata0-0-0,bootindex= > > > > > > 0 \ -device > > > > > > pcie-root-port,pref64-reserve=500M,slot=0,id=pcie_port.1,bus=p > > > > > > cie.0, > > > > > > mu > > > > > > ltifunction=on \ -netdev > > > > > > tap,id=tap5,ifname=tap5,script=no,downscript=no \ -device > > > > > > connectx4,netdev=tap5,bus=pcie_port.1,multifunction=on > > > > > > Regards, > > > > > > > -- > Peter Xu