On Sat, 1 Dec 2018 10:52:21 -0800 (PST) Dongli Zhang <dongli.zh...@oracle.com> wrote:
> Hi, > > I obtained below error when assigning an intel 760p 128GB nvme to guest via > vfio on my desktop: > > qemu-system-x86_64: -device vfio-pci,host=0000:01:00.0: vfio 0000:01:00.0: > failed to add PCI capability 0x11[0x50]@0xb0: table & pba overlap, or they > don't fit in BARs, or don't align > > > This is because the msix table is overlapping with pba. According to below > 'lspci -vv' from host, the distance between msix table offset and pba offset > is > only 0x100, although there are 22 entries supported (22 entries need 0x160). > Looks qemu supports at most 0x800. > > # sudo lspci -vv > ... ... > 01:00.0 Non-Volatile memory controller: Intel Corporation Device f1a6 (rev > 03) (prog-if 02 [NVM Express]) > Subsystem: Intel Corporation Device 390b > ... ... > Capabilities: [b0] MSI-X: Enable- Count=22 Masked- > Vector table: BAR=0 offset=00002000 > PBA: BAR=0 offset=00002100 > > > > A patch below could workaround the issue and passthrough nvme successfully. > > diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c > index 5c7bd96..54fc25e 100644 > --- a/hw/vfio/pci.c > +++ b/hw/vfio/pci.c > @@ -1510,6 +1510,11 @@ static void vfio_msix_early_setup(VFIOPCIDevice *vdev, > Error **errp) > msix->pba_offset = pba & ~PCI_MSIX_FLAGS_BIRMASK; > msix->entries = (ctrl & PCI_MSIX_FLAGS_QSIZE) + 1; > > + if (msix->table_bar == msix->pba_bar && > + msix->table_offset + msix->entries * PCI_MSIX_ENTRY_SIZE > > msix->pba_offset) { > + msix->entries = (msix->pba_offset - msix->table_offset) / > PCI_MSIX_ENTRY_SIZE; > + } > + > /* > * Test the size of the pba_offset variable and catch if it extends > outside > * of the specified BAR. If it is the case, we need to apply a hardware > > > Would you please help confirm if this can be regarded as bug in qemu, or issue > with nvme hardware? Should we fix thin in qemu, or we should never use such > buggy > hardware with vfio? It's a hardware bug, is there perhaps a firmware update for the device that resolves it? It's curious that a vector table size of 0x100 gives us 16 entries and 22 in hex is 0x16 (table size would be reported as 0x15 for the N-1 algorithm). I wonder if there's a hex vs decimal mismatch going on. We don't really know if the workaround above is correct, are there really 16 entries or maybe does the PBA actually start at a different offset? We wouldn't want to generically assume one or the other. I think we need Intel to tell us in which way their hardware is broken and whether it can or is already fixed in a firmware update. Thanks, Alex