On Sun, Sep 18, 2016 at 01:33:27PM +0800, Peter Xu wrote:
> Good to know that we have a solution. :)
> I think this is a good fix, however I still do not understand why this
> is happening... Please see below comment.
> On Thu, Sep 15, 2016 at 04:11:48PM +1000, David Gibson wrote:
> > d1f6af6 "kvm-irqchip: simplify kvm_irqchip_add_msi_route" was a cleanup
> > of kvmchip routing configuration, that was mostly intended for x86.
> > However, it also contains a subtle change in behaviour which breaks EEH[1]
> > error recovery on certain VFIO passthrough devices on spapr guests.  So far
> > it's only been seen on a BCM5719 NIC on a POWER8 server, but there may be
> > other hardware with the same problem.  It's also possible there could be
> > circumstances where it causes a bug on x86 as well, though I don't know of
> > any obvious candidates.
> > 
> > Prior to d1f6af6, both vfio_msix_vector_do_use() and
> > vfio_add_kvm_msi_virq() used msg == NULL as a special flag to mark this
> > as the "dummy" vector used to make the host hardware state sync with the
> > guest expected hardware state in terms of MSI configuration.
> > 
> > Specifically that flag caused vfio_add_kvm_msi_virq() to become a no-op,
> > meaning the dummy irq would always be delivered via qemu.
> AFAICT, QEMU is not delivering that *dummy* IRQ as well. IIUC here the
> dummy IRQ refers to the one in vfio_msix_enable():
>     vfio_msix_vector_do_use(&vdev->pdev, 0, NULL, NULL);
>     vfio_msix_vector_release(&vdev->pdev, 0);
> Here we registered this dummy one to sync up the guest and host
> hardware state for some reason. However, the handler is NULL here,
> means that even QEMU won't handle it if it happens. And the only
> difference is that during this extremely small period, kvm will handle
> the interrupt with an uninitialized MSI message, but assuming that
> would never happen since no one should start to use MSI yet. After
> release(), interrupts will be dropped for all cases.
> So looks like it is not related to "whether QEMU or KVM will handle
> the interrupt", but something else.
> Generally speaking, the process of registering the IRQ should be
> something like (using QEMU with d1f6af6 change):
> 1. vfio_msix_enable(): when MSIX is enabled. this will trigger
>    vfio_add_kvm_msi_virq(), but quickly it is unregistered (virq will
>    be there, but VFIO will assign the VFIO handler to
>    vector->interrupt, rather than vector->kvm_interrupt, so that virq
>    won't take effect).
> 2. vfio_msix_vector_use(): when an MSIX entry is finally used and
>    setup. this will trigger vfio_update_kvm_msi_virq(), to update the
>    interrupt information to the "real one".
> IMHO we should always receive interrupts after step (2). And as we
> have traced (with David), step (2) was updating the correct interrupt
> information even with commit d1f6af6. But I think I might be wrong (or
> say, the above assumption is not correct), because commit d1f6af6
> indeed caused problem with EHH and this specific hardware. I am just
> do not know why it is triggering the issue. And that's why I want to
> know whether there is anything special with that NIC's driver.

It's quite possible there's something unusual about the driver - this
didn't happen for the other NIC I tried, or for an XHCI.
Unfortunately, I don't know what it would be, and I really have no
idea where one would start looking.

> Again, this is totally a discussion only, and I am totally agree with
> current change to fix the issue. Just in case someone knows the reason
> behind this.
> Thanks,
> -- peterx

David Gibson                    | I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
                                | _way_ _around_!

Attachment: signature.asc
Description: PGP signature

Reply via email to