On 03/03/2014 02:51 PM, Benjamin Herrenschmidt wrote:
> On Mon, 2014-03-03 at 11:24 +0800, Gavin Shan wrote:
>> The problem is specific to the case of BIST issued applied to IPR
>> adapter on the guest side. After BIST reset, we lose everything
>> in MSIx table and we never have chance update MSIx messages for
>> those enabled interrupts to MSIx table.
>>
>> The patch fixes it by writing MSIx message to MSIx table before
>> reenabling them.
>
> That needs a better explanation... the guest does try to restore the
> MSI-X (in a way similar to resuming from suspend) by writing the
> message value back into the table.
>
> My understanding is that we trap that and turn that into a call to
> vfio_msi_set_vector_signal, is that correct ?
Yep.
Gavin sent me a backtrace of what happens when the guest tries writing to a
BAR:
qemu/hw/pci/msix.c::msix_table_mmio_write
msix_handle_mask_update
msix_fire_vector_notifier
qemu/hw/misc/vfio.c::vfio_msix_vector_use
vfio_msix_vector_do_use
IOCTL-to-VFIO: VFIO_DEVICE_SET_IRQS
host/drivers/vfio/pci/vfio_pci.c::vfio_pci_ioctl
host/drivers/vfio/pci/vfio_pci_intrs.c::vfio_pci_set_irqs_ioctl
vfio_pci_set_msi_trigger
vfio_msi_set_block
vfio_msi_set_vector_signal
While it works for our particular problem and seems correct, it has one
flaw - hw/pci/msix.c will not generate this backtrace if masking bit does
not change which can happen in general:
===
static void msix_handle_mask_update(PCIDevice *dev, int vector, bool
was_masked)
{
bool is_masked = msix_is_masked(dev, vector);
if (is_masked == was_masked) {
return;
}
===
Or if masking bit is the same, nothing bad is expected?...
> In that case, it does indeed make sense to have that function rewrite
> the cached message.
>
> (Just trying to understand how this patch fixes the problem we saw)
>
> I suppose other drivers would have similar issues if they have some
> kind of internal "reset" path and try to save and restore the MSI-X
> table.
> Cheers,
> Ben.
>
>> Reported-by: Wen Xiong <[email protected]>
>> Signed-off-by: Gavin Shan <[email protected]>
>> ---
>> drivers/vfio/pci/vfio_pci_intrs.c | 19 +++++++++++++++++++
>> 1 file changed, 19 insertions(+)
>>
>> diff --git a/drivers/vfio/pci/vfio_pci_intrs.c
>> b/drivers/vfio/pci/vfio_pci_intrs.c
>> index 2103576..279ebd0 100644
>> --- a/drivers/vfio/pci/vfio_pci_intrs.c
>> +++ b/drivers/vfio/pci/vfio_pci_intrs.c
>> @@ -17,6 +17,7 @@
>> #include <linux/interrupt.h>
>> #include <linux/eventfd.h>
>> #include <linux/pci.h>
>> +#include <linux/msi.h>
>> #include <linux/file.h>
>> #include <linux/poll.h>
>> #include <linux/vfio.h>
>> @@ -517,6 +518,7 @@ static int vfio_msi_set_vector_signal(struct
>> vfio_pci_device *vdev,
>> struct pci_dev *pdev = vdev->pdev;
>> int irq = msix ? vdev->msix[vector].vector : pdev->irq + vector;
>> char *name = msix ? "vfio-msix" : "vfio-msi";
>> + struct msi_msg msg;
>> struct eventfd_ctx *trigger;
>> int ret;
>>
>> @@ -544,6 +546,23 @@ static int vfio_msi_set_vector_signal(struct
>> vfio_pci_device *vdev,
>> return PTR_ERR(trigger);
>> }
>>
>> + /* We possiblly lose the MSI/MSIx message in some cases.
>> + * For example, BIST reset on IPR adapter. The MSIx table
>> + * is cleaned out. However, we never get chance to put
>> + * MSIx messages to MSIx table because all MSIx stuff is
>> + * being cached in QEMU. Here, we had the trick to put the
>> + * MSI/MSIx message back.
>> + *
>> + * Basically, we needn't worry about MSI messages. However,
>> + * it's not harmful and there might be cases of PCI config data
>> + * lost because of cached PCI config data in QEMU again.
>> + *
>> + * Note that we should flash the message prior to enabling
>> + * the corresponding interrupt by request_irq().
>> + */
>> + get_cached_msi_msg(irq, &msg);
>> + write_msi_msg(irq, &msg);
>> +
>> ret = request_irq(irq, vfio_msihandler, 0,
>> vdev->ctx[vector].name, trigger);
>> if (ret) {
>
>
--
Alexey
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to [email protected]
More majordomo info at http://vger.kernel.org/majordomo-info.html