So I was finally able to figure out what the hell is going on:

Suspend:

 - The device suspend code puts the graphics card into a power
   state != PCI_D0.

 - Offline non boot CPUs

 - Break interrupt affinity. Allocate new vector on CPU 0, compose and
   write MSI message which ends up in:

   __pci_write_msi_msg(entry, msg)
   {
        if (dev->current_state != PCI_D0 || pci_dev_is_disconnected(dev)) {
           /* Don't touch the hardware now */
        } else {
           ....
        }
        entry->msg = *msg;
   }
 
  So because the device is not in PCI_D0 the message is not written. It's
  written in the device resume path.

Resume:
[  139.670446] ACPI: Low-level resume complete
[  139.670541] PM: Restoring platform NVS memory
[  139.672462] do_IRQ: 0.55 No irq handler for vector
[  139.672475] Enabling non-boot CPUs ...

So the spurious interrupt happens early and way before the device resume
code writes the new MSI message.

I checked the behaviour on 4.14. The MSI write is delayed there in the same
way, but there is no spurious interrupt. There is no interrupt coming in at
all _BEFORE_ the device is put out of PCI_D0.

And this has certainly nothing to do with the vector management changes,
but I can't figure yet what makes that spurious interrupt to be sent.

Any ideas welcome.

Thanks,

        tglx

Reply via email to