[tip:irq/urgent] genirq/msi, x86/vector: Prevent reservation mode for non maskable MSI

2017-12-29 Thread tip-bot for Thomas Gleixner
Commit-ID:  bc976233a872c0f20f018fb1e89264a541584e25
Gitweb: https://git.kernel.org/tip/bc976233a872c0f20f018fb1e89264a541584e25
Author: Thomas Gleixner 
AuthorDate: Fri, 29 Dec 2017 10:47:22 +0100
Committer:  Thomas Gleixner 
CommitDate: Fri, 29 Dec 2017 21:13:05 +0100

genirq/msi, x86/vector: Prevent reservation mode for non maskable MSI

The new reservation mode for interrupts assigns a dummy vector when the
interrupt is allocated and assigns a real vector when the interrupt is
requested. The reservation mode prevents vector pressure when devices with
a large amount of queues/interrupts are initialized, but only a minimal
subset of those queues/interrupts is actually used.

This mode has an issue with MSI interrupts which cannot be masked. If the
driver is not careful or the hardware emits an interrupt before the device
irq is requestd by the driver then the interrupt ends up on the dummy
vector as a spurious interrupt which can cause malfunction of the device or
in the worst case a lockup of the machine.

Change the logic for the reservation mode so that the early activation of
MSI interrupts checks whether:

 - the device is a PCI/MSI device
 - the reservation mode of the underlying irqdomain is activated
 - PCI/MSI masking is globally enabled
 - the PCI/MSI device uses either MSI-X, which supports masking, or
   MSI with the maskbit supported.

If one of those conditions is false, then clear the reservation mode flag
in the irq data of the interrupt and invoke irq_domain_activate_irq() with
the reserve argument cleared. In the x86 vector code, clear the can_reserve
flag in the vector allocation data so a subsequent free_irq() won't create
the same situation again. The interrupt stays assigned to a real vector
until pci_disable_msi() is invoked and all allocations are undone.

Fixes: 4900be83602b ("x86/vector/msi: Switch to global reservation mode")
Reported-by: Alexandru Chirvasitu 
Reported-by: Andy Shevchenko 
Signed-off-by: Thomas Gleixner 
Tested-by: Alexandru Chirvasitu 
Tested-by: Andy Shevchenko 
Cc: Dou Liyang 
Cc: Pavel Machek 
Cc: Maciej W. Rozycki 
Cc: Mikael Pettersson 
Cc: Josh Poulson 
Cc: Mihai Costache 
Cc: Stephen Hemminger 
Cc: Marc Zyngier 
Cc: linux-...@vger.kernel.org
Cc: Haiyang Zhang 
Cc: Dexuan Cui 
Cc: Simon Xiao 
Cc: Saeed Mahameed 
Cc: Jork Loeser 
Cc: Bjorn Helgaas 
Cc: de...@linuxdriverproject.org
Cc: KY Srinivasan 
Cc: Alan Cox 
Cc: Sakari Ailus ,
Cc: linux-me...@vger.kernel.org
Link: https://lkml.kernel.org/r/alpine.DEB.2.20.1712291406420.1899@nanos
Link: https://lkml.kernel.org/r/alpine.DEB.2.20.1712291409460.1899@nanos
---
 arch/x86/kernel/apic/vector.c | 12 +++-
 kernel/irq/msi.c  | 37 +
 2 files changed, 44 insertions(+), 5 deletions(-)

diff --git a/arch/x86/kernel/apic/vector.c b/arch/x86/kernel/apic/vector.c
index 52c85c8..f8b03bb 100644
--- a/arch/x86/kernel/apic/vector.c
+++ b/arch/x86/kernel/apic/vector.c
@@ -369,8 +369,18 @@ static int activate_reserved(struct irq_data *irqd)
int ret;
 
ret = assign_irq_vector_any_locked(irqd);
-   if (!ret)
+   if (!ret) {
apicd->has_reserved = false;
+   /*
+* Core might have disabled reservation mode after
+* allocating the irq descriptor. Ideally this should
+* happen before allocation time, but that would require
+* completely convoluted ways of transporting that
+* information.
+*/
+   if (!irqd_can_reserve(irqd))
+   apicd->can_reserve = false;
+   }
return ret;
 }
 
diff --git a/kernel/irq/msi.c b/kernel/irq/msi.c
index 9ba9543..2f3c4f5 100644
--- a/kernel/irq/msi.c
+++ b/kernel/irq/msi.c
@@ -339,11 +339,38 @@ int msi_domain_populate_irqs(struct irq_domain *domain, 
struct device *dev,
return ret;
 }
 
-static bool msi_check_reservation_mode(struct msi_domain_info *info)
+/*
+ * Carefully check whether the device can use reservation mode. If
+ * reservation mode is enabled then the early activation will assign a
+ * dummy vector to the device. If the PCI/MSI device does not support
+ * masking of the entry then this can result in spurious interrupts when
+ * the device driver is not absolutely careful. But even then a malfunction
+ * of the hardware could result in a spurious interrupt on the dummy vector
+ * and 

[tip:irq/urgent] genirq/msi, x86/vector: Prevent reservation mode for non maskable MSI

2017-12-29 Thread tip-bot for Thomas Gleixner
Commit-ID:  bc976233a872c0f20f018fb1e89264a541584e25
Gitweb: https://git.kernel.org/tip/bc976233a872c0f20f018fb1e89264a541584e25
Author: Thomas Gleixner 
AuthorDate: Fri, 29 Dec 2017 10:47:22 +0100
Committer:  Thomas Gleixner 
CommitDate: Fri, 29 Dec 2017 21:13:05 +0100

genirq/msi, x86/vector: Prevent reservation mode for non maskable MSI

The new reservation mode for interrupts assigns a dummy vector when the
interrupt is allocated and assigns a real vector when the interrupt is
requested. The reservation mode prevents vector pressure when devices with
a large amount of queues/interrupts are initialized, but only a minimal
subset of those queues/interrupts is actually used.

This mode has an issue with MSI interrupts which cannot be masked. If the
driver is not careful or the hardware emits an interrupt before the device
irq is requestd by the driver then the interrupt ends up on the dummy
vector as a spurious interrupt which can cause malfunction of the device or
in the worst case a lockup of the machine.

Change the logic for the reservation mode so that the early activation of
MSI interrupts checks whether:

 - the device is a PCI/MSI device
 - the reservation mode of the underlying irqdomain is activated
 - PCI/MSI masking is globally enabled
 - the PCI/MSI device uses either MSI-X, which supports masking, or
   MSI with the maskbit supported.

If one of those conditions is false, then clear the reservation mode flag
in the irq data of the interrupt and invoke irq_domain_activate_irq() with
the reserve argument cleared. In the x86 vector code, clear the can_reserve
flag in the vector allocation data so a subsequent free_irq() won't create
the same situation again. The interrupt stays assigned to a real vector
until pci_disable_msi() is invoked and all allocations are undone.

Fixes: 4900be83602b ("x86/vector/msi: Switch to global reservation mode")
Reported-by: Alexandru Chirvasitu 
Reported-by: Andy Shevchenko 
Signed-off-by: Thomas Gleixner 
Tested-by: Alexandru Chirvasitu 
Tested-by: Andy Shevchenko 
Cc: Dou Liyang 
Cc: Pavel Machek 
Cc: Maciej W. Rozycki 
Cc: Mikael Pettersson 
Cc: Josh Poulson 
Cc: Mihai Costache 
Cc: Stephen Hemminger 
Cc: Marc Zyngier 
Cc: linux-...@vger.kernel.org
Cc: Haiyang Zhang 
Cc: Dexuan Cui 
Cc: Simon Xiao 
Cc: Saeed Mahameed 
Cc: Jork Loeser 
Cc: Bjorn Helgaas 
Cc: de...@linuxdriverproject.org
Cc: KY Srinivasan 
Cc: Alan Cox 
Cc: Sakari Ailus ,
Cc: linux-me...@vger.kernel.org
Link: https://lkml.kernel.org/r/alpine.DEB.2.20.1712291406420.1899@nanos
Link: https://lkml.kernel.org/r/alpine.DEB.2.20.1712291409460.1899@nanos
---
 arch/x86/kernel/apic/vector.c | 12 +++-
 kernel/irq/msi.c  | 37 +
 2 files changed, 44 insertions(+), 5 deletions(-)

diff --git a/arch/x86/kernel/apic/vector.c b/arch/x86/kernel/apic/vector.c
index 52c85c8..f8b03bb 100644
--- a/arch/x86/kernel/apic/vector.c
+++ b/arch/x86/kernel/apic/vector.c
@@ -369,8 +369,18 @@ static int activate_reserved(struct irq_data *irqd)
int ret;
 
ret = assign_irq_vector_any_locked(irqd);
-   if (!ret)
+   if (!ret) {
apicd->has_reserved = false;
+   /*
+* Core might have disabled reservation mode after
+* allocating the irq descriptor. Ideally this should
+* happen before allocation time, but that would require
+* completely convoluted ways of transporting that
+* information.
+*/
+   if (!irqd_can_reserve(irqd))
+   apicd->can_reserve = false;
+   }
return ret;
 }
 
diff --git a/kernel/irq/msi.c b/kernel/irq/msi.c
index 9ba9543..2f3c4f5 100644
--- a/kernel/irq/msi.c
+++ b/kernel/irq/msi.c
@@ -339,11 +339,38 @@ int msi_domain_populate_irqs(struct irq_domain *domain, 
struct device *dev,
return ret;
 }
 
-static bool msi_check_reservation_mode(struct msi_domain_info *info)
+/*
+ * Carefully check whether the device can use reservation mode. If
+ * reservation mode is enabled then the early activation will assign a
+ * dummy vector to the device. If the PCI/MSI device does not support
+ * masking of the entry then this can result in spurious interrupts when
+ * the device driver is not absolutely careful. But even then a malfunction
+ * of the hardware could result in a spurious interrupt on the dummy vector
+ * and render the device unusable. If the entry can be masked then the core
+ * logic will prevent the spurious interrupt and reservation mode can be
+ * used. For now reservation mode is restricted to PCI/MSI.
+ */
+static bool msi_check_reservation_mode(struct irq_domain *domain,
+  struct msi_domain_info *info,
+  struct device *dev)
 {
+   struct msi_desc *desc;
+
+   if (domain->bus_token != DOMAIN_BUS_PCI_MSI)
+   return false;
+
if (!(info->flags &