[tip: x86/apic] iommu/hyper-v: Remove I/O-APIC ID check from hyperv_irq_remapping_select()

2020-12-02 Thread tip-bot2 for Dexuan Cui
The following commit has been merged into the x86/apic branch of tip:

Commit-ID: 26ab12bb9d96133b7880141d68b5e01a8783de9d
Gitweb:
https://git.kernel.org/tip/26ab12bb9d96133b7880141d68b5e01a8783de9d
Author:Dexuan Cui 
AuthorDate:Tue, 01 Dec 2020 16:45:10 -08:00
Committer: Thomas Gleixner 
CommitterDate: Wed, 02 Dec 2020 11:22:55 +01:00

iommu/hyper-v: Remove I/O-APIC ID check from hyperv_irq_remapping_select()

commit a491bb19f728 ("iommu/hyper-v: Implement select() method on remapping
irqdomain") restricted the irq_domain_ops::select() callback to match on
I/O-APIC index 0, which was correct until the parameter was changed to
carry the I/O APIC ID in commit f36a74b9345a.

If the ID is not 0 then the match fails. Therefore I/O-APIC init fails to
retrieve the parent irqdomain for the I/O-APIC resulting in a boot panic:

kernel BUG at arch/x86/kernel/apic/io_apic.c:2408!

Fix it by matching the I/O-APIC independent of the ID as there is only one
I/O APIC emulated by Hyper-V.

[ tglx: Amended changelog ]

Fixes: f36a74b9345a ("x86/ioapic: Use I/O-APIC ID for finding irqdomain, not 
index")
Signed-off-by: Dexuan Cui 
Signed-off-by: Thomas Gleixner 
Reviewed-by: David Woodhouse 
Link: https://lore.kernel.org/r/20201202004510.1818-1-de...@microsoft.com
---
 drivers/iommu/hyperv-iommu.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/iommu/hyperv-iommu.c b/drivers/iommu/hyperv-iommu.c
index 9438daa..1d21a0b 100644
--- a/drivers/iommu/hyperv-iommu.c
+++ b/drivers/iommu/hyperv-iommu.c
@@ -105,8 +105,8 @@ static int hyperv_irq_remapping_select(struct irq_domain *d,
   struct irq_fwspec *fwspec,
   enum irq_domain_bus_token bus_token)
 {
-   /* Claim only the first (and only) I/OAPIC */
-   return x86_fwspec_is_ioapic(fwspec) && fwspec->param[0] == 0;
+   /* Claim the only I/O APIC emulated by Hyper-V */
+   return x86_fwspec_is_ioapic(fwspec);
 }
 
 static const struct irq_domain_ops hyperv_ir_domain_ops = {


[tip: x86/apic] x86/hyperv: Enable 15-bit APIC ID if the hypervisor supports it

2020-11-04 Thread tip-bot2 for Dexuan Cui
The following commit has been merged into the x86/apic branch of tip:

Commit-ID: d981059e13ffa9ed03a73472e932d070323bd057
Gitweb:
https://git.kernel.org/tip/d981059e13ffa9ed03a73472e932d070323bd057
Author:Dexuan Cui 
AuthorDate:Mon, 02 Nov 2020 17:11:36 -08:00
Committer: Thomas Gleixner 
CommitterDate: Wed, 04 Nov 2020 11:10:52 +01:00

x86/hyperv: Enable 15-bit APIC ID if the hypervisor supports it

When a Linux VM runs on Hyper-V, if the VM has CPUs with >255 APIC IDs,
the CPUs can't be the destination of IOAPIC interrupts, because the
IOAPIC RTE's Dest Field has only 8 bits. Currently the hackery driver
drivers/iommu/hyperv-iommu.c is used to ensure IOAPIC interrupts are
only routed to CPUs that don't have >255 APIC IDs. However, there is
an issue with kdump, because the kdump kernel can run on any CPU, and
hence IOAPIC interrupts can't work if the kdump kernel run on a CPU
with a >255 APIC ID.

The kdump issue can be fixed by the Extended Dest ID, which is introduced
recently by David Woodhouse (for IOAPIC, see the field virt_destid_8_14 in
struct IO_APIC_route_entry). Of course, the Extended Dest ID needs the
support of the underlying hypervisor. The latest Hyper-V has added the
support recently: with this commit, on such a Hyper-V host, Linux VM
does not use hyperv-iommu.c because hyperv_prepare_irq_remapping()
returns -ENODEV; instead, Linux kernel's generic support of Extended Dest
ID from David is used, meaning that Linux VM is able to support up to
32K CPUs, and IOAPIC interrupts can be routed to all the CPUs.

On an old Hyper-V host that doesn't support the Extended Dest ID, nothing
changes with this commit: Linux VM is still able to bring up the CPUs with
> 255 APIC IDs with the help of hyperv-iommu.c, but IOAPIC interrupts still
can not go to such CPUs, and the kdump kernel still can not work properly
on such CPUs.

[ tglx: Updated comment as suggested by David ]

Signed-off-by: Dexuan Cui 
Signed-off-by: Thomas Gleixner 
Acked-by: David Woodhouse 
Link: https://lore.kernel.org/r/20201103011136.59108-1-de...@microsoft.com
---
 arch/x86/include/asm/hyperv-tlfs.h |  7 +++-
 arch/x86/kernel/cpu/mshyperv.c | 29 +-
 2 files changed, 36 insertions(+)

diff --git a/arch/x86/include/asm/hyperv-tlfs.h 
b/arch/x86/include/asm/hyperv-tlfs.h
index 0ed20e8..6bf42ae 100644
--- a/arch/x86/include/asm/hyperv-tlfs.h
+++ b/arch/x86/include/asm/hyperv-tlfs.h
@@ -23,6 +23,13 @@
 #define HYPERV_CPUID_IMPLEMENT_LIMITS  0x4005
 #define HYPERV_CPUID_NESTED_FEATURES   0x400A
 
+#define HYPERV_CPUID_VIRT_STACK_INTERFACE  0x4081
+#define HYPERV_VS_INTERFACE_EAX_SIGNATURE  0x31235356  /* "VS#1" */
+
+#define HYPERV_CPUID_VIRT_STACK_PROPERTIES 0x4082
+/* Support for the extended IOAPIC RTE format */
+#define HYPERV_VS_PROPERTIES_EAX_EXTENDED_IOAPIC_RTE   BIT(2)
+
 #define HYPERV_HYPERVISOR_PRESENT_BIT  0x8000
 #define HYPERV_CPUID_MIN   0x4005
 #define HYPERV_CPUID_MAX   0x4000
diff --git a/arch/x86/kernel/cpu/mshyperv.c b/arch/x86/kernel/cpu/mshyperv.c
index 05ef1f4..f628e3d 100644
--- a/arch/x86/kernel/cpu/mshyperv.c
+++ b/arch/x86/kernel/cpu/mshyperv.c
@@ -366,9 +366,38 @@ static void __init ms_hyperv_init_platform(void)
 #endif
 }
 
+static bool __init ms_hyperv_x2apic_available(void)
+{
+   return x2apic_supported();
+}
+
+/*
+ * If ms_hyperv_msi_ext_dest_id() returns true, hyperv_prepare_irq_remapping()
+ * returns -ENODEV and the Hyper-V IOMMU driver is not used; instead, the
+ * generic support of the 15-bit APIC ID is used: see __irq_msi_compose_msg().
+ *
+ * Note: for a VM on Hyper-V, the I/O-APIC is the only device which
+ * (logically) generates MSIs directly to the system APIC irq domain.
+ * There is no HPET, and PCI MSI/MSI-X interrupts are remapped by the
+ * pci-hyperv host bridge.
+ */
+static bool __init ms_hyperv_msi_ext_dest_id(void)
+{
+   u32 eax;
+
+   eax = cpuid_eax(HYPERV_CPUID_VIRT_STACK_INTERFACE);
+   if (eax != HYPERV_VS_INTERFACE_EAX_SIGNATURE)
+   return false;
+
+   eax = cpuid_eax(HYPERV_CPUID_VIRT_STACK_PROPERTIES);
+   return eax & HYPERV_VS_PROPERTIES_EAX_EXTENDED_IOAPIC_RTE;
+}
+
 const __initconst struct hypervisor_x86 x86_hyper_ms_hyperv = {
.name   = "Microsoft Hyper-V",
.detect = ms_hyperv_platform,
.type   = X86_HYPER_MS_HYPERV,
+   .init.x2apic_available  = ms_hyperv_x2apic_available,
+   .init.msi_ext_dest_id   = ms_hyperv_msi_ext_dest_id,
.init.init_platform = ms_hyperv_init_platform,
 };


[tip: x86/apic] x86/hyperv: Enable 15-bit APIC ID if the hypervisor supports it

2020-11-03 Thread tip-bot2 for Dexuan Cui
The following commit has been merged into the x86/apic branch of tip:

Commit-ID: af2abc92c5ddf5fc5a2036bc106c4d9a80a4d5f7
Gitweb:
https://git.kernel.org/tip/af2abc92c5ddf5fc5a2036bc106c4d9a80a4d5f7
Author:Dexuan Cui 
AuthorDate:Mon, 02 Nov 2020 17:11:36 -08:00
Committer: Thomas Gleixner 
CommitterDate: Tue, 03 Nov 2020 09:16:46 +01:00

x86/hyperv: Enable 15-bit APIC ID if the hypervisor supports it

When a Linux VM runs on Hyper-V, if the VM has CPUs with >255 APIC IDs,
the CPUs can't be the destination of IOAPIC interrupts, because the
IOAPIC RTE's Dest Field has only 8 bits. Currently the hackery driver
drivers/iommu/hyperv-iommu.c is used to ensure IOAPIC interrupts are
only routed to CPUs that don't have >255 APIC IDs. However, there is
an issue with kdump, because the kdump kernel can run on any CPU, and
hence IOAPIC interrupts can't work if the kdump kernel run on a CPU
with a >255 APIC ID.

The kdump issue can be fixed by the Extended Dest ID, which is introduced
recently by David Woodhouse (for IOAPIC, see the field virt_destid_8_14 in
struct IO_APIC_route_entry). Of course, the Extended Dest ID needs the
support of the underlying hypervisor. The latest Hyper-V has added the
support recently: with this commit, on such a Hyper-V host, Linux VM
does not use hyperv-iommu.c because hyperv_prepare_irq_remapping()
returns -ENODEV; instead, Linux kernel's generic support of Extended Dest
ID from David is used, meaning that Linux VM is able to support up to
32K CPUs, and IOAPIC interrupts can be routed to all the CPUs.

On an old Hyper-V host that doesn't support the Extended Dest ID, nothing
changes with this commit: Linux VM is still able to bring up the CPUs with
can not go to such CPUs, and the kdump kernel still can not work properly
on such CPUs.

Signed-off-by: Dexuan Cui 
Signed-off-by: Thomas Gleixner 
Acked-by: David Woodhouse


   
Link: https://lore.kernel.org/r/20201103011136.59108-1-de...@microsoft.com

---
 arch/x86/include/asm/hyperv-tlfs.h |  7 +++-
 arch/x86/kernel/cpu/mshyperv.c | 30 +-
 2 files changed, 37 insertions(+)

diff --git a/arch/x86/include/asm/hyperv-tlfs.h 
b/arch/x86/include/asm/hyperv-tlfs.h
index 0ed20e8..6bf42ae 100644
--- a/arch/x86/include/asm/hyperv-tlfs.h
+++ b/arch/x86/include/asm/hyperv-tlfs.h
@@ -23,6 +23,13 @@
 #define HYPERV_CPUID_IMPLEMENT_LIMITS  0x4005
 #define HYPERV_CPUID_NESTED_FEATURES   0x400A
 
+#define HYPERV_CPUID_VIRT_STACK_INTERFACE  0x4081
+#define HYPERV_VS_INTERFACE_EAX_SIGNATURE  0x31235356  /* "VS#1" */
+
+#define HYPERV_CPUID_VIRT_STACK_PROPERTIES 0x4082
+/* Support for the extended IOAPIC RTE format */
+#define HYPERV_VS_PROPERTIES_EAX_EXTENDED_IOAPIC_RTE   BIT(2)
+
 #define HYPERV_HYPERVISOR_PRESENT_BIT  0x8000
 #define HYPERV_CPUID_MIN   0x4005
 #define HYPERV_CPUID_MAX   0x4000
diff --git a/arch/x86/kernel/cpu/mshyperv.c b/arch/x86/kernel/cpu/mshyperv.c
index 05ef1f4..cc4037d 100644
--- a/arch/x86/kernel/cpu/mshyperv.c
+++ b/arch/x86/kernel/cpu/mshyperv.c
@@ -366,9 +366,39 @@ static void __init ms_hyperv_init_platform(void)
 #endif
 }
 
+static bool __init ms_hyperv_x2apic_available(void)
+{
+   return x2apic_supported();
+}
+
+/*
+ * If ms_hyperv_msi_ext_dest_id() returns true, hyperv_prepare_irq_remapping()
+ * returns -ENODEV and the Hyper-V IOMMU driver is not used; instead, the
+ * generic support of the 15-bit APIC ID is used: see __irq_msi_compose_msg().
+ *
+ * Note: For a VM on Hyper-V, no emulated legacy device supports PCI MSI/MSI-X,
+ * and PCI MSI/MSI-X only come from the assigned physical PCIe device, and the
+ * PCI MSI/MSI-X interrupts are handled by the pci-hyperv driver. Here despite
+ * the word "msi" in the name "msi_ext_dest_id", actually the callback only
+ * affects how IOAPIC interrupts are routed.
+ */
+static bool __init ms_hyperv_msi_ext_dest_id(void)
+{
+   u32 eax;
+
+   eax = cpuid_eax(HYPERV_CPUID_VIRT_STACK_INTERFACE);
+   if (eax != HYPERV_VS_INTERFACE_EAX_SIGNATURE)
+   return false;
+
+   eax = cpuid_eax(HYPERV_CPUID_VIRT_STACK_PROPERTIES);
+   return eax & HYPERV_VS_PROPERTIES_EAX_EXTENDED_IOAPIC_RTE;
+}
+
 const __initconst struct hypervisor_x86 x86_hyper_ms_hyperv = {
.name   = "Microsoft Hyper-V",
.detect = ms_hyperv_platform,
.type   = X86_HYPER_MS_HYPERV,
+   .init.x2apic_available  = ms_hyperv_x2apic_available,
+   .init.msi_ext_dest_id   = ms_hyperv_msi_ext_dest_id,
.init.init_platform = ms_hyperv_init_platform,
 };


[tip: irq/core] irqdomain: Add the missing assignment of domain->fwnode for named fwnode

2019-09-06 Thread tip-bot2 for Dexuan Cui
The following commit has been merged into the irq/core branch of tip:

Commit-ID: 711419e504ebd68c8f03656616829c8ad7829389
Gitweb:
https://git.kernel.org/tip/711419e504ebd68c8f03656616829c8ad7829389
Author:Dexuan Cui 
AuthorDate:Mon, 02 Sep 2019 23:14:56 
Committer: Marc Zyngier 
CommitterDate: Tue, 03 Sep 2019 09:16:50 +01:00

irqdomain: Add the missing assignment of domain->fwnode for named fwnode

Recently device pass-through stops working for Linux VM running on Hyper-V.

git-bisect shows the regression is caused by the recent commit
467a3bb97432 ("PCI: hv: Allocate a named fwnode ..."), but the root cause
is that the commit d59f6617eef0 forgets to set the domain->fwnode for
IRQCHIP_FWNODE_NAMED*, and as a result:

1. The domain->fwnode remains to be NULL.

2. irq_find_matching_fwspec() returns NULL since "h->fwnode == fwnode" is
false, and pci_set_bus_msi_domain() sets the Hyper-V PCI root bus's
msi_domain to NULL.

3. When the device is added onto the root bus, the device's dev->msi_domain
is set to NULL in pci_set_msi_domain().

4. When a device driver tries to enable MSI-X, pci_msi_setup_msi_irqs()
calls arch_setup_msi_irqs(), which uses the native MSI chip (i.e.
arch/x86/kernel/apic/msi.c: pci_msi_controller) to set up the irqs, but
actually pci_msi_setup_msi_irqs() is supposed to call
msi_domain_alloc_irqs() with the hbus->irq_domain, which is created in
hv_pcie_init_irq_domain() and is associated with the Hyper-V chip
hv_msi_irq_chip. Consequently, the irq line is not properly set up, and
the device driver can not receive any interrupt.

Fixes: d59f6617eef0 ("genirq: Allow fwnode to carry name information only")
Fixes: 467a3bb97432 ("PCI: hv: Allocate a named fwnode instead of an 
address-based one")
Reported-by: Lili Deng 
Signed-off-by: Dexuan Cui 
Signed-off-by: Marc Zyngier 
Link: 
https://lore.kernel.org/r/pu1p153mb01694d9af625ac335c600c5fbf...@pu1p153mb0169.apcp153.prod.outlook.com
---
 kernel/irq/irqdomain.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/kernel/irq/irqdomain.c b/kernel/irq/irqdomain.c
index e7bbab1..132672b 100644
--- a/kernel/irq/irqdomain.c
+++ b/kernel/irq/irqdomain.c
@@ -149,6 +149,7 @@ struct irq_domain *__irq_domain_add(struct fwnode_handle 
*fwnode, int size,
switch (fwid->type) {
case IRQCHIP_FWNODE_NAMED:
case IRQCHIP_FWNODE_NAMED_ID:
+   domain->fwnode = fwnode;
domain->name = kstrdup(fwid->name, GFP_KERNEL);
if (!domain->name) {
kfree(domain);