Re: [EXTERNAL] [tip: x86/apic] x86/io_apic: Cleanup trigger/polarity helpers

2020-11-18 Thread David Woodhouse
On Wed, 2020-11-18 at 23:51 +0700, Suravee Suthikulpanit wrote:
> Yes, this fixes the issue. Now I can receive the IOMMU event log
> interrupts for IO_PAGE_FAULT event, which is triggered 
> using the injection interface via debugfs.

Thanks, Suravee.


smime.p7s
Description: S/MIME cryptographic signature


Re: [EXTERNAL] [tip: x86/apic] x86/io_apic: Cleanup trigger/polarity helpers

2020-11-18 Thread Suravee Suthikulpanit

Tglx,

On 11/18/20 9:06 PM, Thomas Gleixner wrote:

Suravee,

On Wed, Nov 18 2020 at 17:29, Suravee Suthikulpanit wrote:

On 11/17/20 9:00 AM, Suravee Suthikulpanit wrote:

I might need your help debugging this issue. I'm seeing the following error:

[   14.005937] irq 29, desc: d200500b, depth: 0, count: 0, unhandled: 0
[   14.006234] ->handle_irq():  eab4b6eb, handle_bad_irq+0x0/0x230
[   14.006234] ->irq_data.chip(): 1cce6d6b, 
intcapxt_controller+0x0/0x120
[   14.006234] ->action(): 83bfd734
[   14.006234] ->action->handler(): 94806345, 
amd_iommu_int_handler+0x0/0x10
[   14.006234] unexpected IRQ trap at vector 1d

Do you have any idea what might have gone wrong here?


Yes. This lacks setting up the low level flow handler. Delta patch
below.

Thanks,

 tglx
---
--- a/drivers/iommu/amd/init.c
+++ b/drivers/iommu/amd/init.c
@@ -2033,6 +2033,7 @@ static int intcapxt_irqdomain_alloc(stru
  
  		irqd->chip = _controller;

irqd->chip_data = info->data;
+   __irq_set_handler(i, handle_edge_irq, 0, "edge");
}
  
  	return ret;




Yes, this fixes the issue. Now I can receive the IOMMU event log interrupts for IO_PAGE_FAULT event, which is triggered 
using the injection interface via debugfs.


Thanks,
Suravee


Re: [EXTERNAL] [tip: x86/apic] x86/io_apic: Cleanup trigger/polarity helpers

2020-11-18 Thread Thomas Gleixner
Suravee,

On Wed, Nov 18 2020 at 17:29, Suravee Suthikulpanit wrote:
> On 11/17/20 9:00 AM, Suravee Suthikulpanit wrote:
>
> I might need your help debugging this issue. I'm seeing the following error:
>
> [   14.005937] irq 29, desc: d200500b, depth: 0, count: 0, unhandled: > 0
> [   14.006234] ->handle_irq():  eab4b6eb, handle_bad_irq+0x0/0x230
> [   14.006234] ->irq_data.chip(): 1cce6d6b, 
> intcapxt_controller+0x0/0x120
> [   14.006234] ->action(): 83bfd734
> [   14.006234] ->action->handler(): 94806345, 
> amd_iommu_int_handler+0x0/0x10
> [   14.006234] unexpected IRQ trap at vector 1d
>
> Do you have any idea what might have gone wrong here?

Yes. This lacks setting up the low level flow handler. Delta patch
below.

Thanks,

tglx
---
--- a/drivers/iommu/amd/init.c
+++ b/drivers/iommu/amd/init.c
@@ -2033,6 +2033,7 @@ static int intcapxt_irqdomain_alloc(stru
 
irqd->chip = _controller;
irqd->chip_data = info->data;
+   __irq_set_handler(i, handle_edge_irq, 0, "edge");
}
 
return ret;


Re: [EXTERNAL] [tip: x86/apic] x86/io_apic: Cleanup trigger/polarity helpers

2020-11-18 Thread David Woodhouse
On Wed, 2020-11-18 at 17:29 +0700, Suravee Suthikulpanit wrote:
> I might need your help debugging this issue. I'm seeing the following error:
> 
> [   14.005937] irq 29, desc: d200500b, depth: 0, count: 0, unhandled: > 0
> [   14.006234] ->handle_irq():  eab4b6eb, handle_bad_irq+0x0/0x230

Where's that line coming from?

> [   14.006234] ->irq_data.chip(): 1cce6d6b, 
> intcapxt_controller+0x0/0x120
> [   14.006234] ->action(): 83bfd734
> [   14.006234] ->action->handler(): 94806345, 
> amd_iommu_int_handler+0x0/0x10
> [   14.006234] unexpected IRQ trap at vector 1d
> 
> Do you have any idea what might have gone wrong here?

Hm, vector 0x1d is vector 29. It's also IRQ 29. I wonder if that's a 
coincidence.

Can you make intcapxt_irqdomain_activate() print the 64-bit value that
it's writing to the intcapxt registers?


smime.p7s
Description: S/MIME cryptographic signature


Re: [EXTERNAL] [tip: x86/apic] x86/io_apic: Cleanup trigger/polarity helpers

2020-11-18 Thread Suravee Suthikulpanit

David

On 11/17/20 9:00 AM, Suravee Suthikulpanit wrote:

David,

On 11/13/20 10:14 PM, David Woodhouse wrote:

On Wed, 2020-11-11 at 14:30 -0600, Tom Lendacky wrote:

I had trouble cloning your tree for some reason, so just took the top
three patches and applied them to the tip tree. This all appears to be
working. I'll let the IOMMU experts take a closer look (adding Suravee).


Thanks. I see Thomas has taken the first two into the tip.git x86/apic
branch already, so we're just looking for an ack on the third. Which is
this one...

 From 49ee4fa51b8c06d14b7c4c74d15a7d76f865a8ea Mon Sep 17 00:00:00 2001
From: David Woodhouse 
Date: Wed, 11 Nov 2020 12:09:01 +
Subject: [PATCH] iommu/amd: Fix IOMMU interrupt generation in X2APIC mode

The AMD IOMMU has two modes for generating its own interrupts.

The first is very much based on PCI MSI, and can be configured by Linux
precisely that way. But like legacy unmapped PCI MSI it's limited to
8 bits of APIC ID.

The second method does not use PCI MSI at all in hardawre, and instead
configures the INTCAPXT registers in the IOMMU directly with the APIC ID
and vector.

In the latter case, the IOMMU driver would still use pci_enable_msi(),
read back (through MMIO) the MSI message that Linux wrote to the PCI MSI
table, then swizzle those bits into the appropriate register.

Historically, this worked because__irq_compose_msi_msg() would silently
generate an invalid MSI message with the high bits of the APIC ID in the
high bits of the MSI address. That hack was intended only for the Intel
IOMMU, and I recently enforced that, introducing a warning in
__irq_msi_compose_msg() if it was invoked with an APIC ID above 255.

Fix the AMD IOMMU not to depend on that hack any more, by having its own
irqdomain and directly putting the bits from the irq_cfg into the right
place in its ->activate() method.

Fixes: 47bea873cf80 "x86/msi: Only use high bits of MSI address for DMAR unit")
Signed-off-by: David Woodhouse 


I'm still working on testing this series using IO_PAGE_FAULT injection to trigger the IOMMU interrupts. I am still 
debugging some issues, and I'll keep you updated on the findings.


Thanks,
Suravee


I might need your help debugging this issue. I'm seeing the following error:

[   14.005937] irq 29, desc: d200500b, depth: 0, count: 0, unhandled: 0
[   14.006234] ->handle_irq():  eab4b6eb, handle_bad_irq+0x0/0x230
[   14.006234] ->irq_data.chip(): 1cce6d6b, 
intcapxt_controller+0x0/0x120
[   14.006234] ->action(): 83bfd734
[   14.006234] ->action->handler(): 94806345, 
amd_iommu_int_handler+0x0/0x10
[   14.006234] unexpected IRQ trap at vector 1d

Do you have any idea what might have gone wrong here?

Thanks,
Suravee


Re: [EXTERNAL] [tip: x86/apic] x86/io_apic: Cleanup trigger/polarity helpers

2020-11-16 Thread Suravee Suthikulpanit

David,

On 11/13/20 10:14 PM, David Woodhouse wrote:

On Wed, 2020-11-11 at 14:30 -0600, Tom Lendacky wrote:

I had trouble cloning your tree for some reason, so just took the top
three patches and applied them to the tip tree. This all appears to be
working. I'll let the IOMMU experts take a closer look (adding Suravee).


Thanks. I see Thomas has taken the first two into the tip.git x86/apic
branch already, so we're just looking for an ack on the third. Which is
this one...

 From 49ee4fa51b8c06d14b7c4c74d15a7d76f865a8ea Mon Sep 17 00:00:00 2001
From: David Woodhouse 
Date: Wed, 11 Nov 2020 12:09:01 +
Subject: [PATCH] iommu/amd: Fix IOMMU interrupt generation in X2APIC mode

The AMD IOMMU has two modes for generating its own interrupts.

The first is very much based on PCI MSI, and can be configured by Linux
precisely that way. But like legacy unmapped PCI MSI it's limited to
8 bits of APIC ID.

The second method does not use PCI MSI at all in hardawre, and instead
configures the INTCAPXT registers in the IOMMU directly with the APIC ID
and vector.

In the latter case, the IOMMU driver would still use pci_enable_msi(),
read back (through MMIO) the MSI message that Linux wrote to the PCI MSI
table, then swizzle those bits into the appropriate register.

Historically, this worked because__irq_compose_msi_msg() would silently
generate an invalid MSI message with the high bits of the APIC ID in the
high bits of the MSI address. That hack was intended only for the Intel
IOMMU, and I recently enforced that, introducing a warning in
__irq_msi_compose_msg() if it was invoked with an APIC ID above 255.

Fix the AMD IOMMU not to depend on that hack any more, by having its own
irqdomain and directly putting the bits from the irq_cfg into the right
place in its ->activate() method.

Fixes: 47bea873cf80 "x86/msi: Only use high bits of MSI address for DMAR unit")
Signed-off-by: David Woodhouse 


I'm still working on testing this series using IO_PAGE_FAULT injection to trigger the IOMMU interrupts. I am still 
debugging some issues, and I'll keep you updated on the findings.


Thanks,
Suravee


Re: [EXTERNAL] [tip: x86/apic] x86/io_apic: Cleanup trigger/polarity helpers

2020-11-16 Thread David Woodhouse
On Fri, 2020-11-13 at 15:14 +, David Woodhouse wrote:
> On Wed, 2020-11-11 at 14:30 -0600, Tom Lendacky wrote:
> > I had trouble cloning your tree for some reason, so just took the top
> > three patches and applied them to the tip tree. This all appears to be
> > working. I'll let the IOMMU experts take a closer look (adding Suravee).
> 
> Thanks. I see Thomas has taken the first two into the tip.git x86/apic
> branch already, so we're just looking for an ack on the third. Which is
> this one...
> 
> From 49ee4fa51b8c06d14b7c4c74d15a7d76f865a8ea Mon Sep 17 00:00:00 2001
> From: David Woodhouse 
> Date: Wed, 11 Nov 2020 12:09:01 +
> Subject: [PATCH] iommu/amd: Fix IOMMU interrupt generation in X2APIC mode

Ping?



smime.p7s
Description: S/MIME cryptographic signature


Re: [EXTERNAL] [tip: x86/apic] x86/io_apic: Cleanup trigger/polarity helpers

2020-11-13 Thread David Woodhouse
On Wed, 2020-11-11 at 14:30 -0600, Tom Lendacky wrote:
> I had trouble cloning your tree for some reason, so just took the top
> three patches and applied them to the tip tree. This all appears to be
> working. I'll let the IOMMU experts take a closer look (adding Suravee).

Thanks. I see Thomas has taken the first two into the tip.git x86/apic
branch already, so we're just looking for an ack on the third. Which is
this one...

From 49ee4fa51b8c06d14b7c4c74d15a7d76f865a8ea Mon Sep 17 00:00:00 2001
From: David Woodhouse 
Date: Wed, 11 Nov 2020 12:09:01 +
Subject: [PATCH] iommu/amd: Fix IOMMU interrupt generation in X2APIC mode

The AMD IOMMU has two modes for generating its own interrupts.

The first is very much based on PCI MSI, and can be configured by Linux
precisely that way. But like legacy unmapped PCI MSI it's limited to
8 bits of APIC ID.

The second method does not use PCI MSI at all in hardawre, and instead
configures the INTCAPXT registers in the IOMMU directly with the APIC ID
and vector.

In the latter case, the IOMMU driver would still use pci_enable_msi(),
read back (through MMIO) the MSI message that Linux wrote to the PCI MSI
table, then swizzle those bits into the appropriate register.

Historically, this worked because__irq_compose_msi_msg() would silently
generate an invalid MSI message with the high bits of the APIC ID in the
high bits of the MSI address. That hack was intended only for the Intel
IOMMU, and I recently enforced that, introducing a warning in
__irq_msi_compose_msg() if it was invoked with an APIC ID above 255.

Fix the AMD IOMMU not to depend on that hack any more, by having its own
irqdomain and directly putting the bits from the irq_cfg into the right
place in its ->activate() method.

Fixes: 47bea873cf80 "x86/msi: Only use high bits of MSI address for DMAR unit")
Signed-off-by: David Woodhouse 
---
 arch/x86/include/asm/hw_irq.h |   1 +
 drivers/iommu/amd/init.c  | 190 +++---
 2 files changed, 132 insertions(+), 59 deletions(-)

diff --git a/arch/x86/include/asm/hw_irq.h b/arch/x86/include/asm/hw_irq.h
index 458f5a676402..d465ece58151 100644
--- a/arch/x86/include/asm/hw_irq.h
+++ b/arch/x86/include/asm/hw_irq.h
@@ -39,6 +39,7 @@ enum irq_alloc_type {
X86_IRQ_ALLOC_TYPE_PCI_MSI,
X86_IRQ_ALLOC_TYPE_PCI_MSIX,
X86_IRQ_ALLOC_TYPE_DMAR,
+   X86_IRQ_ALLOC_TYPE_AMDVI,
X86_IRQ_ALLOC_TYPE_UV,
 };
 
diff --git a/drivers/iommu/amd/init.c b/drivers/iommu/amd/init.c
index a94b96f1e13a..a57400a73abb 100644
--- a/drivers/iommu/amd/init.c
+++ b/drivers/iommu/amd/init.c
@@ -16,6 +16,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -1557,14 +1558,7 @@ static int __init init_iommu_one(struct amd_iommu 
*iommu, struct ivhd_header *h)
break;
}
 
-   /*
-* Note: Since iommu_update_intcapxt() leverages
-* the IOMMU MMIO access to MSI capability block registers
-* for MSI address lo/hi/data, we need to check both
-* EFR[XtSup] and EFR[MsiCapMmioSup] for x2APIC support.
-*/
-   if ((h->efr_reg & BIT(IOMMU_EFR_XTSUP_SHIFT)) &&
-   (h->efr_reg & BIT(IOMMU_EFR_MSICAPMMIOSUP_SHIFT)))
+   if (h->efr_reg & BIT(IOMMU_EFR_XTSUP_SHIFT))
amd_iommu_xt_mode = IRQ_REMAP_X2APIC_MODE;
break;
default:
@@ -1981,27 +1975,32 @@ union intcapxt {
 } __attribute__ ((packed));
 
 /*
- * Setup the IntCapXT registers with interrupt routing information
- * based on the PCI MSI capability block registers, accessed via
- * MMIO MSI address low/hi and MSI data registers.
+ * There isn't really any need to mask/unmask at the irqchip level because
+ * the 64-bit INTCAPXT registers can be updated atomically without tearing
+ * when the affinity is being updated.
  */
-static void iommu_update_intcapxt(struct amd_iommu *iommu)
+static void intcapxt_unmask_irq(struct irq_data *data)
 {
-   struct msi_msg msg;
-   union intcapxt xt;
-   u32 destid;
+}
 
-   msg.address_lo = readl(iommu->mmio_base + MMIO_MSI_ADDR_LO_OFFSET);
-   msg.address_hi = readl(iommu->mmio_base + MMIO_MSI_ADDR_HI_OFFSET);
-   msg.data = readl(iommu->mmio_base + MMIO_MSI_DATA_OFFSET);
+static void intcapxt_mask_irq(struct irq_data *data)
+{
+}
 
-   destid = x86_msi_msg_get_destid(, x2apic_enabled());
+static struct irq_chip intcapxt_controller;
+
+static int intcapxt_irqdomain_activate(struct irq_domain *domain,
+  struct irq_data *irqd, bool reserve)
+{
+   struct amd_iommu *iommu = irqd->chip_data;
+   struct irq_cfg *cfg = irqd_cfg(irqd);
+   union intcapxt xt;
 
xt.capxt = 0ULL;
-   xt.dest_mode_logical = msg.arch_data.dest_mode_logical;
-   xt.vector = msg.arch_data.vector;
-   xt.destid_0_23 = destid & GENMASK(23, 0);
-   

Re: [EXTERNAL] [tip: x86/apic] x86/io_apic: Cleanup trigger/polarity helpers

2020-11-11 Thread David Woodhouse
On Wed, 2020-11-11 at 14:30 -0600, Tom Lendacky wrote:
> On 11/11/20 6:32 AM, David Woodhouse wrote:
> > On Wed, 2020-11-11 at 10:36 +, David Woodhouse wrote:
> > > On Wed, 2020-11-11 at 10:46 +0100, Thomas Gleixner wrote:
> > > > Looking at it now with brain awake, the XTSUP stuff is pretty much
> > > > the same as DMAR, which I didn't realize yesterday. The affinity
> > > > notifier muck is not needed when we have a write_msg() function which
> > > > twiddles the bits into those other locations.
> > > 
> > > I kind of hate the fact that it's swizzling those bits through invalid
> > > MSI messages, so I did it as its own irqdomain using
> > > irq_domain_create_hierarchy() directly instead of
> > > msi_create_irq_domain().
> > 
> > Please give this a spin:
> > 
> > https://git.infradead.org/users/dwmw2/linux.git/shortlog/refs/heads/amdvi
> 
> I had trouble cloning your tree for some reason, so just took the top
> three patches and applied them to the tip tree. This all appears to be
> working. I'll let the IOMMU experts take a closer look (adding Suravee).

Thanks. Exercising it by actually triggering faults and observing the
IOMMU interrupt really working would be quite useful — on hardware with
both INTCAPXT and the older MSI delivery.


smime.p7s
Description: S/MIME cryptographic signature


Re: [EXTERNAL] [tip: x86/apic] x86/io_apic: Cleanup trigger/polarity helpers

2020-11-11 Thread Tom Lendacky
On 11/11/20 6:32 AM, David Woodhouse wrote:
> On Wed, 2020-11-11 at 10:36 +, David Woodhouse wrote:
>> On Wed, 2020-11-11 at 10:46 +0100, Thomas Gleixner wrote:
>>> Looking at it now with brain awake, the XTSUP stuff is pretty much
>>> the same as DMAR, which I didn't realize yesterday. The affinity
>>> notifier muck is not needed when we have a write_msg() function which
>>> twiddles the bits into those other locations.
>>
>> I kind of hate the fact that it's swizzling those bits through invalid
>> MSI messages, so I did it as its own irqdomain using
>> irq_domain_create_hierarchy() directly instead of
>> msi_create_irq_domain().
> 
> Please give this a spin:
> 
> https://git.infradead.org/users/dwmw2/linux.git/shortlog/refs/heads/amdvi

I had trouble cloning your tree for some reason, so just took the top
three patches and applied them to the tip tree. This all appears to be
working. I'll let the IOMMU experts take a closer look (adding Suravee).

Thanks,
Tom

> 


Re: [EXTERNAL] [tip: x86/apic] x86/io_apic: Cleanup trigger/polarity helpers

2020-11-11 Thread David Woodhouse
On Wed, 2020-11-11 at 10:36 +, David Woodhouse wrote:
> On Wed, 2020-11-11 at 10:46 +0100, Thomas Gleixner wrote:
> > Looking at it now with brain awake, the XTSUP stuff is pretty much
> > the same as DMAR, which I didn't realize yesterday. The affinity
> > notifier muck is not needed when we have a write_msg() function which
> > twiddles the bits into those other locations.
> 
> I kind of hate the fact that it's swizzling those bits through invalid
> MSI messages, so I did it as its own irqdomain using
> irq_domain_create_hierarchy() directly instead of
> msi_create_irq_domain().

Please give this a spin:

https://git.infradead.org/users/dwmw2/linux.git/shortlog/refs/heads/amdvi


smime.p7s
Description: S/MIME cryptographic signature


Re: [EXTERNAL] [tip: x86/apic] x86/io_apic: Cleanup trigger/polarity helpers

2020-11-11 Thread David Woodhouse
On Wed, 2020-11-11 at 10:46 +0100, Thomas Gleixner wrote:
> Looking at it now with brain awake, the XTSUP stuff is pretty much
> the same as DMAR, which I didn't realize yesterday. The affinity
> notifier muck is not needed when we have a write_msg() function which
> twiddles the bits into those other locations.

I kind of hate the fact that it's swizzling those bits through invalid
MSI messages, so I did it as its own irqdomain using
irq_domain_create_hierarchy() directly instead of
msi_create_irq_domain().

Looks something like this, although I believe I'm missing a call to
irq_domain_set_info() somewhere which means that the irq_chip
'intcapxt_controller' isn't being used at all...


diff --git a/drivers/iommu/amd/init.c b/drivers/iommu/amd/init.c
index 90a8add186e0..b68fe10aa67d 100644
--- a/drivers/iommu/amd/init.c
+++ b/drivers/iommu/amd/init.c
@@ -16,6 +16,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -1563,8 +1564,7 @@ static int __init init_iommu_one(struct amd_iommu *iommu, 
struct ivhd_header *h)
 * for MSI address lo/hi/data, we need to check both
 * EFR[XtSup] and EFR[MsiCapMmioSup] for x2APIC support.
 */
-   if ((h->efr_reg & BIT(IOMMU_EFR_XTSUP_SHIFT)) &&
-   (h->efr_reg & BIT(IOMMU_EFR_MSICAPMMIOSUP_SHIFT)))
+   if (h->efr_reg & BIT(IOMMU_EFR_XTSUP_SHIFT))
amd_iommu_xt_mode = IRQ_REMAP_X2APIC_MODE;
break;
default:
@@ -1978,28 +1978,27 @@ union intcapxt {
destid_24_31:  8;
 } __attribute__ ((packed));
 
-/*
- * Setup the IntCapXT registers with interrupt routing information
- * based on the PCI MSI capability block registers, accessed via
- * MMIO MSI address low/hi and MSI data registers.
- */
-static void iommu_update_intcapxt(struct amd_iommu *iommu)
+/* We weren't actually doing anything before. Should we? */
+static void intcapxt_unmask_irq(struct irq_data *data)
 {
-   struct msi_msg msg;
-   union intcapxt xt;
-   u32 destid;
+}
+static void intcapxt_mask_irq(struct irq_data *data)
+{
+}
 
-   msg.address_lo = readl(iommu->mmio_base + MMIO_MSI_ADDR_LO_OFFSET);
-   msg.address_hi = readl(iommu->mmio_base + MMIO_MSI_ADDR_HI_OFFSET);
-   msg.data = readl(iommu->mmio_base + MMIO_MSI_DATA_OFFSET);
 
-   destid = x86_msi_msg_get_destid(, x2apic_enabled());
+static int intcapxt_irqdomain_activate(struct irq_domain *domain,
+  struct irq_data *irqd, bool reserve)
+{
+   struct amd_iommu *iommu = domain->host_data;
+   struct irq_cfg *cfg = irqd_cfg(irqd);
+   union intcapxt xt;
 
xt.capxt = 0ULL;
-   xt.dest_mode_logical = msg.arch_data.dest_mode_logical;
-   xt.vector = msg.arch_data.vector;
-   xt.destid_0_23 = destid & GENMASK(23, 0);
-   xt.destid_24_31 = destid >> 24;
+   xt.dest_mode_logical = apic->dest_mode_logical;
+   xt.vector = cfg->vector;
+   xt.destid_0_23 = cfg->dest_apicid & GENMASK(23, 0);
+   xt.destid_24_31 = cfg->dest_apicid >> 24;
 
/**
 * Current IOMMU implemtation uses the same IRQ for all
@@ -2008,64 +2007,117 @@ static void iommu_update_intcapxt(struct amd_iommu 
*iommu)
writeq(xt.capxt, iommu->mmio_base + MMIO_INTCAPXT_EVT_OFFSET);
writeq(xt.capxt, iommu->mmio_base + MMIO_INTCAPXT_PPR_OFFSET);
writeq(xt.capxt, iommu->mmio_base + MMIO_INTCAPXT_GALOG_OFFSET);
+
+   return 0;
 }
 
-static void _irq_notifier_notify(struct irq_affinity_notify *notify,
-const cpumask_t *mask)
+static void intcapxt_irqdomain_deactivate(struct irq_domain *domain,
+ struct irq_data *irqd)
 {
-   struct amd_iommu *iommu;
+   intcapxt_mask_irq(irqd);
+}
 
-   for_each_iommu(iommu) {
-   if (iommu->dev->irq == notify->irq) {
-   iommu_update_intcapxt(iommu);
-   break;
-   }
-   }
+
+static int intcapxt_irqdomain_alloc(struct irq_domain *domain, unsigned int 
virq,
+   unsigned int nr_irqs, void *arg)
+{
+   return irq_domain_alloc_irqs_parent(domain, virq, nr_irqs, arg);
 }
 
-static void _irq_notifier_release(struct kref *ref)
+static void intcapxt_irqdomain_free(struct irq_domain *domain, unsigned int 
virq,
+   unsigned int nr_irqs)
 {
+   irq_domain_free_irqs_top(domain, virq, nr_irqs);
 }
 
-static int iommu_init_intcapxt(struct amd_iommu *iommu)
+static int intcapxt_set_affinity(struct irq_data *irqd,
+const struct cpumask *mask, bool force)
 {
+   struct irq_data *parent = irqd->parent_data;
int ret;
-   struct irq_affinity_notify *notify = >intcapxt_notify;
 
-   /**
-* IntCapXT requires XTSup=1 and MsiCapMmioSup=1,
-* which can be inferred 

Re: [EXTERNAL] [tip: x86/apic] x86/io_apic: Cleanup trigger/polarity helpers

2020-11-11 Thread Thomas Gleixner
On Wed, Nov 11 2020 at 08:16, David Woodhouse wrote:
> On Tue, 2020-11-10 at 23:48 +0100, Thomas Gleixner wrote:
>> + * IRQCHIP_MSI_EXTID The MSI message created for this chip 
>> can
>> + *   have an otherwise forbidden extended ID
>
> If we're going to do that then we could ditch the separate
> iommu_compose_msi_msg() function too, right?
>
> But actually I'd be more inclined to fix this differently, in a way
> that doesn't still leave AMD's iommu_init_intcapxt() having to set use
> irq_set_affinity_notifier() to update its own registers. That's icky.
>
> Given that this is *its* irqdomain in the first place, it should just
> sit at ->set_affinity() for itself, and call its parent as usual
> without having to use a notifier.
>
> We should also leave it using the basic PCI MSI support in the case
> where the IOMMU doesn't have XTSUP support. It doesn't need its own
> irqdomain for that.

Looking at it now with brain awake, the XTSUP stuff is pretty much
the same as DMAR, which I didn't realize yesterday. The affinity
notifier muck is not needed when we have a write_msg() function which
twiddles the bits into those other locations.

Thanks,

tglx




Re: [EXTERNAL] [tip: x86/apic] x86/io_apic: Cleanup trigger/polarity helpers

2020-11-11 Thread David Woodhouse
On Tue, 2020-11-10 at 23:48 +0100, Thomas Gleixner wrote:
> + * IRQCHIP_MSI_EXTID The MSI message created for this chip 
> can
> + *   have an otherwise forbidden extended ID

If we're going to do that then we could ditch the separate
iommu_compose_msi_msg() function too, right?

But actually I'd be more inclined to fix this differently, in a way
that doesn't still leave AMD's iommu_init_intcapxt() having to set use
irq_set_affinity_notifier() to update its own registers. That's icky.

Given that this is *its* irqdomain in the first place, it should just
sit at ->set_affinity() for itself, and call its parent as usual
without having to use a notifier.

We should also leave it using the basic PCI MSI support in the case
where the IOMMU doesn't have XTSUP support. It doesn't need its own
irqdomain for that.



smime.p7s
Description: S/MIME cryptographic signature


Re: [EXTERNAL] [tip: x86/apic] x86/io_apic: Cleanup trigger/polarity helpers

2020-11-10 Thread Tom Lendacky
On 11/10/20 4:48 PM, Thomas Gleixner wrote:
> On Tue, Nov 10 2020 at 16:00, Tom Lendacky wrote:
>> On 11/10/20 3:30 PM, David Woodhouse wrote:
>> [   15.581115] WARNING: CPU: 6 PID: 1 at arch/x86/kernel/apic/apic.c:2527 
>> __irq_msi_compose_msg+0x9f/0xb0
>> [   15.581115] Call Trace:
>> [   15.581115]  irq_msi_update_msg+0x4d/0x80
>> [   15.581115]  msi_set_affinity+0x160/0x190
> 
> Duh. Yes, that want's some love as well. Delta patch below.
> 
> Thanks,
> 
> tglx

That fixed it.

Thanks,
Tom

> ---
> --- a/arch/x86/kernel/apic/msi.c
> +++ b/arch/x86/kernel/apic/msi.c
> @@ -24,10 +24,11 @@ struct irq_domain *x86_pci_msi_default_d
>  
>  static void irq_msi_update_msg(struct irq_data *irqd, struct irq_cfg *cfg)
>  {
> + struct irq_chip *chip = irq_data_get_irq_chip(irqd);
>   struct msi_msg msg[2] = { [1] = { }, };
>  
> - __irq_msi_compose_msg(cfg, msg, false);
> - irq_data_get_irq_chip(irqd)->irq_write_msi_msg(irqd, msg);
> + __irq_msi_compose_msg(cfg, msg, chip->flags & IRQCHIP_MSI_EXTID);
> + chip->irq_write_msi_msg(irqd, msg);
>  }
>  
>  static int
> @@ -271,7 +272,7 @@ static struct irq_chip iommu_msi_control
>   .irq_retrigger  = irq_chip_retrigger_hierarchy,
>   .irq_set_affinity   = msi_set_affinity,
>   .irq_compose_msi_msg= iommu_msi_compose_msg,
> - .flags  = IRQCHIP_SKIP_SET_WAKE,
> + .flags  = IRQCHIP_SKIP_SET_WAKE | IRQCHIP_MSI_EXTID,
>  };
>  
>  static struct msi_domain_info iommu_msi_domain_info = {
> @@ -310,7 +311,7 @@ static struct irq_chip dmar_msi_controll
>   .irq_retrigger  = irq_chip_retrigger_hierarchy,
>   .irq_compose_msi_msg= iommu_msi_compose_msg,
>   .irq_write_msi_msg  = dmar_msi_write_msg,
> - .flags  = IRQCHIP_SKIP_SET_WAKE,
> + .flags  = IRQCHIP_SKIP_SET_WAKE | IRQCHIP_MSI_EXTID,
>  };
>  
>  static int dmar_msi_init(struct irq_domain *domain,
> --- a/include/linux/irq.h
> +++ b/include/linux/irq.h
> @@ -567,6 +567,8 @@ struct irq_chip {
>   * IRQCHIP_SUPPORTS_NMI:  Chip can deliver NMIs, only for root 
> irqchips
>   * IRQCHIP_ENABLE_WAKEUP_ON_SUSPEND:  Invokes __enable_irq()/__disable_irq() 
> for wake irqs
>   *in the suspend path if they are in 
> disabled state
> + * IRQCHIP_MSI_EXTID   The MSI message created for this chip can
> + * have an otherwise forbidden extended ID
>   */
>  enum {
>   IRQCHIP_SET_TYPE_MASKED = (1 <<  0),
> @@ -579,6 +581,7 @@ enum {
>   IRQCHIP_SUPPORTS_LEVEL_MSI  = (1 <<  7),
>   IRQCHIP_SUPPORTS_NMI= (1 <<  8),
>   IRQCHIP_ENABLE_WAKEUP_ON_SUSPEND= (1 <<  9),
> + IRQCHIP_MSI_EXTID   = (1 << 10),
>  };
>  
>  #include 
> 


Re: [EXTERNAL] [tip: x86/apic] x86/io_apic: Cleanup trigger/polarity helpers

2020-11-10 Thread Thomas Gleixner
On Tue, Nov 10 2020 at 16:00, Tom Lendacky wrote:
> On 11/10/20 3:30 PM, David Woodhouse wrote:
> [   15.581115] WARNING: CPU: 6 PID: 1 at arch/x86/kernel/apic/apic.c:2527 
> __irq_msi_compose_msg+0x9f/0xb0
> [   15.581115] Call Trace:
> [   15.581115]  irq_msi_update_msg+0x4d/0x80
> [   15.581115]  msi_set_affinity+0x160/0x190

Duh. Yes, that want's some love as well. Delta patch below.

Thanks,

tglx
---
--- a/arch/x86/kernel/apic/msi.c
+++ b/arch/x86/kernel/apic/msi.c
@@ -24,10 +24,11 @@ struct irq_domain *x86_pci_msi_default_d
 
 static void irq_msi_update_msg(struct irq_data *irqd, struct irq_cfg *cfg)
 {
+   struct irq_chip *chip = irq_data_get_irq_chip(irqd);
struct msi_msg msg[2] = { [1] = { }, };
 
-   __irq_msi_compose_msg(cfg, msg, false);
-   irq_data_get_irq_chip(irqd)->irq_write_msi_msg(irqd, msg);
+   __irq_msi_compose_msg(cfg, msg, chip->flags & IRQCHIP_MSI_EXTID);
+   chip->irq_write_msi_msg(irqd, msg);
 }
 
 static int
@@ -271,7 +272,7 @@ static struct irq_chip iommu_msi_control
.irq_retrigger  = irq_chip_retrigger_hierarchy,
.irq_set_affinity   = msi_set_affinity,
.irq_compose_msi_msg= iommu_msi_compose_msg,
-   .flags  = IRQCHIP_SKIP_SET_WAKE,
+   .flags  = IRQCHIP_SKIP_SET_WAKE | IRQCHIP_MSI_EXTID,
 };
 
 static struct msi_domain_info iommu_msi_domain_info = {
@@ -310,7 +311,7 @@ static struct irq_chip dmar_msi_controll
.irq_retrigger  = irq_chip_retrigger_hierarchy,
.irq_compose_msi_msg= iommu_msi_compose_msg,
.irq_write_msi_msg  = dmar_msi_write_msg,
-   .flags  = IRQCHIP_SKIP_SET_WAKE,
+   .flags  = IRQCHIP_SKIP_SET_WAKE | IRQCHIP_MSI_EXTID,
 };
 
 static int dmar_msi_init(struct irq_domain *domain,
--- a/include/linux/irq.h
+++ b/include/linux/irq.h
@@ -567,6 +567,8 @@ struct irq_chip {
  * IRQCHIP_SUPPORTS_NMI:  Chip can deliver NMIs, only for root 
irqchips
  * IRQCHIP_ENABLE_WAKEUP_ON_SUSPEND:  Invokes __enable_irq()/__disable_irq() 
for wake irqs
  *in the suspend path if they are in 
disabled state
+ * IRQCHIP_MSI_EXTID The MSI message created for this chip can
+ *   have an otherwise forbidden extended ID
  */
 enum {
IRQCHIP_SET_TYPE_MASKED = (1 <<  0),
@@ -579,6 +581,7 @@ enum {
IRQCHIP_SUPPORTS_LEVEL_MSI  = (1 <<  7),
IRQCHIP_SUPPORTS_NMI= (1 <<  8),
IRQCHIP_ENABLE_WAKEUP_ON_SUSPEND= (1 <<  9),
+   IRQCHIP_MSI_EXTID   = (1 << 10),
 };
 
 #include 


Re: [EXTERNAL] [tip: x86/apic] x86/io_apic: Cleanup trigger/polarity helpers

2020-11-10 Thread Tom Lendacky
On 11/10/20 3:30 PM, David Woodhouse wrote:
> 
> 
> On 10 November 2020 21:01:17 GMT, Thomas Gleixner  wrote:
>> On Tue, Nov 10 2020 at 19:21, David Woodhouse wrote:
>>
>>> On 10 November 2020 18:56:17 GMT, Thomas Gleixner
>>  wrote:
 On Tue, Nov 10 2020 at 18:50, Thomas Gleixner wrote:
> On Tue, Nov 10 2020 at 16:33, David Woodhouse wrote:
>> If I could get post-5.5 kernels to boot at all with the AMD IOMMU
>> enabled, I'd have a go at throwing that together now...
>
> It can share the dmar domain code. Let me frob something.

 Not much to share there and I can't access my AMD machine at the
 moment. Something like the untested below should work.
>>>
>>> Does it even need its own irqdomain? Can it not just allocate
>> directly
>>> from the vector domain then program its own register directly based
>> on
>>> the irq_cfg?
>>
>> It uses pci_enable_msi() and I have no clue about that piece of
>> hardware
>> and whether that is actually required or not. If it is, then it needs a
>> domain because that's what pci_enable_msi() uses.
> 
> I'd be kind of surprised if it is required, but testing on qemu is obviously 
> not going to cut it. Tom?

Was just in the process of testing it... I still get a warning. Here's
the new backtrace:

[   15.581115] WARNING: CPU: 6 PID: 1 at arch/x86/kernel/apic/apic.c:2527 
__irq_msi_compose_msg+0x9f/0xb0
[   15.581115] Modules linked in:
[   15.581115] CPU: 6 PID: 1 Comm: swapper/0 Not tainted 5.10.0-rc3-sos-custom 
#1
[   15.581115] Hardware name: AMD Corporation ETHANOL_X/ETHANOL_X, BIOS 
REX1006G 01/25/2020
[   15.581115] RIP: 0010:__irq_msi_compose_msg+0x9f/0xb0
[   15.581115] Code: 01 00 74 1e 3d ff 7f 00 00 77 1f 0f b7 16 c1 e8 08 83 e0 
7f c1 e0 05 66 81 e2 1f f0 09 d0 66 89 06 c3 3d ff 00 00 00 77 01 c3 <0f> 0b c3 
66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 00 0f 1f 44 00 00
[   15.581115] RSP: 0018:c90c7c18 EFLAGS: 00010012
[   15.581115] RAX: 0100 RBX:  RCX: 
[   15.581115] RDX:  RSI: c90c7c20 RDI: 8881088341c0
[   15.581115] RBP: 8881599e0428 R08:  R09: 
[   15.581115] R10: 0003 R11: 0004 R12: 8881088341c0
[   15.581115] R13:  R14: 0004 R15: 888108834180
[   15.581115] FS:  () GS:88900d38() 
knlGS:
[   15.581115] CS:  0010 DS:  ES:  CR0: 80050033
[   15.581115] CR2:  CR3: 00015240a000 CR4: 00350ee0
[   15.581115] Call Trace:
[   15.581115]  irq_msi_update_msg+0x4d/0x80
[   15.581115]  msi_set_affinity+0x160/0x190
[   15.581115]  irq_do_set_affinity+0x52/0x190
[   15.581115]  irq_setup_affinity+0xd7/0x170
[   15.581115]  irq_startup+0x5d/0xf0
[   15.581115]  __setup_irq+0x6b9/0x700
[   15.581115]  request_threaded_irq+0xf8/0x160
[   15.581115]  ? irq_remapping_alloc+0x4d0/0x4d0
[   15.581115]  ? e820__memblock_setup+0x7d/0x7d
[   15.581115]  iommu_init_msi+0x60/0x190
[   15.581115]  state_next+0x39d/0x665
[   15.581115]  ? e820__memblock_setup+0x7d/0x7d
[   15.581115]  iommu_go_to_state+0x24/0x28
[   15.581115]  amd_iommu_init+0x11/0x46
[   15.581115]  pci_iommu_init+0x16/0x3f
[   15.581115]  do_one_initcall+0x44/0x1d0
[   15.581115]  kernel_init_freeable+0x1e7/0x249
[   15.581115]  ? rest_init+0xb4/0xb4
[   15.581115]  kernel_init+0xa/0x10c
[   15.581115]  ret_from_fork+0x22/0x30
[   15.581115] CPU: 6 PID: 1 Comm: swapper/0 Not tainted 5.10.0-rc3-sos-custom 
#1
[   15.581115] Hardware name: AMD Corporation ETHANOL_X/ETHANOL_X, BIOS 
REX1006G 01/25/2020
[   15.581115] Call Trace:
[   15.581115]  dump_stack+0x6d/0x88
[   15.581115]  __warn.cold+0x24/0x3d
[   15.581115]  ? __irq_msi_compose_msg+0x9f/0xb0
[   15.581115]  report_bug+0xd1/0x100
[   15.581115]  handle_bug+0x35/0x80
[   15.581115]  exc_invalid_op+0x14/0x70
[   15.581115]  asm_exc_invalid_op+0x12/0x20
[   15.581115] RIP: 0010:__irq_msi_compose_msg+0x9f/0xb0
[   15.581115] Code: 01 00 74 1e 3d ff 7f 00 00 77 1f 0f b7 16 c1 e8 08 83 e0 
7f c1 e0 05 66 81 e2 1f f0 09 d0 66 89 06 c3 3d ff 00 00 00 77 01 c3 <0f> 0b c3 
66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 00 0f 1f 44 00 00
[   15.581115] RSP: 0018:c90c7c18 EFLAGS: 00010012
[   15.581115] RAX: 0100 RBX:  RCX: 
[   15.581115] RDX:  RSI: c90c7c20 RDI: 8881088341c0
[   15.581115] RBP: 8881599e0428 R08:  R09: 
[   15.581115] R10: 0003 R11: 0004 R12: 8881088341c0
[   15.581115] R13:  R14: 0004 R15: 888108834180
[   15.581115]  irq_msi_update_msg+0x4d/0x80
[   15.581115]  msi_set_affinity+0x160/0x190
[   15.581115]  irq_do_set_affinity+0x52/0x190
[   15.581115]  irq_setup_affinity+0xd7/0x170
[   15.581115]  irq_startup+0x5d/0xf0
[   15.581115]  __setup_irq+0x6b9/0x700
[   15.581115]  

Re: [EXTERNAL] [tip: x86/apic] x86/io_apic: Cleanup trigger/polarity helpers

2020-11-10 Thread David Woodhouse



On 10 November 2020 21:01:17 GMT, Thomas Gleixner  wrote:
>On Tue, Nov 10 2020 at 19:21, David Woodhouse wrote:
>
>> On 10 November 2020 18:56:17 GMT, Thomas Gleixner
> wrote:
>>>On Tue, Nov 10 2020 at 18:50, Thomas Gleixner wrote:
 On Tue, Nov 10 2020 at 16:33, David Woodhouse wrote:
> If I could get post-5.5 kernels to boot at all with the AMD IOMMU
> enabled, I'd have a go at throwing that together now...

 It can share the dmar domain code. Let me frob something.
>>>
>>>Not much to share there and I can't access my AMD machine at the
>>>moment. Something like the untested below should work.
>>
>> Does it even need its own irqdomain? Can it not just allocate
>directly
>> from the vector domain then program its own register directly based
>on
>> the irq_cfg?
>
>It uses pci_enable_msi() and I have no clue about that piece of
>hardware
>and whether that is actually required or not. If it is, then it needs a
>domain because that's what pci_enable_msi() uses.

I'd be kind of surprised if it is required, but testing on qemu is obviously 
not going to cut it. Tom?

-- 
Sent from my Android device with K-9 Mail. Please excuse my brevity.


Re: [EXTERNAL] [tip: x86/apic] x86/io_apic: Cleanup trigger/polarity helpers

2020-11-10 Thread Thomas Gleixner
On Tue, Nov 10 2020 at 19:21, David Woodhouse wrote:

> On 10 November 2020 18:56:17 GMT, Thomas Gleixner  wrote:
>>On Tue, Nov 10 2020 at 18:50, Thomas Gleixner wrote:
>>> On Tue, Nov 10 2020 at 16:33, David Woodhouse wrote:
 If I could get post-5.5 kernels to boot at all with the AMD IOMMU
 enabled, I'd have a go at throwing that together now...
>>>
>>> It can share the dmar domain code. Let me frob something.
>>
>>Not much to share there and I can't access my AMD machine at the
>>moment. Something like the untested below should work.
>
> Does it even need its own irqdomain? Can it not just allocate directly
> from the vector domain then program its own register directly based on
> the irq_cfg?

It uses pci_enable_msi() and I have no clue about that piece of hardware
and whether that is actually required or not. If it is, then it needs a
domain because that's what pci_enable_msi() uses.

Thanks,

tglx


Re: [EXTERNAL] [tip: x86/apic] x86/io_apic: Cleanup trigger/polarity helpers

2020-11-10 Thread David Woodhouse



On 10 November 2020 18:56:17 GMT, Thomas Gleixner  wrote:
>On Tue, Nov 10 2020 at 18:50, Thomas Gleixner wrote:
>> On Tue, Nov 10 2020 at 16:33, David Woodhouse wrote:
>>> If I could get post-5.5 kernels to boot at all with the AMD IOMMU
>>> enabled, I'd have a go at throwing that together now...
>>
>> It can share the dmar domain code. Let me frob something.
>
>Not much to share there and I can't access my AMD machine at the
>moment. Something like the untested below should work.

Does it even need its own irqdomain? Can it not just allocate directly from the 
vector domain then program its own register directly based on the irq_cfg?


-- 
Sent from my Android device with K-9 Mail. Please excuse my brevity.


Re: [EXTERNAL] [tip: x86/apic] x86/io_apic: Cleanup trigger/polarity helpers

2020-11-10 Thread Thomas Gleixner
On Tue, Nov 10 2020 at 18:50, Thomas Gleixner wrote:
> On Tue, Nov 10 2020 at 16:33, David Woodhouse wrote:
>> If I could get post-5.5 kernels to boot at all with the AMD IOMMU
>> enabled, I'd have a go at throwing that together now...
>
> It can share the dmar domain code. Let me frob something.

Not much to share there and I can't access my AMD machine at the
moment. Something like the untested below should work.

Thanks,

tglx
---
--- a/arch/x86/include/asm/apic.h
+++ b/arch/x86/include/asm/apic.h
@@ -523,7 +523,7 @@ struct msi_msg;
 struct irq_cfg;
 
 extern void __irq_msi_compose_msg(struct irq_cfg *cfg, struct msi_msg *msg,
- bool dmar);
+ bool iommu);
 
 extern void ioapic_zap_locks(void);
 
--- a/arch/x86/include/asm/irqdomain.h
+++ b/arch/x86/include/asm/irqdomain.h
@@ -63,4 +63,6 @@ static inline void x86_create_pci_msi_do
 #define x86_pci_msi_default_domain NULL
 #endif
 
+struct irq_domain *x86_create_iommu_msi_domain(void);
+
 #endif
--- a/arch/x86/kernel/apic/apic.c
+++ b/arch/x86/kernel/apic/apic.c
@@ -2498,7 +2498,7 @@ int hard_smp_processor_id(void)
 }
 
 void __irq_msi_compose_msg(struct irq_cfg *cfg, struct msi_msg *msg,
-  bool dmar)
+  bool iommu)
 {
memset(msg, 0, sizeof(*msg));
 
@@ -2519,7 +2519,7 @@ void __irq_msi_compose_msg(struct irq_cf
 * some hypervisors allow the extended destination ID field in bits
 * 5-11 to be used, giving support for 15 bits of APIC IDs in total.
 */
-   if (dmar)
+   if (iommu)
msg->arch_addr_hi.destid_8_31 = cfg->dest_apicid >> 8;
else if (virt_ext_dest_id && cfg->dest_apicid < 0x8000)
msg->arch_addr_lo.virt_destid_8_14 = cfg->dest_apicid >> 8;
--- a/arch/x86/kernel/apic/msi.c
+++ b/arch/x86/kernel/apic/msi.c
@@ -184,7 +184,8 @@ static struct msi_domain_info pci_msi_do
.handler_name   = "edge",
 };
 
-struct irq_domain * __init native_create_pci_msi_domain(void)
+static struct irq_domain *__init
+__create_pci_msi_domain(struct msi_domain_info *info, const char *name)
 {
struct fwnode_handle *fn;
struct irq_domain *d;
@@ -192,21 +193,25 @@ struct irq_domain * __init native_create
if (disable_apic)
return NULL;
 
-   fn = irq_domain_alloc_named_fwnode("PCI-MSI");
+   fn = irq_domain_alloc_named_fwnode(name);
if (!fn)
return NULL;
 
-   d = pci_msi_create_irq_domain(fn, _msi_domain_info,
- x86_vector_domain);
+   d = pci_msi_create_irq_domain(fn, info, x86_vector_domain);
if (!d) {
irq_domain_free_fwnode(fn);
-   pr_warn("Failed to initialize PCI-MSI irqdomain.\n");
+   pr_warn("Failed to initialize %s irqdomain.\n", name);
} else {
d->flags |= IRQ_DOMAIN_MSI_NOMASK_QUIRK;
}
return d;
 }
 
+struct irq_domain * __init native_create_pci_msi_domain(void)
+{
+   return __create_pci_msi_domain(_msi_domain_info, "PCI-MSI");
+}
+
 void __init x86_create_pci_msi_domain(void)
 {
x86_pci_msi_default_domain = x86_init.irqs.create_pci_msi_domain();
@@ -247,6 +252,43 @@ struct irq_domain *arch_create_remap_msi
 }
 #endif
 
+static void __maybe_unused iommu_msi_compose_msg(struct irq_data *data,
+struct msi_msg *msg)
+{
+   __irq_msi_compose_msg(irqd_cfg(data), msg, true);
+}
+
+#ifdef CONFIG_AMD_IOMMU
+/*
+ * Similar to the Intel IOMMU abuse below. The resulting irq domain is
+ * associated to the IOMMU pci device, so that pci_enable_msi() works.
+ */
+static struct irq_chip iommu_msi_controller = {
+   .name   = "IOMMU-MSI",
+   .irq_unmask = pci_msi_unmask_irq,
+   .irq_mask   = pci_msi_mask_irq,
+   .irq_ack= irq_chip_ack_parent,
+   .irq_retrigger  = irq_chip_retrigger_hierarchy,
+   .irq_set_affinity   = msi_set_affinity,
+   .irq_compose_msi_msg= iommu_msi_compose_msg,
+   .flags  = IRQCHIP_SKIP_SET_WAKE,
+};
+
+static struct msi_domain_info iommu_msi_domain_info = {
+   .flags  = MSI_FLAG_USE_DEF_DOM_OPS | MSI_FLAG_USE_DEF_CHIP_OPS |
+ MSI_FLAG_PCI_MSIX,
+   .ops= _msi_domain_ops,
+   .chip   = _msi_controller,
+   .handler= handle_edge_irq,
+   .handler_name   = "edge",
+};
+
+struct irq_domain __init *x86_create_iommu_msi_domain(void)
+{
+   return __create_pci_msi_domain(_msi_domain_info, "IOMMU-MSI");
+}
+#endif
+
 #ifdef CONFIG_DMAR_TABLE
 /*
  * The Intel IOMMU (ab)uses the high bits of the MSI address to contain the
@@ -254,11 +296,6 @@ struct irq_domain *arch_create_remap_msi
  * case for MSIs as it would be targeting real memory above 4GiB not the
  * APIC.
  */
-static void 

Re: [EXTERNAL] [tip: x86/apic] x86/io_apic: Cleanup trigger/polarity helpers

2020-11-10 Thread Thomas Gleixner
On Tue, Nov 10 2020 at 16:33, David Woodhouse wrote:
> On Tue, 2020-11-10 at 10:17 -0600, Tom Lendacky wrote:
>> Yep. The warning started triggering with:
>> 47bea873cf80 ("x86/msi: Only use high bits of MSI address for DMAR unit")
>> 
>> Here's the backtrace:
>> 
>> [   15.745929]  irq_chip_compose_msi_msg+0x2e/0x40
>> [   15.750984]  msi_domain_activate+0x4b/0x90
>> [   15.76]  __irq_domain_activate_irq+0x53/0x80
>> [   15.760707]  ? irq_set_msi_desc_off+0x5a/0x90
>> [   15.765568]  irq_domain_activate_irq+0x25/0x40
>> [   15.770525]  __msi_domain_alloc_irqs+0x16a/0x310
>> [   15.775680]  __pci_enable_msi_range+0x182/0x2b0
>> [   15.780738]  ? e820__memblock_setup+0x7d/0x7d
>> [   15.785597]  pci_enable_msi+0x16/0x30
>> [   15.789685]  iommu_init_msi+0x30/0x190
>
> It's asking the core code to generate a PCI MSI message for it and
> actually program that to the PCI device (since the IOMMU itself is a
> PCI device).
>
> That isn't actually used for generating MSI, but is instead interpreted
> to write the intcapxt registers which *do* generate the interrupts.
>
> That wants fixing, preferably not to go via MSI format at all, or maybe
> just to use the 'dmar' flag to __irq_msi_compose_msg(). Either way by
> having an irqdomain of its own like the Intel IOMMU does.
>
> If I could get post-5.5 kernels to boot at all with the AMD IOMMU
> enabled, I'd have a go at throwing that together now...

It can share the dmar domain code. Let me frob something.


Re: [EXTERNAL] [tip: x86/apic] x86/io_apic: Cleanup trigger/polarity helpers

2020-11-10 Thread Tom Lendacky
On 11/10/20 10:33 AM, David Woodhouse wrote:
> On Tue, 2020-11-10 at 10:17 -0600, Tom Lendacky wrote:
>> Yep. The warning started triggering with:
>> 47bea873cf80 ("x86/msi: Only use high bits of MSI address for DMAR unit")
>>
>> Here's the backtrace:
>>
>> [   15.611109] [ cut here ]
>> [   15.616274] WARNING: CPU: 184 PID: 1 at arch/x86/kernel/apic/apic.c:2505 
>> __irq_msi_compose_msg+0x79/0x80
>> [   15.626855] Modules linked in:
>> [   15.630263] CPU: 184 PID: 1 Comm: swapper/0 Not tainted 
>> 5.10.0-rc1-sos-custom #1
>> [   15.638516] Hardware name: AMD Corporation ETHANOL_X/ETHANOL_X, BIOS 
>> REX1006G 01/25/2020
>> [   15.647549] RIP: 0010:__irq_msi_compose_msg+0x79/0x80
>> [   15.653188] Code: 0f f0 ff 09 d0 89 06 8b 47 04 c7 46 04 00 00 00 00 88 
>> 46 08 45 84 c0 74 08 8b 07 30 c0 89 46 04 c3 81 3f ff 00 00 00 77 01 c3 <0f> 
>> 0b c3 0f 1f 40 00 0f 1f 44 00 00 8b 05 81 f9 04 02 85 c0 75 16
>> [   15.674140] RSP: 0018:c90c7c30 EFLAGS: 00010212
>> [   15.679971] RAX: 0021 RBX: 888143789028 RCX: 
>> 
>> [   15.687936] RDX:  RSI: c90c7c40 RDI: 
>> 8881447dd1c0
>> [   15.695899] RBP: 888143789028 R08:  R09: 
>> 0005
>> [   15.703861] R10: 0002 R11: 0004 R12: 
>> 88900e7b80c0
>> [   15.711825] R13:  R14: 88907646ba80 R15: 
>> 
>> [   15.719789] FS:  () GS:88900f00() 
>> knlGS:
>> [   15.728821] CS:  0010 DS:  ES:  CR0: 80050033
>> [   15.735234] CR2:  CR3: 00114d40a000 CR4: 
>> 00350ee0
>> [   15.743197] Call Trace:
>> [   15.745929]  irq_chip_compose_msi_msg+0x2e/0x40
>> [   15.750984]  msi_domain_activate+0x4b/0x90
>> [   15.76]  __irq_domain_activate_irq+0x53/0x80
>> [   15.760707]  ? irq_set_msi_desc_off+0x5a/0x90
>> [   15.765568]  irq_domain_activate_irq+0x25/0x40
>> [   15.770525]  __msi_domain_alloc_irqs+0x16a/0x310
>> [   15.775680]  __pci_enable_msi_range+0x182/0x2b0
>> [   15.780738]  ? e820__memblock_setup+0x7d/0x7d
>> [   15.785597]  pci_enable_msi+0x16/0x30
>> [   15.789685]  iommu_init_msi+0x30/0x190
>> [   15.793860]  state_next+0x39d/0x665
>> [   15.797754]  ? e820__memblock_setup+0x7d/0x7d
>> [   15.802615]  iommu_go_to_state+0x24/0x28
>> [   15.806993]  amd_iommu_init+0x11/0x46
>> [   15.811076]  pci_iommu_init+0x16/0x3f
> 
> Oh joy.
> 
> It's asking the core code to generate a PCI MSI message for it and
> actually program that to the PCI device (since the IOMMU itself is a
> PCI device).
> 
> That isn't actually used for generating MSI, but is instead interpreted
> to write the intcapxt registers which *do* generate the interrupts.
> 
> That wants fixing, preferably not to go via MSI format at all, or maybe
> just to use the 'dmar' flag to __irq_msi_compose_msg(). Either way by
> having an irqdomain of its own like the Intel IOMMU does.
> 
> If I could get post-5.5 kernels to boot at all with the AMD IOMMU
> enabled, I'd have a go at throwing that together now...

That's odd. I haven't had an issue with or heard of any issues with
booting post-5.5 kernels with the AMD IOMMU enabled. What kind of problems
are you encountering?

Thanks,
Tom

> 


Re: [EXTERNAL] [tip: x86/apic] x86/io_apic: Cleanup trigger/polarity helpers

2020-11-10 Thread David Woodhouse
On Tue, 2020-11-10 at 10:17 -0600, Tom Lendacky wrote:
> Yep. The warning started triggering with:
> 47bea873cf80 ("x86/msi: Only use high bits of MSI address for DMAR unit")
> 
> Here's the backtrace:
> 
> [   15.611109] [ cut here ]
> [   15.616274] WARNING: CPU: 184 PID: 1 at arch/x86/kernel/apic/apic.c:2505 
> __irq_msi_compose_msg+0x79/0x80
> [   15.626855] Modules linked in:
> [   15.630263] CPU: 184 PID: 1 Comm: swapper/0 Not tainted 
> 5.10.0-rc1-sos-custom #1
> [   15.638516] Hardware name: AMD Corporation ETHANOL_X/ETHANOL_X, BIOS 
> REX1006G 01/25/2020
> [   15.647549] RIP: 0010:__irq_msi_compose_msg+0x79/0x80
> [   15.653188] Code: 0f f0 ff 09 d0 89 06 8b 47 04 c7 46 04 00 00 00 00 88 46 
> 08 45 84 c0 74 08 8b 07 30 c0 89 46 04 c3 81 3f ff 00 00 00 77 01 c3 <0f> 0b 
> c3 0f 1f 40 00 0f 1f 44 00 00 8b 05 81 f9 04 02 85 c0 75 16
> [   15.674140] RSP: 0018:c90c7c30 EFLAGS: 00010212
> [   15.679971] RAX: 0021 RBX: 888143789028 RCX: 
> 
> [   15.687936] RDX:  RSI: c90c7c40 RDI: 
> 8881447dd1c0
> [   15.695899] RBP: 888143789028 R08:  R09: 
> 0005
> [   15.703861] R10: 0002 R11: 0004 R12: 
> 88900e7b80c0
> [   15.711825] R13:  R14: 88907646ba80 R15: 
> 
> [   15.719789] FS:  () GS:88900f00() 
> knlGS:
> [   15.728821] CS:  0010 DS:  ES:  CR0: 80050033
> [   15.735234] CR2:  CR3: 00114d40a000 CR4: 
> 00350ee0
> [   15.743197] Call Trace:
> [   15.745929]  irq_chip_compose_msi_msg+0x2e/0x40
> [   15.750984]  msi_domain_activate+0x4b/0x90
> [   15.76]  __irq_domain_activate_irq+0x53/0x80
> [   15.760707]  ? irq_set_msi_desc_off+0x5a/0x90
> [   15.765568]  irq_domain_activate_irq+0x25/0x40
> [   15.770525]  __msi_domain_alloc_irqs+0x16a/0x310
> [   15.775680]  __pci_enable_msi_range+0x182/0x2b0
> [   15.780738]  ? e820__memblock_setup+0x7d/0x7d
> [   15.785597]  pci_enable_msi+0x16/0x30
> [   15.789685]  iommu_init_msi+0x30/0x190
> [   15.793860]  state_next+0x39d/0x665
> [   15.797754]  ? e820__memblock_setup+0x7d/0x7d
> [   15.802615]  iommu_go_to_state+0x24/0x28
> [   15.806993]  amd_iommu_init+0x11/0x46
> [   15.811076]  pci_iommu_init+0x16/0x3f

Oh joy.

It's asking the core code to generate a PCI MSI message for it and
actually program that to the PCI device (since the IOMMU itself is a
PCI device).

That isn't actually used for generating MSI, but is instead interpreted
to write the intcapxt registers which *do* generate the interrupts.

That wants fixing, preferably not to go via MSI format at all, or maybe
just to use the 'dmar' flag to __irq_msi_compose_msg(). Either way by
having an irqdomain of its own like the Intel IOMMU does.

If I could get post-5.5 kernels to boot at all with the AMD IOMMU
enabled, I'd have a go at throwing that together now...



smime.p7s
Description: S/MIME cryptographic signature


Re: [EXTERNAL] [tip: x86/apic] x86/io_apic: Cleanup trigger/polarity helpers

2020-11-10 Thread David Woodhouse
On Tue, 2020-11-10 at 16:54 +0100, Thomas Gleixner wrote:
> On Tue, Nov 10 2020 at 08:55, Tom Lendacky wrote:
> > On 11/10/20 8:34 AM, Thomas Gleixner wrote:
> > I was about to send the dmesg output when I saw this. A quick test
> > with
> > this change resolves the boot issue, thanks!
> 
> /me feels stupid
> 
> > I'm still seeing the warning at arch/x86/kernel/apic/apic.c:2527,
> > but I'll
> > start a separate thread on that.
> 
> Can you please provide the backtrace?

We'll want something like this, to stop AMD IOMMU from registering its
irqdomain even when it shouldn't. But that's the wrong way round; that
would give you remapped MSIs when it should be using the
x86_vector_domain for them. And you have the converse, it seems.

--- a/drivers/iommu/amd/init.c
+++ b/drivers/iommu/amd/init.c
@@ -1601,9 +1601,11 @@ static int __init init_iommu_one(struct amd_iommu 
*iommu, struct ivhd_header *h)
if (ret)
return ret;
 
-   ret = amd_iommu_create_irq_domain(iommu);
-   if (ret)
-   return ret;
+   if (amd_iommu_irq_remap) {
+   ret = amd_iommu_create_irq_domain(iommu);
+   if (ret)
+   return ret;
+   }
 
/*
 * Make sure IOMMU is not considered to translate itself. The IVRS



smime.p7s
Description: S/MIME cryptographic signature


Re: [EXTERNAL] [tip: x86/apic] x86/io_apic: Cleanup trigger/polarity helpers

2020-11-09 Thread David Woodhouse
On Mon, 2020-11-09 at 17:15 -0600, Tom Lendacky wrote:
> On 10/29/20 7:15 AM, tip-bot2 for Thomas Gleixner wrote:
> > The following commit has been merged into the x86/apic branch of tip:
> > 
> > Commit-ID: a27dca645d2c0f31abb7858aa0e10b2fa0f2f659
> > Gitweb:
> > https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git/commit/?id=a27dca645d2c0f31abb7858aa0e10b2fa0f2f659
> > Author:Thomas Gleixner 
> > AuthorDate:Sat, 24 Oct 2020 22:35:19 +01:00
> > Committer: Thomas Gleixner 
> > CommitterDate: Wed, 28 Oct 2020 20:26:26 +01:00
> > 
> > x86/io_apic: Cleanup trigger/polarity helpers
> > 
> > 'trigger' and 'polarity' are used throughout the I/O-APIC code for handling
> > the trigger type (edge/level) and the active low/high configuration. While
> > there are defines for initializing these variables and struct members, they
> > are not used consequently and the meaning of 'trigger' and 'polarity' is
> > opaque and confusing at best.
> > 
> > Rename them to 'is_level' and 'active_low' and make them boolean in various
> > structs so it's entirely clear what the meaning is.
> 
> Running the tip tree on my second generation EPYC system I'm seeing lots
> of the following:
> 
> [  105.325371] hpet: Lost 9601 RTC interrupts
> [  105.485766] hpet: Lost 9600 RTC interrupts
> [  105.639182] hpet: Lost 9601 RTC interrupts
> [  105.792155] hpet: Lost 9601 RTC interrupts
> [  105.947076] hpet: Lost 9601 RTC interrupts
> [  106.100876] hpet: Lost 9600 RTC interrupts
> [  106.253444] hpet: Lost 9601 RTC interrupts
> [  106.406722] hpet: Lost 9601 RTC interrupts
> 
> preventing the system from booting. I bisected it to this commit.
> 
> Additionally, I'm seeing warnings and error messages (which I haven't
> bisected, yet) along these lines:
> 
> [   12.790801] WARNING: CPU: 135 PID: 1 at arch/x86/kernel/apic/apic.c:2505 
> __irq_msi_compose_msg+0x79/0x80
> [   98.121716] irq 3: nobody cared (try booting with the "irqpoll" option)
> [  100.692087] irq 15: nobody cared (try booting with the "irqpoll" option)
> [  100.800217] irq 11: nobody cared (try booting with the "irqpoll" option)
> [  100.800407] irq 10: nobody cared (try booting with the "irqpoll" option)

Odd. The warning suggests we're asking __irq_msi_compose_msg() to
compose an MSI targeted at an APIC ID above 255, which shouldn't ever
happen.

Can we see a full boot log with apic=verbose please? 


smime.p7s
Description: S/MIME cryptographic signature