Re: [Xen-devel] HPET interrupt remapping during boot

2019-10-09 Thread Jan Beulich
On 09.10.2019 15:56, Roger Pau Monné  wrote:
> On Wed, Oct 09, 2019 at 02:03:12PM +0200, Jan Beulich wrote:
>> On 09.10.2019 13:29, Roger Pau Monné  wrote:
>>> Right, then I guess we either mask all the io-apic pins or we setup
>>> proper remapping entries for non-masked pins? (in order to avoid iommu
>>> faults)
>>
>> Making the ExtInt entry is at least worth an experiment, to
>> (hopefully) confirm that this would take care of the IOMMU
>> fault. But I'm afraid (as per above) it's not an option in
>> general. What I could see us doing is mask the entry if all
>> legacy IRQs are handled through the IO-APIC. This would take
>> care of spurious interrupts, as these are the only ones
>> which can make it through when the PIC mask bits are all set.
>> However, maybe it is legitimate to mask the ExtInt entry
>> when an IOMMU comes into play.
> 
> That was my thinking, ie: make sure every io-apic pin is masked before
> enabling iommu interrupt remapping. Nothing useful can happen of
> having io-apic pins unmasked, as the remapping is not setup anyway.
> If/when those pins get used a proper remapping entry is going to be
> setup, and the pin would then be unmasked.

Well, this isn't the only option. Another would be to transform
all unmasked entries to be translated.

>> As to "proper" remapping entries: I'll have to look at the
>> spec what they say about this. There's only one IRT index
>> that we can put in the RTE, yet this would need to serve all
>> 15 IRQs potentially coming through the PIC. Recall that the
>> vector gets supplied by the PIC in the ExtInt case, not by
>> the IO-APIC RTE.
> 
> You can set the delivery mode of the IRTE to ExtINT, much like how this
> is done on the io-apic, and then poke the PIC to figure out which IRQ
> triggered?

Hmm, yes - it didn't even occur to me that VT-d might allow
ExtInt as delivery mode; too much AMD IOMMU work recently,
where the only way to deal with ExtInt is a "don't remap"
flag in the device table entry.

Jan

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] HPET interrupt remapping during boot

2019-10-09 Thread Roger Pau Monné
On Wed, Oct 09, 2019 at 02:03:12PM +0200, Jan Beulich wrote:
> On 09.10.2019 13:29, Roger Pau Monné  wrote:
> > On Wed, Oct 09, 2019 at 12:41:05PM +0200, Jan Beulich wrote:
> >> On 09.10.2019 12:11, Roger Pau Monné  wrote:
> >>> And it does print the following when setting up the iommu:
> >>>
> >>> (XEN) ioapic 0 pin 0 not masked
> >>> (XEN) vec=00 delivery=ExINT dest=P status=0 polarity=0 irr=0 trig=E 
> >>> mask=0 dest_id:0001
> >>>
> >>> I wonder, shouldn't all pins of all the io-apics be masked at boot?
> >>
> >> I think you might get different answers here depending on whether
> >> you ask firmware or OS people. In fact there are cases where the
> >> IO-APIC needs to be left in this state, I think, but such would
> >> likely need properly reflecting in ACPI tables (albeit I don't
> >> know/recall how this would be done; looking at the code ). This goes back 
> >> to times
> >> when IO-APICs were new and OSes would not even know about them,
> >> yet they wouldn't get any interrupts to work if fiddling with
> >> only the PIC (sitting behind IO-APIC pin 0).
> >>
> >> See enable_IO_APIC(), where we actually use this property to
> >> determine the pin behind which the 8259 sits.
> >>
> >> I've seen quite many systems where in the BIOS setup you have an
> >> option to select whether you have an "ACPI OS" (wording of course
> >> varies). I've never checked whether this may e.g. reflect itself
> >> in the handover state of the GSI 0 RTE.
> >>
> >> In your testing patch, could you also log the PIC mask bytes?
> >> There ought to be at least one unmasked; or wait - there actually
> >> is a spurious interrupt there (right before IOMMU initialization):
> >>
> >> (XEN) spurious 8259A interrupt: IRQ7.
> > 
> > So I've added a log of the PIC masks just before checking the ioapic
> > masks:
> > 
> > (XEN) 8259A-1 mask: fe 8259A-2 mask: ff
> > 
> > AFAICT IRQ7 seems to be unmasked? Sorry my knowledge of PICs is quite
> > limited since I've never had to deal with them.
> 
> That's IRQ0 then which is unmasked. As said the spurious one
> (IRQ7) can't be masked (at the PIC); only the "normal" IRQ7 can
> be.
> 
> > The line I've added is:
> > 
> > printk("8259A-1 mask: %x 8259A-2 mask: %x\n", inb(0x21), inb(0xA1));
> > 
> > I wonder why does Xen even has any code to deal with the PICs,
> > shouldn't we rely on io-apics only for legacy delivery?
> 
> There are (were?) systems where things wouldn't work without.
> 
> >> Hence I wonder if there's not possibly a 2nd one once the IOMMU
> >> has been set up.
> > 
> > Right, then I guess we either mask all the io-apic pins or we setup
> > proper remapping entries for non-masked pins? (in order to avoid iommu
> > faults)
> 
> Making the ExtInt entry is at least worth an experiment, to
> (hopefully) confirm that this would take care of the IOMMU
> fault. But I'm afraid (as per above) it's not an option in
> general. What I could see us doing is mask the entry if all
> legacy IRQs are handled through the IO-APIC. This would take
> care of spurious interrupts, as these are the only ones
> which can make it through when the PIC mask bits are all set.
> However, maybe it is legitimate to mask the ExtInt entry
> when an IOMMU comes into play.

That was my thinking, ie: make sure every io-apic pin is masked before
enabling iommu interrupt remapping. Nothing useful can happen of
having io-apic pins unmasked, as the remapping is not setup anyway.
If/when those pins get used a proper remapping entry is going to be
setup, and the pin would then be unmasked.

> As to "proper" remapping entries: I'll have to look at the
> spec what they say about this. There's only one IRT index
> that we can put in the RTE, yet this would need to serve all
> 15 IRQs potentially coming through the PIC. Recall that the
> vector gets supplied by the PIC in the ExtInt case, not by
> the IO-APIC RTE.

You can set the delivery mode of the IRTE to ExtINT, much like how this
is done on the io-apic, and then poke the PIC to figure out which IRQ
triggered?

Roger.

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] HPET interrupt remapping during boot

2019-10-09 Thread Jan Beulich
On 09.10.2019 13:29, Roger Pau Monné  wrote:
> On Wed, Oct 09, 2019 at 12:41:05PM +0200, Jan Beulich wrote:
>> On 09.10.2019 12:11, Roger Pau Monné  wrote:
>>> And it does print the following when setting up the iommu:
>>>
>>> (XEN) ioapic 0 pin 0 not masked
>>> (XEN) vec=00 delivery=ExINT dest=P status=0 polarity=0 irr=0 trig=E mask=0 
>>> dest_id:0001
>>>
>>> I wonder, shouldn't all pins of all the io-apics be masked at boot?
>>
>> I think you might get different answers here depending on whether
>> you ask firmware or OS people. In fact there are cases where the
>> IO-APIC needs to be left in this state, I think, but such would
>> likely need properly reflecting in ACPI tables (albeit I don't
>> know/recall how this would be done; looking at the code ). This goes back to 
>> times
>> when IO-APICs were new and OSes would not even know about them,
>> yet they wouldn't get any interrupts to work if fiddling with
>> only the PIC (sitting behind IO-APIC pin 0).
>>
>> See enable_IO_APIC(), where we actually use this property to
>> determine the pin behind which the 8259 sits.
>>
>> I've seen quite many systems where in the BIOS setup you have an
>> option to select whether you have an "ACPI OS" (wording of course
>> varies). I've never checked whether this may e.g. reflect itself
>> in the handover state of the GSI 0 RTE.
>>
>> In your testing patch, could you also log the PIC mask bytes?
>> There ought to be at least one unmasked; or wait - there actually
>> is a spurious interrupt there (right before IOMMU initialization):
>>
>> (XEN) spurious 8259A interrupt: IRQ7.
> 
> So I've added a log of the PIC masks just before checking the ioapic
> masks:
> 
> (XEN) 8259A-1 mask: fe 8259A-2 mask: ff
> 
> AFAICT IRQ7 seems to be unmasked? Sorry my knowledge of PICs is quite
> limited since I've never had to deal with them.

That's IRQ0 then which is unmasked. As said the spurious one
(IRQ7) can't be masked (at the PIC); only the "normal" IRQ7 can
be.

> The line I've added is:
> 
> printk("8259A-1 mask: %x 8259A-2 mask: %x\n", inb(0x21), inb(0xA1));
> 
> I wonder why does Xen even has any code to deal with the PICs,
> shouldn't we rely on io-apics only for legacy delivery?

There are (were?) systems where things wouldn't work without.

>> Hence I wonder if there's not possibly a 2nd one once the IOMMU
>> has been set up.
> 
> Right, then I guess we either mask all the io-apic pins or we setup
> proper remapping entries for non-masked pins? (in order to avoid iommu
> faults)

Making the ExtInt entry is at least worth an experiment, to
(hopefully) confirm that this would take care of the IOMMU
fault. But I'm afraid (as per above) it's not an option in
general. What I could see us doing is mask the entry if all
legacy IRQs are handled through the IO-APIC. This would take
care of spurious interrupts, as these are the only ones
which can make it through when the PIC mask bits are all set.
However, maybe it is legitimate to mask the ExtInt entry
when an IOMMU comes into play.

As to "proper" remapping entries: I'll have to look at the
spec what they say about this. There's only one IRT index
that we can put in the RTE, yet this would need to serve all
15 IRQs potentially coming through the PIC. Recall that the
vector gets supplied by the PIC in the ExtInt case, not by
the IO-APIC RTE.

Jan

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] HPET interrupt remapping during boot

2019-10-09 Thread Roger Pau Monné
On Wed, Oct 09, 2019 at 12:41:05PM +0200, Jan Beulich wrote:
> On 09.10.2019 12:11, Roger Pau Monné  wrote:
> > And it does print the following when setting up the iommu:
> > 
> > (XEN) ioapic 0 pin 0 not masked
> > (XEN) vec=00 delivery=ExINT dest=P status=0 polarity=0 irr=0 trig=E mask=0 
> > dest_id:0001
> > 
> > I wonder, shouldn't all pins of all the io-apics be masked at boot?
> 
> I think you might get different answers here depending on whether
> you ask firmware or OS people. In fact there are cases where the
> IO-APIC needs to be left in this state, I think, but such would
> likely need properly reflecting in ACPI tables (albeit I don't
> know/recall how this would be done; looking at the code ). This goes back to 
> times
> when IO-APICs were new and OSes would not even know about them,
> yet they wouldn't get any interrupts to work if fiddling with
> only the PIC (sitting behind IO-APIC pin 0).
> 
> See enable_IO_APIC(), where we actually use this property to
> determine the pin behind which the 8259 sits.
> 
> I've seen quite many systems where in the BIOS setup you have an
> option to select whether you have an "ACPI OS" (wording of course
> varies). I've never checked whether this may e.g. reflect itself
> in the handover state of the GSI 0 RTE.
> 
> In your testing patch, could you also log the PIC mask bytes?
> There ought to be at least one unmasked; or wait - there actually
> is a spurious interrupt there (right before IOMMU initialization):
> 
> (XEN) spurious 8259A interrupt: IRQ7.

So I've added a log of the PIC masks just before checking the ioapic
masks:

(XEN) 8259A-1 mask: fe 8259A-2 mask: ff

AFAICT IRQ7 seems to be unmasked? Sorry my knowledge of PICs is quite
limited since I've never had to deal with them.

The line I've added is:

printk("8259A-1 mask: %x 8259A-2 mask: %x\n", inb(0x21), inb(0xA1));

I wonder why does Xen even has any code to deal with the PICs,
shouldn't we rely on io-apics only for legacy delivery?

> Hence I wonder if there's not possibly a 2nd one once the IOMMU
> has been set up.

Right, then I guess we either mask all the io-apic pins or we setup
proper remapping entries for non-masked pins? (in order to avoid iommu
faults)

Thanks, Roger.

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] HPET interrupt remapping during boot

2019-10-09 Thread Jan Beulich
On 09.10.2019 12:11, Roger Pau Monné  wrote:
> And it does print the following when setting up the iommu:
> 
> (XEN) ioapic 0 pin 0 not masked
> (XEN) vec=00 delivery=ExINT dest=P status=0 polarity=0 irr=0 trig=E mask=0 
> dest_id:0001
> 
> I wonder, shouldn't all pins of all the io-apics be masked at boot?

I think you might get different answers here depending on whether
you ask firmware or OS people. In fact there are cases where the
IO-APIC needs to be left in this state, I think, but such would
likely need properly reflecting in ACPI tables (albeit I don't
know/recall how this would be done; looking at the code ). This goes back to 
times
when IO-APICs were new and OSes would not even know about them,
yet they wouldn't get any interrupts to work if fiddling with
only the PIC (sitting behind IO-APIC pin 0).

See enable_IO_APIC(), where we actually use this property to
determine the pin behind which the 8259 sits.

I've seen quite many systems where in the BIOS setup you have an
option to select whether you have an "ACPI OS" (wording of course
varies). I've never checked whether this may e.g. reflect itself
in the handover state of the GSI 0 RTE.

In your testing patch, could you also log the PIC mask bytes?
There ought to be at least one unmasked; or wait - there actually
is a spurious interrupt there (right before IOMMU initialization):

(XEN) spurious 8259A interrupt: IRQ7.

Hence I wonder if there's not possibly a 2nd one once the IOMMU
has been set up.

Jan

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] HPET interrupt remapping during boot

2019-10-09 Thread Roger Pau Monné
On Wed, Oct 09, 2019 at 11:31:59AM +0200, Jan Beulich wrote:
> On 08.10.2019 20:30, Andrew Cooper wrote:
> > Hello,
> > 
> > I have no idea if this is a regression or not.  I suspect it might not
> > be, and has always been broken.
> > 
> > Either way, I'm seeing occasional single interrupt remapping errors when
> > booting a range of Intel systems
> > 
> > (XEN) x2APIC mode is already enabled by BIOS.
> > (XEN) Using APIC driver x2apic_cluster
> > ...
> > (XEN) Platform timer is 23.999MHz HPET
> > (XEN) Detected 2194.922 MHz processor.
> > ...
> > (XEN) HVM: HAP page sizes: 4kB, 2MB, 1GB
> > (XEN) alt table 82d08047a070 -> 82d080486c6c
> > (XEN) [VT-D]INTR-REMAP: Request device [:f0:1f.0] fault index 0,
> > iommu reg = 82c00072d000
> > (XEN) [VT-D]INTR-REMAP: reason 22 - Present field in the IRTE entry is clear
> > (XEN) microcode: CPU2 updated from revision 0x521 to 0x52b, date
> > = 2019-08-12
> > 
> > From other debugging, I know that this happens after CPU 1 (which is a
> > hyperthread) has passed through start_secondary().
> > 
> > f0:1f.0 is one of the IO-APICs, and if I've cross referenced the DMAR
> > and APIC tables properly, is the IO-APIC on the PCH, making the
> > problematic IRQ GSI0.
> > 
> > This suggests that we have an error setting up the timer IRQ (as the
> > HPET isn't MSI-capable), but we have already allegedly used it
> > successfully earlier on boot.
> 
> Wait - is this really a system with the timer at GSI 0, i.e. without
> an interrupt source override like this (as seen in Linux logs):
> 
> ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 dfl dfl)

I also have such a system, and I do have and entry like:

(XEN) ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 dfl dfl)

On the Xen log. I've added some more debug info to my build with the
following patch:

diff --git a/xen/drivers/passthrough/vtd/iommu.c 
b/xen/drivers/passthrough/vtd/iommu.c
index f08eec070d..0d7abcd8fa 100644
--- a/xen/drivers/passthrough/vtd/iommu.c
+++ b/xen/drivers/passthrough/vtd/iommu.c
@@ -993,6 +993,7 @@ static void iommu_page_fault(int irq, void *dev_id,
  * specs since a new interrupt won't be generated until we clear all
  * the faults that caused this one to happen.
  */
+WARN();
 tasklet_schedule(_fault_tasklet);
 }
 
diff --git a/xen/drivers/passthrough/x86/iommu.c 
b/xen/drivers/passthrough/x86/iommu.c
index 59905629e1..2dbc7a3607 100644
--- a/xen/drivers/passthrough/x86/iommu.c
+++ b/xen/drivers/passthrough/x86/iommu.c
@@ -23,12 +23,59 @@
 #include 
 #include 
 
+#include 
+
 const struct iommu_init_ops *__initdata iommu_init_ops;
 struct iommu_ops __read_mostly iommu_ops;
 
+static const char * delivery_mode_2_str(
+const enum ioapic_irq_destination_types mode)
+{
+switch ( mode )
+{
+case dest_Fixed: return "Fixed";
+case dest_LowestPrio: return "LoPri";
+case dest_SMI: return "SMI";
+case dest_NMI: return "NMI";
+case dest_INIT: return "INIT";
+case dest_ExtINT: return "ExINT";
+case dest__reserved_1:
+case dest__reserved_2: return "Resvd";
+default: return "INVAL";
+}
+}
+
 int __init iommu_hardware_setup(void)
 {
 int rc;
+int apic;
+
+for(apic = 0; apic < nr_ioapics; apic++)
+{
+int pin;
+/* See if any of the pins is in ExtINT mode */
+for (pin = 0; pin < nr_ioapic_entries[apic]; pin++) {
+struct IO_APIC_route_entry rte = __ioapic_read_entry(apic, pin, 0);
+
+/* If the interrupt line is enabled and in ExtInt mode
+ * I have found the pin where the i8259 is connected.
+ */
+if (!rte.mask)
+{
+printk("ioapic %d pin %d not masked\n", apic, pin);
+printk("vec=%02x delivery=%-5s dest=%c status=%d "
+   "polarity=%d irr=%d trig=%c mask=%d dest_id:%0*x\n",
+   rte.vector, delivery_mode_2_str(rte.delivery_mode),
+   rte.dest_mode ? 'L' : 'P',
+   rte.delivery_status, rte.polarity, rte.irr,
+   rte.trigger ? 'L' : 'E', rte.mask,
+   x2apic_enabled ? 8 : 2,
+   x2apic_enabled ? rte.dest.dest32
+  : rte.dest.logical.logical_dest);
+}
+}
+}
+
 
 if ( !iommu_init_ops )
 return -ENODEV;

And it does print the following when setting up the iommu:

(XEN) ioapic 0 pin 0 not masked
(XEN) vec=00 delivery=ExINT dest=P status=0 polarity=0 irr=0 trig=E mask=0 
dest_id:0001

I wonder, shouldn't all pins of all the io-apics be masked at boot?

Full Xen boot log below.

Thanks, Roger.
---
(XEN) Xen version 4.13-unstable (root@xenrtcloud) (gcc (Debian 6.3.0-18+deb9u1) 
6.3.0 20170516) debug=y  Wed Oct  9 12:03:37 CEST 2019
(XEN) Latest ChangeSet:   Copyright (C) 1994-2013 H. Peter Anvin et al
(XEN) build-id: 52e9ffb079497b12ad075c4d09e1c9070151bfdf
(XEN) Console output is synchronous.
(XEN) Bootloader: 

Re: [Xen-devel] HPET interrupt remapping during boot

2019-10-09 Thread Jan Beulich
On 08.10.2019 20:30, Andrew Cooper wrote:
> Hello,
> 
> I have no idea if this is a regression or not.  I suspect it might not
> be, and has always been broken.
> 
> Either way, I'm seeing occasional single interrupt remapping errors when
> booting a range of Intel systems
> 
> (XEN) x2APIC mode is already enabled by BIOS.
> (XEN) Using APIC driver x2apic_cluster
> ...
> (XEN) Platform timer is 23.999MHz HPET
> (XEN) Detected 2194.922 MHz processor.
> ...
> (XEN) HVM: HAP page sizes: 4kB, 2MB, 1GB
> (XEN) alt table 82d08047a070 -> 82d080486c6c
> (XEN) [VT-D]INTR-REMAP: Request device [:f0:1f.0] fault index 0,
> iommu reg = 82c00072d000
> (XEN) [VT-D]INTR-REMAP: reason 22 - Present field in the IRTE entry is clear
> (XEN) microcode: CPU2 updated from revision 0x521 to 0x52b, date
> = 2019-08-12
> 
> From other debugging, I know that this happens after CPU 1 (which is a
> hyperthread) has passed through start_secondary().
> 
> f0:1f.0 is one of the IO-APICs, and if I've cross referenced the DMAR
> and APIC tables properly, is the IO-APIC on the PCH, making the
> problematic IRQ GSI0.
> 
> This suggests that we have an error setting up the timer IRQ (as the
> HPET isn't MSI-capable), but we have already allegedly used it
> successfully earlier on boot.

Wait - is this really a system with the timer at GSI 0, i.e. without
an interrupt source override like this (as seen in Linux logs):

ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 dfl dfl)

Otherwise isn't this rather an ExtInt arriving through the PIC? We
mask all PIC interrupts first thing, but I'm not sure there aren't
paths where we'd unmask any. Additionally I seem to vaguely recall
that the spurious interrupt can't be masked at the PIC, and I do
recall seeing (randomly, like you) spurious interrupts during boot
(can't tell offhand whether this was on AMD and/or Intel, and
perhaps on IOMMU-less systems only). I've never seen though what
you describe here.

Then again the log message saying "fault index 0" doesn't tell us
what the GSI is, or what IO-APIC pin the IRQ can through. All not
yet set up IO-APIC RTEs would specify a remapping index of zero.
Of course all such IO-APIC entries ought to have their mask bit
set - question is whether the system comes out of boot with one
(perhaps indeed an ExtInt one) not masked.

Jan

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] HPET interrupt remapping during boot

2019-10-08 Thread Andrew Cooper
Hello,

I have no idea if this is a regression or not.  I suspect it might not
be, and has always been broken.

Either way, I'm seeing occasional single interrupt remapping errors when
booting a range of Intel systems

(XEN) x2APIC mode is already enabled by BIOS.
(XEN) Using APIC driver x2apic_cluster
...
(XEN) Platform timer is 23.999MHz HPET
(XEN) Detected 2194.922 MHz processor.
...
(XEN) HVM: HAP page sizes: 4kB, 2MB, 1GB
(XEN) alt table 82d08047a070 -> 82d080486c6c
(XEN) [VT-D]INTR-REMAP: Request device [:f0:1f.0] fault index 0,
iommu reg = 82c00072d000
(XEN) [VT-D]INTR-REMAP: reason 22 - Present field in the IRTE entry is clear
(XEN) microcode: CPU2 updated from revision 0x521 to 0x52b, date
= 2019-08-12

From other debugging, I know that this happens after CPU 1 (which is a
hyperthread) has passed through start_secondary().

f0:1f.0 is one of the IO-APICs, and if I've cross referenced the DMAR
and APIC tables properly, is the IO-APIC on the PCH, making the
problematic IRQ GSI0.

This suggests that we have an error setting up the timer IRQ (as the
HPET isn't MSI-capable), but we have already allegedly used it
successfully earlier on boot.

I haven't investigated further yet, but it is an intermittent issue
(i.e. doesn't reproduce on each boot).  My gut feeling is that we have
something which corrects itself as a side effect of a later action.

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel