Re: `do_IRQ: 1.55 No irq handler for vector` on ASRock E350M1

2018-03-29 Thread Morton, Eric
+Frank G.

Eric
 
By reading this e-mail, you are consenting to agree with the opinions disclosed 
within.
On 3/29/18, 3:29 AM, "Paul Menzel"  wrote:

Dear Yazen, Eric, Tom,


On 02/26/18 17:42, Tom Lendacky wrote:
> On 2/26/2018 10:37 AM, Morton, Eric wrote:

>> Yazen dug out PLAT-21393 as sounding like this issue. I haven't had
>> a chance to digest it.
> 
> Yes, internally to AMD, that was the bug that tracked the issue I
> was referring to.

If you could point me to more details how to fix this in AGESA, I could 
probably fix the version in coreboot to fix the issue.


Kind regards,

Paul





Re: `do_IRQ: 1.55 No irq handler for vector` on ASRock E350M1

2018-03-29 Thread Morton, Eric
+Frank G.

Eric
 
By reading this e-mail, you are consenting to agree with the opinions disclosed 
within.
On 3/29/18, 3:29 AM, "Paul Menzel"  wrote:

Dear Yazen, Eric, Tom,


On 02/26/18 17:42, Tom Lendacky wrote:
> On 2/26/2018 10:37 AM, Morton, Eric wrote:

>> Yazen dug out PLAT-21393 as sounding like this issue. I haven't had
>> a chance to digest it.
> 
> Yes, internally to AMD, that was the bug that tracked the issue I
> was referring to.

If you could point me to more details how to fix this in AGESA, I could 
probably fix the version in coreboot to fix the issue.


Kind regards,

Paul





Re: `do_IRQ: 1.55 No irq handler for vector` on ASRock E350M1

2018-03-29 Thread Paul Menzel

Dear Yazen, Eric, Tom,


On 02/26/18 17:42, Tom Lendacky wrote:

On 2/26/2018 10:37 AM, Morton, Eric wrote:



Yazen dug out PLAT-21393 as sounding like this issue. I haven't had
a chance to digest it.


Yes, internally to AMD, that was the bug that tracked the issue I
was referring to.


If you could point me to more details how to fix this in AGESA, I could 
probably fix the version in coreboot to fix the issue.



Kind regards,

Paul



smime.p7s
Description: S/MIME Cryptographic Signature


Re: `do_IRQ: 1.55 No irq handler for vector` on ASRock E350M1

2018-03-29 Thread Paul Menzel

Dear Yazen, Eric, Tom,


On 02/26/18 17:42, Tom Lendacky wrote:

On 2/26/2018 10:37 AM, Morton, Eric wrote:



Yazen dug out PLAT-21393 as sounding like this issue. I haven't had
a chance to digest it.


Yes, internally to AMD, that was the bug that tracked the issue I
was referring to.


If you could point me to more details how to fix this in AGESA, I could 
probably fix the version in coreboot to fix the issue.



Kind regards,

Paul



smime.p7s
Description: S/MIME Cryptographic Signature


Re: `do_IRQ: 1.55 No irq handler for vector` on ASRock E350M1

2018-03-29 Thread Kai Heng Feng

Hi Tom & Thomas,


On Feb 27, 2018, at 12:42 AM, Tom Lendacky  wrote:

On 2/26/2018 10:37 AM, Morton, Eric wrote:

Thomas,

Yazen dug out PLAT-21393 as sounding like this issue. I haven't had a  
chance to digest it.


Yes, internally to AMD, that was the bug that tracked the issue I was
referring to.


There's also another user [1] affected by this issue.

Ethernet r8169 failed to work after system suspend:

[ 150.944101] do_IRQ: 3.37 No irq handler for vector
[ 150.944105] r8169 :01:00.0 enp1s0: link down

It's a regression started from v4.15-rc5.

[1] https://bugs.launchpad.net/bugs/1752772

Kai-Heng



Thanks,
Tom


Eric

On 2/26/18, 10:31 AM, "Borislav Petkov"  wrote:

On Mon, Feb 26, 2018 at 10:14:10AM -0600, Tom Lendacky wrote:

On 2/24/2018 2:59 AM, Thomas Gleixner wrote:

On Sat, 24 Feb 2018, Paul Menzel wrote:

Am 23.02.2018 um 20:09 schrieb Borislav Petkov:

On Fri, Feb 23, 2018 at 07:18:34PM +0100, Thomas Gleixner wrote:
Borislav is seeing similar issues on larger AMD machines. The  
interrupt
seems to come from BIOS/microcode during bringup of secondary CPUs  
and we

have no idea why.


Paul, can you boot 4.14 and grep your dmesg for something like:

[0.00] spurious 8259A interrupt: IRQ7. >
?


No, I do not see that. Please find the logs attached.


From your 4.14 log:

Feb 19 09:48:06.843173 kodi kernel: CPU 0 irqstacks, hard=e9b0a000  
soft=e9b0c000

Feb 19 09:48:06.843216 kodi kernel: spurious 8259A interrupt: IRQ7.


I think I remember seeing something like this previously and it turned  
out

to be a BIOS bug.  All the AP's were enabled to work with the legacy 8259
interrupt controller.  In an SMP system, only one processor in the system
should be configured to handle legacy 8259 interrupts (ExtINT delivery
mode - see Intel's SDM, Volume 3, section 10.5.1, Delivery Mode).  Once
the BIOS was fixed, the spurious interrupt message went away.

I believe at some point during UEFI, the APs were exposed to an ExtINT
interrupt.  Since they were configured to handle ExtINT delivery mode and
interrupts were not yet enabled, the interrupt was left pending.  When  
the

APs were started by the OS and interrupts were enabled, the interrupt
triggered. Since the original pending interrupt was handled by the BSP,
there was no longer an interrupt actually pending, so the 8259 responds
with IRQ 7 when queried by the OS.  This occurred for each AP.


Interesting - is this something that can happen on Zen too?

Because I have such reports too.

--
Regards/Gruss,
Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.


Re: `do_IRQ: 1.55 No irq handler for vector` on ASRock E350M1

2018-03-29 Thread Kai Heng Feng

Hi Tom & Thomas,


On Feb 27, 2018, at 12:42 AM, Tom Lendacky  wrote:

On 2/26/2018 10:37 AM, Morton, Eric wrote:

Thomas,

Yazen dug out PLAT-21393 as sounding like this issue. I haven't had a  
chance to digest it.


Yes, internally to AMD, that was the bug that tracked the issue I was
referring to.


There's also another user [1] affected by this issue.

Ethernet r8169 failed to work after system suspend:

[ 150.944101] do_IRQ: 3.37 No irq handler for vector
[ 150.944105] r8169 :01:00.0 enp1s0: link down

It's a regression started from v4.15-rc5.

[1] https://bugs.launchpad.net/bugs/1752772

Kai-Heng



Thanks,
Tom


Eric

On 2/26/18, 10:31 AM, "Borislav Petkov"  wrote:

On Mon, Feb 26, 2018 at 10:14:10AM -0600, Tom Lendacky wrote:

On 2/24/2018 2:59 AM, Thomas Gleixner wrote:

On Sat, 24 Feb 2018, Paul Menzel wrote:

Am 23.02.2018 um 20:09 schrieb Borislav Petkov:

On Fri, Feb 23, 2018 at 07:18:34PM +0100, Thomas Gleixner wrote:
Borislav is seeing similar issues on larger AMD machines. The  
interrupt
seems to come from BIOS/microcode during bringup of secondary CPUs  
and we

have no idea why.


Paul, can you boot 4.14 and grep your dmesg for something like:

[0.00] spurious 8259A interrupt: IRQ7. >
?


No, I do not see that. Please find the logs attached.


From your 4.14 log:

Feb 19 09:48:06.843173 kodi kernel: CPU 0 irqstacks, hard=e9b0a000  
soft=e9b0c000

Feb 19 09:48:06.843216 kodi kernel: spurious 8259A interrupt: IRQ7.


I think I remember seeing something like this previously and it turned  
out

to be a BIOS bug.  All the AP's were enabled to work with the legacy 8259
interrupt controller.  In an SMP system, only one processor in the system
should be configured to handle legacy 8259 interrupts (ExtINT delivery
mode - see Intel's SDM, Volume 3, section 10.5.1, Delivery Mode).  Once
the BIOS was fixed, the spurious interrupt message went away.

I believe at some point during UEFI, the APs were exposed to an ExtINT
interrupt.  Since they were configured to handle ExtINT delivery mode and
interrupts were not yet enabled, the interrupt was left pending.  When  
the

APs were started by the OS and interrupts were enabled, the interrupt
triggered. Since the original pending interrupt was handled by the BSP,
there was no longer an interrupt actually pending, so the 8259 responds
with IRQ 7 when queried by the OS.  This occurred for each AP.


Interesting - is this something that can happen on Zen too?

Because I have such reports too.

--
Regards/Gruss,
Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.


Re: `do_IRQ: 1.55 No irq handler for vector` on ASRock E350M1

2018-02-26 Thread Tom Lendacky
On 2/26/2018 10:37 AM, Morton, Eric wrote:
> Thomas,
> 
> Yazen dug out PLAT-21393 as sounding like this issue. I haven't had a chance 
> to digest it.

Yes, internally to AMD, that was the bug that tracked the issue I was
referring to.

Thanks,
Tom

> 
> Eric
> 
> On 2/26/18, 10:31 AM, "Borislav Petkov"  wrote:
> 
> On Mon, Feb 26, 2018 at 10:14:10AM -0600, Tom Lendacky wrote:
> > On 2/24/2018 2:59 AM, Thomas Gleixner wrote:
> > > On Sat, 24 Feb 2018, Paul Menzel wrote:
> > >> Am 23.02.2018 um 20:09 schrieb Borislav Petkov:
> > >>> On Fri, Feb 23, 2018 at 07:18:34PM +0100, Thomas Gleixner wrote:
> >  Borislav is seeing similar issues on larger AMD machines. The 
> interrupt
> >  seems to come from BIOS/microcode during bringup of secondary CPUs 
> and we
> >  have no idea why.
> > >>>
> > >>> Paul, can you boot 4.14 and grep your dmesg for something like:
> > >>>
> > >>> [0.00] spurious 8259A interrupt: IRQ7. >
> > >>> ?
> > >>
> > >> No, I do not see that. Please find the logs attached.
> > > 
> > > From your 4.14 log:
> > > 
> > > Feb 19 09:48:06.843173 kodi kernel: CPU 0 irqstacks, hard=e9b0a000 
> soft=e9b0c000
> > > Feb 19 09:48:06.843216 kodi kernel: spurious 8259A interrupt: IRQ7.
> > 
> > I think I remember seeing something like this previously and it turned 
> out
> > to be a BIOS bug.  All the AP's were enabled to work with the legacy 
> 8259
> > interrupt controller.  In an SMP system, only one processor in the 
> system
> > should be configured to handle legacy 8259 interrupts (ExtINT delivery
> > mode - see Intel's SDM, Volume 3, section 10.5.1, Delivery Mode).  Once
> > the BIOS was fixed, the spurious interrupt message went away.
> > 
> > I believe at some point during UEFI, the APs were exposed to an ExtINT
> > interrupt.  Since they were configured to handle ExtINT delivery mode 
> and
> > interrupts were not yet enabled, the interrupt was left pending.  When 
> the
> > APs were started by the OS and interrupts were enabled, the interrupt
> > triggered. Since the original pending interrupt was handled by the BSP,
> > there was no longer an interrupt actually pending, so the 8259 responds
> > with IRQ 7 when queried by the OS.  This occurred for each AP.
> 
> Interesting - is this something that can happen on Zen too?
> 
> Because I have such reports too.
> 
> -- 
> Regards/Gruss,
> Boris.
> 
> Good mailing practices for 400: avoid top-posting and trim the reply.
> 
> 


Re: `do_IRQ: 1.55 No irq handler for vector` on ASRock E350M1

2018-02-26 Thread Tom Lendacky
On 2/26/2018 10:37 AM, Morton, Eric wrote:
> Thomas,
> 
> Yazen dug out PLAT-21393 as sounding like this issue. I haven't had a chance 
> to digest it.

Yes, internally to AMD, that was the bug that tracked the issue I was
referring to.

Thanks,
Tom

> 
> Eric
> 
> On 2/26/18, 10:31 AM, "Borislav Petkov"  wrote:
> 
> On Mon, Feb 26, 2018 at 10:14:10AM -0600, Tom Lendacky wrote:
> > On 2/24/2018 2:59 AM, Thomas Gleixner wrote:
> > > On Sat, 24 Feb 2018, Paul Menzel wrote:
> > >> Am 23.02.2018 um 20:09 schrieb Borislav Petkov:
> > >>> On Fri, Feb 23, 2018 at 07:18:34PM +0100, Thomas Gleixner wrote:
> >  Borislav is seeing similar issues on larger AMD machines. The 
> interrupt
> >  seems to come from BIOS/microcode during bringup of secondary CPUs 
> and we
> >  have no idea why.
> > >>>
> > >>> Paul, can you boot 4.14 and grep your dmesg for something like:
> > >>>
> > >>> [0.00] spurious 8259A interrupt: IRQ7. >
> > >>> ?
> > >>
> > >> No, I do not see that. Please find the logs attached.
> > > 
> > > From your 4.14 log:
> > > 
> > > Feb 19 09:48:06.843173 kodi kernel: CPU 0 irqstacks, hard=e9b0a000 
> soft=e9b0c000
> > > Feb 19 09:48:06.843216 kodi kernel: spurious 8259A interrupt: IRQ7.
> > 
> > I think I remember seeing something like this previously and it turned 
> out
> > to be a BIOS bug.  All the AP's were enabled to work with the legacy 
> 8259
> > interrupt controller.  In an SMP system, only one processor in the 
> system
> > should be configured to handle legacy 8259 interrupts (ExtINT delivery
> > mode - see Intel's SDM, Volume 3, section 10.5.1, Delivery Mode).  Once
> > the BIOS was fixed, the spurious interrupt message went away.
> > 
> > I believe at some point during UEFI, the APs were exposed to an ExtINT
> > interrupt.  Since they were configured to handle ExtINT delivery mode 
> and
> > interrupts were not yet enabled, the interrupt was left pending.  When 
> the
> > APs were started by the OS and interrupts were enabled, the interrupt
> > triggered. Since the original pending interrupt was handled by the BSP,
> > there was no longer an interrupt actually pending, so the 8259 responds
> > with IRQ 7 when queried by the OS.  This occurred for each AP.
> 
> Interesting - is this something that can happen on Zen too?
> 
> Because I have such reports too.
> 
> -- 
> Regards/Gruss,
> Boris.
> 
> Good mailing practices for 400: avoid top-posting and trim the reply.
> 
> 


Re: `do_IRQ: 1.55 No irq handler for vector` on ASRock E350M1

2018-02-26 Thread Tom Lendacky
On 2/26/2018 10:31 AM, Borislav Petkov wrote:
> On Mon, Feb 26, 2018 at 10:14:10AM -0600, Tom Lendacky wrote:
>> On 2/24/2018 2:59 AM, Thomas Gleixner wrote:
>>> On Sat, 24 Feb 2018, Paul Menzel wrote:
 Am 23.02.2018 um 20:09 schrieb Borislav Petkov:
> On Fri, Feb 23, 2018 at 07:18:34PM +0100, Thomas Gleixner wrote:
>> Borislav is seeing similar issues on larger AMD machines. The interrupt
>> seems to come from BIOS/microcode during bringup of secondary CPUs and we
>> have no idea why.
>
> Paul, can you boot 4.14 and grep your dmesg for something like:
>
> [0.00] spurious 8259A interrupt: IRQ7. >
> ?

 No, I do not see that. Please find the logs attached.
>>>
>>> From your 4.14 log:
>>>
>>> Feb 19 09:48:06.843173 kodi kernel: CPU 0 irqstacks, hard=e9b0a000 
>>> soft=e9b0c000
>>> Feb 19 09:48:06.843216 kodi kernel: spurious 8259A interrupt: IRQ7.
>>
>> I think I remember seeing something like this previously and it turned out
>> to be a BIOS bug.  All the AP's were enabled to work with the legacy 8259
>> interrupt controller.  In an SMP system, only one processor in the system
>> should be configured to handle legacy 8259 interrupts (ExtINT delivery
>> mode - see Intel's SDM, Volume 3, section 10.5.1, Delivery Mode).  Once
>> the BIOS was fixed, the spurious interrupt message went away.
>>
>> I believe at some point during UEFI, the APs were exposed to an ExtINT
>> interrupt.  Since they were configured to handle ExtINT delivery mode and
>> interrupts were not yet enabled, the interrupt was left pending.  When the
>> APs were started by the OS and interrupts were enabled, the interrupt
>> triggered. Since the original pending interrupt was handled by the BSP,
>> there was no longer an interrupt actually pending, so the 8259 responds
>> with IRQ 7 when queried by the OS.  This occurred for each AP.
> 
> Interesting - is this something that can happen on Zen too?

Yes, that's where I remember seeing it.

Thanks,
Tom

> 
> Because I have such reports too.
> 


Re: `do_IRQ: 1.55 No irq handler for vector` on ASRock E350M1

2018-02-26 Thread Tom Lendacky
On 2/26/2018 10:31 AM, Borislav Petkov wrote:
> On Mon, Feb 26, 2018 at 10:14:10AM -0600, Tom Lendacky wrote:
>> On 2/24/2018 2:59 AM, Thomas Gleixner wrote:
>>> On Sat, 24 Feb 2018, Paul Menzel wrote:
 Am 23.02.2018 um 20:09 schrieb Borislav Petkov:
> On Fri, Feb 23, 2018 at 07:18:34PM +0100, Thomas Gleixner wrote:
>> Borislav is seeing similar issues on larger AMD machines. The interrupt
>> seems to come from BIOS/microcode during bringup of secondary CPUs and we
>> have no idea why.
>
> Paul, can you boot 4.14 and grep your dmesg for something like:
>
> [0.00] spurious 8259A interrupt: IRQ7. >
> ?

 No, I do not see that. Please find the logs attached.
>>>
>>> From your 4.14 log:
>>>
>>> Feb 19 09:48:06.843173 kodi kernel: CPU 0 irqstacks, hard=e9b0a000 
>>> soft=e9b0c000
>>> Feb 19 09:48:06.843216 kodi kernel: spurious 8259A interrupt: IRQ7.
>>
>> I think I remember seeing something like this previously and it turned out
>> to be a BIOS bug.  All the AP's were enabled to work with the legacy 8259
>> interrupt controller.  In an SMP system, only one processor in the system
>> should be configured to handle legacy 8259 interrupts (ExtINT delivery
>> mode - see Intel's SDM, Volume 3, section 10.5.1, Delivery Mode).  Once
>> the BIOS was fixed, the spurious interrupt message went away.
>>
>> I believe at some point during UEFI, the APs were exposed to an ExtINT
>> interrupt.  Since they were configured to handle ExtINT delivery mode and
>> interrupts were not yet enabled, the interrupt was left pending.  When the
>> APs were started by the OS and interrupts were enabled, the interrupt
>> triggered. Since the original pending interrupt was handled by the BSP,
>> there was no longer an interrupt actually pending, so the 8259 responds
>> with IRQ 7 when queried by the OS.  This occurred for each AP.
> 
> Interesting - is this something that can happen on Zen too?

Yes, that's where I remember seeing it.

Thanks,
Tom

> 
> Because I have such reports too.
> 


Re: `do_IRQ: 1.55 No irq handler for vector` on ASRock E350M1

2018-02-26 Thread Morton, Eric
Thomas,

Yazen dug out PLAT-21393 as sounding like this issue. I haven't had a chance to 
digest it.

Eric

On 2/26/18, 10:31 AM, "Borislav Petkov"  wrote:

On Mon, Feb 26, 2018 at 10:14:10AM -0600, Tom Lendacky wrote:
> On 2/24/2018 2:59 AM, Thomas Gleixner wrote:
> > On Sat, 24 Feb 2018, Paul Menzel wrote:
> >> Am 23.02.2018 um 20:09 schrieb Borislav Petkov:
> >>> On Fri, Feb 23, 2018 at 07:18:34PM +0100, Thomas Gleixner wrote:
>  Borislav is seeing similar issues on larger AMD machines. The 
interrupt
>  seems to come from BIOS/microcode during bringup of secondary CPUs 
and we
>  have no idea why.
> >>>
> >>> Paul, can you boot 4.14 and grep your dmesg for something like:
> >>>
> >>> [0.00] spurious 8259A interrupt: IRQ7. >
> >>> ?
> >>
> >> No, I do not see that. Please find the logs attached.
> > 
> > From your 4.14 log:
> > 
> > Feb 19 09:48:06.843173 kodi kernel: CPU 0 irqstacks, hard=e9b0a000 
soft=e9b0c000
> > Feb 19 09:48:06.843216 kodi kernel: spurious 8259A interrupt: IRQ7.
> 
> I think I remember seeing something like this previously and it turned out
> to be a BIOS bug.  All the AP's were enabled to work with the legacy 8259
> interrupt controller.  In an SMP system, only one processor in the system
> should be configured to handle legacy 8259 interrupts (ExtINT delivery
> mode - see Intel's SDM, Volume 3, section 10.5.1, Delivery Mode).  Once
> the BIOS was fixed, the spurious interrupt message went away.
> 
> I believe at some point during UEFI, the APs were exposed to an ExtINT
> interrupt.  Since they were configured to handle ExtINT delivery mode and
> interrupts were not yet enabled, the interrupt was left pending.  When the
> APs were started by the OS and interrupts were enabled, the interrupt
> triggered. Since the original pending interrupt was handled by the BSP,
> there was no longer an interrupt actually pending, so the 8259 responds
> with IRQ 7 when queried by the OS.  This occurred for each AP.

Interesting - is this something that can happen on Zen too?

Because I have such reports too.

-- 
Regards/Gruss,
Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.




Re: `do_IRQ: 1.55 No irq handler for vector` on ASRock E350M1

2018-02-26 Thread Morton, Eric
Thomas,

Yazen dug out PLAT-21393 as sounding like this issue. I haven't had a chance to 
digest it.

Eric

On 2/26/18, 10:31 AM, "Borislav Petkov"  wrote:

On Mon, Feb 26, 2018 at 10:14:10AM -0600, Tom Lendacky wrote:
> On 2/24/2018 2:59 AM, Thomas Gleixner wrote:
> > On Sat, 24 Feb 2018, Paul Menzel wrote:
> >> Am 23.02.2018 um 20:09 schrieb Borislav Petkov:
> >>> On Fri, Feb 23, 2018 at 07:18:34PM +0100, Thomas Gleixner wrote:
>  Borislav is seeing similar issues on larger AMD machines. The 
interrupt
>  seems to come from BIOS/microcode during bringup of secondary CPUs 
and we
>  have no idea why.
> >>>
> >>> Paul, can you boot 4.14 and grep your dmesg for something like:
> >>>
> >>> [0.00] spurious 8259A interrupt: IRQ7. >
> >>> ?
> >>
> >> No, I do not see that. Please find the logs attached.
> > 
> > From your 4.14 log:
> > 
> > Feb 19 09:48:06.843173 kodi kernel: CPU 0 irqstacks, hard=e9b0a000 
soft=e9b0c000
> > Feb 19 09:48:06.843216 kodi kernel: spurious 8259A interrupt: IRQ7.
> 
> I think I remember seeing something like this previously and it turned out
> to be a BIOS bug.  All the AP's were enabled to work with the legacy 8259
> interrupt controller.  In an SMP system, only one processor in the system
> should be configured to handle legacy 8259 interrupts (ExtINT delivery
> mode - see Intel's SDM, Volume 3, section 10.5.1, Delivery Mode).  Once
> the BIOS was fixed, the spurious interrupt message went away.
> 
> I believe at some point during UEFI, the APs were exposed to an ExtINT
> interrupt.  Since they were configured to handle ExtINT delivery mode and
> interrupts were not yet enabled, the interrupt was left pending.  When the
> APs were started by the OS and interrupts were enabled, the interrupt
> triggered. Since the original pending interrupt was handled by the BSP,
> there was no longer an interrupt actually pending, so the 8259 responds
> with IRQ 7 when queried by the OS.  This occurred for each AP.

Interesting - is this something that can happen on Zen too?

Because I have such reports too.

-- 
Regards/Gruss,
Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.




Re: `do_IRQ: 1.55 No irq handler for vector` on ASRock E350M1

2018-02-26 Thread Borislav Petkov
On Mon, Feb 26, 2018 at 10:14:10AM -0600, Tom Lendacky wrote:
> On 2/24/2018 2:59 AM, Thomas Gleixner wrote:
> > On Sat, 24 Feb 2018, Paul Menzel wrote:
> >> Am 23.02.2018 um 20:09 schrieb Borislav Petkov:
> >>> On Fri, Feb 23, 2018 at 07:18:34PM +0100, Thomas Gleixner wrote:
>  Borislav is seeing similar issues on larger AMD machines. The interrupt
>  seems to come from BIOS/microcode during bringup of secondary CPUs and we
>  have no idea why.
> >>>
> >>> Paul, can you boot 4.14 and grep your dmesg for something like:
> >>>
> >>> [0.00] spurious 8259A interrupt: IRQ7. >
> >>> ?
> >>
> >> No, I do not see that. Please find the logs attached.
> > 
> > From your 4.14 log:
> > 
> > Feb 19 09:48:06.843173 kodi kernel: CPU 0 irqstacks, hard=e9b0a000 
> > soft=e9b0c000
> > Feb 19 09:48:06.843216 kodi kernel: spurious 8259A interrupt: IRQ7.
> 
> I think I remember seeing something like this previously and it turned out
> to be a BIOS bug.  All the AP's were enabled to work with the legacy 8259
> interrupt controller.  In an SMP system, only one processor in the system
> should be configured to handle legacy 8259 interrupts (ExtINT delivery
> mode - see Intel's SDM, Volume 3, section 10.5.1, Delivery Mode).  Once
> the BIOS was fixed, the spurious interrupt message went away.
> 
> I believe at some point during UEFI, the APs were exposed to an ExtINT
> interrupt.  Since they were configured to handle ExtINT delivery mode and
> interrupts were not yet enabled, the interrupt was left pending.  When the
> APs were started by the OS and interrupts were enabled, the interrupt
> triggered. Since the original pending interrupt was handled by the BSP,
> there was no longer an interrupt actually pending, so the 8259 responds
> with IRQ 7 when queried by the OS.  This occurred for each AP.

Interesting - is this something that can happen on Zen too?

Because I have such reports too.

-- 
Regards/Gruss,
Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.


Re: `do_IRQ: 1.55 No irq handler for vector` on ASRock E350M1

2018-02-26 Thread Borislav Petkov
On Mon, Feb 26, 2018 at 10:14:10AM -0600, Tom Lendacky wrote:
> On 2/24/2018 2:59 AM, Thomas Gleixner wrote:
> > On Sat, 24 Feb 2018, Paul Menzel wrote:
> >> Am 23.02.2018 um 20:09 schrieb Borislav Petkov:
> >>> On Fri, Feb 23, 2018 at 07:18:34PM +0100, Thomas Gleixner wrote:
>  Borislav is seeing similar issues on larger AMD machines. The interrupt
>  seems to come from BIOS/microcode during bringup of secondary CPUs and we
>  have no idea why.
> >>>
> >>> Paul, can you boot 4.14 and grep your dmesg for something like:
> >>>
> >>> [0.00] spurious 8259A interrupt: IRQ7. >
> >>> ?
> >>
> >> No, I do not see that. Please find the logs attached.
> > 
> > From your 4.14 log:
> > 
> > Feb 19 09:48:06.843173 kodi kernel: CPU 0 irqstacks, hard=e9b0a000 
> > soft=e9b0c000
> > Feb 19 09:48:06.843216 kodi kernel: spurious 8259A interrupt: IRQ7.
> 
> I think I remember seeing something like this previously and it turned out
> to be a BIOS bug.  All the AP's were enabled to work with the legacy 8259
> interrupt controller.  In an SMP system, only one processor in the system
> should be configured to handle legacy 8259 interrupts (ExtINT delivery
> mode - see Intel's SDM, Volume 3, section 10.5.1, Delivery Mode).  Once
> the BIOS was fixed, the spurious interrupt message went away.
> 
> I believe at some point during UEFI, the APs were exposed to an ExtINT
> interrupt.  Since they were configured to handle ExtINT delivery mode and
> interrupts were not yet enabled, the interrupt was left pending.  When the
> APs were started by the OS and interrupts were enabled, the interrupt
> triggered. Since the original pending interrupt was handled by the BSP,
> there was no longer an interrupt actually pending, so the 8259 responds
> with IRQ 7 when queried by the OS.  This occurred for each AP.

Interesting - is this something that can happen on Zen too?

Because I have such reports too.

-- 
Regards/Gruss,
Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.


Re: `do_IRQ: 1.55 No irq handler for vector` on ASRock E350M1

2018-02-26 Thread Tom Lendacky
On 2/24/2018 2:59 AM, Thomas Gleixner wrote:
> On Sat, 24 Feb 2018, Paul Menzel wrote:
>> Am 23.02.2018 um 20:09 schrieb Borislav Petkov:
>>> On Fri, Feb 23, 2018 at 07:18:34PM +0100, Thomas Gleixner wrote:
 Borislav is seeing similar issues on larger AMD machines. The interrupt
 seems to come from BIOS/microcode during bringup of secondary CPUs and we
 have no idea why.
>>>
>>> Paul, can you boot 4.14 and grep your dmesg for something like:
>>>
>>> [0.00] spurious 8259A interrupt: IRQ7. >
>>> ?
>>
>> No, I do not see that. Please find the logs attached.
> 
> From your 4.14 log:
> 
> Feb 19 09:48:06.843173 kodi kernel: CPU 0 irqstacks, hard=e9b0a000 
> soft=e9b0c000
> Feb 19 09:48:06.843216 kodi kernel: spurious 8259A interrupt: IRQ7.

I think I remember seeing something like this previously and it turned out
to be a BIOS bug.  All the AP's were enabled to work with the legacy 8259
interrupt controller.  In an SMP system, only one processor in the system
should be configured to handle legacy 8259 interrupts (ExtINT delivery
mode - see Intel's SDM, Volume 3, section 10.5.1, Delivery Mode).  Once
the BIOS was fixed, the spurious interrupt message went away.

I believe at some point during UEFI, the APs were exposed to an ExtINT
interrupt.  Since they were configured to handle ExtINT delivery mode and
interrupts were not yet enabled, the interrupt was left pending.  When the
APs were started by the OS and interrupts were enabled, the interrupt
triggered. Since the original pending interrupt was handled by the BSP,
there was no longer an interrupt actually pending, so the 8259 responds
with IRQ 7 when queried by the OS.  This occurred for each AP.

Thanks,
Tom

> Feb 19 09:48:06.843258 kodi kernel: Console: colour VGA+ 80x25
> 
> So this has been there in 4.14 already just the detection mechanism has
> changed due to the modifications of the interrupt vector management code.
> It's not a real issue, just annoying 
> 
> Thanks,
> 
>   tglx
> 


Re: `do_IRQ: 1.55 No irq handler for vector` on ASRock E350M1

2018-02-26 Thread Tom Lendacky
On 2/24/2018 2:59 AM, Thomas Gleixner wrote:
> On Sat, 24 Feb 2018, Paul Menzel wrote:
>> Am 23.02.2018 um 20:09 schrieb Borislav Petkov:
>>> On Fri, Feb 23, 2018 at 07:18:34PM +0100, Thomas Gleixner wrote:
 Borislav is seeing similar issues on larger AMD machines. The interrupt
 seems to come from BIOS/microcode during bringup of secondary CPUs and we
 have no idea why.
>>>
>>> Paul, can you boot 4.14 and grep your dmesg for something like:
>>>
>>> [0.00] spurious 8259A interrupt: IRQ7. >
>>> ?
>>
>> No, I do not see that. Please find the logs attached.
> 
> From your 4.14 log:
> 
> Feb 19 09:48:06.843173 kodi kernel: CPU 0 irqstacks, hard=e9b0a000 
> soft=e9b0c000
> Feb 19 09:48:06.843216 kodi kernel: spurious 8259A interrupt: IRQ7.

I think I remember seeing something like this previously and it turned out
to be a BIOS bug.  All the AP's were enabled to work with the legacy 8259
interrupt controller.  In an SMP system, only one processor in the system
should be configured to handle legacy 8259 interrupts (ExtINT delivery
mode - see Intel's SDM, Volume 3, section 10.5.1, Delivery Mode).  Once
the BIOS was fixed, the spurious interrupt message went away.

I believe at some point during UEFI, the APs were exposed to an ExtINT
interrupt.  Since they were configured to handle ExtINT delivery mode and
interrupts were not yet enabled, the interrupt was left pending.  When the
APs were started by the OS and interrupts were enabled, the interrupt
triggered. Since the original pending interrupt was handled by the BSP,
there was no longer an interrupt actually pending, so the 8259 responds
with IRQ 7 when queried by the OS.  This occurred for each AP.

Thanks,
Tom

> Feb 19 09:48:06.843258 kodi kernel: Console: colour VGA+ 80x25
> 
> So this has been there in 4.14 already just the detection mechanism has
> changed due to the modifications of the interrupt vector management code.
> It's not a real issue, just annoying 
> 
> Thanks,
> 
>   tglx
> 


Re: `do_IRQ: 1.55 No irq handler for vector` on ASRock E350M1

2018-02-24 Thread Thomas Gleixner
On Sat, 24 Feb 2018, Paul Menzel wrote:
> Am 23.02.2018 um 20:09 schrieb Borislav Petkov:
> > On Fri, Feb 23, 2018 at 07:18:34PM +0100, Thomas Gleixner wrote:
> > > Borislav is seeing similar issues on larger AMD machines. The interrupt
> > > seems to come from BIOS/microcode during bringup of secondary CPUs and we
> > > have no idea why.
> > 
> > Paul, can you boot 4.14 and grep your dmesg for something like:
> > 
> > [0.00] spurious 8259A interrupt: IRQ7. >
> > ?
> 
> No, I do not see that. Please find the logs attached.

>From your 4.14 log:

Feb 19 09:48:06.843173 kodi kernel: CPU 0 irqstacks, hard=e9b0a000 soft=e9b0c000
Feb 19 09:48:06.843216 kodi kernel: spurious 8259A interrupt: IRQ7.
Feb 19 09:48:06.843258 kodi kernel: Console: colour VGA+ 80x25

So this has been there in 4.14 already just the detection mechanism has
changed due to the modifications of the interrupt vector management code.
It's not a real issue, just annoying 

Thanks,

tglx


Re: `do_IRQ: 1.55 No irq handler for vector` on ASRock E350M1

2018-02-24 Thread Thomas Gleixner
On Sat, 24 Feb 2018, Paul Menzel wrote:
> Am 23.02.2018 um 20:09 schrieb Borislav Petkov:
> > On Fri, Feb 23, 2018 at 07:18:34PM +0100, Thomas Gleixner wrote:
> > > Borislav is seeing similar issues on larger AMD machines. The interrupt
> > > seems to come from BIOS/microcode during bringup of secondary CPUs and we
> > > have no idea why.
> > 
> > Paul, can you boot 4.14 and grep your dmesg for something like:
> > 
> > [0.00] spurious 8259A interrupt: IRQ7. >
> > ?
> 
> No, I do not see that. Please find the logs attached.

>From your 4.14 log:

Feb 19 09:48:06.843173 kodi kernel: CPU 0 irqstacks, hard=e9b0a000 soft=e9b0c000
Feb 19 09:48:06.843216 kodi kernel: spurious 8259A interrupt: IRQ7.
Feb 19 09:48:06.843258 kodi kernel: Console: colour VGA+ 80x25

So this has been there in 4.14 already just the detection mechanism has
changed due to the modifications of the interrupt vector management code.
It's not a real issue, just annoying 

Thanks,

tglx


Re: `do_IRQ: 1.55 No irq handler for vector` on ASRock E350M1

2018-02-23 Thread Borislav Petkov
On Fri, Feb 23, 2018 at 07:18:34PM +0100, Thomas Gleixner wrote:
> Borislav is seeing similar issues on larger AMD machines. The interrupt
> seems to come from BIOS/microcode during bringup of secondary CPUs and we
> have no idea why.

Paul, can you boot 4.14 and grep your dmesg for something like:

[0.00] spurious 8259A interrupt: IRQ7.

?

Thx.

-- 
Regards/Gruss,
Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.


Re: `do_IRQ: 1.55 No irq handler for vector` on ASRock E350M1

2018-02-23 Thread Borislav Petkov
On Fri, Feb 23, 2018 at 07:18:34PM +0100, Thomas Gleixner wrote:
> Borislav is seeing similar issues on larger AMD machines. The interrupt
> seems to come from BIOS/microcode during bringup of secondary CPUs and we
> have no idea why.

Paul, can you boot 4.14 and grep your dmesg for something like:

[0.00] spurious 8259A interrupt: IRQ7.

?

Thx.

-- 
Regards/Gruss,
Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.


Re: `do_IRQ: 1.55 No irq handler for vector` on ASRock E350M1

2018-02-23 Thread Thomas Gleixner
Paul,

On Fri, 23 Feb 2018, Paul Menzel wrote:
> 
> On the ASRock E350M1 (AMD A50M), since Linux 4.15 I get the message below.
> 
> do_IRQ: 1.55 No irq handler for vector

Thanks for the report, but the single line of dmesg w/o any context is not
really helpful. Can you please provide a larger piece of dmesg which shows
in which context this happens.

I assume that this happens during early boot right when CPU1 is brought up,
right?

Borislav is seeing similar issues on larger AMD machines. The interrupt
seems to come from BIOS/microcode during bringup of secondary CPUs and we
have no idea why.

Thanks,

tglx



Re: `do_IRQ: 1.55 No irq handler for vector` on ASRock E350M1

2018-02-23 Thread Thomas Gleixner
Paul,

On Fri, 23 Feb 2018, Paul Menzel wrote:
> 
> On the ASRock E350M1 (AMD A50M), since Linux 4.15 I get the message below.
> 
> do_IRQ: 1.55 No irq handler for vector

Thanks for the report, but the single line of dmesg w/o any context is not
really helpful. Can you please provide a larger piece of dmesg which shows
in which context this happens.

I assume that this happens during early boot right when CPU1 is brought up,
right?

Borislav is seeing similar issues on larger AMD machines. The interrupt
seems to come from BIOS/microcode during bringup of secondary CPUs and we
have no idea why.

Thanks,

tglx