Re: [PATCH 0/5] iommu/amd: fixes for suspend/resume

2022-01-27 Thread Mike Lothian
On Thu, 27 Jan 2022 at 10:22, Maxim Levitsky  wrote:
>
> On Thu, 2022-01-27 at 00:39 +, Mike Lothian wrote:
> > On Wed, 26 Jan 2022 at 10:12, Maxim Levitsky  wrote:
> > > Great, your system does seem to support GA log
> > > (but a patch to check if, other that assume blindly that it is supported 
> > > is
> > > something that should be done).
> > >
> > > So could you bump the LOOP_TIMEOUT like by 10x or so and see if the 
> > > problem goes away?
> > >
> > > (that code should be rewritten to time based wait and not just blindly 
> > > loop like that,
> > > I also can prepare a patch for that as well).
> > >
> > > Best regards,
> > > Maxim Levitsky
> > >
> >
> > Hi
> >
> > I've done quite a few restarts with the LOOP_TIMEOUT increased and
> > I've not seen the issue since
>
> Great, so the problem is solved I guess.
> Thanks for the help!
>
>
> I'll send a patch for this in few days to replace this and other similiar 
> timeouts
> with a proper udelay() wait.
>

Thanks for your help
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 0/5] iommu/amd: fixes for suspend/resume

2022-01-27 Thread Maxim Levitsky
On Thu, 2022-01-27 at 00:39 +, Mike Lothian wrote:
> On Wed, 26 Jan 2022 at 10:12, Maxim Levitsky  wrote:
> > Great, your system does seem to support GA log
> > (but a patch to check if, other that assume blindly that it is supported is
> > something that should be done).
> > 
> > So could you bump the LOOP_TIMEOUT like by 10x or so and see if the problem 
> > goes away?
> > 
> > (that code should be rewritten to time based wait and not just blindly loop 
> > like that,
> > I also can prepare a patch for that as well).
> > 
> > Best regards,
> > Maxim Levitsky
> > 
> 
> Hi
> 
> I've done quite a few restarts with the LOOP_TIMEOUT increased and
> I've not seen the issue since

Great, so the problem is solved I guess. 
Thanks for the help!


I'll send a patch for this in few days to replace this and other similiar 
timeouts
with a proper udelay() wait.

Best regards,
Maxim Levitsky

> 
> Cheers
> 
> Mike
> 


___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 0/5] iommu/amd: fixes for suspend/resume

2022-01-26 Thread Mike Lothian
On Wed, 26 Jan 2022 at 10:12, Maxim Levitsky  wrote:
>
> Great, your system does seem to support GA log
> (but a patch to check if, other that assume blindly that it is supported is
> something that should be done).
>
> So could you bump the LOOP_TIMEOUT like by 10x or so and see if the problem 
> goes away?
>
> (that code should be rewritten to time based wait and not just blindly loop 
> like that,
> I also can prepare a patch for that as well).
>
> Best regards,
> Maxim Levitsky
>

Hi

I've done quite a few restarts with the LOOP_TIMEOUT increased and
I've not seen the issue since

Cheers

Mike
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 0/5] iommu/amd: fixes for suspend/resume

2022-01-26 Thread Maxim Levitsky
On Wed, 2022-01-26 at 09:54 +, Mike Lothian wrote:
> On Wed, 26 Jan 2022 at 07:34, Maxim Levitsky  wrote:
> > Could you post the whole dmesg, or at least:
> > 
> > dmesg | grep AMD-Vi
> > 
> > 
> > What CPU does your system have?
> > 
> > I suspect that your system doesn't GA log feature enabled in the IOMMU, and 
> > the code never checks
> > for that, and here it fails enabling it, which  before my patches was just
> > ignoring it silently.
> > 
> > 
> > Best regards,
> > Maxim Levitsky
> > > Hope that helps
> > > 
> > > Mike
> > > 
> 
> Hi
> 
> It's an AMD Ryzen 9 5900HX
> 
> [0.186350] AMD-Vi: ivrs, add hid:AMDI0020, uid:\_SB.FUR0, rdevid:160
> [0.186353] AMD-Vi: ivrs, add hid:AMDI0020, uid:\_SB.FUR1, rdevid:160
> [0.186354] AMD-Vi: ivrs, add hid:AMDI0020, uid:\_SB.FUR2, rdevid:160
> [0.186355] AMD-Vi: ivrs, add hid:AMDI0020, uid:\_SB.FUR3, rdevid:160
> [0.355628] pci :00:00.2: AMD-Vi: IOMMU performance counters supported
> [0.356134] pci :00:00.2: AMD-Vi: Found IOMMU cap 0x40
> [0.356136] AMD-Vi: Extended features (0x206d73ef22254ade): PPR
> X2APIC NX GT IA GA PC GA_vAPIC
> [0.356140] AMD-Vi: Interrupt remapping enabled
> [0.356141] AMD-Vi: Virtual APIC enabled
> [0.356142] AMD-Vi: X2APIC enabled
> [0.431377] AMD-Vi: AMD IOMMUv2 loaded and initialized
> 
> I've attached the dmesg, I notice that some boots it doesn't happen
> 
> Cheers
> 
> Mike

Great, your system does seem to support GA log 
(but a patch to check if, other that assume blindly that it is supported is 
something that should be done).

So could you bump the LOOP_TIMEOUT like by 10x or so and see if the problem 
goes away?

(that code should be rewritten to time based wait and not just blindly loop 
like that,
I also can prepare a patch for that as well).

Best regards,
Maxim Levitsky

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 0/5] iommu/amd: fixes for suspend/resume

2022-01-26 Thread Mike Lothian
On Wed, 26 Jan 2022 at 07:34, Maxim Levitsky  wrote:
>
> Could you post the whole dmesg, or at least:
>
> dmesg | grep AMD-Vi
>
>
> What CPU does your system have?
>
> I suspect that your system doesn't GA log feature enabled in the IOMMU, and 
> the code never checks
> for that, and here it fails enabling it, which  before my patches was just
> ignoring it silently.
>
>
> Best regards,
> Maxim Levitsky
> >
> > Hope that helps
> >
> > Mike
> >

Hi

It's an AMD Ryzen 9 5900HX

[0.186350] AMD-Vi: ivrs, add hid:AMDI0020, uid:\_SB.FUR0, rdevid:160
[0.186353] AMD-Vi: ivrs, add hid:AMDI0020, uid:\_SB.FUR1, rdevid:160
[0.186354] AMD-Vi: ivrs, add hid:AMDI0020, uid:\_SB.FUR2, rdevid:160
[0.186355] AMD-Vi: ivrs, add hid:AMDI0020, uid:\_SB.FUR3, rdevid:160
[0.355628] pci :00:00.2: AMD-Vi: IOMMU performance counters supported
[0.356134] pci :00:00.2: AMD-Vi: Found IOMMU cap 0x40
[0.356136] AMD-Vi: Extended features (0x206d73ef22254ade): PPR
X2APIC NX GT IA GA PC GA_vAPIC
[0.356140] AMD-Vi: Interrupt remapping enabled
[0.356141] AMD-Vi: Virtual APIC enabled
[0.356142] AMD-Vi: X2APIC enabled
[0.431377] AMD-Vi: AMD IOMMUv2 loaded and initialized

I've attached the dmesg, I notice that some boots it doesn't happen

Cheers

Mike


x2apic.dmesg
Description: Binary data
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH 0/5] iommu/amd: fixes for suspend/resume

2022-01-25 Thread Maxim Levitsky
On Tue, 2022-01-25 at 23:25 +, Mike Lothian wrote:
> On Tue, 25 Jan 2022 at 19:26, Maxim Levitsky  wrote:
> > Could you just apply these patches on top of 5.15 kernel and see if you get 
> > the warning?
> > 
> > If something could case it is I think patch 1, it does move the GA log 
> > enabled
> > to be a bit later.
> > I also added few warnings there. I wonder why your dmesg quote doesn't 
> > contain the C line
> > where the warning happens.
> > 
> > In partucular I added:
> > 
> > if (WARN_ON(status & (MMIO_STATUS_GALOG_RUN_MASK)))
> > 
> > That will fire if GA log is already running (maybe BIOS enabled it? - it 
> > really shouldn't do that)
> > 
> > 
> > And that:
> > 
> > if (WARN_ON(i >= LOOP_TIMEOUT))
> > 
> > also should not happen and worth to be logged IMHO.
> > 
> > Best regards,
> > Maxim Levitsky
> > 
> 
> Hi
> 
> I applied on top of another kernel as you asked, I also enabled some debugging
> 
> [0.398833] [ cut here ]
> [0.398835] WARNING: CPU: 0 PID: 1 at drivers/iommu/amd/init.c:839
> amd_iommu_enable_interrupts+0x1da/0x440
> [0.398840] Modules linked in:
> [0.398841] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.16.0-rc5-agd5f+ 
> #1388
> [0.398843] Hardware name: ASUSTeK COMPUTER INC. ROG Strix
> G513QY_G513QY/G513QY, BIOS G513QY.316 11/29/2021
> [0.398845] RIP: 0010:amd_iommu_enable_interrupts+0x1da/0x440
> [0.398847] Code: 4b 38 48 89 41 18 b8 a0 86 01 00 0f 1f 44 00 00
> 48 8b 4b 38 8b 89 20 20 00 00 f7 c1 00 01 00 00 0f 85 7a fe ff ff ff
> c8 75 e6 <0f> 0b e9 6f fe ff ff 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44
> 0
> 0 00
> [0.398850] RSP: 0018:888100927c68 EFLAGS: 00010246
> [0.398851] RAX:  RBX: 88810004b000 RCX: 
> 0018
> [0.398853] RDX: 0008 RSI: 888100927c70 RDI: 
> c90800f0
> [0.398854] RBP: 888100927c68 R08: 8881015b8f88 R09: 
> 
> [0.398855] R10:  R11:  R12: 
> 7fff
> [0.398856] R13: 777f8000 R14:  R15: 
> 8357a758
> [0.398858] FS:  () GS:888fde40()
> knlGS:
> [0.398859] CS:  0010 DS:  ES:  CR0: 80050033
> [0.398860] CR2:  CR3: ac40c000 CR4: 
> 00150ef0
> [0.398862] Call Trace:
> [0.398864]  
> [0.398864]  ? iommu_setup+0x29a/0x29a
> [0.398867]  ? state_next+0x6e/0x1c9
> [0.398870]  ? iommu_setup+0x29a/0x29a
> [0.398872]  ? iommu_go_to_state+0x1f/0x33
> [0.398873]  ? amd_iommu_init+0xa/0x23
> [0.398875]  ? pci_iommu_init+0xf/0x45
> [0.398876]  ? iommu_setup+0x29a/0x29a
> [0.398878]  ? 
> __initstub__kmod_pci_dma__244_136_pci_iommu_initrootfs+0x5/0x8
> [0.398880]  ? do_one_initcall+0x100/0x290
> [0.398882]  ? do_initcall_level+0x8b/0xe5
> [0.398884]  ? do_initcalls+0x44/0x6d
> [0.398885]  ? kernel_init_freeable+0xc7/0x10d
> [0.398886]  ? rest_init+0xc0/0xc0
> [0.39]  ? kernel_init+0x11/0x150
> [0.398889]  ? ret_from_fork+0x22/0x30
> [0.398891]  
> [0.398892] ---[ end trace f048a4ec907dc976 ]---
> 
> Which points to patch one and "if (WARN_ON(i >= LOOP_TIMEOUT))"


Could you post the whole dmesg, or at least:

dmesg | grep AMD-Vi


What CPU does your system have?

I suspect that your system doesn't GA log feature enabled in the IOMMU, and the 
code never checks
for that, and here it fails enabling it, which  before my patches was just
ignoring it silently.


Best regards,
Maxim Levitsky
> 
> Hope that helps
> 
> Mike
> 


___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 0/5] iommu/amd: fixes for suspend/resume

2022-01-25 Thread Mike Lothian
On Tue, 25 Jan 2022 at 19:26, Maxim Levitsky  wrote:
>
> Could you just apply these patches on top of 5.15 kernel and see if you get 
> the warning?
>
> If something could case it is I think patch 1, it does move the GA log enabled
> to be a bit later.
> I also added few warnings there. I wonder why your dmesg quote doesn't 
> contain the C line
> where the warning happens.
>
> In partucular I added:
>
> if (WARN_ON(status & (MMIO_STATUS_GALOG_RUN_MASK)))
>
> That will fire if GA log is already running (maybe BIOS enabled it? - it 
> really shouldn't do that)
>
>
> And that:
>
> if (WARN_ON(i >= LOOP_TIMEOUT))
>
> also should not happen and worth to be logged IMHO.
>
> Best regards,
> Maxim Levitsky
>

Hi

I applied on top of another kernel as you asked, I also enabled some debugging

[0.398833] [ cut here ]
[0.398835] WARNING: CPU: 0 PID: 1 at drivers/iommu/amd/init.c:839
amd_iommu_enable_interrupts+0x1da/0x440
[0.398840] Modules linked in:
[0.398841] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.16.0-rc5-agd5f+ #1388
[0.398843] Hardware name: ASUSTeK COMPUTER INC. ROG Strix
G513QY_G513QY/G513QY, BIOS G513QY.316 11/29/2021
[0.398845] RIP: 0010:amd_iommu_enable_interrupts+0x1da/0x440
[0.398847] Code: 4b 38 48 89 41 18 b8 a0 86 01 00 0f 1f 44 00 00
48 8b 4b 38 8b 89 20 20 00 00 f7 c1 00 01 00 00 0f 85 7a fe ff ff ff
c8 75 e6 <0f> 0b e9 6f fe ff ff 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44
0
0 00
[0.398850] RSP: 0018:888100927c68 EFLAGS: 00010246
[0.398851] RAX:  RBX: 88810004b000 RCX: 0018
[0.398853] RDX: 0008 RSI: 888100927c70 RDI: c90800f0
[0.398854] RBP: 888100927c68 R08: 8881015b8f88 R09: 
[0.398855] R10:  R11:  R12: 7fff
[0.398856] R13: 777f8000 R14:  R15: 8357a758
[0.398858] FS:  () GS:888fde40()
knlGS:
[0.398859] CS:  0010 DS:  ES:  CR0: 80050033
[0.398860] CR2:  CR3: ac40c000 CR4: 00150ef0
[0.398862] Call Trace:
[0.398864]  
[0.398864]  ? iommu_setup+0x29a/0x29a
[0.398867]  ? state_next+0x6e/0x1c9
[0.398870]  ? iommu_setup+0x29a/0x29a
[0.398872]  ? iommu_go_to_state+0x1f/0x33
[0.398873]  ? amd_iommu_init+0xa/0x23
[0.398875]  ? pci_iommu_init+0xf/0x45
[0.398876]  ? iommu_setup+0x29a/0x29a
[0.398878]  ? __initstub__kmod_pci_dma__244_136_pci_iommu_initrootfs+0x5/0x8
[0.398880]  ? do_one_initcall+0x100/0x290
[0.398882]  ? do_initcall_level+0x8b/0xe5
[0.398884]  ? do_initcalls+0x44/0x6d
[0.398885]  ? kernel_init_freeable+0xc7/0x10d
[0.398886]  ? rest_init+0xc0/0xc0
[0.39]  ? kernel_init+0x11/0x150
[0.398889]  ? ret_from_fork+0x22/0x30
[0.398891]  
[0.398892] ---[ end trace f048a4ec907dc976 ]---

Which points to patch one and "if (WARN_ON(i >= LOOP_TIMEOUT))"

Hope that helps

Mike
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 0/5] iommu/amd: fixes for suspend/resume

2022-01-25 Thread Maxim Levitsky
On Tue, 2022-01-25 at 15:08 +, Mike Lothian wrote:
> Hi
> 
> I'm seeing a WARNING that I think might be related to these patches, 
> unfortunately another issue is making bisecting difficult
> 
> [0.359362] AMD-Vi: X2APIC enabled
> [0.395140] [ cut here ]
> [0.395142] WARNING: CPU: 0 PID: 1 at 
> amd_iommu_enable_interrupts+0x1da/0x440
> [0.395146] Modules linked in:
> [0.395148] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.17.0-rc1-tip+ #2995
> [0.395150] Hardware name: ASUSTeK COMPUTER INC. ROG Strix 
> G513QY_G513QY/G513QY, BIOS G513QY.316 11/29/2021
> [0.395152] RIP: 0010:amd_iommu_enable_interrupts+0x1da/0x440
> [0.395154] Code: 4b 38 48 89 41 18 b8 a0 86 01 00 0f 1f 44 00 00 48 8b 4b 
> 38 8b 89 20 20 00 00 f7 c1 00 01 00 00 0f 85 7a fe ff ff ff c8 75 e6 <0f> 0b 
> e9 6f fe ff ff 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00
> [0.395157] RSP: 0018:88810022fc68 EFLAGS: 00010246
> [0.395158] RAX:  RBX: 88810004b000 RCX: 
> 0018
> [0.395160] RDX: 0008 RSI: 88810022fc70 RDI: 
> c90800f0
> [0.395161] RBP: 88810022fc68 R08: 888100fce088 R09: 
> 
> [0.395162] R10:  R11:  R12: 
> 7fff
> [0.395163] R13: 777f8000 R14:  R15: 
> 8357c9e8
> [0.395165] FS:  () GS:888fde40() 
> knlGS:
> [0.395166] CS:  0010 DS:  ES:  CR0: 80050033
> [0.395167] CR2: 88901e1ff000 CR3: b440c000 CR4: 
> 00150ef0
> [0.395169] Call Trace:
> [0.395170]  
> [0.395171]  ? iommu_setup+0x29a/0x29a
> [0.395174]  ? state_next+0x6e/0x1c9
> [0.395177]  ? iommu_setup+0x29a/0x29a
> [0.395178]  ? iommu_go_to_state+0x1f/0x33
> [0.395180]  ? amd_iommu_init+0xa/0x23
> [0.395182]  ? pci_iommu_init+0xf/0x45
> [0.395183]  ? iommu_setup+0x29a/0x29a
> [0.395184]  ? 
> __initstub__kmod_pci_dma__250_136_pci_iommu_initrootfs+0x5/0x8
> [0.395186]  ? do_one_initcall+0x100/0x290
> [0.395190]  ? do_initcall_level+0x8b/0xe5
> [0.395192]  ? do_initcalls+0x44/0x6d
> [0.395194]  ? kernel_init_freeable+0xc7/0x10d
> [0.395196]  ? rest_init+0xc0/0xc0
> [0.395198]  ? kernel_init+0x11/0x150
> [0.395200]  ? ret_from_fork+0x22/0x30
> [0.395201]  
> [0.395202] ---[ end trace  ]---
> [0.395204] PCI-DMA: Using software bounce buffer
> 
> Let me know if you need any more info
> 
> Cheers
> 
> Mike


Could you just apply these patches on top of 5.15 kernel and see if you get the 
warning?

If something could case it is I think patch 1, it does move the GA log enabled
to be a bit later.
I also added few warnings there. I wonder why your dmesg quote doesn't contain 
the C line
where the warning happens.

In partucular I added:

if (WARN_ON(status & (MMIO_STATUS_GALOG_RUN_MASK)))

That will fire if GA log is already running (maybe BIOS enabled it? - it really 
shouldn't do that)


And that:

if (WARN_ON(i >= LOOP_TIMEOUT))

also should not happen and worth to be logged IMHO.

Best regards,
Maxim Levitsky



___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 0/5] iommu/amd: fixes for suspend/resume

2022-01-25 Thread Mike Lothian
Hi

I'm seeing a WARNING that I think might be related to these patches, 
unfortunately another issue is making bisecting difficult

[0.359362] AMD-Vi: X2APIC enabled
[0.395140] [ cut here ]
[0.395142] WARNING: CPU: 0 PID: 1 at amd_iommu_enable_interrupts+0x1da/0x440
[0.395146] Modules linked in:
[0.395148] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.17.0-rc1-tip+ #2995
[0.395150] Hardware name: ASUSTeK COMPUTER INC. ROG Strix 
G513QY_G513QY/G513QY, BIOS G513QY.316 11/29/2021
[0.395152] RIP: 0010:amd_iommu_enable_interrupts+0x1da/0x440
[0.395154] Code: 4b 38 48 89 41 18 b8 a0 86 01 00 0f 1f 44 00 00 48 8b 4b 
38 8b 89 20 20 00 00 f7 c1 00 01 00 00 0f 85 7a fe ff ff ff c8 75 e6 <0f> 0b e9 
6f fe ff ff 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00
[0.395157] RSP: 0018:88810022fc68 EFLAGS: 00010246
[0.395158] RAX:  RBX: 88810004b000 RCX: 0018
[0.395160] RDX: 0008 RSI: 88810022fc70 RDI: c90800f0
[0.395161] RBP: 88810022fc68 R08: 888100fce088 R09: 
[0.395162] R10:  R11:  R12: 7fff
[0.395163] R13: 777f8000 R14:  R15: 8357c9e8
[0.395165] FS:  () GS:888fde40() 
knlGS:
[0.395166] CS:  0010 DS:  ES:  CR0: 80050033
[0.395167] CR2: 88901e1ff000 CR3: b440c000 CR4: 00150ef0
[0.395169] Call Trace:
[0.395170]  
[0.395171]  ? iommu_setup+0x29a/0x29a
[0.395174]  ? state_next+0x6e/0x1c9
[0.395177]  ? iommu_setup+0x29a/0x29a
[0.395178]  ? iommu_go_to_state+0x1f/0x33
[0.395180]  ? amd_iommu_init+0xa/0x23
[0.395182]  ? pci_iommu_init+0xf/0x45
[0.395183]  ? iommu_setup+0x29a/0x29a
[0.395184]  ? __initstub__kmod_pci_dma__250_136_pci_iommu_initrootfs+0x5/0x8
[0.395186]  ? do_one_initcall+0x100/0x290
[0.395190]  ? do_initcall_level+0x8b/0xe5
[0.395192]  ? do_initcalls+0x44/0x6d
[0.395194]  ? kernel_init_freeable+0xc7/0x10d
[0.395196]  ? rest_init+0xc0/0xc0
[0.395198]  ? kernel_init+0x11/0x150
[0.395200]  ? ret_from_fork+0x22/0x30
[0.395201]  
[0.395202] ---[ end trace  ]---
[0.395204] PCI-DMA: Using software bounce buffer

Let me know if you need any more info

Cheers

Mike
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 0/5] iommu/amd: fixes for suspend/resume

2021-12-17 Thread Joerg Roedel
On Tue, Nov 23, 2021 at 06:10:33PM +0200, Maxim Levitsky wrote:
> Maxim Levitsky (5):
>   iommu/amd: restore GA log/tail pointer on host resume
>   iommu/amd: x2apic mode: re-enable after resume
>   iommu/amd: x2apic mode: setup the INTX registers on mask/unmask
>   iommu/amd: x2apic mode: mask/unmask interrupts on suspend/resume
>   iommu/amd: remove useless irq affinity notifier
> 
>  drivers/iommu/amd/amd_iommu_types.h |   2 -
>  drivers/iommu/amd/init.c| 107 +++-
>  2 files changed, 58 insertions(+), 51 deletions(-)

Applied for v5.17, thanks.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 0/5] iommu/amd: fixes for suspend/resume

2021-12-10 Thread Maxim Levitsky
On Thu, 2021-12-02 at 01:08 +0200, Maxim Levitsky wrote:
> On Tue, 2021-11-23 at 18:10 +0200, Maxim Levitsky wrote:
> > As I sadly found out, a s3 cycle makes the AMD's iommu stop sending 
> > interrupts
> > until the system is rebooted.
> > 
> > I only noticed it now because otherwise the IOMMU works, and these 
> > interrupts
> > are only used for errors and for GA log which I tend not to use by
> > making my VMs do mwait/pause/etc in guest (cpu-pm=on).
> > 
> > There are two issues here that prevent interrupts from being generated after
> > s3 cycle:
> > 
> > 1. GA log base address was not restored after resume, and was all zeroed
> > after resume (by BIOS or such).
> > 
> > In theory if BIOS writes some junk to it, that can even cause a memory 
> > corruption.
> > Patch 2 fixes that.
> > 
> > 2. INTX (aka x2apic mode) settings were not restored after resume.
> > That mode is used regardless if the host uses/supports x2apic, but rather 
> > when
> > the IOMMU supports it, and mine does.
> > Patches 3-4 fix that.
> > 
> > Note that there is still one slight (userspace) bug remaining:
> > During suspend all but the boot CPU are offlined and then after resume
> > are onlined again.
> > 
> > The offlining moves all non-affinity managed interrupts to CPU0, and
> > later when all other CPUs are onlined, there is nothing in the kernel
> > to spread back the interrupts over the cores.
> > 
> > The userspace 'irqbalance' daemon does fix this but it seems to ignore
> > the IOMMU interrupts in INTX mode since they are not attached to any
> > PCI device, and thus they remain on CPU0 after a s3 cycle,
> > which is suboptimal when the system has multiple IOMMUs
> > (mine has 4 of them).
> > 
> > Setting the IRQ affinity manually via /proc/irq/ does work.
> > 
> > This was tested on my 3970X with both INTX and regular MSI mode (later was 
> > enabled
> > by patching out INTX detection), by running a guest with AVIC enabled and 
> > with
> > a PCI assigned device (network card), and observing interrupts from
> > IOMMU while guest is mostly idle.
> > 
> > This was also tested on my AMD laptop with 4650U (which has the same issue)
> > (I tested only INTX mode)
> > 
> > Patch 1 is a small refactoring to remove an unused struct field.
> > 
> > Best regards,
> >Maxim Levitsky
> > 
> > Maxim Levitsky (5):
> >   iommu/amd: restore GA log/tail pointer on host resume
> >   iommu/amd: x2apic mode: re-enable after resume
> >   iommu/amd: x2apic mode: setup the INTX registers on mask/unmask
> >   iommu/amd: x2apic mode: mask/unmask interrupts on suspend/resume
> >   iommu/amd: remove useless irq affinity notifier
> > 
> >  drivers/iommu/amd/amd_iommu_types.h |   2 -
> >  drivers/iommu/amd/init.c| 107 +++-
> >  2 files changed, 58 insertions(+), 51 deletions(-)
> > 
> > -- 
> > 2.26.3
> > 
> > 
> 
> Polite ping on these patches.

Another very polite ping on these patches :)

Best regards,
Maxim Levitsky

> Best regards,
>   Maxim Levitsky


___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 0/5] iommu/amd: fixes for suspend/resume

2021-12-06 Thread Joerg Roedel
On Tue, Nov 23, 2021 at 06:10:33PM +0200, Maxim Levitsky wrote:
> Best regards,
>Maxim Levitsky
> 
> Maxim Levitsky (5):
>   iommu/amd: restore GA log/tail pointer on host resume
>   iommu/amd: x2apic mode: re-enable after resume
>   iommu/amd: x2apic mode: setup the INTX registers on mask/unmask
>   iommu/amd: x2apic mode: mask/unmask interrupts on suspend/resume
>   iommu/amd: remove useless irq affinity notifier
> 
>  drivers/iommu/amd/amd_iommu_types.h |   2 -
>  drivers/iommu/amd/init.c| 107 +++-
>  2 files changed, 58 insertions(+), 51 deletions(-)

Suravee, can you please have a look? These look like v5.16 material.

Thanks,

Joerg
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 0/5] iommu/amd: fixes for suspend/resume

2021-12-01 Thread Maxim Levitsky
On Tue, 2021-11-23 at 18:10 +0200, Maxim Levitsky wrote:
> As I sadly found out, a s3 cycle makes the AMD's iommu stop sending interrupts
> until the system is rebooted.
> 
> I only noticed it now because otherwise the IOMMU works, and these interrupts
> are only used for errors and for GA log which I tend not to use by
> making my VMs do mwait/pause/etc in guest (cpu-pm=on).
> 
> There are two issues here that prevent interrupts from being generated after
> s3 cycle:
> 
> 1. GA log base address was not restored after resume, and was all zeroed
> after resume (by BIOS or such).
> 
> In theory if BIOS writes some junk to it, that can even cause a memory 
> corruption.
> Patch 2 fixes that.
> 
> 2. INTX (aka x2apic mode) settings were not restored after resume.
> That mode is used regardless if the host uses/supports x2apic, but rather when
> the IOMMU supports it, and mine does.
> Patches 3-4 fix that.
> 
> Note that there is still one slight (userspace) bug remaining:
> During suspend all but the boot CPU are offlined and then after resume
> are onlined again.
> 
> The offlining moves all non-affinity managed interrupts to CPU0, and
> later when all other CPUs are onlined, there is nothing in the kernel
> to spread back the interrupts over the cores.
> 
> The userspace 'irqbalance' daemon does fix this but it seems to ignore
> the IOMMU interrupts in INTX mode since they are not attached to any
> PCI device, and thus they remain on CPU0 after a s3 cycle,
> which is suboptimal when the system has multiple IOMMUs
> (mine has 4 of them).
> 
> Setting the IRQ affinity manually via /proc/irq/ does work.
> 
> This was tested on my 3970X with both INTX and regular MSI mode (later was 
> enabled
> by patching out INTX detection), by running a guest with AVIC enabled and with
> a PCI assigned device (network card), and observing interrupts from
> IOMMU while guest is mostly idle.
> 
> This was also tested on my AMD laptop with 4650U (which has the same issue)
> (I tested only INTX mode)
> 
> Patch 1 is a small refactoring to remove an unused struct field.
> 
> Best regards,
>Maxim Levitsky
> 
> Maxim Levitsky (5):
>   iommu/amd: restore GA log/tail pointer on host resume
>   iommu/amd: x2apic mode: re-enable after resume
>   iommu/amd: x2apic mode: setup the INTX registers on mask/unmask
>   iommu/amd: x2apic mode: mask/unmask interrupts on suspend/resume
>   iommu/amd: remove useless irq affinity notifier
> 
>  drivers/iommu/amd/amd_iommu_types.h |   2 -
>  drivers/iommu/amd/init.c| 107 +++-
>  2 files changed, 58 insertions(+), 51 deletions(-)
> 
> -- 
> 2.26.3
> 
> 

Polite ping on these patches.
Best regards,
Maxim Levitsky

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 0/5] iommu/amd: fixes for suspend/resume

2021-11-23 Thread Maxim Levitsky
As I sadly found out, a s3 cycle makes the AMD's iommu stop sending interrupts
until the system is rebooted.

I only noticed it now because otherwise the IOMMU works, and these interrupts
are only used for errors and for GA log which I tend not to use by
making my VMs do mwait/pause/etc in guest (cpu-pm=on).

There are two issues here that prevent interrupts from being generated after
s3 cycle:

1. GA log base address was not restored after resume, and was all zeroed
after resume (by BIOS or such).

In theory if BIOS writes some junk to it, that can even cause a memory 
corruption.
Patch 2 fixes that.

2. INTX (aka x2apic mode) settings were not restored after resume.
That mode is used regardless if the host uses/supports x2apic, but rather when
the IOMMU supports it, and mine does.
Patches 3-4 fix that.

Note that there is still one slight (userspace) bug remaining:
During suspend all but the boot CPU are offlined and then after resume
are onlined again.

The offlining moves all non-affinity managed interrupts to CPU0, and
later when all other CPUs are onlined, there is nothing in the kernel
to spread back the interrupts over the cores.

The userspace 'irqbalance' daemon does fix this but it seems to ignore
the IOMMU interrupts in INTX mode since they are not attached to any
PCI device, and thus they remain on CPU0 after a s3 cycle,
which is suboptimal when the system has multiple IOMMUs
(mine has 4 of them).

Setting the IRQ affinity manually via /proc/irq/ does work.

This was tested on my 3970X with both INTX and regular MSI mode (later was 
enabled
by patching out INTX detection), by running a guest with AVIC enabled and with
a PCI assigned device (network card), and observing interrupts from
IOMMU while guest is mostly idle.

This was also tested on my AMD laptop with 4650U (which has the same issue)
(I tested only INTX mode)

Patch 1 is a small refactoring to remove an unused struct field.

Best regards,
   Maxim Levitsky

Maxim Levitsky (5):
  iommu/amd: restore GA log/tail pointer on host resume
  iommu/amd: x2apic mode: re-enable after resume
  iommu/amd: x2apic mode: setup the INTX registers on mask/unmask
  iommu/amd: x2apic mode: mask/unmask interrupts on suspend/resume
  iommu/amd: remove useless irq affinity notifier

 drivers/iommu/amd/amd_iommu_types.h |   2 -
 drivers/iommu/amd/init.c| 107 +++-
 2 files changed, 58 insertions(+), 51 deletions(-)

-- 
2.26.3


___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu