Re: ioremap() called early from pnv_pci_init_ioda_phb()
On Sun, May 10, 2020 at 1:51 AM Christophe Leroy wrote: > > > > Le 08/05/2020 à 19:41, Qian Cai a écrit : > > > > > >> On May 8, 2020, at 10:39 AM, Qian Cai wrote: > >> > >> Booting POWER9 PowerNV has this message, > >> > >> "ioremap() called early from pnv_pci_init_ioda_phb+0x420/0xdfc. Use > >> early_ioremap() instead” > >> > >> but use the patch below will result in leaks because it will never call > >> early_iounmap() anywhere. However, it looks me it was by design that > >> phb->regs mapping would be there forever where it would be used in > >> pnv_ioda_get_inval_reg(), so is just that check_early_ioremap_leak() > >> initcall too strong? > >> > >> --- a/arch/powerpc/platforms/powernv/pci-ioda.c > >> +++ b/arch/powerpc/platforms/powernv/pci-ioda.c > >> @@ -36,6 +36,7 @@ > >> #include > >> #include > >> #include > >> +#include > >> > >> #include > >> > >> @@ -3827,7 +3828,7 @@ static void __init pnv_pci_init_ioda_phb(struct > >> device_node *np, > >> /* Get registers */ > >> if (!of_address_to_resource(np, 0, )) { > >> phb->regs_phys = r.start; > >> - phb->regs = ioremap(r.start, resource_size()); > >> + phb->regs = early_ioremap(r.start, resource_size()); > >> if (phb->regs == NULL) > >> pr_err(" Failed to map registers !\n”); > > > > This will also trigger a panic with debugfs reads, so isn’t that this > > commit bogus at least for powerpc64? > > > > d538aadc2718 (“powerpc/ioremap: warn on early use of ioremap()") > > No d538aadc2718 is not bogus. That's the point, we want to remove all > early usages of ioremap() in order to remove the hack with the > ioremap_bot stuff and all, and stick to the generic ioremap logic. > > In order to do so, all early use of ioremap() has to be converted to > early_ioremap() or to fixmap or anything else that allows to do ioremaps > before the slab is ready. > > early_ioremap() is for temporary mappings necessary at boottime. For > long lasting mappings, another method is to be used. > > Now, the point is that other architectures like for instance x86 don't > seem to have to use early_ioremap() much. Powerpc is for instance doing > early mappings for PCI. Seems like x86 initialises PCI once slab is > ready. Can't powerpc do the same ? Patches welcome.
Re: ioremap() called early from pnv_pci_init_ioda_phb()
Le 08/05/2020 à 19:41, Qian Cai a écrit : On May 8, 2020, at 10:39 AM, Qian Cai wrote: Booting POWER9 PowerNV has this message, "ioremap() called early from pnv_pci_init_ioda_phb+0x420/0xdfc. Use early_ioremap() instead” but use the patch below will result in leaks because it will never call early_iounmap() anywhere. However, it looks me it was by design that phb->regs mapping would be there forever where it would be used in pnv_ioda_get_inval_reg(), so is just that check_early_ioremap_leak() initcall too strong? --- a/arch/powerpc/platforms/powernv/pci-ioda.c +++ b/arch/powerpc/platforms/powernv/pci-ioda.c @@ -36,6 +36,7 @@ #include #include #include +#include #include @@ -3827,7 +3828,7 @@ static void __init pnv_pci_init_ioda_phb(struct device_node *np, /* Get registers */ if (!of_address_to_resource(np, 0, )) { phb->regs_phys = r.start; - phb->regs = ioremap(r.start, resource_size()); + phb->regs = early_ioremap(r.start, resource_size()); if (phb->regs == NULL) pr_err(" Failed to map registers !\n”); This will also trigger a panic with debugfs reads, so isn’t that this commit bogus at least for powerpc64? d538aadc2718 (“powerpc/ioremap: warn on early use of ioremap()") No d538aadc2718 is not bogus. That's the point, we want to remove all early usages of ioremap() in order to remove the hack with the ioremap_bot stuff and all, and stick to the generic ioremap logic. In order to do so, all early use of ioremap() has to be converted to early_ioremap() or to fixmap or anything else that allows to do ioremaps before the slab is ready. early_ioremap() is for temporary mappings necessary at boottime. For long lasting mappings, another method is to be used. Now, the point is that other architectures like for instance x86 don't seem to have to use early_ioremap() much. Powerpc is for instance doing early mappings for PCI. Seems like x86 initialises PCI once slab is ready. Can't powerpc do the same ? Christophe
Re: ioremap() called early from pnv_pci_init_ioda_phb()
> On May 9, 2020, at 4:38 AM, Nicholas Piggin wrote: > > Your patch to use early_ioremap is faulting? I wonder why? Yes, I don’t know the reasons either. I suppose not many places in other parts of the kernel which keep using those addresses from early_ioremap() after system booted. Otherwise, we would see those leak warnings elsewhere. Thus, we probably have to audit the code, and if still necessary, call early_ioremap() and then early_iounmap() followed by a ioremap() once slab allocator is available?
Re: ioremap() called early from pnv_pci_init_ioda_phb()
Excerpts from Oliver O'Halloran's message of May 9, 2020 6:11 pm: > On Sat, May 9, 2020 at 12:41 AM Qian Cai wrote: >> >> Booting POWER9 PowerNV has this message, >> >> "ioremap() called early from pnv_pci_init_ioda_phb+0x420/0xdfc. Use >> early_ioremap() instead” >> >> but use the patch below will result in leaks because it will never call >> early_iounmap() anywhere. However, it looks me it was by design that >> phb->regs mapping would be there forever where it would be used in >> pnv_ioda_get_inval_reg(), so is just that check_early_ioremap_leak() >> initcall too strong? > > The warning there is junk. The PHBs are setup at boot and never torn > down so we're not "leaking" the mapping. It's supposed to be there for > the lifetime of the kernel. > > That said, we could probably move the PCI setup to a point later in > boot where the normal ioremap can be be used. We would have to check > for initcalls which depend on the PHBs being setup and delay those too > though. I think it helps to unify code a bit more and take special cases out of ioremap to have all these early calls use early_ioremap. We actually do want to move these later if possible too, on radix they use memblock for page tables, and on hash they don't even set up proper kernel page tables but just bolt PTEs into the hash table. Thanks, Nick
Re: ioremap() called early from pnv_pci_init_ioda_phb()
Excerpts from Qian Cai's message of May 9, 2020 3:41 am: > > >> On May 8, 2020, at 10:39 AM, Qian Cai wrote: >> >> Booting POWER9 PowerNV has this message, >> >> "ioremap() called early from pnv_pci_init_ioda_phb+0x420/0xdfc. Use >> early_ioremap() instead” >> >> but use the patch below will result in leaks because it will never call >> early_iounmap() anywhere. However, it looks me it was by design that >> phb->regs mapping would be there forever where it would be used in >> pnv_ioda_get_inval_reg(), so is just that check_early_ioremap_leak() >> initcall too strong? >> >> --- a/arch/powerpc/platforms/powernv/pci-ioda.c >> +++ b/arch/powerpc/platforms/powernv/pci-ioda.c >> @@ -36,6 +36,7 @@ >> #include >> #include >> #include >> +#include >> >> #include >> >> @@ -3827,7 +3828,7 @@ static void __init pnv_pci_init_ioda_phb(struct >> device_node *np, >>/* Get registers */ >>if (!of_address_to_resource(np, 0, )) { >>phb->regs_phys = r.start; >> - phb->regs = ioremap(r.start, resource_size()); >> + phb->regs = early_ioremap(r.start, resource_size()); >>if (phb->regs == NULL) >>pr_err(" Failed to map registers !\n”); > > This will also trigger a panic with debugfs reads, so isn’t that this commit > bogus at least for powerpc64? Your patch to use early_ioremap is faulting? I wonder why? Thanks, Nick > > d538aadc2718 (“powerpc/ioremap: warn on early use of ioremap()") > > 11017.617022][T122068] Faulting instruction address: 0xc00db564 > [11017.617257][T122066] Faulting instruction address: 0xc00db564 > [11017.617950][T122073] Faulting instruction address: 0xc00db564 > [11017.61][T122064] BUG: Unable to handle kernel data access on read at > 0xffe20e10 > [11017.618935][T122064] Faulting instruction address: 0xc00db564 > [11017.737996][T122072] > [11017.738010][T122073] Oops: Kernel access of bad area, sig: 11 [#2] > [11017.738024][T122073] LE PAGE_SIZE=64K MMU=Radix SMP NR_CPUS=256 > DEBUG_PAGEALLOC NUMA PowerNV > [11017.738051][T122073] Modules linked in: brd ext4 crc16 mbcache jbd2 loop > kvm_hv kvm ip_tables x_tables xfs sd_mod bnx2x ahci libahci tg3 mdio libata > libphy firmware_class dm_mirror dm_region_hash dm_log dm_mod > [11017.738110][T122073] CPU: 108 PID: 122073 Comm: read_all Tainted: G D > W 5.7.0-rc4-next-20200508+ #4 > [11017.738138][T122073] NIP: c00db564 LR: c056f660 CTR: > c00db550 > [11017.738173][T122073] REGS: c00374f6f980 TRAP: 0380 Tainted: G D > W (5.7.0-rc4-next-20200508+) > [11017.738234][T122073] MSR: 90009033 CR: > 22002282 XER: 2004 > [11017.738278][T122073] CFAR: c056f65c IRQMASK: 0 > [11017.738278][T122073] GPR00: c056f660 c00374f6fc10 > c1689400 c000201ffc41aa00 > [11017.738278][T122073] GPR04: c00374f6fc70 > 0001 > [11017.738278][T122073] GPR08: ffe2 > c008ee380080 > [11017.738278][T122073] GPR12: c00db550 c000201fff671280 > > [11017.738278][T122073] GPR16: 0002 10040800 > 1001ccd8 1001cc80 > [11017.738278][T122073] GPR20: 1001cc98 1001ccc8 > 1001cca8 1001cb48 > [11017.738278][T122073] GPR24: > 03ff 7fffebb67390 > [11017.738278][T122073] GPR28: c00374f6fd90 c000200c0c6a7550 > c000200c0c6a7500 > [11017.738542][T122073] NIP [c00db564] > pnv_eeh_dbgfs_get_inbB+0x14/0x30 > [11017.738579][T122073] LR [c056f660] simple_attr_read+0xa0/0x180 > [11017.738613][T122073] Call Trace: > [11017.738645][T122073] [c00374f6fc10] [c056f630] > simple_attr_read+0x70/0x180 (unreliable) > [11017.738672][T122073] [c00374f6fcb0] [c064a2e0] > full_proxy_read+0x90/0xe0 > [11017.738686][T122073] [c00374f6fd00] [c051fe0c] > __vfs_read+0x3c/0x70 > [11017.738722][T122073] [c00374f6fd20] [c051feec] > vfs_read+0xac/0x170 > [11017.738757][T122073] [c00374f6fd70] [c052034c] > ksys_read+0x7c/0x140 > [11017.738818][T122073] [c00374f6fdc0] [c0038af4] > system_call_exception+0x114/0x1e0 > [11017.738867][T122073] [c00374f6fe20] [c000c8f0] > system_call_common+0xf0/0x278 > [11017.738916][T122073] Instruction dump: > [11017.738948][T122073] 7c0004ac f9490d10 a14d0c78 3860 b14d0c7a 4e800020 > 6000 7c0802a6 > [11017.739001][T122073] 6000 e9230278 e9290028 7c0004ac > 0c09 4c00012c 3860
Re: ioremap() called early from pnv_pci_init_ioda_phb()
On Sat, May 9, 2020 at 12:41 AM Qian Cai wrote: > > Booting POWER9 PowerNV has this message, > > "ioremap() called early from pnv_pci_init_ioda_phb+0x420/0xdfc. Use > early_ioremap() instead” > > but use the patch below will result in leaks because it will never call > early_iounmap() anywhere. However, it looks me it was by design that > phb->regs mapping would be there forever where it would be used in > pnv_ioda_get_inval_reg(), so is just that check_early_ioremap_leak() initcall > too strong? The warning there is junk. The PHBs are setup at boot and never torn down so we're not "leaking" the mapping. It's supposed to be there for the lifetime of the kernel. That said, we could probably move the PCI setup to a point later in boot where the normal ioremap can be be used. We would have to check for initcalls which depend on the PHBs being setup and delay those too though. Oliver
Re: ioremap() called early from pnv_pci_init_ioda_phb()
> On May 8, 2020, at 10:39 AM, Qian Cai wrote: > > Booting POWER9 PowerNV has this message, > > "ioremap() called early from pnv_pci_init_ioda_phb+0x420/0xdfc. Use > early_ioremap() instead” > > but use the patch below will result in leaks because it will never call > early_iounmap() anywhere. However, it looks me it was by design that > phb->regs mapping would be there forever where it would be used in > pnv_ioda_get_inval_reg(), so is just that check_early_ioremap_leak() initcall > too strong? > > --- a/arch/powerpc/platforms/powernv/pci-ioda.c > +++ b/arch/powerpc/platforms/powernv/pci-ioda.c > @@ -36,6 +36,7 @@ > #include > #include > #include > +#include > > #include > > @@ -3827,7 +3828,7 @@ static void __init pnv_pci_init_ioda_phb(struct > device_node *np, >/* Get registers */ >if (!of_address_to_resource(np, 0, )) { >phb->regs_phys = r.start; > - phb->regs = ioremap(r.start, resource_size()); > + phb->regs = early_ioremap(r.start, resource_size()); >if (phb->regs == NULL) >pr_err(" Failed to map registers !\n”); This will also trigger a panic with debugfs reads, so isn’t that this commit bogus at least for powerpc64? d538aadc2718 (“powerpc/ioremap: warn on early use of ioremap()") 11017.617022][T122068] Faulting instruction address: 0xc00db564 [11017.617257][T122066] Faulting instruction address: 0xc00db564 [11017.617950][T122073] Faulting instruction address: 0xc00db564 [11017.61][T122064] BUG: Unable to handle kernel data access on read at 0xffe20e10 [11017.618935][T122064] Faulting instruction address: 0xc00db564 [11017.737996][T122072] [11017.738010][T122073] Oops: Kernel access of bad area, sig: 11 [#2] [11017.738024][T122073] LE PAGE_SIZE=64K MMU=Radix SMP NR_CPUS=256 DEBUG_PAGEALLOC NUMA PowerNV [11017.738051][T122073] Modules linked in: brd ext4 crc16 mbcache jbd2 loop kvm_hv kvm ip_tables x_tables xfs sd_mod bnx2x ahci libahci tg3 mdio libata libphy firmware_class dm_mirror dm_region_hash dm_log dm_mod [11017.738110][T122073] CPU: 108 PID: 122073 Comm: read_all Tainted: G D W 5.7.0-rc4-next-20200508+ #4 [11017.738138][T122073] NIP: c00db564 LR: c056f660 CTR: c00db550 [11017.738173][T122073] REGS: c00374f6f980 TRAP: 0380 Tainted: G D W (5.7.0-rc4-next-20200508+) [11017.738234][T122073] MSR: 90009033 CR: 22002282 XER: 2004 [11017.738278][T122073] CFAR: c056f65c IRQMASK: 0 [11017.738278][T122073] GPR00: c056f660 c00374f6fc10 c1689400 c000201ffc41aa00 [11017.738278][T122073] GPR04: c00374f6fc70 0001 [11017.738278][T122073] GPR08: ffe2 c008ee380080 [11017.738278][T122073] GPR12: c00db550 c000201fff671280 [11017.738278][T122073] GPR16: 0002 10040800 1001ccd8 1001cc80 [11017.738278][T122073] GPR20: 1001cc98 1001ccc8 1001cca8 1001cb48 [11017.738278][T122073] GPR24: 03ff 7fffebb67390 [11017.738278][T122073] GPR28: c00374f6fd90 c000200c0c6a7550 c000200c0c6a7500 [11017.738542][T122073] NIP [c00db564] pnv_eeh_dbgfs_get_inbB+0x14/0x30 [11017.738579][T122073] LR [c056f660] simple_attr_read+0xa0/0x180 [11017.738613][T122073] Call Trace: [11017.738645][T122073] [c00374f6fc10] [c056f630] simple_attr_read+0x70/0x180 (unreliable) [11017.738672][T122073] [c00374f6fcb0] [c064a2e0] full_proxy_read+0x90/0xe0 [11017.738686][T122073] [c00374f6fd00] [c051fe0c] __vfs_read+0x3c/0x70 [11017.738722][T122073] [c00374f6fd20] [c051feec] vfs_read+0xac/0x170 [11017.738757][T122073] [c00374f6fd70] [c052034c] ksys_read+0x7c/0x140 [11017.738818][T122073] [c00374f6fdc0] [c0038af4] system_call_exception+0x114/0x1e0 [11017.738867][T122073] [c00374f6fe20] [c000c8f0] system_call_common+0xf0/0x278 [11017.738916][T122073] Instruction dump: [11017.738948][T122073] 7c0004ac f9490d10 a14d0c78 3860 b14d0c7a 4e800020 6000 7c0802a6 [11017.739001][T122073] 6000 e9230278 e9290028 7c0004ac 0c09 4c00012c 3860 [11017.739052][T122073] ---[ end trace f68728a0d3053b5e ]--- [11017.828156][T122073] [11017.828170][T122068] Oops: Kernel access of bad area, sig: 11 [#3] [11017.828184][T122068] LE PAGE_SIZE=64K MMU=Radix SMP NR_CPUS=256 DEBUG_PAGEALLOC NUMA PowerNV [11017.828209][T122068] Modules linked in: brd ext4 crc16 mbcache jbd2 loop kvm_hv kvm ip_tables x_tables xfs sd_mod bnx2x ahci libahci tg3 mdio libata libphy firmwa
ioremap() called early from pnv_pci_init_ioda_phb()
Booting POWER9 PowerNV has this message, "ioremap() called early from pnv_pci_init_ioda_phb+0x420/0xdfc. Use early_ioremap() instead” but use the patch below will result in leaks because it will never call early_iounmap() anywhere. However, it looks me it was by design that phb->regs mapping would be there forever where it would be used in pnv_ioda_get_inval_reg(), so is just that check_early_ioremap_leak() initcall too strong? --- a/arch/powerpc/platforms/powernv/pci-ioda.c +++ b/arch/powerpc/platforms/powernv/pci-ioda.c @@ -36,6 +36,7 @@ #include #include #include +#include #include @@ -3827,7 +3828,7 @@ static void __init pnv_pci_init_ioda_phb(struct device_node *np, /* Get registers */ if (!of_address_to_resource(np, 0, )) { phb->regs_phys = r.start; - phb->regs = ioremap(r.start, resource_size()); + phb->regs = early_ioremap(r.start, resource_size()); if (phb->regs == NULL) pr_err(" Failed to map registers !\n”); [ 23.080069][T1] [ cut here ] [ 23.080089][T1] Debug warning: early ioremap leak of 10 areas detected. [ 23.080089][T1] please boot with early_ioremap_debug and report the dmesg. [ 23.080157][T1] WARNING: CPU: 4 PID: 1 at mm/early_ioremap.c:99 check_early_ioremap_leak+0xd4/0x108 [ 23.080171][T1] Modules linked in: [ 23.080192][T1] CPU: 4 PID: 1 Comm: swapper/0 Not tainted 5.7.0-rc4-next-20200508+ #4 [ 23.080214][T1] NIP: c103f2d8 LR: c103f2d4 CTR: [ 23.080226][T1] REGS: c0003df0f860 TRAP: 0700 Not tainted (5.7.0-rc4-next-20200508+) [ 23.080259][T1] MSR: 90029033 CR: 48000222 XER: 2004 [ 23.080296][T1] CFAR: c010d5a8 IRQMASK: 0 [ 23.080296][T1] GPR00: c103f2d4 c0003df0faf0 c1689400 0072 [ 23.080296][T1] GPR04: 0006 c0003df0f7e4 0004 [ 23.080296][T1] GPR08: 001ffbb6 c0003dee6680 0002 [ 23.080296][T1] GPR12: c01fae00 c1057860 c10578b0 [ 23.080296][T1] GPR16: c1002d38 c14f0660 c14f0680 c14f06a0 [ 23.080296][T1] GPR20: c14f06c0 c14f06e0 c14f0700 c14f0720 [ 23.080296][T1] GPR24: c0c4bc30 c00486b82000 c15a0fe0 c15a0fc0 [ 23.080296][T1] GPR28: 0010 0010 c1061e30 000a [ 23.080507][T1] NIP [c103f2d8] check_early_ioremap_leak+0xd4/0x108 [ 23.080530][T1] LR [c103f2d4] check_early_ioremap_leak+0xd0/0x108 [ 23.080552][T1] Call Trace: [ 23.080571][T1] [c0003df0faf0] [c103f2d4] check_early_ioremap_leak+0xd0/0x108 (unreliable) [ 23.080607][T1] [c0003df0fb80] [c001130c] do_one_initcall+0xcc/0x660 [ 23.080648][T1] [c0003df0fc80] [c1004c18] kernel_init_freeable+0x480/0x568 [ 23.080681][T1] [c0003df0fdb0] [c0012180] kernel_init+0x24/0x194 [ 23.080713][T1] [c0003df0fe20] [c000cb28] ret_from_kernel_thread+0x5c/0x74 This is from the early_ioremap_debug dmesg. [0.00][T0] [ cut here ] [0.00][T0] __early_ioremap(0x000600c3c001, 0001) [0] => + ffbe [0.00][T0] WARNING: CPU: 0 PID: 0 at mm/early_ioremap.c:162 __early_ioremap+0x2d8/0x408 [0.00][T0] Modules linked in: [0.00][T0] CPU: 0 PID: 0 Comm: swapper Not tainted 5.7.0-rc4-next-20200508+ #4 [0.00][T0] NIP: c103f5e4 LR: c103f5e0 CTR: c01e77f0 [0.00][T0] REGS: c168f980 TRAP: 0700 Not tainted (5.7.0-rc4-next-20200508+) [0.00][T0] MSR: 90021033 CR: 28000248 XER: 2004 [0.00][T0] CFAR: c010d5a8 IRQMASK: 1 [0.00][T0] GPR00: c103f5e0 c168fc10 c1689400 0050 [0.00][T0] GPR04: c152f6f8 c168f904 [0.00][T0] GPR08: c162f600 0002 [0.00][T0] GPR12: c01e77f0 c5b3 [0.00][T0] GPR16: 1000 [0.00][T0] GPR20: 81ae [0.00][T0] GPR24: 0001 c1061da8 0008 0008 [0.00][T0] GPR28: c1061db0 c1061eb8 [0.00][T0] NIP [c103f5e4] __early_ioremap+0x2d8/0x4