Re: [PATCH v2 15/16] powerpc/powernv/sriov: Make single PE mode a per-BAR setting
Nathan Chancellor writes: > On Sun, Aug 02, 2020 at 11:12:23PM +1000, Michael Ellerman wrote: >> Nathan Chancellor writes: >> > On Wed, Jul 22, 2020 at 04:57:14PM +1000, Oliver O'Halloran wrote: >> >> Using single PE BARs to map an SR-IOV BAR is really a choice about what >> >> strategy to use when mapping a BAR. It doesn't make much sense for this to >> >> be a global setting since a device might have one large BAR which needs to >> >> be mapped with single PE windows and another smaller BAR that can be >> >> mapped >> >> with a regular segmented window. Make the segmented vs single decision a >> >> per-BAR setting and clean up the logic that decides which mode to use. >> >> >> >> Signed-off-by: Oliver O'Halloran >> >> --- >> >> v2: Dropped unused total_vfs variables in >> >> pnv_pci_ioda_fixup_iov_resources() >> >> Dropped bar_no from pnv_pci_iov_resource_alignment() >> >> Minor re-wording of comments. >> >> --- >> >> arch/powerpc/platforms/powernv/pci-sriov.c | 131 ++--- >> >> arch/powerpc/platforms/powernv/pci.h | 11 +- >> >> 2 files changed, 73 insertions(+), 69 deletions(-) >> >> >> >> diff --git a/arch/powerpc/platforms/powernv/pci-sriov.c >> >> b/arch/powerpc/platforms/powernv/pci-sriov.c >> >> index ce8ad6851d73..76215d01405b 100644 >> >> --- a/arch/powerpc/platforms/powernv/pci-sriov.c >> >> +++ b/arch/powerpc/platforms/powernv/pci-sriov.c >> >> @@ -260,42 +256,40 @@ void pnv_pci_ioda_fixup_iov(struct pci_dev *pdev) >> >> resource_size_t pnv_pci_iov_resource_alignment(struct pci_dev *pdev, >> >> int resno) >> >> { >> >> - struct pnv_phb *phb = pci_bus_to_pnvhb(pdev->bus); >> >> struct pnv_iov_data *iov = pnv_iov_get(pdev); >> >> resource_size_t align; >> >> >> >> + /* >> >> + * iov can be null if we have an SR-IOV device with IOV BAR that can't >> >> + * be placed in the m64 space (i.e. The BAR is 32bit or non-prefetch). >> >> + * In that case we don't allow VFs to be enabled since one of their >> >> + * BARs would not be placed in the correct PE. >> >> + */ >> >> + if (!iov) >> >> + return align; >> >> + if (!iov->vfs_expanded) >> >> + return align; >> >> + >> >> + align = pci_iov_resource_size(pdev, resno); >> >> That's, oof. >> >> > I am not sure if it has been reported yet but clang points out that >> > align is initialized after its use: >> > >> > arch/powerpc/platforms/powernv/pci-sriov.c:267:10: warning: variable >> > 'align' is uninitialized when used here [-Wuninitialized] >> > return align; >> >^ >> > arch/powerpc/platforms/powernv/pci-sriov.c:258:23: note: initialize the >> > variable 'align' to silence this warning >> > resource_size_t align; >> > ^ >> > = 0 >> > 1 warning generated. >> >> But I can't get gcc to warn about it? >> >> It produces some code, so it's not like the whole function has been >> elided or something. I'm confused. > > -Wmaybe-uninitialized was disabled in commit 78a5255ffb6a ("Stop the > ad-hoc games with -Wno-maybe-initialized") upstream so GCC won't warn on > stuff like this anymore. Seems so. Just that there's no "maybe" here, it's very uninitialised. > I would assume the function should still be generated since those checks > are relevant, just the return value is bogus. Yeah, just sometimes missing warnings boil down to the compiler eliding whole sections of code, if it can convince itself they're unreachable. AFAICS there's nothing weird going on here that should confuse GCC, it's about as straight forward as it gets. Actually I can reproduce it with: $ cat > test.c <
Re: [merge] Build failure selftest/powerpc/mm/pkey_exec_prot
> On 02-Aug-2020, at 10:58 PM, Sandipan Das wrote: > > Hi Sachin, > > On 02/08/20 4:45 pm, Sachin Sant wrote: >> pkey_exec_prot test from linuxppc merge branch (3f68564f1f5a) fails to >> build due to following error: >> >> gcc -std=gnu99 -O2 -Wall -Werror >> -DGIT_VERSION='"v5.8-rc7-1276-g3f68564f1f5a"' >> -I/home/sachin/linux/tools/testing/selftests/powerpc/include -m64 >> pkey_exec_prot.c >> /home/sachin/linux/tools/testing/selftests/kselftest_harness.h >> /home/sachin/linux/tools/testing/selftests/kselftest.h ../harness.c >> ../utils.c -o >> /home/sachin/linux/tools/testing/selftests/powerpc/mm/pkey_exec_prot >> In file included from pkey_exec_prot.c:18: >> /home/sachin/linux/tools/testing/selftests/powerpc/include/pkeys.h:34: >> error: "SYS_pkey_mprotect" redefined [-Werror] >> #define SYS_pkey_mprotect 386 >> >> In file included from /usr/include/sys/syscall.h:31, >> from >> /home/sachin/linux/tools/testing/selftests/powerpc/include/utils.h:47, >> from >> /home/sachin/linux/tools/testing/selftests/powerpc/include/pkeys.h:12, >> from pkey_exec_prot.c:18: >> /usr/include/bits/syscall.h:1583: note: this is the location of the previous >> definition >> # define SYS_pkey_mprotect __NR_pkey_mprotect >> >> commit 128d3d021007 introduced this error. >> selftests/powerpc: Move pkey helpers to headers >> >> Possibly the # defines for sys calls can be retained in pkey_exec_prot.c or >> > > I am unable to reproduce this on the latest merge branch (HEAD at > f59195f7faa4). > I don't see any redefinitions in pkey_exec_prot.c either. > I can still see this problem on latest merge branch. I have following gcc version gcc version 8.3.1 20191121 # git show commit f59195f7faa4896b7c1d947ac2dba29ec18ad569 (HEAD -> merge, origin/merge) Merge: 70ce795dac09 ac3a0c847296 Author: Michael Ellerman Date: Sun Aug 2 23:18:03 2020 +1000 Automatic merge of 'master', 'next' and 'fixes' (2020-08-02 23:18) # make -C powerpc …… …... BUILD_TARGET=/home/sachin/linux/tools/testing/selftests/powerpc/mm; mkdir -p $BUILD_TARGET; make OUTPUT=$BUILD_TARGET -k -C mm all make[1]: Entering directory '/home/sachin/linux/tools/testing/selftests/powerpc/mm' gcc -std=gnu99 -O2 -Wall -Werror -DGIT_VERSION='"v5.8-rc7-1456-gf59195f7faa4"' -I/home/sachin/linux/tools/testing/selftests/powerpc/include -m64 pkey_exec_prot.c /home/sachin/linux/tools/testing/selftests/kselftest_harness.h /home/sachin/linux/tools/testing/selftests/kselftest.h ../harness.c ../utils.c -o /home/sachin/linux/tools/testing/selftests/powerpc/mm/pkey_exec_prot In file included from pkey_exec_prot.c:18: /home/sachin/linux/tools/testing/selftests/powerpc/include/pkeys.h:34: error: "SYS_pkey_mprotect" redefined [-Werror] #define SYS_pkey_mprotect 386 In file included from /usr/include/sys/syscall.h:31, from /home/sachin/linux/tools/testing/selftests/powerpc/include/utils.h:47, from /home/sachin/linux/tools/testing/selftests/powerpc/include/pkeys.h:12, from pkey_exec_prot.c:18: /usr/include/bits/syscall.h:1583: note: this is the location of the previous definition # define SYS_pkey_mprotect __NR_pkey_mprotect In file included from pkey_exec_prot.c:18: /home/sachin/linux/tools/testing/selftests/powerpc/include/pkeys.h:35: error: "SYS_pkey_alloc" redefined [-Werror] #define SYS_pkey_alloc 384 In file included from /usr/include/sys/syscall.h:31, from /home/sachin/linux/tools/testing/selftests/powerpc/include/utils.h:47, from /home/sachin/linux/tools/testing/selftests/powerpc/include/pkeys.h:12, from pkey_exec_prot.c:18: /usr/include/bits/syscall.h:1575: note: this is the location of the previous definition # define SYS_pkey_alloc __NR_pkey_alloc In file included from pkey_exec_prot.c:18: /home/sachin/linux/tools/testing/selftests/powerpc/include/pkeys.h:36: error: "SYS_pkey_free" redefined [-Werror] #define SYS_pkey_free 385 In file included from /usr/include/sys/syscall.h:31, from /home/sachin/linux/tools/testing/selftests/powerpc/include/utils.h:47, from /home/sachin/linux/tools/testing/selftests/powerpc/include/pkeys.h:12, from pkey_exec_prot.c:18: /usr/include/bits/syscall.h:1579: note: this is the location of the previous definition # define SYS_pkey_free __NR_pkey_free cc1: all warnings being treated as errors make[1]: *** [../../lib.mk:142: /home/sachin/linux/tools/testing/selftests/powerpc/mm/pkey_exec_prot] Error 1 gcc -std=gnu99 -O2 -Wall -Werror -DGIT_VERSION='"v5.8-rc7-1456-gf59195f7faa4"' -I/home/sachin/linux/tools/testing/selftests/powerpc/include -m64 pkey_siginfo.c /home/sachin/linux/tools/testing/selftests/kselftest_harness.h /home/sachin/linux/tools/testing/selftests/kselftest.h ../harness.c ../utils.c -lpthread -o /home/sachin/linux/tools/testing/selftests/powerpc/mm/pkey_siginfo
Re: powerpc: build failures in Linus' tree
On Mon, Aug 03, 2020 at 02:10:17PM +1000, Stephen Rothwell wrote: > Our mails have crossed. Ah indeed :-) > I just sent a more comprehensive patch. I > think your patch would require a lot of build testing and even then may > fail for some CONFIG combination that we didn't test or added in the > future (or someone just made up). Your looks far more complete and very likely more future-proof, I totally agree. Thanks! Willy
Re: [PATCH] powerpc: fix up PPC_FSL_BOOK3E build
Hi all, On Mon, 3 Aug 2020 13:54:47 +1000 Stephen Rothwell wrote: > > Commit > > 1c9df907da83 ("random: fix circular include dependency on arm64 after > addition of percpu.h") > > exposed a curcular include dependency: > > asm/mmu.h includes asm/percpu.h, which includes asm/paca.h, which > includes asm/mmu.h > > So fix it by extracting the small part of asm/mmu.h that needs > asm/percu.h into a new file and including that where necessary. > > Cc: Willy Tarreau > Cc: > Signed-off-by: Stephen Rothwell I should have put: Fixes: 1c9df907da83 ("random: fix circular include dependency on arm64 after addition of percpu.h") -- Cheers, Stephen Rothwell pgpuL69Veke2e.pgp Description: OpenPGP digital signature
Re: [PATCH v2 15/16] powerpc/powernv/sriov: Make single PE mode a per-BAR setting
On Sun, Aug 02, 2020 at 11:12:23PM +1000, Michael Ellerman wrote: > Nathan Chancellor writes: > > On Wed, Jul 22, 2020 at 04:57:14PM +1000, Oliver O'Halloran wrote: > >> Using single PE BARs to map an SR-IOV BAR is really a choice about what > >> strategy to use when mapping a BAR. It doesn't make much sense for this to > >> be a global setting since a device might have one large BAR which needs to > >> be mapped with single PE windows and another smaller BAR that can be mapped > >> with a regular segmented window. Make the segmented vs single decision a > >> per-BAR setting and clean up the logic that decides which mode to use. > >> > >> Signed-off-by: Oliver O'Halloran > >> --- > >> v2: Dropped unused total_vfs variables in > >> pnv_pci_ioda_fixup_iov_resources() > >> Dropped bar_no from pnv_pci_iov_resource_alignment() > >> Minor re-wording of comments. > >> --- > >> arch/powerpc/platforms/powernv/pci-sriov.c | 131 ++--- > >> arch/powerpc/platforms/powernv/pci.h | 11 +- > >> 2 files changed, 73 insertions(+), 69 deletions(-) > >> > >> diff --git a/arch/powerpc/platforms/powernv/pci-sriov.c > >> b/arch/powerpc/platforms/powernv/pci-sriov.c > >> index ce8ad6851d73..76215d01405b 100644 > >> --- a/arch/powerpc/platforms/powernv/pci-sriov.c > >> +++ b/arch/powerpc/platforms/powernv/pci-sriov.c > >> @@ -260,42 +256,40 @@ void pnv_pci_ioda_fixup_iov(struct pci_dev *pdev) > >> resource_size_t pnv_pci_iov_resource_alignment(struct pci_dev *pdev, > >> int resno) > >> { > >> - struct pnv_phb *phb = pci_bus_to_pnvhb(pdev->bus); > >>struct pnv_iov_data *iov = pnv_iov_get(pdev); > >>resource_size_t align; > >> > >> + /* > >> + * iov can be null if we have an SR-IOV device with IOV BAR that can't > >> + * be placed in the m64 space (i.e. The BAR is 32bit or non-prefetch). > >> + * In that case we don't allow VFs to be enabled since one of their > >> + * BARs would not be placed in the correct PE. > >> + */ > >> + if (!iov) > >> + return align; > >> + if (!iov->vfs_expanded) > >> + return align; > >> + > >> + align = pci_iov_resource_size(pdev, resno); > > That's, oof. > > > I am not sure if it has been reported yet but clang points out that > > align is initialized after its use: > > > > arch/powerpc/platforms/powernv/pci-sriov.c:267:10: warning: variable > > 'align' is uninitialized when used here [-Wuninitialized] > > return align; > >^ > > arch/powerpc/platforms/powernv/pci-sriov.c:258:23: note: initialize the > > variable 'align' to silence this warning > > resource_size_t align; > > ^ > > = 0 > > 1 warning generated. > > But I can't get gcc to warn about it? > > It produces some code, so it's not like the whole function has been > elided or something. I'm confused. > > cheers -Wmaybe-uninitialized was disabled in commit 78a5255ffb6a ("Stop the ad-hoc games with -Wno-maybe-initialized") upstream so GCC won't warn on stuff like this anymore. I would assume the function should still be generated since those checks are relevant, just the return value is bogus. Cheers, Nathan
Re: powerpc: build failures in Linus' tree
Hi Willy, On Mon, 3 Aug 2020 05:45:47 +0200 Willy Tarreau wrote: > > On Sun, Aug 02, 2020 at 07:20:19PM +0200, Willy Tarreau wrote: > > On Sun, Aug 02, 2020 at 08:48:42PM +1000, Stephen Rothwell wrote: > > > > > > We are getting build failures in some PowerPC configs for Linus' tree. > > > See e.g. http://kisskb.ellerman.id.au/kisskb/buildresult/14306515/ > > > > > > In file included from /kisskb/src/arch/powerpc/include/asm/paca.h:18, > > > from /kisskb/src/arch/powerpc/include/asm/percpu.h:13, > > > from /kisskb/src/include/linux/random.h:14, > > > from /kisskb/src/include/linux/net.h:18, > > > from /kisskb/src/net/ipv6/ip6_fib.c:20: > > > /kisskb/src/arch/powerpc/include/asm/mmu.h:139:22: error: unknown type > > > name 'next_tlbcam_idx' > > > 139 | DECLARE_PER_CPU(int, next_tlbcam_idx); > > > > > > I assume this is caused by commit > > > > > > 1c9df907da83 ("random: fix circular include dependency on arm64 after > > > addition of percpu.h") > > > > > > But I can't see how, sorry. > > > > So there, asm/mmu.h includes asm/percpu.h, which includes asm/paca.h, which > > includes asm/mmu.h. > > > > I suspect that we can remove asm/paca.h from asm/percpu.h as it *seems* > > to be only used by the #define __my_cpu_offset but I don't know if anything > > will break further, especially if this __my_cpu_offset is used anywhere > > without this paca definition. > > I tried this and it fixed 5.8 for me with your config above. I'm appending > a patch that does just this. I didn't test other configs as I don't know > which ones to test though. If it fixes the problem for you, maybe it can > be picked by the PPC maintainers. Our mails have crossed. I just sent a more comprehensive patch. I think your patch would require a lot of build testing and even then may fail for some CONFIG combination that we didn't test or added in the future (or someone just made up). -- Cheers, Stephen Rothwell pgpuqxnLJXjcP.pgp Description: OpenPGP digital signature
[PATCH] powerpc: Fix P10 PVR revision in /proc/cpuinfo for SMT4 cores
On POWER10 bit 12 in the PVR indicates if the core is SMT4 or SMT8. Bit 12 is set for SMT4. Without this patch, /proc/cpuinfo on a SMT4 DD1 POWER10 looks like this: cpu : POWER10, altivec supported revision: 17.0 (pvr 0080 1100) Signed-off-by: Michael Neuling --- arch/powerpc/kernel/setup-common.c | 1 + 1 file changed, 1 insertion(+) diff --git a/arch/powerpc/kernel/setup-common.c b/arch/powerpc/kernel/setup-common.c index b198b0ff25..808ec9fab6 100644 --- a/arch/powerpc/kernel/setup-common.c +++ b/arch/powerpc/kernel/setup-common.c @@ -311,6 +311,7 @@ static int show_cpuinfo(struct seq_file *m, void *v) min = pvr & 0xFF; break; case 0x004e: /* POWER9 bits 12-15 give chip type */ + case 0x0080: /* POWER10 bit 12 gives SMT8/4 */ maj = (pvr >> 8) & 0x0F; min = pvr & 0xFF; break; -- 2.26.2
[PATCH] powerpc: fix up PPC_FSL_BOOK3E build
Commit 1c9df907da83 ("random: fix circular include dependency on arm64 after addition of percpu.h") exposed a curcular include dependency: asm/mmu.h includes asm/percpu.h, which includes asm/paca.h, which includes asm/mmu.h So fix it by extracting the small part of asm/mmu.h that needs asm/percu.h into a new file and including that where necessary. Cc: Willy Tarreau Cc: Signed-off-by: Stephen Rothwell --- I have done powerpc test builds of allmodconfig, ppc64e_defconfig and corenet64_smp_defconfig. arch/powerpc/include/asm/mmu.h | 5 - arch/powerpc/include/asm/mmu_fsl_e.h| 10 ++ arch/powerpc/kernel/smp.c | 1 + arch/powerpc/mm/mem.c | 1 + arch/powerpc/mm/nohash/book3e_hugetlbpage.c | 1 + arch/powerpc/mm/nohash/tlb.c| 1 + 6 files changed, 14 insertions(+), 5 deletions(-) create mode 100644 arch/powerpc/include/asm/mmu_fsl_e.h diff --git a/arch/powerpc/include/asm/mmu.h b/arch/powerpc/include/asm/mmu.h index f4ac25d4df05..fa602a4cf303 100644 --- a/arch/powerpc/include/asm/mmu.h +++ b/arch/powerpc/include/asm/mmu.h @@ -134,11 +134,6 @@ typedef pte_t *pgtable_t; -#ifdef CONFIG_PPC_FSL_BOOK3E -#include -DECLARE_PER_CPU(int, next_tlbcam_idx); -#endif - enum { MMU_FTRS_POSSIBLE = #ifdef CONFIG_PPC_BOOK3S diff --git a/arch/powerpc/include/asm/mmu_fsl_e.h b/arch/powerpc/include/asm/mmu_fsl_e.h new file mode 100644 index ..c74a81556ce5 --- /dev/null +++ b/arch/powerpc/include/asm/mmu_fsl_e.h @@ -0,0 +1,10 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +#ifndef _ASM_POWERPC_MMU_FSL_E_H_ +#define _ASM_POWERPC_MMU_FSL_E_H_ + +#ifdef CONFIG_PPC_FSL_BOOK3E +#include +DECLARE_PER_CPU(int, next_tlbcam_idx); +#endif + +#endif diff --git a/arch/powerpc/kernel/smp.c b/arch/powerpc/kernel/smp.c index 73199470c265..142b3e7882bf 100644 --- a/arch/powerpc/kernel/smp.c +++ b/arch/powerpc/kernel/smp.c @@ -35,6 +35,7 @@ #include #include +#include #include #include #include diff --git a/arch/powerpc/mm/mem.c b/arch/powerpc/mm/mem.c index c2c11eb8dcfc..7371061b2126 100644 --- a/arch/powerpc/mm/mem.c +++ b/arch/powerpc/mm/mem.c @@ -37,6 +37,7 @@ #include #include #include +#include #include #include #include diff --git a/arch/powerpc/mm/nohash/book3e_hugetlbpage.c b/arch/powerpc/mm/nohash/book3e_hugetlbpage.c index 8b88be91b622..cacda4ee5da5 100644 --- a/arch/powerpc/mm/nohash/book3e_hugetlbpage.c +++ b/arch/powerpc/mm/nohash/book3e_hugetlbpage.c @@ -9,6 +9,7 @@ #include #include +#include #include #ifdef CONFIG_PPC64 diff --git a/arch/powerpc/mm/nohash/tlb.c b/arch/powerpc/mm/nohash/tlb.c index 696f568253a0..8b3a68ce7fde 100644 --- a/arch/powerpc/mm/nohash/tlb.c +++ b/arch/powerpc/mm/nohash/tlb.c @@ -171,6 +171,7 @@ int extlb_level_exc; #endif /* CONFIG_PPC64 */ #ifdef CONFIG_PPC_FSL_BOOK3E +#include /* next_tlbcam_idx is used to round-robin tlbcam entry assignment */ DEFINE_PER_CPU(int, next_tlbcam_idx); EXPORT_PER_CPU_SYMBOL(next_tlbcam_idx); -- 2.28.0 -- Cheers, Stephen Rothwell pgpxKNXSZepmO.pgp Description: OpenPGP digital signature
Re: powerpc: build failures in Linus' tree
Hi again Stephen, On Sun, Aug 02, 2020 at 07:20:19PM +0200, Willy Tarreau wrote: > On Sun, Aug 02, 2020 at 08:48:42PM +1000, Stephen Rothwell wrote: > > Hi all, > > > > We are getting build failures in some PowerPC configs for Linus' tree. > > See e.g. http://kisskb.ellerman.id.au/kisskb/buildresult/14306515/ > > > > In file included from /kisskb/src/arch/powerpc/include/asm/paca.h:18, > > from /kisskb/src/arch/powerpc/include/asm/percpu.h:13, > > from /kisskb/src/include/linux/random.h:14, > > from /kisskb/src/include/linux/net.h:18, > > from /kisskb/src/net/ipv6/ip6_fib.c:20: > > /kisskb/src/arch/powerpc/include/asm/mmu.h:139:22: error: unknown type name > > 'next_tlbcam_idx' > > 139 | DECLARE_PER_CPU(int, next_tlbcam_idx); > > > > I assume this is caused by commit > > > > 1c9df907da83 ("random: fix circular include dependency on arm64 after > > addition of percpu.h") > > > > But I can't see how, sorry. > > So there, asm/mmu.h includes asm/percpu.h, which includes asm/paca.h, which > includes asm/mmu.h. > > I suspect that we can remove asm/paca.h from asm/percpu.h as it *seems* > to be only used by the #define __my_cpu_offset but I don't know if anything > will break further, especially if this __my_cpu_offset is used anywhere > without this paca definition. I tried this and it fixed 5.8 for me with your config above. I'm appending a patch that does just this. I didn't test other configs as I don't know which ones to test though. If it fixes the problem for you, maybe it can be picked by the PPC maintainers. Willy >From bcd64a7d0f3445c9a75d3b4dc4837d2ce61660c9 Mon Sep 17 00:00:00 2001 From: Willy Tarreau Date: Mon, 3 Aug 2020 05:27:57 +0200 Subject: powerpc: fix circular dependency in percpu.h After random.h started to include percpu.h (commit f227e3e), several archs broke in circular dependencies around percpu.h. In https://lore.kernel.org/lkml/20200802204842.36bca...@canb.auug.org.au/ Stephen Rothwell reported breakage for powerpc with CONFIG_PPC_FSL_BOOK3E. It turns out that asm/percpu.h includes asm/paca.h, which itself includes mmu.h, which includes percpu.h when CONFIG_PPC_FSL_BOOK3E=y. Percpu seems to include asm/paca.h only for local_paca which is used in the __my_cpu_offset macro. Removing this include solves the issue for this config. Reported-by: Stephen Rothwell Fixes: f227e3e ("random32: update the net random state on interrupt and activity") Link: https://lore.kernel.org/lkml/20200802204842.36bca...@canb.auug.org.au/ Cc: Linus Torvalds Cc: Michael Ellerman Cc: Benjamin Herrenschmidt Cc: Paul Mackerras Signed-off-by: Willy Tarreau --- arch/powerpc/include/asm/percpu.h | 2 -- 1 file changed, 2 deletions(-) diff --git a/arch/powerpc/include/asm/percpu.h b/arch/powerpc/include/asm/percpu.h index dce863a..cd3f6e5 100644 --- a/arch/powerpc/include/asm/percpu.h +++ b/arch/powerpc/include/asm/percpu.h @@ -10,8 +10,6 @@ #ifdef CONFIG_SMP -#include - #define __my_cpu_offset local_paca->data_offset #endif /* CONFIG_SMP */ -- 2.9.0
[powerpc:merge] BUILD SUCCESS f59195f7faa4896b7c1d947ac2dba29ec18ad569
tree/branch: https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git merge branch HEAD: f59195f7faa4896b7c1d947ac2dba29ec18ad569 Automatic merge of 'master', 'next' and 'fixes' (2020-08-02 23:18) elapsed time: 801m configs tested: 99 configs skipped: 10 The following configs have been built successfully. More configs may be tested in the coming days. arm defconfig arm64allyesconfig arm64 defconfig arm allyesconfig arm allmodconfig powerpc chrp32_defconfig arc nsimosci_hs_smp_defconfig arm pxa910_defconfig shsh7757lcr_defconfig arm aspeed_g4_defconfig sh defconfig mips decstation_defconfig sh se7343_defconfig mips gcw0_defconfig arm pxa255-idp_defconfig armmini2440_defconfig sh rts7751r2d1_defconfig nios2 3c120_defconfig c6xevmc6457_defconfig h8300 defconfig armshmobile_defconfig armpleb_defconfig powerpcmpc7448_hpc2_defconfig armzeus_defconfig c6x dsk6455_defconfig sh ap325rxa_defconfig nios2 10m50_defconfig powerpc64 defconfig m68k bvme6000_defconfig mips tb0226_defconfig sh se7780_defconfig arcvdk_hs38_smp_defconfig arm milbeaut_m10v_defconfig c6xevmc6472_defconfig mips ath79_defconfig mips cavium_octeon_defconfig powerpc ppc64_defconfig mips mips_paravirt_defconfig ia64 allmodconfig ia64defconfig ia64 allyesconfig m68k allmodconfig m68kdefconfig m68k allyesconfig nds32 defconfig nios2allyesconfig cskydefconfig alpha defconfig alphaallyesconfig xtensa allyesconfig h8300allyesconfig arc defconfig sh allmodconfig parisc defconfig s390 allyesconfig parisc allyesconfig s390defconfig i386 allyesconfig sparcallyesconfig sparc defconfig i386defconfig nios2 defconfig arc allyesconfig nds32 allnoconfig c6x allyesconfig mips allyesconfig mips allmodconfig powerpc defconfig powerpc allyesconfig powerpc allmodconfig powerpc allnoconfig i386 randconfig-a004-20200802 i386 randconfig-a005-20200802 i386 randconfig-a001-20200802 i386 randconfig-a002-20200802 i386 randconfig-a003-20200802 i386 randconfig-a006-20200802 i386 randconfig-a011-20200802 i386 randconfig-a012-20200802 i386 randconfig-a015-20200802 i386 randconfig-a014-20200802 i386 randconfig-a013-20200802 i386 randconfig-a016-20200802 x86_64 randconfig-a006-20200802 x86_64 randconfig-a001-20200802 x86_64 randconfig-a004-20200802 x86_64 randconfig-a003-20200802 x86_64 randconfig-a002-20200802 x86_64 randconfig-a005-20200802 riscvallyesconfig riscv allnoconfig riscv defconfig riscvallmodconfig x86_64 rhel x86_64 allyesconfig x86_64rhel-7.6-kselftests x86_64 defconfig x86_64 rhel-8.3 x86_64 kexec --- 0-DAY CI Kernel Test Service, Intel Corporation https://lists.01.org
[powerpc:next] BUILD SUCCESS af0870c4e75655b1931d0a5ffde2f448a2794362
tree/branch: https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git next branch HEAD: af0870c4e75655b1931d0a5ffde2f448a2794362 powerpc/papr_scm: Add support for fetching nvdimm 'fuel-gauge' metric elapsed time: 800m configs tested: 96 configs skipped: 12 The following configs have been built successfully. More configs may be tested in the coming days. arm defconfig arm64allyesconfig arm64 defconfig arm allyesconfig arm allmodconfig mips jazz_defconfig mipsar7_defconfig c6x dsk6455_defconfig m68k amcore_defconfig ia64 gensparse_defconfig mips gcw0_defconfig arm pxa255-idp_defconfig armmini2440_defconfig powerpcadder875_defconfig powerpc g5_defconfig sh rts7751r2dplus_defconfig s390defconfig arm jornada720_defconfig sh rts7751r2d1_defconfig nios2 3c120_defconfig c6xevmc6457_defconfig h8300 defconfig armzeus_defconfig m68kdefconfig arm bcm2835_defconfig powerpc pseries_defconfig armcerfcube_defconfig armmulti_v7_defconfig nds32alldefconfig arc haps_hs_defconfig arm exynos_defconfig mips pistachio_defconfig powerpcamigaone_defconfig arm nhk8815_defconfig mips cavium_octeon_defconfig powerpc ppc64_defconfig mips mips_paravirt_defconfig ia64 allmodconfig ia64defconfig ia64 allyesconfig m68k allmodconfig m68k allyesconfig nios2 defconfig arc allyesconfig nds32 allnoconfig c6x allyesconfig nds32 defconfig nios2allyesconfig cskydefconfig alpha defconfig alphaallyesconfig xtensa allyesconfig h8300allyesconfig arc defconfig sh allmodconfig parisc defconfig s390 allyesconfig parisc allyesconfig i386 allyesconfig sparcallyesconfig sparc defconfig i386defconfig mips allyesconfig mips allmodconfig powerpc defconfig powerpc allyesconfig powerpc allmodconfig powerpc allnoconfig i386 randconfig-a004-20200802 i386 randconfig-a005-20200802 i386 randconfig-a006-20200802 i386 randconfig-a001-20200802 i386 randconfig-a002-20200802 i386 randconfig-a003-20200802 x86_64 randconfig-a006-20200802 x86_64 randconfig-a001-20200802 x86_64 randconfig-a004-20200802 x86_64 randconfig-a003-20200802 x86_64 randconfig-a002-20200802 x86_64 randconfig-a005-20200802 i386 randconfig-a011-20200802 i386 randconfig-a012-20200802 i386 randconfig-a015-20200802 i386 randconfig-a014-20200802 i386 randconfig-a013-20200802 i386 randconfig-a016-20200802 riscvallyesconfig riscv allnoconfig riscv defconfig riscvallmodconfig x86_64 rhel x86_64 allyesconfig x86_64rhel-7.6-kselftests x86_64 defconfig x86_64 kexec x86_64 rhel-8.3 --- 0-DAY CI Kernel Test Service, Intel Corporation https://lists.01.org/hyperkitty/list/kbuild-...@lists.01.org
[powerpc:next-test] BUILD SUCCESS 7f7917ae4d306a72ef9f6265028d8d203702f0b8
tree/branch: https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git next-test branch HEAD: 7f7917ae4d306a72ef9f6265028d8d203702f0b8 selftests/powerpc: Skip vmx/vsx tests on older CPUs elapsed time: 799m configs tested: 106 configs skipped: 12 The following configs have been built successfully. More configs may be tested in the coming days. arm defconfig arm64allyesconfig arm64 defconfig arm allyesconfig arm allmodconfig powerpc chrp32_defconfig arm pxa910_defconfig shsh7757lcr_defconfig sh se7780_defconfig arc nsimosci_hs_smp_defconfig arm colibri_pxa300_defconfig arm gemini_defconfig c6xevmc6678_defconfig armmulti_v5_defconfig m68km5272c3_defconfig mips gcw0_defconfig arm pxa255-idp_defconfig armmini2440_defconfig powerpcadder875_defconfig powerpc g5_defconfig sh rts7751r2dplus_defconfig s390defconfig arm jornada720_defconfig sh rts7751r2d1_defconfig nios2 3c120_defconfig c6xevmc6457_defconfig h8300 defconfig armzeus_defconfig c6x dsk6455_defconfig m68kdefconfig nds32alldefconfig arc haps_hs_defconfig arm exynos_defconfig mips pistachio_defconfig arm am200epdkit_defconfig arm lpc32xx_defconfig alphaallyesconfig mips ath25_defconfig mips maltaaprp_defconfig powerpcamigaone_defconfig arm nhk8815_defconfig arcvdk_hs38_smp_defconfig arm milbeaut_m10v_defconfig c6xevmc6472_defconfig mips ath79_defconfig mips cavium_octeon_defconfig powerpc ppc64_defconfig mips mips_paravirt_defconfig ia64 allmodconfig ia64defconfig ia64 allyesconfig m68k allmodconfig m68k allyesconfig nios2 defconfig arc allyesconfig nds32 allnoconfig c6x allyesconfig nds32 defconfig nios2allyesconfig cskydefconfig alpha defconfig xtensa allyesconfig h8300allyesconfig arc defconfig sh allmodconfig parisc defconfig s390 allyesconfig parisc allyesconfig i386 allyesconfig sparcallyesconfig sparc defconfig i386defconfig mips allyesconfig mips allmodconfig powerpc allyesconfig powerpc allmodconfig powerpc allnoconfig powerpc defconfig i386 randconfig-a004-20200802 i386 randconfig-a005-20200802 i386 randconfig-a001-20200802 i386 randconfig-a002-20200802 i386 randconfig-a003-20200802 i386 randconfig-a006-20200802 x86_64 randconfig-a006-20200802 x86_64 randconfig-a001-20200802 x86_64 randconfig-a004-20200802 x86_64 randconfig-a003-20200802 x86_64 randconfig-a002-20200802 x86_64 randconfig-a005-20200802 i386 randconfig-a011-20200802 i386 randconfig-a012-20200802 i386 randconfig-a015-20200802 i386 randconfig-a014-20200802 i386 randconfig-a013-20200802 i386 randconfig-a016-20200802 riscvallyesconfig riscv allnoconfig riscv defconfig riscvallmodconfig x86_64
[PATCH v2] selftests/powerpc: Skip vmx/vsx/tar/etc tests on older CPUs
Some of our tests use VSX or newer VMX instructions, so need to be skipped on older CPUs to avoid SIGILL'ing. Similarly TAR was added in v2.07, and the PMU event used in the stcx fail test only works on Power8 or later. Signed-off-by: Michael Ellerman --- tools/testing/selftests/powerpc/math/Makefile | 8 tools/testing/selftests/powerpc/math/vmx_preempt.c| 3 +++ tools/testing/selftests/powerpc/math/vmx_signal.c | 3 +++ tools/testing/selftests/powerpc/math/vmx_syscall.c| 7 ++- tools/testing/selftests/powerpc/math/vsx_preempt.c| 2 ++ tools/testing/selftests/powerpc/pmu/count_stcx_fail.c | 4 tools/testing/selftests/powerpc/ptrace/ptrace-tar.c | 3 +++ tools/testing/selftests/powerpc/ptrace/ptrace-vsx.c | 2 ++ tools/testing/selftests/powerpc/stringloops/Makefile | 2 +- tools/testing/selftests/powerpc/stringloops/memcmp.c | 6 ++ 10 files changed, 34 insertions(+), 6 deletions(-) v2: Skip a few more tests. diff --git a/tools/testing/selftests/powerpc/math/Makefile b/tools/testing/selftests/powerpc/math/Makefile index 4e2049d2fd8d..fcc91c205984 100644 --- a/tools/testing/selftests/powerpc/math/Makefile +++ b/tools/testing/selftests/powerpc/math/Makefile @@ -11,9 +11,9 @@ $(OUTPUT)/fpu_syscall: fpu_asm.S $(OUTPUT)/fpu_preempt: fpu_asm.S $(OUTPUT)/fpu_signal: fpu_asm.S -$(OUTPUT)/vmx_syscall: vmx_asm.S -$(OUTPUT)/vmx_preempt: vmx_asm.S -$(OUTPUT)/vmx_signal: vmx_asm.S +$(OUTPUT)/vmx_syscall: vmx_asm.S ../utils.c +$(OUTPUT)/vmx_preempt: vmx_asm.S ../utils.c +$(OUTPUT)/vmx_signal: vmx_asm.S ../utils.c $(OUTPUT)/vsx_preempt: CFLAGS += -mvsx -$(OUTPUT)/vsx_preempt: vsx_asm.S +$(OUTPUT)/vsx_preempt: vsx_asm.S ../utils.c diff --git a/tools/testing/selftests/powerpc/math/vmx_preempt.c b/tools/testing/selftests/powerpc/math/vmx_preempt.c index 2e059f154e77..6761d6ce30ec 100644 --- a/tools/testing/selftests/powerpc/math/vmx_preempt.c +++ b/tools/testing/selftests/powerpc/math/vmx_preempt.c @@ -57,6 +57,9 @@ int test_preempt_vmx(void) int i, rc, threads; pthread_t *tids; + // vcmpequd used in vmx_asm.S is v2.07 + SKIP_IF(!have_hwcap2(PPC_FEATURE2_ARCH_2_07)); + threads = sysconf(_SC_NPROCESSORS_ONLN) * THREAD_FACTOR; tids = malloc(threads * sizeof(pthread_t)); FAIL_IF(!tids); diff --git a/tools/testing/selftests/powerpc/math/vmx_signal.c b/tools/testing/selftests/powerpc/math/vmx_signal.c index 785a48e0976f..b340a5c4e79d 100644 --- a/tools/testing/selftests/powerpc/math/vmx_signal.c +++ b/tools/testing/selftests/powerpc/math/vmx_signal.c @@ -96,6 +96,9 @@ int test_signal_vmx(void) void *rc_p; pthread_t *tids; + // vcmpequd used in vmx_asm.S is v2.07 + SKIP_IF(!have_hwcap2(PPC_FEATURE2_ARCH_2_07)); + threads = sysconf(_SC_NPROCESSORS_ONLN) * THREAD_FACTOR; tids = malloc(threads * sizeof(pthread_t)); FAIL_IF(!tids); diff --git a/tools/testing/selftests/powerpc/math/vmx_syscall.c b/tools/testing/selftests/powerpc/math/vmx_syscall.c index 9ee293cc868e..03c78dfe3444 100644 --- a/tools/testing/selftests/powerpc/math/vmx_syscall.c +++ b/tools/testing/selftests/powerpc/math/vmx_syscall.c @@ -49,9 +49,14 @@ int test_vmx_syscall(void) * Setup an environment with much context switching */ pid_t pid2; - pid_t pid = fork(); + pid_t pid; int ret; int child_ret; + + // vcmpequd used in vmx_asm.S is v2.07 + SKIP_IF(!have_hwcap2(PPC_FEATURE2_ARCH_2_07)); + + pid = fork(); FAIL_IF(pid == -1); pid2 = fork(); diff --git a/tools/testing/selftests/powerpc/math/vsx_preempt.c b/tools/testing/selftests/powerpc/math/vsx_preempt.c index 63de9c6e2cd3..d1601bb889d4 100644 --- a/tools/testing/selftests/powerpc/math/vsx_preempt.c +++ b/tools/testing/selftests/powerpc/math/vsx_preempt.c @@ -92,6 +92,8 @@ int test_preempt_vsx(void) int i, rc, threads; pthread_t *tids; + SKIP_IF(!have_hwcap(PPC_FEATURE_HAS_VSX)); + threads = sysconf(_SC_NPROCESSORS_ONLN) * THREAD_FACTOR; tids = malloc(threads * sizeof(pthread_t)); FAIL_IF(!tids); diff --git a/tools/testing/selftests/powerpc/pmu/count_stcx_fail.c b/tools/testing/selftests/powerpc/pmu/count_stcx_fail.c index 7b4ac4537702..2980abca31e0 100644 --- a/tools/testing/selftests/powerpc/pmu/count_stcx_fail.c +++ b/tools/testing/selftests/powerpc/pmu/count_stcx_fail.c @@ -9,6 +9,7 @@ #include #include #include +#include #include "event.h" #include "utils.h" @@ -104,6 +105,9 @@ static int test_body(void) struct event events[3]; u64 overhead; + // The STCX_FAIL event we use works on Power8 or later + SKIP_IF(!have_hwcap2(PPC_FEATURE2_ARCH_2_07)); + setup_event([0], PERF_COUNT_HW_INSTRUCTIONS, PERF_TYPE_HARDWARE, "instructions"); setup_event([1], PERF_COUNT_HW_CPU_CYCLES, PERF_TYPE_HARDWARE, "cycles"); setup_event([2], PM_STCX_FAIL,
Re: [PATCH v2 17/17] memblock: use separate iterators for memory and reserved regions
* Mike Rapoport wrote: > From: Mike Rapoport > > for_each_memblock() is used to iterate over memblock.memory in > a few places that use data from memblock_region rather than the memory > ranges. > > Introduce separate for_each_mem_region() and for_each_reserved_mem_region() > to improve encapsulation of memblock internals from its users. > > Signed-off-by: Mike Rapoport > --- > .clang-format | 3 ++- > arch/arm64/kernel/setup.c | 2 +- > arch/arm64/mm/numa.c | 2 +- > arch/mips/netlogic/xlp/setup.c | 2 +- > arch/x86/mm/numa.c | 2 +- > include/linux/memblock.h | 19 --- > mm/memblock.c | 4 ++-- > mm/page_alloc.c| 8 > 8 files changed, 28 insertions(+), 14 deletions(-) The x86 part: Acked-by: Ingo Molnar Thanks, Ingo
Re: [PATCH v2 14/17] x86/setup: simplify reserve_crashkernel()
* Mike Rapoport wrote: > From: Mike Rapoport > > * Replace magic numbers with defines > * Replace memblock_find_in_range() + memblock_reserve() with > memblock_phys_alloc_range() > * Stop checking for low memory size in reserve_crashkernel_low(). The > allocation from limited range will anyway fail if there is no enough > memory, so there is no need for extra traversal of memblock.memory > > Signed-off-by: Mike Rapoport Assuming that this got or will get tested with a crash kernel: Acked-by: Ingo Molnar Thanks, Ingo
Re: [PATCH v2 13/17] x86/setup: simplify initrd relocation and reservation
* Mike Rapoport wrote: > From: Mike Rapoport > > Currently, initrd image is reserved very early during setup and then it > might be relocated and re-reserved after the initial physical memory > mapping is created. The "late" reservation of memblock verifies that mapped > memory size exceeds the size of initrd, the checks whether the relocation > required and, if yes, relocates inirtd to a new memory allocated from > memblock and frees the old location. > > The check for memory size is excessive as memblock allocation will anyway > fail if there is not enough memory. Besides, there is no point to allocate > memory from memblock using memblock_find_in_range() + memblock_reserve() > when there exists memblock_phys_alloc_range() with required functionality. > > Remove the redundant check and simplify memblock allocation. > > Signed-off-by: Mike Rapoport Assuming there's no hidden dependency here breaking something: Acked-by: Ingo Molnar Thanks, Ingo
Re: powerpc: build failures in Linus' tree
On Sun, Aug 02, 2020 at 08:48:42PM +1000, Stephen Rothwell wrote: > Hi all, > > We are getting build failures in some PowerPC configs for Linus' tree. > See e.g. http://kisskb.ellerman.id.au/kisskb/buildresult/14306515/ > > In file included from /kisskb/src/arch/powerpc/include/asm/paca.h:18, > from /kisskb/src/arch/powerpc/include/asm/percpu.h:13, > from /kisskb/src/include/linux/random.h:14, > from /kisskb/src/include/linux/net.h:18, > from /kisskb/src/net/ipv6/ip6_fib.c:20: > /kisskb/src/arch/powerpc/include/asm/mmu.h:139:22: error: unknown type name > 'next_tlbcam_idx' > 139 | DECLARE_PER_CPU(int, next_tlbcam_idx); > > I assume this is caused by commit > > 1c9df907da83 ("random: fix circular include dependency on arm64 after > addition of percpu.h") > > But I can't see how, sorry. So there, asm/mmu.h includes asm/percpu.h, which includes asm/paca.h, which includes asm/mmu.h. I suspect that we can remove asm/paca.h from asm/percpu.h as it *seems* to be only used by the #define __my_cpu_offset but I don't know if anything will break further, especially if this __my_cpu_offset is used anywhere without this paca definition. Willy
Re: [merge] Build failure selftest/powerpc/mm/pkey_exec_prot
Hi Sachin, On 02/08/20 4:45 pm, Sachin Sant wrote: > pkey_exec_prot test from linuxppc merge branch (3f68564f1f5a) fails to > build due to following error: > > gcc -std=gnu99 -O2 -Wall -Werror > -DGIT_VERSION='"v5.8-rc7-1276-g3f68564f1f5a"' > -I/home/sachin/linux/tools/testing/selftests/powerpc/include -m64 > pkey_exec_prot.c > /home/sachin/linux/tools/testing/selftests/kselftest_harness.h > /home/sachin/linux/tools/testing/selftests/kselftest.h ../harness.c > ../utils.c -o > /home/sachin/linux/tools/testing/selftests/powerpc/mm/pkey_exec_prot > In file included from pkey_exec_prot.c:18: > /home/sachin/linux/tools/testing/selftests/powerpc/include/pkeys.h:34: error: > "SYS_pkey_mprotect" redefined [-Werror] > #define SYS_pkey_mprotect 386 > > In file included from /usr/include/sys/syscall.h:31, > from > /home/sachin/linux/tools/testing/selftests/powerpc/include/utils.h:47, > from > /home/sachin/linux/tools/testing/selftests/powerpc/include/pkeys.h:12, > from pkey_exec_prot.c:18: > /usr/include/bits/syscall.h:1583: note: this is the location of the previous > definition > # define SYS_pkey_mprotect __NR_pkey_mprotect > > commit 128d3d021007 introduced this error. > selftests/powerpc: Move pkey helpers to headers > > Possibly the # defines for sys calls can be retained in pkey_exec_prot.c or > I am unable to reproduce this on the latest merge branch (HEAD at f59195f7faa4). I don't see any redefinitions in pkey_exec_prot.c either. - Sandipan
[PATCH v2 17/17] memblock: use separate iterators for memory and reserved regions
From: Mike Rapoport for_each_memblock() is used to iterate over memblock.memory in a few places that use data from memblock_region rather than the memory ranges. Introduce separate for_each_mem_region() and for_each_reserved_mem_region() to improve encapsulation of memblock internals from its users. Signed-off-by: Mike Rapoport --- .clang-format | 3 ++- arch/arm64/kernel/setup.c | 2 +- arch/arm64/mm/numa.c | 2 +- arch/mips/netlogic/xlp/setup.c | 2 +- arch/x86/mm/numa.c | 2 +- include/linux/memblock.h | 19 --- mm/memblock.c | 4 ++-- mm/page_alloc.c| 8 8 files changed, 28 insertions(+), 14 deletions(-) diff --git a/.clang-format b/.clang-format index e28a849a1c58..cff71d345456 100644 --- a/.clang-format +++ b/.clang-format @@ -201,7 +201,7 @@ ForEachMacros: - 'for_each_matching_node' - 'for_each_matching_node_and_match' - 'for_each_member' - - 'for_each_memblock' + - 'for_each_mem_region' - 'for_each_memblock_type' - 'for_each_memcg_cache_index' - 'for_each_mem_pfn_range' @@ -267,6 +267,7 @@ ForEachMacros: - 'for_each_property_of_node' - 'for_each_registered_fb' - 'for_each_reserved_mem_range' + - 'for_each_reserved_mem_region' - 'for_each_rtd_codec_dais' - 'for_each_rtd_codec_dais_rollback' - 'for_each_rtd_components' diff --git a/arch/arm64/kernel/setup.c b/arch/arm64/kernel/setup.c index f3aec7244aab..52ea2f1a7184 100644 --- a/arch/arm64/kernel/setup.c +++ b/arch/arm64/kernel/setup.c @@ -217,7 +217,7 @@ static void __init request_standard_resources(void) if (!standard_resources) panic("%s: Failed to allocate %zu bytes\n", __func__, res_size); - for_each_memblock(memory, region) { + for_each_mem_region(region) { res = _resources[i++]; if (memblock_is_nomap(region)) { res->name = "reserved"; diff --git a/arch/arm64/mm/numa.c b/arch/arm64/mm/numa.c index 0cbdbcc885fb..f121e42246a6 100644 --- a/arch/arm64/mm/numa.c +++ b/arch/arm64/mm/numa.c @@ -350,7 +350,7 @@ static int __init numa_register_nodes(void) struct memblock_region *mblk; /* Check that valid nid is set to memblks */ - for_each_memblock(memory, mblk) { + for_each_mem_region(mblk) { int mblk_nid = memblock_get_region_node(mblk); if (mblk_nid == NUMA_NO_NODE || mblk_nid >= MAX_NUMNODES) { diff --git a/arch/mips/netlogic/xlp/setup.c b/arch/mips/netlogic/xlp/setup.c index 1a0fc5b62ba4..6e3102bcd2f1 100644 --- a/arch/mips/netlogic/xlp/setup.c +++ b/arch/mips/netlogic/xlp/setup.c @@ -70,7 +70,7 @@ static void nlm_fixup_mem(void) const int pref_backup = 512; struct memblock_region *mem; - for_each_memblock(memory, mem) { + for_each_mem_region(mem) { memblock_remove(mem->base + mem->size - pref_backup, pref_backup); } diff --git a/arch/x86/mm/numa.c b/arch/x86/mm/numa.c index 8ee952038c80..fe6ea18d6923 100644 --- a/arch/x86/mm/numa.c +++ b/arch/x86/mm/numa.c @@ -516,7 +516,7 @@ static void __init numa_clear_kernel_node_hotplug(void) * memory ranges, because quirks such as trim_snb_memory() * reserve specific pages for Sandy Bridge graphics. ] */ - for_each_memblock(reserved, mb_region) { + for_each_reserved_mem_region(mb_region) { int nid = memblock_get_region_node(mb_region); if (nid != MAX_NUMNODES) diff --git a/include/linux/memblock.h b/include/linux/memblock.h index 9e51b3fd4134..a6970e058bd7 100644 --- a/include/linux/memblock.h +++ b/include/linux/memblock.h @@ -522,9 +522,22 @@ static inline unsigned long memblock_region_reserved_end_pfn(const struct memblo return PFN_UP(reg->base + reg->size); } -#define for_each_memblock(memblock_type, region) \ - for (region = memblock.memblock_type.regions; \ -region < (memblock.memblock_type.regions + memblock.memblock_type.cnt);\ +/** + * for_each_mem_region - itereate over registered memory regions + * @region: loop variable + */ +#define for_each_mem_region(region)\ + for (region = memblock.memory.regions; \ +region < (memblock.memory.regions + memblock.memory.cnt); \ +region++) + +/** + * for_each_reserved_mem_region - itereate over reserved memory regions + * @region: loop variable + */ +#define for_each_reserved_mem_region(region) \ + for (region = memblock.reserved.regions;\ +region < (memblock.reserved.regions + memblock.reserved.cnt); \ region++) extern void *alloc_large_system_hash(const char *tablename, diff --git a/mm/memblock.c b/mm/memblock.c
[PATCH v2 16/17] memblock: implement for_each_reserved_mem_region() using __next_mem_region()
From: Mike Rapoport Iteration over memblock.reserved with for_each_reserved_mem_region() used __next_reserved_mem_region() that implemented a subset of __next_mem_region(). Use __for_each_mem_range() and, essentially, __next_mem_region() with appropriate parameters to reduce code duplication. While on it, rename for_each_reserved_mem_region() to for_each_reserved_mem_range() for consistency. Signed-off-by: Mike Rapoport --- .clang-format| 2 +- arch/arm64/kernel/setup.c| 2 +- drivers/irqchip/irq-gic-v3-its.c | 2 +- include/linux/memblock.h | 12 +++-- mm/memblock.c| 46 +++- 5 files changed, 17 insertions(+), 47 deletions(-) diff --git a/.clang-format b/.clang-format index 52ededab25ce..e28a849a1c58 100644 --- a/.clang-format +++ b/.clang-format @@ -266,7 +266,7 @@ ForEachMacros: - 'for_each_process_thread' - 'for_each_property_of_node' - 'for_each_registered_fb' - - 'for_each_reserved_mem_region' + - 'for_each_reserved_mem_range' - 'for_each_rtd_codec_dais' - 'for_each_rtd_codec_dais_rollback' - 'for_each_rtd_components' diff --git a/arch/arm64/kernel/setup.c b/arch/arm64/kernel/setup.c index 93b3844cf442..f3aec7244aab 100644 --- a/arch/arm64/kernel/setup.c +++ b/arch/arm64/kernel/setup.c @@ -257,7 +257,7 @@ static int __init reserve_memblock_reserved_regions(void) if (!memblock_is_region_reserved(mem->start, mem_size)) continue; - for_each_reserved_mem_region(j, _start, _end) { + for_each_reserved_mem_range(j, _start, _end) { resource_size_t start, end; start = max(PFN_PHYS(PFN_DOWN(r_start)), mem->start); diff --git a/drivers/irqchip/irq-gic-v3-its.c b/drivers/irqchip/irq-gic-v3-its.c index beac4caefad9..9971fd8cf6b6 100644 --- a/drivers/irqchip/irq-gic-v3-its.c +++ b/drivers/irqchip/irq-gic-v3-its.c @@ -2192,7 +2192,7 @@ static bool gic_check_reserved_range(phys_addr_t addr, unsigned long size) addr_end = addr + size - 1; - for_each_reserved_mem_region(i, , ) { + for_each_reserved_mem_range(i, , ) { if (addr >= start && addr_end <= end) return true; } diff --git a/include/linux/memblock.h b/include/linux/memblock.h index ec2fd8f32a19..9e51b3fd4134 100644 --- a/include/linux/memblock.h +++ b/include/linux/memblock.h @@ -136,9 +136,6 @@ void __next_mem_range_rev(u64 *idx, int nid, enum memblock_flags flags, struct memblock_type *type_b, phys_addr_t *out_start, phys_addr_t *out_end, int *out_nid); -void __next_reserved_mem_region(u64 *idx, phys_addr_t *out_start, - phys_addr_t *out_end); - void __memblock_free_late(phys_addr_t base, phys_addr_t size); /** @@ -193,7 +190,7 @@ void __memblock_free_late(phys_addr_t base, phys_addr_t size); MEMBLOCK_NONE, p_start, p_end, NULL) /** - * for_each_reserved_mem_region - iterate over all reserved memblock areas + * for_each_reserved_mem_range - iterate over all reserved memblock areas * @i: u64 used as loop variable * @p_start: ptr to phys_addr_t for start address of the range, can be %NULL * @p_end: ptr to phys_addr_t for end address of the range, can be %NULL @@ -201,10 +198,9 @@ void __memblock_free_late(phys_addr_t base, phys_addr_t size); * Walks over reserved areas of memblock. Available as soon as memblock * is initialized. */ -#define for_each_reserved_mem_region(i, p_start, p_end) \ - for (i = 0UL, __next_reserved_mem_region(, p_start, p_end); \ -i != (u64)ULLONG_MAX; \ -__next_reserved_mem_region(, p_start, p_end)) +#define for_each_reserved_mem_range(i, p_start, p_end) \ + __for_each_mem_range(i, , NULL, NUMA_NO_NODE, \ +MEMBLOCK_NONE, p_start, p_end, NULL) static inline bool memblock_is_hotpluggable(struct memblock_region *m) { diff --git a/mm/memblock.c b/mm/memblock.c index 48d614352b25..dadf579f7c53 100644 --- a/mm/memblock.c +++ b/mm/memblock.c @@ -946,42 +946,16 @@ int __init_memblock memblock_clear_nomap(phys_addr_t base, phys_addr_t size) return memblock_setclr_flag(base, size, 0, MEMBLOCK_NOMAP); } -/** - * __next_reserved_mem_region - next function for for_each_reserved_region() - * @idx: pointer to u64 loop variable - * @out_start: ptr to phys_addr_t for start address of the region, can be %NULL - * @out_end: ptr to phys_addr_t for end address of the region, can be %NULL - * - * Iterate over all reserved memory regions. - */ -void __init_memblock __next_reserved_mem_region(u64 *idx, - phys_addr_t *out_start, - phys_addr_t *out_end) -{ - struct memblock_type
[PATCH v2 15/17] memblock: remove unused memblock_mem_size()
From: Mike Rapoport The only user of memblock_mem_size() was x86 setup code, it is gone now and memblock_mem_size() funciton can be removed. Signed-off-by: Mike Rapoport --- include/linux/memblock.h | 1 - mm/memblock.c| 15 --- 2 files changed, 16 deletions(-) diff --git a/include/linux/memblock.h b/include/linux/memblock.h index d70c2835e913..ec2fd8f32a19 100644 --- a/include/linux/memblock.h +++ b/include/linux/memblock.h @@ -450,7 +450,6 @@ static inline bool memblock_bottom_up(void) phys_addr_t memblock_phys_mem_size(void); phys_addr_t memblock_reserved_size(void); -phys_addr_t memblock_mem_size(unsigned long limit_pfn); phys_addr_t memblock_start_of_DRAM(void); phys_addr_t memblock_end_of_DRAM(void); void memblock_enforce_memory_limit(phys_addr_t memory_limit); diff --git a/mm/memblock.c b/mm/memblock.c index c1a4c8798973..48d614352b25 100644 --- a/mm/memblock.c +++ b/mm/memblock.c @@ -1656,21 +1656,6 @@ phys_addr_t __init_memblock memblock_reserved_size(void) return memblock.reserved.total_size; } -phys_addr_t __init memblock_mem_size(unsigned long limit_pfn) -{ - unsigned long pages = 0; - unsigned long start_pfn, end_pfn; - int i; - - for_each_mem_pfn_range(i, MAX_NUMNODES, _pfn, _pfn, NULL) { - start_pfn = min_t(unsigned long, start_pfn, limit_pfn); - end_pfn = min_t(unsigned long, end_pfn, limit_pfn); - pages += end_pfn - start_pfn; - } - - return PFN_PHYS(pages); -} - /* lowest address */ phys_addr_t __init_memblock memblock_start_of_DRAM(void) { -- 2.26.2
[PATCH v2 14/17] x86/setup: simplify reserve_crashkernel()
From: Mike Rapoport * Replace magic numbers with defines * Replace memblock_find_in_range() + memblock_reserve() with memblock_phys_alloc_range() * Stop checking for low memory size in reserve_crashkernel_low(). The allocation from limited range will anyway fail if there is no enough memory, so there is no need for extra traversal of memblock.memory Signed-off-by: Mike Rapoport --- arch/x86/kernel/setup.c | 40 ++-- 1 file changed, 14 insertions(+), 26 deletions(-) diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c index d8de4053c5e8..d7ced6982524 100644 --- a/arch/x86/kernel/setup.c +++ b/arch/x86/kernel/setup.c @@ -419,13 +419,13 @@ static int __init reserve_crashkernel_low(void) { #ifdef CONFIG_X86_64 unsigned long long base, low_base = 0, low_size = 0; - unsigned long total_low_mem; + unsigned long low_mem_limit; int ret; - total_low_mem = memblock_mem_size(1UL << (32 - PAGE_SHIFT)); + low_mem_limit = min(memblock_phys_mem_size(), CRASH_ADDR_LOW_MAX); /* crashkernel=Y,low */ - ret = parse_crashkernel_low(boot_command_line, total_low_mem, _size, ); + ret = parse_crashkernel_low(boot_command_line, low_mem_limit, _size, ); if (ret) { /* * two parts from kernel/dma/swiotlb.c: @@ -443,23 +443,17 @@ static int __init reserve_crashkernel_low(void) return 0; } - low_base = memblock_find_in_range(0, 1ULL << 32, low_size, CRASH_ALIGN); + low_base = memblock_phys_alloc_range(low_size, CRASH_ALIGN, 0, CRASH_ADDR_LOW_MAX); if (!low_base) { pr_err("Cannot reserve %ldMB crashkernel low memory, please try smaller size.\n", (unsigned long)(low_size >> 20)); return -ENOMEM; } - ret = memblock_reserve(low_base, low_size); - if (ret) { - pr_err("%s: Error reserving crashkernel low memblock.\n", __func__); - return ret; - } - - pr_info("Reserving %ldMB of low memory at %ldMB for crashkernel (System low RAM: %ldMB)\n", + pr_info("Reserving %ldMB of low memory at %ldMB for crashkernel (low RAM limit: %ldMB)\n", (unsigned long)(low_size >> 20), (unsigned long)(low_base >> 20), - (unsigned long)(total_low_mem >> 20)); + (unsigned long)(low_mem_limit >> 20)); crashk_low_res.start = low_base; crashk_low_res.end = low_base + low_size - 1; @@ -503,13 +497,13 @@ static void __init reserve_crashkernel(void) * unless "crashkernel=size[KMG],high" is specified. */ if (!high) - crash_base = memblock_find_in_range(CRASH_ALIGN, - CRASH_ADDR_LOW_MAX, - crash_size, CRASH_ALIGN); + crash_base = memblock_phys_alloc_range(crash_size, + CRASH_ALIGN, CRASH_ALIGN, + CRASH_ADDR_LOW_MAX); if (!crash_base) - crash_base = memblock_find_in_range(CRASH_ALIGN, - CRASH_ADDR_HIGH_MAX, - crash_size, CRASH_ALIGN); + crash_base = memblock_phys_alloc_range(crash_size, + CRASH_ALIGN, CRASH_ALIGN, + CRASH_ADDR_HIGH_MAX); if (!crash_base) { pr_info("crashkernel reservation failed - No suitable area found.\n"); return; @@ -517,19 +511,13 @@ static void __init reserve_crashkernel(void) } else { unsigned long long start; - start = memblock_find_in_range(crash_base, - crash_base + crash_size, - crash_size, 1 << 20); + start = memblock_phys_alloc_range(crash_size, SZ_1M, crash_base, + crash_base + crash_size); if (start != crash_base) { pr_info("crashkernel reservation failed - memory is in use.\n"); return; } } - ret = memblock_reserve(crash_base, crash_size); - if (ret) { - pr_err("%s: Error reserving crashkernel memblock.\n", __func__); - return; - } if (crash_base >= (1ULL << 32) && reserve_crashkernel_low()) { memblock_free(crash_base, crash_size); -- 2.26.2
[PATCH v2 13/17] x86/setup: simplify initrd relocation and reservation
From: Mike Rapoport Currently, initrd image is reserved very early during setup and then it might be relocated and re-reserved after the initial physical memory mapping is created. The "late" reservation of memblock verifies that mapped memory size exceeds the size of initrd, the checks whether the relocation required and, if yes, relocates inirtd to a new memory allocated from memblock and frees the old location. The check for memory size is excessive as memblock allocation will anyway fail if there is not enough memory. Besides, there is no point to allocate memory from memblock using memblock_find_in_range() + memblock_reserve() when there exists memblock_phys_alloc_range() with required functionality. Remove the redundant check and simplify memblock allocation. Signed-off-by: Mike Rapoport --- arch/x86/kernel/setup.c | 16 +++- 1 file changed, 3 insertions(+), 13 deletions(-) diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c index a3767e74c758..d8de4053c5e8 100644 --- a/arch/x86/kernel/setup.c +++ b/arch/x86/kernel/setup.c @@ -262,16 +262,12 @@ static void __init relocate_initrd(void) u64 area_size = PAGE_ALIGN(ramdisk_size); /* We need to move the initrd down into directly mapped mem */ - relocated_ramdisk = memblock_find_in_range(0, PFN_PHYS(max_pfn_mapped), - area_size, PAGE_SIZE); - + relocated_ramdisk = memblock_phys_alloc_range(area_size, PAGE_SIZE, 0, + PFN_PHYS(max_pfn_mapped)); if (!relocated_ramdisk) panic("Cannot find place for new RAMDISK of size %lld\n", ramdisk_size); - /* Note: this includes all the mem currently occupied by - the initrd, we rely on that fact to keep the data intact. */ - memblock_reserve(relocated_ramdisk, area_size); initrd_start = relocated_ramdisk + PAGE_OFFSET; initrd_end = initrd_start + ramdisk_size; printk(KERN_INFO "Allocated new RAMDISK: [mem %#010llx-%#010llx]\n", @@ -298,13 +294,13 @@ static void __init early_reserve_initrd(void) memblock_reserve(ramdisk_image, ramdisk_end - ramdisk_image); } + static void __init reserve_initrd(void) { /* Assume only end is not page aligned */ u64 ramdisk_image = get_ramdisk_image(); u64 ramdisk_size = get_ramdisk_size(); u64 ramdisk_end = PAGE_ALIGN(ramdisk_image + ramdisk_size); - u64 mapped_size; if (!boot_params.hdr.type_of_loader || !ramdisk_image || !ramdisk_size) @@ -312,12 +308,6 @@ static void __init reserve_initrd(void) initrd_start = 0; - mapped_size = memblock_mem_size(max_pfn_mapped); - if (ramdisk_size >= (mapped_size>>1)) - panic("initrd too large to handle, " - "disabling initrd (%lld needed, %lld available)\n", - ramdisk_size, mapped_size>>1); - printk(KERN_INFO "RAMDISK: [mem %#010llx-%#010llx]\n", ramdisk_image, ramdisk_end - 1); -- 2.26.2
[PATCH v2 12/17] arch, drivers: replace for_each_membock() with for_each_mem_range()
From: Mike Rapoport There are several occurrences of the following pattern: for_each_memblock(memory, reg) { start = __pfn_to_phys(memblock_region_memory_base_pfn(reg); end = __pfn_to_phys(memblock_region_memory_end_pfn(reg)); /* do something with start and end */ } Using for_each_mem_range() iterator is more appropriate in such cases and allows simpler and cleaner code. Signed-off-by: Mike Rapoport --- arch/arm/kernel/setup.c | 18 ++--- arch/arm/mm/mmu.c| 39 ++ arch/arm/mm/pmsa-v7.c| 20 +- arch/arm/mm/pmsa-v8.c| 17 arch/arm/xen/mm.c| 7 ++-- arch/arm64/mm/kasan_init.c | 10 ++--- arch/arm64/mm/mmu.c | 11 ++ arch/c6x/kernel/setup.c | 9 +++-- arch/microblaze/mm/init.c| 9 +++-- arch/mips/cavium-octeon/dma-octeon.c | 12 +++--- arch/mips/kernel/setup.c | 31 +++ arch/openrisc/mm/init.c | 8 ++-- arch/powerpc/kernel/fadump.c | 50 +++- arch/powerpc/mm/book3s64/hash_utils.c| 16 arch/powerpc/mm/book3s64/radix_pgtable.c | 11 +++--- arch/powerpc/mm/kasan/kasan_init_32.c| 8 ++-- arch/powerpc/mm/mem.c| 16 +--- arch/powerpc/mm/pgtable_32.c | 8 ++-- arch/riscv/mm/init.c | 25 +--- arch/riscv/mm/kasan_init.c | 10 ++--- arch/s390/kernel/setup.c | 27 - arch/s390/mm/vmem.c | 16 arch/sparc/mm/init_64.c | 12 ++ drivers/bus/mvebu-mbus.c | 12 +++--- drivers/s390/char/zcore.c| 9 +++-- 25 files changed, 200 insertions(+), 211 deletions(-) diff --git a/arch/arm/kernel/setup.c b/arch/arm/kernel/setup.c index d8e18cdd96d3..3f65d0ac9f63 100644 --- a/arch/arm/kernel/setup.c +++ b/arch/arm/kernel/setup.c @@ -843,19 +843,25 @@ early_param("mem", early_mem); static void __init request_standard_resources(const struct machine_desc *mdesc) { - struct memblock_region *region; + phys_addr_t start, end, res_end; struct resource *res; + u64 i; kernel_code.start = virt_to_phys(_text); kernel_code.end = virt_to_phys(__init_begin - 1); kernel_data.start = virt_to_phys(_sdata); kernel_data.end = virt_to_phys(_end - 1); - for_each_memblock(memory, region) { - phys_addr_t start = __pfn_to_phys(memblock_region_memory_base_pfn(region)); - phys_addr_t end = __pfn_to_phys(memblock_region_memory_end_pfn(region)) - 1; + for_each_mem_range(i, , ) { unsigned long boot_alias_start; + /* +* In memblock, end points to the first byte after the +* range while in resourses, end points to the last byte in +* the range. +*/ + res_end = end - 1; + /* * Some systems have a special memory alias which is only * used for booting. We need to advertise this region to @@ -869,7 +875,7 @@ static void __init request_standard_resources(const struct machine_desc *mdesc) __func__, sizeof(*res)); res->name = "System RAM (boot alias)"; res->start = boot_alias_start; - res->end = phys_to_idmap(end); + res->end = phys_to_idmap(res_end); res->flags = IORESOURCE_MEM | IORESOURCE_BUSY; request_resource(_resource, res); } @@ -880,7 +886,7 @@ static void __init request_standard_resources(const struct machine_desc *mdesc) sizeof(*res)); res->name = "System RAM"; res->start = start; - res->end = end; + res->end = res_end; res->flags = IORESOURCE_SYSTEM_RAM | IORESOURCE_BUSY; request_resource(_resource, res); diff --git a/arch/arm/mm/mmu.c b/arch/arm/mm/mmu.c index 628028bfbb92..a149d9cb4fdb 100644 --- a/arch/arm/mm/mmu.c +++ b/arch/arm/mm/mmu.c @@ -1155,9 +1155,8 @@ phys_addr_t arm_lowmem_limit __initdata = 0; void __init adjust_lowmem_bounds(void) { - phys_addr_t memblock_limit = 0; - u64 vmalloc_limit; - struct memblock_region *reg; + phys_addr_t block_start, block_end, memblock_limit = 0; + u64 vmalloc_limit, i; phys_addr_t lowmem_limit = 0; /* @@ -1173,26 +1172,18 @@ void __init adjust_lowmem_bounds(void) * The first usable region must be PMD aligned. Mark its start * as MEMBLOCK_NOMAP if it isn't */ -
[PATCH v2 11/17] arch, mm: replace for_each_memblock() with for_each_mem_pfn_range()
From: Mike Rapoport There are several occurrences of the following pattern: for_each_memblock(memory, reg) { start_pfn = memblock_region_memory_base_pfn(reg); end_pfn = memblock_region_memory_end_pfn(reg); /* do something with start_pfn and end_pfn */ } Rather than iterate over all memblock.memory regions and each time query for their start and end PFNs, use for_each_mem_pfn_range() iterator to get simpler and clearer code. Signed-off-by: Mike Rapoport --- arch/arm/mm/init.c | 11 --- arch/arm64/mm/init.c | 11 --- arch/powerpc/kernel/fadump.c | 11 ++- arch/powerpc/mm/mem.c| 15 --- arch/powerpc/mm/numa.c | 7 ++- arch/s390/mm/page-states.c | 6 ++ arch/sh/mm/init.c| 9 +++-- mm/memblock.c| 6 ++ mm/sparse.c | 10 -- 9 files changed, 35 insertions(+), 51 deletions(-) diff --git a/arch/arm/mm/init.c b/arch/arm/mm/init.c index 626af348eb8f..d630573277d1 100644 --- a/arch/arm/mm/init.c +++ b/arch/arm/mm/init.c @@ -304,16 +304,14 @@ free_memmap(unsigned long start_pfn, unsigned long end_pfn) */ static void __init free_unused_memmap(void) { - unsigned long start, prev_end = 0; - struct memblock_region *reg; + unsigned long start, end, prev_end = 0; + int i; /* * This relies on each bank being in address order. * The banks are sorted previously in bootmem_init(). */ - for_each_memblock(memory, reg) { - start = memblock_region_memory_base_pfn(reg); - + for_each_mem_pfn_range(i, MAX_NUMNODES, , , NULL) { #ifdef CONFIG_SPARSEMEM /* * Take care not to free memmap entries that don't exist @@ -341,8 +339,7 @@ static void __init free_unused_memmap(void) * memmap entries are valid from the bank end aligned to * MAX_ORDER_NR_PAGES. */ - prev_end = ALIGN(memblock_region_memory_end_pfn(reg), -MAX_ORDER_NR_PAGES); + prev_end = ALIGN(end, MAX_ORDER_NR_PAGES); } #ifdef CONFIG_SPARSEMEM diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c index 1e93cfc7c47a..291b5805457d 100644 --- a/arch/arm64/mm/init.c +++ b/arch/arm64/mm/init.c @@ -473,12 +473,10 @@ static inline void free_memmap(unsigned long start_pfn, unsigned long end_pfn) */ static void __init free_unused_memmap(void) { - unsigned long start, prev_end = 0; - struct memblock_region *reg; - - for_each_memblock(memory, reg) { - start = __phys_to_pfn(reg->base); + unsigned long start, end, prev_end = 0; + int i; + for_each_mem_pfn_range(i, MAX_NUMNODES, , , NULL) { #ifdef CONFIG_SPARSEMEM /* * Take care not to free memmap entries that don't exist due @@ -498,8 +496,7 @@ static void __init free_unused_memmap(void) * memmap entries are valid from the bank end aligned to * MAX_ORDER_NR_PAGES. */ - prev_end = ALIGN(__phys_to_pfn(reg->base + reg->size), -MAX_ORDER_NR_PAGES); + prev_end = ALIGN(end, MAX_ORDER_NR_PAGES); } #ifdef CONFIG_SPARSEMEM diff --git a/arch/powerpc/kernel/fadump.c b/arch/powerpc/kernel/fadump.c index 78ab9a6ee6ac..fc85cbc66839 100644 --- a/arch/powerpc/kernel/fadump.c +++ b/arch/powerpc/kernel/fadump.c @@ -1216,14 +1216,15 @@ static void fadump_free_reserved_memory(unsigned long start_pfn, */ static void fadump_release_reserved_area(u64 start, u64 end) { - u64 tstart, tend, spfn, epfn; - struct memblock_region *reg; + u64 tstart, tend, spfn, epfn, reg_spfn, reg_epfn, i; spfn = PHYS_PFN(start); epfn = PHYS_PFN(end); - for_each_memblock(memory, reg) { - tstart = max_t(u64, spfn, memblock_region_memory_base_pfn(reg)); - tend = min_t(u64, epfn, memblock_region_memory_end_pfn(reg)); + + for_each_mem_pfn_range(i, MAX_NUMNODES, _spfn, _epfn, NULL) { + tstart = max_t(u64, spfn, reg_spfn); + tend = min_t(u64, epfn, reg_epfn); + if (tstart < tend) { fadump_free_reserved_memory(tstart, tend); diff --git a/arch/powerpc/mm/mem.c b/arch/powerpc/mm/mem.c index c2c11eb8dcfc..1364dd532107 100644 --- a/arch/powerpc/mm/mem.c +++ b/arch/powerpc/mm/mem.c @@ -192,15 +192,16 @@ void __init initmem_init(void) /* mark pages that don't exist as nosave */ static int __init mark_nonram_nosave(void) { - struct memblock_region *reg, *prev = NULL; + unsigned long spfn, epfn, prev = 0; + int i; - for_each_memblock(memory, reg) { - if (prev && - memblock_region_memory_end_pfn(prev) <
[PATCH v2 10/17] memblock: reduce number of parameters in for_each_mem_range()
From: Mike Rapoport Currently for_each_mem_range() iterator is the most generic way to traverse memblock regions. As such, it has 8 parameters and it is hardly convenient to users. Most users choose to utilize one of its wrappers and the only user that actually needs most of the parameters outside memblock is s390 crash dump implementation. To avoid yet another naming for memblock iterators, rename the existing for_each_mem_range() to __for_each_mem_range() and add a new for_each_mem_range() wrapper with only index, start and end parameters. The new wrapper nicely fits into init_unavailable_mem() and will be used in upcoming changes to simplify memblock traversals. Signed-off-by: Mike Rapoport Reviewed-by: Baoquan He --- .clang-format | 1 + arch/arm64/kernel/machine_kexec_file.c | 6 ++ arch/s390/kernel/crash_dump.c | 8 include/linux/memblock.h | 18 ++ mm/page_alloc.c| 3 +-- 5 files changed, 22 insertions(+), 14 deletions(-) diff --git a/.clang-format b/.clang-format index a0a96088c74f..52ededab25ce 100644 --- a/.clang-format +++ b/.clang-format @@ -205,6 +205,7 @@ ForEachMacros: - 'for_each_memblock_type' - 'for_each_memcg_cache_index' - 'for_each_mem_pfn_range' + - '__for_each_mem_range' - 'for_each_mem_range' - 'for_each_mem_range_rev' - 'for_each_migratetype_order' diff --git a/arch/arm64/kernel/machine_kexec_file.c b/arch/arm64/kernel/machine_kexec_file.c index 361a1143e09e..5b0e67b93cdc 100644 --- a/arch/arm64/kernel/machine_kexec_file.c +++ b/arch/arm64/kernel/machine_kexec_file.c @@ -215,8 +215,7 @@ static int prepare_elf_headers(void **addr, unsigned long *sz) phys_addr_t start, end; nr_ranges = 1; /* for exclusion of crashkernel region */ - for_each_mem_range(i, , NULL, NUMA_NO_NODE, - MEMBLOCK_NONE, , , NULL) + for_each_mem_range(i, , ) nr_ranges++; cmem = kmalloc(struct_size(cmem, ranges, nr_ranges), GFP_KERNEL); @@ -225,8 +224,7 @@ static int prepare_elf_headers(void **addr, unsigned long *sz) cmem->max_nr_ranges = nr_ranges; cmem->nr_ranges = 0; - for_each_mem_range(i, , NULL, NUMA_NO_NODE, - MEMBLOCK_NONE, , , NULL) { + for_each_mem_range(i, , ) { cmem->ranges[cmem->nr_ranges].start = start; cmem->ranges[cmem->nr_ranges].end = end - 1; cmem->nr_ranges++; diff --git a/arch/s390/kernel/crash_dump.c b/arch/s390/kernel/crash_dump.c index f96a5857bbfd..e28085c725ff 100644 --- a/arch/s390/kernel/crash_dump.c +++ b/arch/s390/kernel/crash_dump.c @@ -549,8 +549,8 @@ static int get_mem_chunk_cnt(void) int cnt = 0; u64 idx; - for_each_mem_range(idx, , _type, NUMA_NO_NODE, - MEMBLOCK_NONE, NULL, NULL, NULL) + __for_each_mem_range(idx, , _type, NUMA_NO_NODE, +MEMBLOCK_NONE, NULL, NULL, NULL) cnt++; return cnt; } @@ -563,8 +563,8 @@ static void loads_init(Elf64_Phdr *phdr, u64 loads_offset) phys_addr_t start, end; u64 idx; - for_each_mem_range(idx, , _type, NUMA_NO_NODE, - MEMBLOCK_NONE, , , NULL) { + __for_each_mem_range(idx, , _type, NUMA_NO_NODE, +MEMBLOCK_NONE, , , NULL) { phdr->p_filesz = end - start; phdr->p_type = PT_LOAD; phdr->p_offset = start; diff --git a/include/linux/memblock.h b/include/linux/memblock.h index e6a23b3db696..d70c2835e913 100644 --- a/include/linux/memblock.h +++ b/include/linux/memblock.h @@ -142,7 +142,7 @@ void __next_reserved_mem_region(u64 *idx, phys_addr_t *out_start, void __memblock_free_late(phys_addr_t base, phys_addr_t size); /** - * for_each_mem_range - iterate through memblock areas from type_a and not + * __for_each_mem_range - iterate through memblock areas from type_a and not * included in type_b. Or just type_a if type_b is NULL. * @i: u64 used as loop variable * @type_a: ptr to memblock_type to iterate @@ -153,7 +153,7 @@ void __memblock_free_late(phys_addr_t base, phys_addr_t size); * @p_end: ptr to phys_addr_t for end address of the range, can be %NULL * @p_nid: ptr to int for nid of the range, can be %NULL */ -#define for_each_mem_range(i, type_a, type_b, nid, flags, \ +#define __for_each_mem_range(i, type_a, type_b, nid, flags,\ p_start, p_end, p_nid) \ for (i = 0, __next_mem_range(, nid, flags, type_a, type_b,\ p_start, p_end, p_nid);\ @@ -182,6 +182,16 @@ void __memblock_free_late(phys_addr_t base, phys_addr_t size); __next_mem_range_rev(, nid, flags, type_a, type_b, \
[PATCH v2 09/17] memblock: make memblock_debug and related functionality private
From: Mike Rapoport The only user of memblock_dbg() outside memblock was s390 setup code and it is converted to use pr_debug() instead. This allows to stop exposing memblock_debug and memblock_dbg() to the rest of the kernel. Signed-off-by: Mike Rapoport Reviewed-by: Baoquan He --- arch/s390/kernel/setup.c | 4 ++-- include/linux/memblock.h | 12 +--- mm/memblock.c| 13 +++-- 3 files changed, 14 insertions(+), 15 deletions(-) diff --git a/arch/s390/kernel/setup.c b/arch/s390/kernel/setup.c index 07aa15ba43b3..8b284cf6e199 100644 --- a/arch/s390/kernel/setup.c +++ b/arch/s390/kernel/setup.c @@ -776,8 +776,8 @@ static void __init memblock_add_mem_detect_info(void) unsigned long start, end; int i; - memblock_dbg("physmem info source: %s (%hhd)\n", -get_mem_info_source(), mem_detect.info_source); + pr_debug("physmem info source: %s (%hhd)\n", +get_mem_info_source(), mem_detect.info_source); /* keep memblock lists close to the kernel */ memblock_set_bottom_up(true); for_each_mem_detect_block(i, , ) { diff --git a/include/linux/memblock.h b/include/linux/memblock.h index 220b5f0dad42..e6a23b3db696 100644 --- a/include/linux/memblock.h +++ b/include/linux/memblock.h @@ -90,7 +90,6 @@ struct memblock { }; extern struct memblock memblock; -extern int memblock_debug; #ifndef CONFIG_ARCH_KEEP_MEMBLOCK #define __init_memblock __meminit @@ -102,9 +101,6 @@ void memblock_discard(void); static inline void memblock_discard(void) {} #endif -#define memblock_dbg(fmt, ...) \ - if (memblock_debug) printk(KERN_INFO pr_fmt(fmt), ##__VA_ARGS__) - phys_addr_t memblock_find_in_range(phys_addr_t start, phys_addr_t end, phys_addr_t size, phys_addr_t align); void memblock_allow_resize(void); @@ -456,13 +452,7 @@ bool memblock_is_region_memory(phys_addr_t base, phys_addr_t size); bool memblock_is_reserved(phys_addr_t addr); bool memblock_is_region_reserved(phys_addr_t base, phys_addr_t size); -extern void __memblock_dump_all(void); - -static inline void memblock_dump_all(void) -{ - if (memblock_debug) - __memblock_dump_all(); -} +void memblock_dump_all(void); /** * memblock_set_current_limit - Set the current allocation limit to allow diff --git a/mm/memblock.c b/mm/memblock.c index a5b9b3df81fc..824938849f6d 100644 --- a/mm/memblock.c +++ b/mm/memblock.c @@ -134,7 +134,10 @@ struct memblock memblock __initdata_memblock = { i < memblock_type->cnt;\ i++, rgn = _type->regions[i]) -int memblock_debug __initdata_memblock; +#define memblock_dbg(fmt, ...) \ + if (memblock_debug) printk(KERN_INFO pr_fmt(fmt), ##__VA_ARGS__) + +static int memblock_debug __initdata_memblock; static bool system_has_some_mirror __initdata_memblock = false; static int memblock_can_resize __initdata_memblock; static int memblock_memory_in_slab __initdata_memblock = 0; @@ -1919,7 +1922,7 @@ static void __init_memblock memblock_dump(struct memblock_type *type) } } -void __init_memblock __memblock_dump_all(void) +static void __init_memblock __memblock_dump_all(void) { pr_info("MEMBLOCK configuration:\n"); pr_info(" memory size = %pa reserved size = %pa\n", @@ -1933,6 +1936,12 @@ void __init_memblock __memblock_dump_all(void) #endif } +void __init_memblock memblock_dump_all(void) +{ + if (memblock_debug) + __memblock_dump_all(); +} + void __init memblock_allow_resize(void) { memblock_can_resize = 1; -- 2.26.2
[PATCH v2 08/17] memblock: make for_each_memblock_type() iterator private
From: Mike Rapoport for_each_memblock_type() is not used outside mm/memblock.c, move it there from include/linux/memblock.h Signed-off-by: Mike Rapoport Reviewed-by: Baoquan He --- include/linux/memblock.h | 5 - mm/memblock.c| 5 + 2 files changed, 5 insertions(+), 5 deletions(-) diff --git a/include/linux/memblock.h b/include/linux/memblock.h index 017fae833d4a..220b5f0dad42 100644 --- a/include/linux/memblock.h +++ b/include/linux/memblock.h @@ -532,11 +532,6 @@ static inline unsigned long memblock_region_reserved_end_pfn(const struct memblo region < (memblock.memblock_type.regions + memblock.memblock_type.cnt);\ region++) -#define for_each_memblock_type(i, memblock_type, rgn) \ - for (i = 0, rgn = _type->regions[0]; \ -i < memblock_type->cnt;\ -i++, rgn = _type->regions[i]) - extern void *alloc_large_system_hash(const char *tablename, unsigned long bucketsize, unsigned long numentries, diff --git a/mm/memblock.c b/mm/memblock.c index 39aceafc57f6..a5b9b3df81fc 100644 --- a/mm/memblock.c +++ b/mm/memblock.c @@ -129,6 +129,11 @@ struct memblock memblock __initdata_memblock = { .current_limit = MEMBLOCK_ALLOC_ANYWHERE, }; +#define for_each_memblock_type(i, memblock_type, rgn) \ + for (i = 0, rgn = _type->regions[0]; \ +i < memblock_type->cnt;\ +i++, rgn = _type->regions[i]) + int memblock_debug __initdata_memblock; static bool system_has_some_mirror __initdata_memblock = false; static int memblock_can_resize __initdata_memblock; -- 2.26.2
[PATCH v2 07/17] mircoblaze: drop unneeded NUMA and sparsemem initializations
From: Mike Rapoport microblaze does not support neither NUMA not SPARSMEM, so there is no point to call memblock_set_node() and sparse_memory_present_with_active_regions() functions during microblaze memory initialization. Remove these calls and the surrounding code. Signed-off-by: Mike Rapoport --- arch/microblaze/mm/init.c | 17 + 1 file changed, 1 insertion(+), 16 deletions(-) diff --git a/arch/microblaze/mm/init.c b/arch/microblaze/mm/init.c index 521b59ba716c..49e0c241f9b1 100644 --- a/arch/microblaze/mm/init.c +++ b/arch/microblaze/mm/init.c @@ -105,9 +105,8 @@ static void __init paging_init(void) void __init setup_memory(void) { - struct memblock_region *reg; - #ifndef CONFIG_MMU + struct memblock_region *reg; u32 kernel_align_start, kernel_align_size; /* Find main memory where is the kernel */ @@ -161,20 +160,6 @@ void __init setup_memory(void) pr_info("%s: max_low_pfn: %#lx\n", __func__, max_low_pfn); pr_info("%s: max_pfn: %#lx\n", __func__, max_pfn); - /* Add active regions with valid PFNs */ - for_each_memblock(memory, reg) { - unsigned long start_pfn, end_pfn; - - start_pfn = memblock_region_memory_base_pfn(reg); - end_pfn = memblock_region_memory_end_pfn(reg); - memblock_set_node(start_pfn << PAGE_SHIFT, - (end_pfn - start_pfn) << PAGE_SHIFT, - , 0); - } - - /* XXX need to clip this if using highmem? */ - sparse_memory_present_with_active_regions(0); - paging_init(); } -- 2.26.2
[PATCH v2 06/17] riscv: drop unneeded node initialization
From: Mike Rapoport RISC-V does not (yet) support NUMA and for UMA architectures node 0 is used implicitly during early memory initialization. There is no need to call memblock_set_node(), remove this call and the surrounding code. Signed-off-by: Mike Rapoport --- arch/riscv/mm/init.c | 9 - 1 file changed, 9 deletions(-) diff --git a/arch/riscv/mm/init.c b/arch/riscv/mm/init.c index 79e9d55bdf1a..7440ba2cdaaa 100644 --- a/arch/riscv/mm/init.c +++ b/arch/riscv/mm/init.c @@ -191,15 +191,6 @@ void __init setup_bootmem(void) early_init_fdt_scan_reserved_mem(); memblock_allow_resize(); memblock_dump_all(); - - for_each_memblock(memory, reg) { - unsigned long start_pfn = memblock_region_memory_base_pfn(reg); - unsigned long end_pfn = memblock_region_memory_end_pfn(reg); - - memblock_set_node(PFN_PHYS(start_pfn), - PFN_PHYS(end_pfn - start_pfn), - , 0); - } } #ifdef CONFIG_MMU -- 2.26.2
[PATCH v2 05/17] h8300, nds32, openrisc: simplify detection of memory extents
From: Mike Rapoport Instead of traversing memblock.memory regions to find memory_start and memory_end, simply query memblock_{start,end}_of_DRAM(). Signed-off-by: Mike Rapoport Acked-by: Stafford Horne --- arch/h8300/kernel/setup.c| 8 +++- arch/nds32/kernel/setup.c| 8 ++-- arch/openrisc/kernel/setup.c | 9 ++--- 3 files changed, 7 insertions(+), 18 deletions(-) diff --git a/arch/h8300/kernel/setup.c b/arch/h8300/kernel/setup.c index 28ac88358a89..0281f92eea3d 100644 --- a/arch/h8300/kernel/setup.c +++ b/arch/h8300/kernel/setup.c @@ -74,17 +74,15 @@ static void __init bootmem_init(void) memory_end = memory_start = 0; /* Find main memory where is the kernel */ - for_each_memblock(memory, region) { - memory_start = region->base; - memory_end = region->base + region->size; - } + memory_start = memblock_start_of_DRAM(); + memory_end = memblock_end_of_DRAM(); if (!memory_end) panic("No memory!"); /* setup bootmem globals (we use no_bootmem, but mm still depends on this) */ min_low_pfn = PFN_UP(memory_start); - max_low_pfn = PFN_DOWN(memblock_end_of_DRAM()); + max_low_pfn = PFN_DOWN(memory_end); max_pfn = max_low_pfn; memblock_reserve(__pa(_stext), _end - _stext); diff --git a/arch/nds32/kernel/setup.c b/arch/nds32/kernel/setup.c index a066efbe53c0..c356e484dcab 100644 --- a/arch/nds32/kernel/setup.c +++ b/arch/nds32/kernel/setup.c @@ -249,12 +249,8 @@ static void __init setup_memory(void) memory_end = memory_start = 0; /* Find main memory where is the kernel */ - for_each_memblock(memory, region) { - memory_start = region->base; - memory_end = region->base + region->size; - pr_info("%s: Memory: 0x%x-0x%x\n", __func__, - memory_start, memory_end); - } + memory_start = memblock_start_of_DRAM(); + memory_end = memblock_end_of_DRAM(); if (!memory_end) { panic("No memory!"); diff --git a/arch/openrisc/kernel/setup.c b/arch/openrisc/kernel/setup.c index 8aa438e1f51f..c5706153d3b6 100644 --- a/arch/openrisc/kernel/setup.c +++ b/arch/openrisc/kernel/setup.c @@ -48,17 +48,12 @@ static void __init setup_memory(void) unsigned long ram_start_pfn; unsigned long ram_end_pfn; phys_addr_t memory_start, memory_end; - struct memblock_region *region; memory_end = memory_start = 0; /* Find main memory where is the kernel, we assume its the only one */ - for_each_memblock(memory, region) { - memory_start = region->base; - memory_end = region->base + region->size; - printk(KERN_INFO "%s: Memory: 0x%x-0x%x\n", __func__, - memory_start, memory_end); - } + memory_start = memblock_start_of_DRAM(); + memory_end = memblock_end_of_DRAM(); if (!memory_end) { panic("No memory!"); -- 2.26.2
[PATCH v2 04/17] arm64: numa: simplify dummy_numa_init()
From: Mike Rapoport dummy_numa_init() loops over memblock.memory and passes nid=0 to numa_add_memblk() which essentially wraps memblock_set_node(). However, memblock_set_node() can cope with entire memory span itself, so the loop over memblock.memory regions is redundant. Using a single call to memblock_set_node() rather than a loop also fixes an issue with a buggy ACPI firmware in which the SRAT table covers some but not all of the memory in the EFI memory map. Jonathan Cameron says: This issue can be easily triggered by having an SRAT table which fails to cover all elements of the EFI memory map. This firmware error is detected and a warning printed. e.g. "NUMA: Warning: invalid memblk node 64 [mem 0x24000-0x27fff]" At that point we fall back to dummy_numa_init(). However, the failed ACPI init has left us with our memblocks all broken up as we split them when trying to assign them to NUMA nodes. We then iterate over the memblocks and add them to node 0. numa_add_memblk() calls memblock_set_node() which merges regions that were previously split up during the earlier attempt to add them to different nodes during parsing of SRAT. This means elements are moved in the memblock array and we can end up in a different memblock after the call to numa_add_memblk(). Result is: Unable to handle kernel paging request at virtual address 3a40 Mem abort info: ESR = 0x9604 EC = 0x25: DABT (current EL), IL = 32 bits SET = 0, FnV = 0 EA = 0, S1PTW = 0 Data abort info: ISV = 0, ISS = 0x0004 CM = 0, WnR = 0 [3a40] user address but active_mm is swapper Internal error: Oops: 9604 [#1] PREEMPT SMP ... Call trace: sparse_init_nid+0x5c/0x2b0 sparse_init+0x138/0x170 bootmem_init+0x80/0xe0 setup_arch+0x2a0/0x5fc start_kernel+0x8c/0x648 Replace the loop with a single call to memblock_set_node() to the entire memory. Signed-off-by: Mike Rapoport Acked-by: Jonathan Cameron Acked-by: Catalin Marinas --- arch/arm64/mm/numa.c | 13 + 1 file changed, 5 insertions(+), 8 deletions(-) diff --git a/arch/arm64/mm/numa.c b/arch/arm64/mm/numa.c index aafcee3e3f7e..0cbdbcc885fb 100644 --- a/arch/arm64/mm/numa.c +++ b/arch/arm64/mm/numa.c @@ -423,19 +423,16 @@ static int __init numa_init(int (*init_func)(void)) */ static int __init dummy_numa_init(void) { + phys_addr_t start = memblock_start_of_DRAM(); + phys_addr_t end = memblock_end_of_DRAM(); int ret; - struct memblock_region *mblk; if (numa_off) pr_info("NUMA disabled\n"); /* Forced off on command line. */ - pr_info("Faking a node at [mem %#018Lx-%#018Lx]\n", - memblock_start_of_DRAM(), memblock_end_of_DRAM() - 1); - - for_each_memblock(memory, mblk) { - ret = numa_add_memblk(0, mblk->base, mblk->base + mblk->size); - if (!ret) - continue; + pr_info("Faking a node at [mem %#018Lx-%#018Lx]\n", start, end - 1); + ret = numa_add_memblk(0, start, end); + if (ret) { pr_err("NUMA init failed\n"); return ret; } -- 2.26.2
[PATCH v2 03/17] arm, xtensa: simplify initialization of high memory pages
From: Mike Rapoport The function free_highpages() in both arm and xtensa essentially open-code for_each_free_mem_range() loop to detect high memory pages that were not reserved and that should be initialized and passed to the buddy allocator. Replace open-coded implementation of for_each_free_mem_range() with usage of memblock API to simplify the code. Signed-off-by: Mike Rapoport Reviewed-by: Max Filippov # xtensa Tested-by: Max Filippov # xtensa --- arch/arm/mm/init.c| 48 +++-- arch/xtensa/mm/init.c | 55 --- 2 files changed, 18 insertions(+), 85 deletions(-) diff --git a/arch/arm/mm/init.c b/arch/arm/mm/init.c index 01e18e43b174..626af348eb8f 100644 --- a/arch/arm/mm/init.c +++ b/arch/arm/mm/init.c @@ -352,61 +352,29 @@ static void __init free_unused_memmap(void) #endif } -#ifdef CONFIG_HIGHMEM -static inline void free_area_high(unsigned long pfn, unsigned long end) -{ - for (; pfn < end; pfn++) - free_highmem_page(pfn_to_page(pfn)); -} -#endif - static void __init free_highpages(void) { #ifdef CONFIG_HIGHMEM unsigned long max_low = max_low_pfn; - struct memblock_region *mem, *res; + phys_addr_t range_start, range_end; + u64 i; /* set highmem page free */ - for_each_memblock(memory, mem) { - unsigned long start = memblock_region_memory_base_pfn(mem); - unsigned long end = memblock_region_memory_end_pfn(mem); + for_each_free_mem_range(i, NUMA_NO_NODE, MEMBLOCK_NONE, + _start, _end, NULL) { + unsigned long start = PHYS_PFN(range_start); + unsigned long end = PHYS_PFN(range_end); /* Ignore complete lowmem entries */ if (end <= max_low) continue; - if (memblock_is_nomap(mem)) - continue; - /* Truncate partial highmem entries */ if (start < max_low) start = max_low; - /* Find and exclude any reserved regions */ - for_each_memblock(reserved, res) { - unsigned long res_start, res_end; - - res_start = memblock_region_reserved_base_pfn(res); - res_end = memblock_region_reserved_end_pfn(res); - - if (res_end < start) - continue; - if (res_start < start) - res_start = start; - if (res_start > end) - res_start = end; - if (res_end > end) - res_end = end; - if (res_start != start) - free_area_high(start, res_start); - start = res_end; - if (start == end) - break; - } - - /* And now free anything which remains */ - if (start < end) - free_area_high(start, end); + for (; start < end; start++) + free_highmem_page(pfn_to_page(start)); } #endif } diff --git a/arch/xtensa/mm/init.c b/arch/xtensa/mm/init.c index a05b306cf371..ad9d59d93f39 100644 --- a/arch/xtensa/mm/init.c +++ b/arch/xtensa/mm/init.c @@ -79,67 +79,32 @@ void __init zones_init(void) free_area_init(max_zone_pfn); } -#ifdef CONFIG_HIGHMEM -static void __init free_area_high(unsigned long pfn, unsigned long end) -{ - for (; pfn < end; pfn++) - free_highmem_page(pfn_to_page(pfn)); -} - static void __init free_highpages(void) { +#ifdef CONFIG_HIGHMEM unsigned long max_low = max_low_pfn; - struct memblock_region *mem, *res; + phys_addr_t range_start, range_end; + u64 i; - reset_all_zones_managed_pages(); /* set highmem page free */ - for_each_memblock(memory, mem) { - unsigned long start = memblock_region_memory_base_pfn(mem); - unsigned long end = memblock_region_memory_end_pfn(mem); + for_each_free_mem_range(i, NUMA_NO_NODE, MEMBLOCK_NONE, + _start, _end, NULL) { + unsigned long start = PHYS_PFN(range_start); + unsigned long end = PHYS_PFN(range_end); /* Ignore complete lowmem entries */ if (end <= max_low) continue; - if (memblock_is_nomap(mem)) - continue; - /* Truncate partial highmem entries */ if (start < max_low) start = max_low; - /* Find and exclude any reserved regions */ - for_each_memblock(reserved, res) { - unsigned long res_start,
[PATCH v2 02/17] dma-contiguous: simplify cma_early_percent_memory()
From: Mike Rapoport The memory size calculation in cma_early_percent_memory() traverses memblock.memory rather than simply call memblock_phys_mem_size(). The comment in that function suggests that at some point there should have been call to memblock_analyze() before memblock_phys_mem_size() could be used. As of now, there is no memblock_analyze() at all and memblock_phys_mem_size() can be used as soon as cold-plug memory is registerd with memblock. Replace loop over memblock.memory with a call to memblock_phys_mem_size(). Signed-off-by: Mike Rapoport Reviewed-by: Christoph Hellwig --- kernel/dma/contiguous.c | 11 +-- 1 file changed, 1 insertion(+), 10 deletions(-) diff --git a/kernel/dma/contiguous.c b/kernel/dma/contiguous.c index 15bc5026c485..1992afd8ca7b 100644 --- a/kernel/dma/contiguous.c +++ b/kernel/dma/contiguous.c @@ -73,16 +73,7 @@ early_param("cma", early_cma); static phys_addr_t __init __maybe_unused cma_early_percent_memory(void) { - struct memblock_region *reg; - unsigned long total_pages = 0; - - /* -* We cannot use memblock_phys_mem_size() here, because -* memblock_analyze() has not been called yet. -*/ - for_each_memblock(memory, reg) - total_pages += memblock_region_memory_end_pfn(reg) - - memblock_region_memory_base_pfn(reg); + unsigned long total_pages = PHYS_PFN(memblock_phys_mem_size()); return (total_pages * CONFIG_CMA_SIZE_PERCENTAGE / 100) << PAGE_SHIFT; } -- 2.26.2
[PATCH v2 01/17] KVM: PPC: Book3S HV: simplify kvm_cma_reserve()
From: Mike Rapoport The memory size calculation in kvm_cma_reserve() traverses memblock.memory rather than simply call memblock_phys_mem_size(). The comment in that function suggests that at some point there should have been call to memblock_analyze() before memblock_phys_mem_size() could be used. As of now, there is no memblock_analyze() at all and memblock_phys_mem_size() can be used as soon as cold-plug memory is registerd with memblock. Replace loop over memblock.memory with a call to memblock_phys_mem_size(). Signed-off-by: Mike Rapoport --- arch/powerpc/kvm/book3s_hv_builtin.c | 11 ++- 1 file changed, 2 insertions(+), 9 deletions(-) diff --git a/arch/powerpc/kvm/book3s_hv_builtin.c b/arch/powerpc/kvm/book3s_hv_builtin.c index 7cd3cf3d366b..56ab0d28de2a 100644 --- a/arch/powerpc/kvm/book3s_hv_builtin.c +++ b/arch/powerpc/kvm/book3s_hv_builtin.c @@ -95,22 +95,15 @@ EXPORT_SYMBOL_GPL(kvm_free_hpt_cma); void __init kvm_cma_reserve(void) { unsigned long align_size; - struct memblock_region *reg; - phys_addr_t selected_size = 0; + phys_addr_t selected_size; /* * We need CMA reservation only when we are in HV mode */ if (!cpu_has_feature(CPU_FTR_HVMODE)) return; - /* -* We cannot use memblock_phys_mem_size() here, because -* memblock_analyze() has not been called yet. -*/ - for_each_memblock(memory, reg) - selected_size += memblock_region_memory_end_pfn(reg) - -memblock_region_memory_base_pfn(reg); + selected_size = PHYS_PFN(memblock_phys_mem_size()); selected_size = (selected_size * kvm_cma_resv_ratio / 100) << PAGE_SHIFT; if (selected_size) { pr_debug("%s: reserving %ld MiB for global area\n", __func__, -- 2.26.2
[PATCH v2 00/17] memblock: seasonal cleaning^w cleanup
From: Mike Rapoport Hi, These patches simplify several uses of memblock iterators and hide some of the memblock implementation details from the rest of the system. The patches are on top of v5.8-rc7 + cherry-pick of "mm/sparse: cleanup the code surrounding memory_present()" [1] from mmotm tree. v2 changes: * replace for_each_memblock() with two versions, one for memblock.memory and another one for memblock.reserved * fix overzealous cleanup of powerpc fadamp: keep the traversal over the memblocks, but use better suited iterators * don't remove traversal over memblock.reserved in x86 numa cleanup but replace for_each_memblock() with new for_each_reserved_mem_region() * simplify ramdisk and crash kernel allocations on x86 * drop more redundant and unused code: __next_reserved_mem_region() and memblock_mem_size() * add description of numa initialization fix on arm64 (thanks Jonathan) * add Acked and Reviewed tags [1] http://lkml.kernel.org/r/20200712083130.22919-1-r...@kernel.org Mike Rapoport (17): KVM: PPC: Book3S HV: simplify kvm_cma_reserve() dma-contiguous: simplify cma_early_percent_memory() arm, xtensa: simplify initialization of high memory pages arm64: numa: simplify dummy_numa_init() h8300, nds32, openrisc: simplify detection of memory extents riscv: drop unneeded node initialization mircoblaze: drop unneeded NUMA and sparsemem initializations memblock: make for_each_memblock_type() iterator private memblock: make memblock_debug and related functionality private memblock: reduce number of parameters in for_each_mem_range() arch, mm: replace for_each_memblock() with for_each_mem_pfn_range() arch, drivers: replace for_each_membock() with for_each_mem_range() x86/setup: simplify initrd relocation and reservation x86/setup: simplify reserve_crashkernel() memblock: remove unused memblock_mem_size() memblock: implement for_each_reserved_mem_region() using __next_mem_region() memblock: use separate iterators for memory and reserved regions .clang-format| 4 +- arch/arm/kernel/setup.c | 18 +++-- arch/arm/mm/init.c | 59 arch/arm/mm/mmu.c| 39 --- arch/arm/mm/pmsa-v7.c| 20 +++--- arch/arm/mm/pmsa-v8.c| 17 +++-- arch/arm/xen/mm.c| 7 +- arch/arm64/kernel/machine_kexec_file.c | 6 +- arch/arm64/kernel/setup.c| 4 +- arch/arm64/mm/init.c | 11 ++- arch/arm64/mm/kasan_init.c | 10 +-- arch/arm64/mm/mmu.c | 11 +-- arch/arm64/mm/numa.c | 15 ++--- arch/c6x/kernel/setup.c | 9 +-- arch/h8300/kernel/setup.c| 8 +-- arch/microblaze/mm/init.c| 24 ++- arch/mips/cavium-octeon/dma-octeon.c | 12 ++-- arch/mips/kernel/setup.c | 31 + arch/mips/netlogic/xlp/setup.c | 2 +- arch/nds32/kernel/setup.c| 8 +-- arch/openrisc/kernel/setup.c | 9 +-- arch/openrisc/mm/init.c | 8 ++- arch/powerpc/kernel/fadump.c | 57 arch/powerpc/kvm/book3s_hv_builtin.c | 11 +-- arch/powerpc/mm/book3s64/hash_utils.c| 16 ++--- arch/powerpc/mm/book3s64/radix_pgtable.c | 11 ++- arch/powerpc/mm/kasan/kasan_init_32.c| 8 +-- arch/powerpc/mm/mem.c| 33 + arch/powerpc/mm/numa.c | 7 +- arch/powerpc/mm/pgtable_32.c | 8 +-- arch/riscv/mm/init.c | 34 +++--- arch/riscv/mm/kasan_init.c | 10 +-- arch/s390/kernel/crash_dump.c| 8 +-- arch/s390/kernel/setup.c | 31 + arch/s390/mm/page-states.c | 6 +- arch/s390/mm/vmem.c | 16 +++-- arch/sh/mm/init.c| 9 +-- arch/sparc/mm/init_64.c | 12 ++-- arch/x86/kernel/setup.c | 56 +--- arch/x86/mm/numa.c | 2 +- arch/xtensa/mm/init.c| 55 +++ drivers/bus/mvebu-mbus.c | 12 ++-- drivers/irqchip/irq-gic-v3-its.c | 2 +- drivers/s390/char/zcore.c| 9 +-- include/linux/memblock.h | 65 +- kernel/dma/contiguous.c | 11 +-- mm/memblock.c| 85 mm/page_alloc.c | 11 ++- mm/sparse.c | 10 ++- 49 files changed, 366 insertions(+), 561 deletions(-) -- 2.26.2
Re: [RFC PATCH 1/2] powerpc/numa: Introduce logical numa id
Srikar Dronamraju writes: > * Aneesh Kumar K.V [2020-07-31 16:49:14]: > >> We use ibm,associativity and ibm,associativity-lookup-arrays to derive the >> numa >> node numbers. These device tree properties are firmware indicated grouping of >> resources based on their hierarchy in the platform. These numbers (group id) >> are >> not sequential and hypervisor/firmware can follow different numbering >> schemes. >> For ex: on powernv platforms, we group them in the below order. >> >> * - CCM node ID >> * - HW card ID >> * - HW module ID >> * - Chip ID >> * - Core ID >> >> Based on ibm,associativity-reference-points we use one of the above group >> ids as >> Linux NUMA node id. (On PowerNV platform Chip ID is used). This results >> in Linux reporting non-linear NUMA node id and which also results in Linux >> reporting empty node 0 NUMA nodes. >> > > If its just to eliminate node 0, then we have 2 other probably better > solutions. > 1. Dont mark node 0 as spl (currently still in mm-tree and a result in > linux-next) > 2. powerpc specific: explicitly clear node 0 during numa bringup. > I am not sure I consider them better. But yes, those patches are good and also resolves the node 0 initialization when the firmware didn't indicate the presence of such a node. This patch in addition make sure that we get the same topolgy report across reboot on a virtualized partitions as longs as the cpu/memory ratio per powervm domains remain the same. This should also help to avoid confusion after an LPM migration once we start applying topology updates. >> This can be resolved by mapping the firmware provided group id to a logical >> Linux >> NUMA id. In this patch, we do this only for pseries platforms considering the > > On PowerVM, as you would know the nid is already a logical or a flattened > chip-id and not the actual hardware chip-id. Yes. But then they are derived based on PowerVM resources AKA domains. Now based on the available resource on a system, we could end up with different node numbers with same toplogy across reboots. Making it logical at OS level prevent that. > >> firmware group id is a virtualized entity and users would not have drawn any >> conclusion based on the Linux Numa Node id. >> >> On PowerNV platform since we have historically mapped Chip ID as Linux NUMA >> node >> id, we keep the existing Linux NUMA node id numbering. >> >> Before Fix: >> # numactl -H >> available: 2 nodes (0-1) >> node 0 cpus: >> node 0 size: 0 MB >> node 0 free: 0 MB >> node 1 cpus: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 >> 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 >> 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 >> node 1 size: 50912 MB >> node 1 free: 45248 MB >> node distances: >> node 0 1 >> 0: 10 40 >> 1: 40 10 >> >> after fix >> # numactl -H >> available: 1 nodes (0) >> node 0 cpus: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 >> 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 >> 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 >> node 0 size: 50912 MB >> node 0 free: 49724 MB >> node distances: >> node 0 >> 0: 10 >> >> Signed-off-by: Aneesh Kumar K.V >> --- >> arch/powerpc/include/asm/topology.h | 1 + >> arch/powerpc/mm/numa.c | 49 ++--- >> 2 files changed, 39 insertions(+), 11 deletions(-) >> >> diff --git a/arch/powerpc/include/asm/topology.h >> b/arch/powerpc/include/asm/topology.h >> index f0b6300e7dd3..15b0424a27a8 100644 >> --- a/arch/powerpc/include/asm/topology.h >> +++ b/arch/powerpc/include/asm/topology.h >> @@ -118,5 +118,6 @@ int get_physical_package_id(int cpu); >> #endif >> #endif >> >> +int firmware_group_id_to_nid(int firmware_gid); >> #endif /* __KERNEL__ */ >> #endif /* _ASM_POWERPC_TOPOLOGY_H */ >> diff --git a/arch/powerpc/mm/numa.c b/arch/powerpc/mm/numa.c >> index e437a9ac4956..6c659aada55b 100644 >> --- a/arch/powerpc/mm/numa.c >> +++ b/arch/powerpc/mm/numa.c >> @@ -221,25 +221,51 @@ static void initialize_distance_lookup_table(int nid, >> } >> } >> >> +static u32 nid_map[MAX_NUMNODES] = {[0 ... MAX_NUMNODES - 1] = >> NUMA_NO_NODE}; >> + >> +int firmware_group_id_to_nid(int firmware_gid) >> +{ >> +static int last_nid = 0; >> + >> +/* >> + * For PowerNV we don't change the node id. This helps to avoid >> + * confusion w.r.t the expected node ids. On pseries, node numbers >> + * are virtualized. Hence do logical node id for pseries. >> + */ >> +if (!firmware_has_feature(FW_FEATURE_LPAR)) >> +return firmware_gid; >> + >> +if (firmware_gid == -1) >> +return NUMA_NO_NODE; >> + >> +if (nid_map[firmware_gid] == NUMA_NO_NODE) >> +nid_map[firmware_gid] = last_nid++; > > How do we ensure 2 simultaneous firmware_group_id_to_nid() calls dont end up > at this
Re: [PATCH v3] selftests: powerpc: Fix CPU affinity for child process
On Tue, 9 Jun 2020 13:44:23 +0530, Harish wrote: > On systems with large number of cpus, test fails trying to set > affinity by calling sched_setaffinity() with smaller size for > affinity mask. This patch fixes it by making sure that the size of > allocated affinity mask is dependent on the number of CPUs as > reported by get_nprocs(). Applied to powerpc/next. [1/1] selftests/powerpc: Fix CPU affinity for child process https://git.kernel.org/powerpc/c/854eb5022be04f81e318765f089f41a57c8e5d83 cheers
Re: [PATCH v4 0/2] powerpc/papr_scm: add support for reporting NVDIMM 'life_used_percentage' metric
On Fri, 31 Jul 2020 12:11:51 +0530, Vaibhav Jain wrote: > Changes since v3[1]: > > * Fixed a rebase issue pointed out by Aneesh in first patch in the series. > > [1] > https://lore.kernel.org/linux-nvdimm/20200730121303.134230-1-vaib...@linux.ibm.com Applied to powerpc/next. [1/2] powerpc/papr_scm: Fetch nvdimm performance stats from PHYP https://git.kernel.org/powerpc/c/2d02bf835e5731de632c8a13567905fa7c0da01c [2/2] powerpc/papr_scm: Add support for fetching nvdimm 'fuel-gauge' metric https://git.kernel.org/powerpc/c/af0870c4e75655b1931d0a5ffde2f448a2794362 cheers
Re: [PATCH] powerpc: fix function annotations to avoid section mismatch warnings with gcc-10
On Wed, 29 Jul 2020 15:37:41 +0200, Vladis Dronov wrote: > Certain warnings are emitted for powerpc code when building with a gcc-10 > toolset: > > WARNING: modpost: vmlinux.o(.text.unlikely+0x377c): Section mismatch in > reference from the function remove_pmd_table() to the function > .meminit.text:split_kernel_mapping() > The function remove_pmd_table() references > the function __meminit split_kernel_mapping(). > This is often because remove_pmd_table lacks a __meminit > annotation or the annotation of split_kernel_mapping is wrong. > > [...] Applied to powerpc/next. [1/1] powerpc: fix function annotations to avoid section mismatch warnings with gcc-10 https://git.kernel.org/powerpc/c/aff779515a070df7e23da9e86f1096f7d10d647e cheers
Re: [PATCH v3] selftests: powerpc: Fix online CPU selection
On Thu, 30 Jul 2020 10:38:46 +0530, Sandipan Das wrote: > The size of the CPU affinity mask must be large enough for > systems with a very large number of CPUs. Otherwise, tests > which try to determine the first online CPU by calling > sched_getaffinity() will fail. This makes sure that the size > of the allocated affinity mask is dependent on the number of > CPUs as reported by get_nprocs_conf(). Applied to powerpc/next. [1/1] selftests/powerpc: Fix online CPU selection https://git.kernel.org/powerpc/c/dfa03fff86027e58c8dba5c03ae68150d4e513ad cheers
Re: [PATCH] powerpc/pseries/hotplug-cpu: remove double free in error path
On Thu, 19 Sep 2019 18:16:33 -0500, Nathan Lynch wrote: > In the unlikely event that the device tree lacks a /cpus node, > find_dlpar_cpus_to_add() oddly frees the cpu_drcs buffer it has been > passed before returning an error. Its only caller also frees the > buffer on error. > > Remove the less conventional kfree() of a caller-supplied buffer from > find_dlpar_cpus_to_add(). Applied to powerpc/next. [1/1] powerpc/pseries/hotplug-cpu: Remove double free in error path https://git.kernel.org/powerpc/c/a0ff72f9f5a780341e7ff5e9ba50a0dad5fa1980 cheers
Re: [PATCH 0/4] cacheinfo instrumentation tweaks
On Thu, 27 Jun 2019 00:15:33 -0500, Nathan Lynch wrote: > A few changes that would have aided debugging this code's interactions > with partition migration, maybe they'll help with the next thing > (hibernation?). > > Nathan Lynch (4): > powerpc/cacheinfo: set pr_fmt > powerpc/cacheinfo: name@unit instead of full DT path in debug messages > powerpc/cacheinfo: improve diagnostics about malformed cache lists > powerpc/cacheinfo: warn if cache object chain becomes unordered > > [...] Applied to powerpc/next. [1/4] powerpc/cacheinfo: Set pr_fmt() https://git.kernel.org/powerpc/c/e2b3c165f27a6bdb197b0dc86683ed36f61c5527 [2/4] powerpc/cacheinfo: Use name@unit instead of full DT path in debug messages https://git.kernel.org/powerpc/c/be6f885e97e9304541057fbf25148685847ef310 [3/4] powerpc/cacheinfo: Improve diagnostics about malformed cache lists https://git.kernel.org/powerpc/c/1b3da8ffaa158e9a95c19b17c14d7259d58bc0cd [4/4] powerpc/cacheinfo: Warn if cache object chain becomes unordered https://git.kernel.org/powerpc/c/6ec54363f198aae9c1343f82ff5b865546944a73 cheers
Re: [PATCH 0/2] migration/prrn instrumentation tweaks
On Thu, 27 Jun 2019 00:30:42 -0500, Nathan Lynch wrote: > Mainly this produces better information about what's happening with > the device tree as a result of LPM or PRRN. > > Nathan Lynch (2): > powerpc/pseries/mobility: set pr_fmt > powerpc/pseries/mobility: add pr_debug for device tree changes > > [...] Applied to powerpc/next. [1/2] powerpc/pseries/mobility: Set pr_fmt() https://git.kernel.org/powerpc/c/494a66f34e00b6a1897b5a1ab150a19265696b17 [2/2] powerpc/pseries/mobility: Add pr_debug() for device tree changes https://git.kernel.org/powerpc/c/5d8b1f9dea17b4bf5e5f088f39eeab32c7e487be cheers
Re: [PATCH v2] hmi: Move hmi irq stat from percpu variable to paca.
On Tue, 23 Jun 2020 15:57:50 +0530, Mahesh Salgaonkar wrote: > With the proposed change in percpu bootmem allocator to use page mapping > [1], the percpu first chunk memory area can come from vmalloc ranges. This > makes hmi handler to crash the kernel whenever percpu variable is accessed > in real mode. This patch fixes this issue by moving the hmi irq stat > inside paca for safe access in realmode. > > [1] > https://lore.kernel.org/linuxppc-dev/20200608070904.387440-1-aneesh.ku...@linux.ibm.com/ Applied to powerpc/next. [1/1] powerpc/64s: Move HMI IRQ stat from percpu variable to paca. https://git.kernel.org/powerpc/c/ada68a66b72687e6b74e35c42efd1783e84b01fd cheers
Re: [PATCH v6 00/11] ppc64: enable kdump support for kexec_file_load syscall
On Wed, 29 Jul 2020 17:08:44 +0530, Hari Bathini wrote: > Sorry! There was a gateway issue on my system while posting v5, due to > which some patches did not make it through. Resending... > > This patch series enables kdump support for kexec_file_load system > call (kexec -s -p) on PPC64. The changes are inspired from kexec-tools > code but heavily modified for kernel consumption. > > [...] Applied to powerpc/next. [01/11] kexec_file: Allow archs to handle special regions while locating memory hole https://git.kernel.org/powerpc/c/f891f19736bdf404845f97d8038054be37160ea8 [02/11] powerpc/kexec_file: Mark PPC64 specific code https://git.kernel.org/powerpc/c/19031275a5881233b4fc31b7dee68bf0b0758bbc [03/11] powerpc/kexec_file: Add helper functions for getting memory ranges https://git.kernel.org/powerpc/c/180adfc532a83c1d74146449f7385f767d4b8059 [04/11] powerpc/kexec_file: Avoid stomping memory used by special regions https://git.kernel.org/powerpc/c/b8e55a3e5c208862eacded5aad822184f89f85d9 [05/11] powerpc/drmem: Make LMB walk a bit more flexible https://git.kernel.org/powerpc/c/adfefc609e55edc5dce18a68d1526af6d70aaf86 [06/11] powerpc/kexec_file: Restrict memory usage of kdump kernel https://git.kernel.org/powerpc/c/7c64e21a1c5a5bcd651d895b8faa68e9cdcc433d [07/11] powerpc/kexec_file: Setup backup region for kdump kernel https://git.kernel.org/powerpc/c/1a1cf93c200581c72a3cd521e1e0a1a3b5d0077d [08/11] powerpc/kexec_file: Prepare elfcore header for crashing kernel https://git.kernel.org/powerpc/c/cb350c1f1f867db16725f1bb06be033ece19e998 [09/11] powerpc/kexec_file: Add appropriate regions for memory reserve map https://git.kernel.org/powerpc/c/6ecd0163d36049b5f2435a8658f1320c9f3f2924 [10/11] powerpc/kexec_file: Fix kexec load failure with lack of memory hole https://git.kernel.org/powerpc/c/b5667d13be8d0928a02b46e0c6f7ab891d32f697 [11/11] powerpc/kexec_file: Enable early kernel OPAL calls https://git.kernel.org/powerpc/c/2e6bd221d96fcfd9bd1eed5cd9c008e7959daed7 cheers
Re: [PATCH] powerpc/fsl/dts: add missing P4080DS I2C devices
On Fri, 21 Sep 2018 01:04:22 +0200, David Lamparter wrote: > This just adds the zl2006 voltage regulators / power monitors and the > onboard I2C eeproms. The ICS9FG108 clock chip doesn't seem to have a > driver, so it is left in the DTS as a comment. And for good measure, > the SPD eeproms are tagged as such. Applied to powerpc/next. [1/1] powerpc/fsl/dts: add missing P4080DS I2C devices https://git.kernel.org/powerpc/c/d3c61954fc1827df571e235b9a98e10108ef5c3d cheers
Re: [PATCH v3 0/3] cpuidle-pseries: Parse extended CEDE information for idle.
On Thu, 30 Jul 2020 11:02:54 +0530, Gautham R. Shenoy wrote: > This is a v3 of the patch series to parse the extended CEDE > information in the pseries-cpuidle driver. > > The previous two versions of the patches can be found here: > > v2: > https://lore.kernel.org/lkml/1596005254-25753-1-git-send-email-...@linux.vnet.ibm.com/ > > [...] Applied to powerpc/next. [1/3] cpuidle: pseries: Set the latency-hint before entering CEDE https://git.kernel.org/powerpc/c/3af0ada7dd98c6da35c1fd7f107af3b9aa5e904c [2/3] cpuidle: pseries: Add function to parse extended CEDE records https://git.kernel.org/powerpc/c/054e44ba99ae36918631fcbf5f034e466c2f1b73 [3/3] cpuidle: pseries: Fixup exit latency for CEDE(0) https://git.kernel.org/powerpc/c/d947fb4c965cdb7242f3f91124ea16079c49fa8b cheers
Re: [PATCH] selftests/powerpc: return skip code for spectre_v2
On Tue, 28 Jul 2020 12:50:39 -0300, Thadeu Lima de Souza Cascardo wrote: > When running under older versions of qemu of under newer versions with old > machine types, some security features will not be reported to the guest. > This will lead the guest OS to consider itself Vulnerable to spectre_v2. > > So, spectre_v2 test fails in such cases when the host is mitigated and miss > predictions cannot be detected as expected by the test. > > [...] Applied to powerpc/next. [1/1] selftests/powerpc: Return skip code for spectre_v2 https://git.kernel.org/powerpc/c/f3054ffd71b5afd44832b2207e6e90267e1cd2d1 cheers
Re: [PATCH v4 0/3] Add support for divde[.] and divdeu[.] instruction emulation
On Tue, 28 Jul 2020 18:33:05 +0530, Balamuruhan S wrote: > This patchset adds support to emulate divde, divde., divdeu and divdeu. > instructions and testcases for it. > > Resend v4: rebased on latest powerpc next branch > > Changes in v4: > - > Fix review comments from Naveen, > * replace TEST_DIVDEU() instead of wrongly used TEST_DIVDEU_DOT() in > divdeu testcase. > * Include `acked-by` tag from Naveen for the series. > * Rebase it on latest mpe's merge tree. > > [...] Applied to powerpc/next. [1/3] powerpc/ppc-opcode: Add divde and divdeu opcodes https://git.kernel.org/powerpc/c/8902c6f96364d1117236948d6c7b9178f428529c [2/3] powerpc/sstep: Add support for divde[.] and divdeu[.] instructions https://git.kernel.org/powerpc/c/151c32bf5ebdd41114267717dc4b53d2632cbd30 [3/3] powerpc/test_emulate_step: Add testcases for divde[.] and divdeu[.] instructions https://git.kernel.org/powerpc/c/b859c95cf4b936b5e8019e7ab68ee2740e609ffd cheers
Re: [PATCH] powerpc/configs: Add BLK_DEV_NVME to pseries_defconfig
On Wed, 29 Jul 2020 14:08:28 +1000, Anton Blanchard wrote: > I've forgotten to manual enable NVME when building pseries kernels > for machines with NVME adapters. Since it's a reasonably common > configuration, enable it by default. Applied to powerpc/next. [1/1] powerpc/configs: Add BLK_DEV_NVME to pseries_defconfig https://git.kernel.org/powerpc/c/fdaa7ce2016ccd09a538b05bace5f4479662ddcb cheers
Re: [PATCH 0/2] powerpc: OpenCAPI Cleanup
On Wed, 15 Apr 2020 11:23:41 +1000, Alastair D'Silva wrote: > These patches address checkpatch & kernel doc warnings > in the OpenCAPI infrastructure. > > Alastair D'Silva (2): > ocxl: Remove unnecessary externs > ocxl: Address kernel doc errors & warnings > > [...] Applied to powerpc/next. [1/2] ocxl: Remove unnecessary externs https://git.kernel.org/powerpc/c/c75d42e4c768c403f259f6c7f6217c850cf11be9 [2/2] ocxl: Address kernel doc errors & warnings https://git.kernel.org/powerpc/c/3591538a31af37cf6a2d83f1da99e651a822af8b cheers
Re: [PATCH] powerpc/64s/hash: Fix hash_preload running with interrupts enabled
On Mon, 27 Jul 2020 16:09:47 +1000, Nicholas Piggin wrote: > Commit 2f92447f9f96 ("powerpc/book3s64/hash: Use the pte_t address from the > caller") removed the local_irq_disable from hash_preload, but it was > required for more than just the page table walk: the hash pte busy bit is > effectively a lock which may be taken in interrupt context, and the local > update flag test must not be preempted before it's used. > > This solves apparent lockups with perf interrupting __hash_page_64K. If > get_perf_callchain then also takes a hash fault on the same page while it > is already locked, it will loop forever taking hash faults, which looks like > this: > > [...] Applied to powerpc/fixes. [1/1] powerpc/64s/hash: Fix hash_preload running with interrupts enabled https://git.kernel.org/powerpc/c/909adfc66b9a1db21b5e8733e9ebfa6cd5135d74 cheers
Re: [PATCH 06/15] powerpc: fadamp: simplify fadump_reserve_crash_area()
Mike Rapoport writes: > On Thu, Jul 30, 2020 at 10:15:13PM +1000, Michael Ellerman wrote: >> Mike Rapoport writes: >> > From: Mike Rapoport >> > >> > fadump_reserve_crash_area() reserves memory from a specified base address >> > till the end of the RAM. >> > >> > Replace iteration through the memblock.memory with a single call to >> > memblock_reserve() with appropriate that will take care of proper memory >> ^ >> parameters? >> > reservation. >> > >> > Signed-off-by: Mike Rapoport >> > --- >> > arch/powerpc/kernel/fadump.c | 20 +--- >> > 1 file changed, 1 insertion(+), 19 deletions(-) >> >> I think this looks OK to me, but I don't have a setup to test it easily. >> I've added Hari to Cc who might be able to. >> >> But I'll give you an ack in the hope that it works :) > > Actually, I did some digging in the git log and the traversal was added > there on purpose by the commit b71a693d3db3 ("powerpc/fadump: exclude > memory holes while reserving memory in second kernel") > Presuming this is still reqruired I'm going to drop this patch and will > simply replace for_each_memblock() with for_each_mem_range() in v2. Thanks. cheers
Re: [PATCH v2 15/16] powerpc/powernv/sriov: Make single PE mode a per-BAR setting
Nathan Chancellor writes: > On Wed, Jul 22, 2020 at 04:57:14PM +1000, Oliver O'Halloran wrote: >> Using single PE BARs to map an SR-IOV BAR is really a choice about what >> strategy to use when mapping a BAR. It doesn't make much sense for this to >> be a global setting since a device might have one large BAR which needs to >> be mapped with single PE windows and another smaller BAR that can be mapped >> with a regular segmented window. Make the segmented vs single decision a >> per-BAR setting and clean up the logic that decides which mode to use. >> >> Signed-off-by: Oliver O'Halloran >> --- >> v2: Dropped unused total_vfs variables in pnv_pci_ioda_fixup_iov_resources() >> Dropped bar_no from pnv_pci_iov_resource_alignment() >> Minor re-wording of comments. >> --- >> arch/powerpc/platforms/powernv/pci-sriov.c | 131 ++--- >> arch/powerpc/platforms/powernv/pci.h | 11 +- >> 2 files changed, 73 insertions(+), 69 deletions(-) >> >> diff --git a/arch/powerpc/platforms/powernv/pci-sriov.c >> b/arch/powerpc/platforms/powernv/pci-sriov.c >> index ce8ad6851d73..76215d01405b 100644 >> --- a/arch/powerpc/platforms/powernv/pci-sriov.c >> +++ b/arch/powerpc/platforms/powernv/pci-sriov.c >> @@ -260,42 +256,40 @@ void pnv_pci_ioda_fixup_iov(struct pci_dev *pdev) >> resource_size_t pnv_pci_iov_resource_alignment(struct pci_dev *pdev, >>int resno) >> { >> -struct pnv_phb *phb = pci_bus_to_pnvhb(pdev->bus); >> struct pnv_iov_data *iov = pnv_iov_get(pdev); >> resource_size_t align; >> >> +/* >> + * iov can be null if we have an SR-IOV device with IOV BAR that can't >> + * be placed in the m64 space (i.e. The BAR is 32bit or non-prefetch). >> + * In that case we don't allow VFs to be enabled since one of their >> + * BARs would not be placed in the correct PE. >> + */ >> +if (!iov) >> +return align; >> +if (!iov->vfs_expanded) >> +return align; >> + >> +align = pci_iov_resource_size(pdev, resno); That's, oof. > I am not sure if it has been reported yet but clang points out that > align is initialized after its use: > > arch/powerpc/platforms/powernv/pci-sriov.c:267:10: warning: variable 'align' > is uninitialized when used here [-Wuninitialized] > return align; >^ > arch/powerpc/platforms/powernv/pci-sriov.c:258:23: note: initialize the > variable 'align' to silence this warning > resource_size_t align; > ^ > = 0 > 1 warning generated. But I can't get gcc to warn about it? It produces some code, so it's not like the whole function has been elided or something. I'm confused. cheers
Re: [PATCH] powerpc/pseries: explicitly reschedule during drmem_lmb list traversal
Nathan Lynch writes: > Michael Ellerman writes: >> Nathan Lynch writes: >>> Michael Ellerman writes: Nathan Lynch writes: > Laurent Dufour writes: >> Le 28/07/2020 à 19:37, Nathan Lynch a écrit : >>> The drmem lmb list can have hundreds of thousands of entries, and >>> unfortunately lookups take the form of linear searches. As long as >>> this is the case, traversals have the potential to monopolize the CPU >>> and provoke lockup reports, workqueue stalls, and the like unless >>> they explicitly yield. >>> >>> Rather than placing cond_resched() calls within various >>> for_each_drmem_lmb() loop blocks in the code, put it in the iteration >>> expression of the loop macro itself so users can't omit it. >> >> Is that not too much to call cond_resched() on every LMB? >> >> Could that be less frequent, every 10, or 100, I don't really know ? > > Everything done within for_each_drmem_lmb is relatively heavyweight > already. E.g. calling dlpar_remove_lmb()/dlpar_add_lmb() can take dozens > of milliseconds. I don't think cond_resched() is an expensive check in > this context. Hmm, mostly. But there are quite a few cases like drmem_update_dt_v1(): for_each_drmem_lmb(lmb) { dr_cell->base_addr = cpu_to_be64(lmb->base_addr); dr_cell->drc_index = cpu_to_be32(lmb->drc_index); dr_cell->aa_index = cpu_to_be32(lmb->aa_index); dr_cell->flags = cpu_to_be32(drmem_lmb_flags(lmb)); dr_cell++; } Which will compile to a pretty tight loop at the moment. Or drmem_update_dt_v2() which has two loops over all lmbs. And although the actual TIF check is cheap the function call to do it is not free. So I worry this is going to make some of those long loops take even longer. >>> >>> That's fair, and I was wrong - some of the loop bodies are relatively >>> simple, not doing allocations or taking locks, etc. >>> >>> One way to deal is to keep for_each_drmem_lmb() as-is and add a new >>> iterator that can reschedule, e.g. for_each_drmem_lmb_slow(). >> >> If we did that, how many call-sites would need converting? >> Is it ~2 or ~20 or ~200? > > At a glance I would convert 15-20 out of the 24 users in the tree I'm > looking at. Let me know if I should do a v2 with that approach. OK, that's a bunch of churn then, if we're planning to rework the code significantly in the near future. One thought, which I possibly should not put in writing, is that we could use the alignment of the pointer as a poor man's substitute for a counter, eg: +static inline struct drmem_lmb *drmem_lmb_next(struct drmem_lmb *lmb) +{ + if (lmb % PAGE_SIZE == 0) + cond_resched(); + + return ++lmb; +} I think the lmbs are allocated in a block, so I think that will work. Maybe PAGE_SIZE is not the right size to use, but you get the idea. Gross I know, but might be OK as short term solution? cheers
[merge] Build failure selftest/powerpc/mm/pkey_exec_prot
pkey_exec_prot test from linuxppc merge branch (3f68564f1f5a) fails to build due to following error: gcc -std=gnu99 -O2 -Wall -Werror -DGIT_VERSION='"v5.8-rc7-1276-g3f68564f1f5a"' -I/home/sachin/linux/tools/testing/selftests/powerpc/include -m64 pkey_exec_prot.c /home/sachin/linux/tools/testing/selftests/kselftest_harness.h /home/sachin/linux/tools/testing/selftests/kselftest.h ../harness.c ../utils.c -o /home/sachin/linux/tools/testing/selftests/powerpc/mm/pkey_exec_prot In file included from pkey_exec_prot.c:18: /home/sachin/linux/tools/testing/selftests/powerpc/include/pkeys.h:34: error: "SYS_pkey_mprotect" redefined [-Werror] #define SYS_pkey_mprotect 386 In file included from /usr/include/sys/syscall.h:31, from /home/sachin/linux/tools/testing/selftests/powerpc/include/utils.h:47, from /home/sachin/linux/tools/testing/selftests/powerpc/include/pkeys.h:12, from pkey_exec_prot.c:18: /usr/include/bits/syscall.h:1583: note: this is the location of the previous definition # define SYS_pkey_mprotect __NR_pkey_mprotect commit 128d3d021007 introduced this error. selftests/powerpc: Move pkey helpers to headers Possibly the # defines for sys calls can be retained in pkey_exec_prot.c or Thanks -Sachin
powerpc: build failures in Linus' tree
Hi all, We are getting build failures in some PowerPC configs for Linus' tree. See e.g. http://kisskb.ellerman.id.au/kisskb/buildresult/14306515/ In file included from /kisskb/src/arch/powerpc/include/asm/paca.h:18, from /kisskb/src/arch/powerpc/include/asm/percpu.h:13, from /kisskb/src/include/linux/random.h:14, from /kisskb/src/include/linux/net.h:18, from /kisskb/src/net/ipv6/ip6_fib.c:20: /kisskb/src/arch/powerpc/include/asm/mmu.h:139:22: error: unknown type name 'next_tlbcam_idx' 139 | DECLARE_PER_CPU(int, next_tlbcam_idx); I assume this is caused by commit 1c9df907da83 ("random: fix circular include dependency on arm64 after addition of percpu.h") But I can't see how, sorry. -- Cheers, Stephen Rothwell pgp48HTup_HaY.pgp Description: OpenPGP digital signature