[PATCH v7 11/11] powerpc/mm: Enable HAVE_MOVE_PMD support
mremap HAVE_MOVE_PMD/PUD optimization time comparison for 1GB region: 1GB mremap - Source PTE-aligned, Destination PTE-aligned mremap time: 2292772ns 1GB mremap - Source PMD-aligned, Destination PMD-aligned mremap time: 1158928ns 1GB mremap - Source PUD-aligned, Destination PUD-aligned mremap time:63886ns Signed-off-by: Aneesh Kumar K.V --- arch/powerpc/platforms/Kconfig.cputype | 2 ++ 1 file changed, 2 insertions(+) diff --git a/arch/powerpc/platforms/Kconfig.cputype b/arch/powerpc/platforms/Kconfig.cputype index f998e655b570..be8ceb5bece4 100644 --- a/arch/powerpc/platforms/Kconfig.cputype +++ b/arch/powerpc/platforms/Kconfig.cputype @@ -101,6 +101,8 @@ config PPC_BOOK3S_64 select ARCH_ENABLE_THP_MIGRATION if TRANSPARENT_HUGEPAGE select ARCH_SUPPORTS_HUGETLBFS select ARCH_SUPPORTS_NUMA_BALANCING + select HAVE_MOVE_PMD + select HAVE_MOVE_PUD select IRQ_WORK select PPC_MM_SLICES select PPC_HAVE_KUEP -- 2.31.1
[PATCH v7 10/11] powerpc/book3s64/mm: Update flush_tlb_range to flush page walk cache
flush_tlb_range is special in that we don't specify the page size used for the translation. Hence when flushing TLB we flush the translation cache for all possible page sizes. The kernel also uses the same interface when moving page tables around. Such a move requires us to flush the page walk cache. Instead of adding another interface to force page walk cache flush, update flush_tlb_range to flush page walk cache if the range flushed is more than the PMD range. A page table move will always involve an invalidate range more than PMD_SIZE. Running microbenchmark with mprotect and parallel memory access didn't show any observable performance impact. Signed-off-by: Aneesh Kumar K.V --- .../include/asm/book3s/64/tlbflush-radix.h| 2 + arch/powerpc/mm/book3s64/radix_hugetlbpage.c | 8 +++- arch/powerpc/mm/book3s64/radix_tlb.c | 44 --- 3 files changed, 36 insertions(+), 18 deletions(-) diff --git a/arch/powerpc/include/asm/book3s/64/tlbflush-radix.h b/arch/powerpc/include/asm/book3s/64/tlbflush-radix.h index 8b33601cdb9d..ab9d5e535000 100644 --- a/arch/powerpc/include/asm/book3s/64/tlbflush-radix.h +++ b/arch/powerpc/include/asm/book3s/64/tlbflush-radix.h @@ -60,6 +60,8 @@ extern void radix__flush_hugetlb_tlb_range(struct vm_area_struct *vma, unsigned long start, unsigned long end); extern void radix__flush_tlb_range_psize(struct mm_struct *mm, unsigned long start, unsigned long end, int psize); +void radix__flush_tlb_pwc_range_psize(struct mm_struct *mm, unsigned long start, + unsigned long end, int psize); extern void radix__flush_pmd_tlb_range(struct vm_area_struct *vma, unsigned long start, unsigned long end); extern void radix__flush_tlb_range(struct vm_area_struct *vma, unsigned long start, diff --git a/arch/powerpc/mm/book3s64/radix_hugetlbpage.c b/arch/powerpc/mm/book3s64/radix_hugetlbpage.c index cb91071eef52..23d3e08911d3 100644 --- a/arch/powerpc/mm/book3s64/radix_hugetlbpage.c +++ b/arch/powerpc/mm/book3s64/radix_hugetlbpage.c @@ -32,7 +32,13 @@ void radix__flush_hugetlb_tlb_range(struct vm_area_struct *vma, unsigned long st struct hstate *hstate = hstate_file(vma->vm_file); psize = hstate_get_psize(hstate); - radix__flush_tlb_range_psize(vma->vm_mm, start, end, psize); + /* +* Flush PWC even if we get PUD_SIZE hugetlb invalidate to keep this simpler. +*/ + if (end - start >= PUD_SIZE) + radix__flush_tlb_pwc_range_psize(vma->vm_mm, start, end, psize); + else + radix__flush_tlb_range_psize(vma->vm_mm, start, end, psize); } /* diff --git a/arch/powerpc/mm/book3s64/radix_tlb.c b/arch/powerpc/mm/book3s64/radix_tlb.c index 817a02ef6032..35c5eb23bfaf 100644 --- a/arch/powerpc/mm/book3s64/radix_tlb.c +++ b/arch/powerpc/mm/book3s64/radix_tlb.c @@ -997,14 +997,13 @@ static unsigned long tlb_local_single_page_flush_ceiling __read_mostly = POWER9_ static inline void __radix__flush_tlb_range(struct mm_struct *mm, unsigned long start, unsigned long end) - { unsigned long pid; unsigned int page_shift = mmu_psize_defs[mmu_virtual_psize].shift; unsigned long page_size = 1UL << page_shift; unsigned long nr_pages = (end - start) >> page_shift; bool fullmm = (end == TLB_FLUSH_ALL); - bool flush_pid; + bool flush_pid, flush_pwc = false; enum tlb_flush_type type; pid = mm->context.id; @@ -1023,8 +1022,16 @@ static inline void __radix__flush_tlb_range(struct mm_struct *mm, flush_pid = nr_pages > tlb_single_page_flush_ceiling; else flush_pid = nr_pages > tlb_local_single_page_flush_ceiling; + /* +* full pid flush already does the PWC flush. if it is not full pid +* flush check the range is more than PMD and force a pwc flush +* mremap() depends on this behaviour. +*/ + if (!flush_pid && (end - start) >= PMD_SIZE) + flush_pwc = true; if (!mmu_has_feature(MMU_FTR_GTSE) && type == FLUSH_TYPE_GLOBAL) { + unsigned long type = H_RPTI_TYPE_TLB; unsigned long tgt = H_RPTI_TARGET_CMMU; unsigned long pg_sizes = psize_to_rpti_pgsize(mmu_virtual_psize); @@ -1032,19 +1039,20 @@ static inline void __radix__flush_tlb_range(struct mm_struct *mm, pg_sizes |= psize_to_rpti_pgsize(MMU_PAGE_2M); if (atomic_read(>context.copros) > 0) tgt |= H_RPTI_TARGET_NMMU; - pseries_rpt_invalidate(pid, tgt, H_RPTI_TYPE_TLB, pg_sizes, - start, end); + if (flush_pwc) + type |= H_RPTI_TYPE_PWC; + pseries_rpt_invalidate(pid,
[PATCH v7 09/11] mm/mremap: Allow arch runtime override
Architectures like ppc64 support faster mremap only with radix translation. Hence allow a runtime check w.r.t support for fast mremap. Signed-off-by: Aneesh Kumar K.V --- arch/powerpc/include/asm/tlb.h | 6 ++ mm/mremap.c| 15 ++- 2 files changed, 20 insertions(+), 1 deletion(-) diff --git a/arch/powerpc/include/asm/tlb.h b/arch/powerpc/include/asm/tlb.h index 160422a439aa..09a9ae5f3656 100644 --- a/arch/powerpc/include/asm/tlb.h +++ b/arch/powerpc/include/asm/tlb.h @@ -83,5 +83,11 @@ static inline int mm_is_thread_local(struct mm_struct *mm) } #endif +#define arch_supports_page_table_move arch_supports_page_table_move +static inline bool arch_supports_page_table_move(void) +{ + return radix_enabled(); +} + #endif /* __KERNEL__ */ #endif /* __ASM_POWERPC_TLB_H */ diff --git a/mm/mremap.c b/mm/mremap.c index dacfa9111ab1..9cd352fb9cf8 100644 --- a/mm/mremap.c +++ b/mm/mremap.c @@ -25,7 +25,7 @@ #include #include -#include +#include #include #include "internal.h" @@ -210,6 +210,15 @@ static void move_ptes(struct vm_area_struct *vma, pmd_t *old_pmd, drop_rmap_locks(vma); } +#ifndef arch_supports_page_table_move +#define arch_supports_page_table_move arch_supports_page_table_move +static inline bool arch_supports_page_table_move(void) +{ + return IS_ENABLED(CONFIG_HAVE_MOVE_PMD) || + IS_ENABLED(CONFIG_HAVE_MOVE_PUD); +} +#endif + #ifdef CONFIG_HAVE_MOVE_PMD static bool move_normal_pmd(struct vm_area_struct *vma, unsigned long old_addr, unsigned long new_addr, pmd_t *old_pmd, pmd_t *new_pmd) @@ -218,6 +227,8 @@ static bool move_normal_pmd(struct vm_area_struct *vma, unsigned long old_addr, struct mm_struct *mm = vma->vm_mm; pmd_t pmd; + if (!arch_supports_page_table_move()) + return false; /* * The destination pmd shouldn't be established, free_pgtables() * should have released it. @@ -284,6 +295,8 @@ static bool move_normal_pud(struct vm_area_struct *vma, unsigned long old_addr, struct mm_struct *mm = vma->vm_mm; pud_t pud; + if (!arch_supports_page_table_move()) + return false; /* * The destination pud shouldn't be established, free_pgtables() * should have released it. -- 2.31.1
[PATCH v7 08/11] powerpc/mm/book3s64: Fix possible build error
Update _tlbiel_pid() such that we can avoid build errors like below when using this function in other places. arch/powerpc/mm/book3s64/radix_tlb.c: In function ‘__radix__flush_tlb_range_psize’: arch/powerpc/mm/book3s64/radix_tlb.c:114:2: warning: ‘asm’ operand 3 probably does not match constraints 114 | asm volatile(PPC_TLBIEL(%0, %4, %3, %2, %1) | ^~~ arch/powerpc/mm/book3s64/radix_tlb.c:114:2: error: impossible constraint in ‘asm’ make[4]: *** [scripts/Makefile.build:271: arch/powerpc/mm/book3s64/radix_tlb.o] Error 1 m With this fix, we can also drop the __always_inline in __radix_flush_tlb_range_psize which was added by commit e12d6d7d46a6 ("powerpc/mm/radix: mark __radix__flush_tlb_range_psize() as __always_inline") Reviewed-by: Christophe Leroy Acked-by: Michael Ellerman Signed-off-by: Aneesh Kumar K.V --- arch/powerpc/mm/book3s64/radix_tlb.c | 26 +- 1 file changed, 17 insertions(+), 9 deletions(-) diff --git a/arch/powerpc/mm/book3s64/radix_tlb.c b/arch/powerpc/mm/book3s64/radix_tlb.c index 409e61210789..817a02ef6032 100644 --- a/arch/powerpc/mm/book3s64/radix_tlb.c +++ b/arch/powerpc/mm/book3s64/radix_tlb.c @@ -291,22 +291,30 @@ static inline void fixup_tlbie_lpid(unsigned long lpid) /* * We use 128 set in radix mode and 256 set in hpt mode. */ -static __always_inline void _tlbiel_pid(unsigned long pid, unsigned long ric) +static inline void _tlbiel_pid(unsigned long pid, unsigned long ric) { int set; asm volatile("ptesync": : :"memory"); - /* -* Flush the first set of the TLB, and if we're doing a RIC_FLUSH_ALL, -* also flush the entire Page Walk Cache. -*/ - __tlbiel_pid(pid, 0, ric); + switch (ric) { + case RIC_FLUSH_PWC: - /* For PWC, only one flush is needed */ - if (ric == RIC_FLUSH_PWC) { + /* For PWC, only one flush is needed */ + __tlbiel_pid(pid, 0, RIC_FLUSH_PWC); ppc_after_tlbiel_barrier(); return; + case RIC_FLUSH_TLB: + __tlbiel_pid(pid, 0, RIC_FLUSH_TLB); + break; + case RIC_FLUSH_ALL: + default: + /* +* Flush the first set of the TLB, and if +* we're doing a RIC_FLUSH_ALL, also flush +* the entire Page Walk Cache. +*/ + __tlbiel_pid(pid, 0, RIC_FLUSH_ALL); } if (!cpu_has_feature(CPU_FTR_ARCH_31)) { @@ -1176,7 +1184,7 @@ void radix__tlb_flush(struct mmu_gather *tlb) } } -static __always_inline void __radix__flush_tlb_range_psize(struct mm_struct *mm, +static void __radix__flush_tlb_range_psize(struct mm_struct *mm, unsigned long start, unsigned long end, int psize, bool also_pwc) { -- 2.31.1
[PATCH v7 07/11] mm/mremap: Use pmd/pud_poplulate to update page table entries
pmd/pud_populate is the right interface to be used to set the respective page table entries. Some architectures like ppc64 do assume that set_pmd/pud_at can only be used to set a hugepage PTE. Since we are not setting up a hugepage PTE here, use the pmd/pud_populate interface. Signed-off-by: Aneesh Kumar K.V --- mm/mremap.c | 7 +++ 1 file changed, 3 insertions(+), 4 deletions(-) diff --git a/mm/mremap.c b/mm/mremap.c index 795a7d628b53..dacfa9111ab1 100644 --- a/mm/mremap.c +++ b/mm/mremap.c @@ -26,6 +26,7 @@ #include #include +#include #include "internal.h" @@ -258,8 +259,7 @@ static bool move_normal_pmd(struct vm_area_struct *vma, unsigned long old_addr, VM_BUG_ON(!pmd_none(*new_pmd)); - /* Set the new pmd */ - set_pmd_at(mm, new_addr, new_pmd, pmd); + pmd_populate(mm, new_pmd, pmd_pgtable(pmd)); flush_tlb_range(vma, old_addr, old_addr + PMD_SIZE); if (new_ptl != old_ptl) spin_unlock(new_ptl); @@ -306,8 +306,7 @@ static bool move_normal_pud(struct vm_area_struct *vma, unsigned long old_addr, VM_BUG_ON(!pud_none(*new_pud)); - /* Set the new pud */ - set_pud_at(mm, new_addr, new_pud, pud); + pud_populate(mm, new_pud, (pmd_t *)pud_page_vaddr(pud)); flush_tlb_range(vma, old_addr, old_addr + PUD_SIZE); if (new_ptl != old_ptl) spin_unlock(new_ptl); -- 2.31.1
[PATCH v7 06/11] mm/mremap: Don't enable optimized PUD move if page table levels is 2
With two level page table don't enable move_normal_pud. Signed-off-by: Aneesh Kumar K.V --- mm/mremap.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/mm/mremap.c b/mm/mremap.c index 92ab7d24a587..795a7d628b53 100644 --- a/mm/mremap.c +++ b/mm/mremap.c @@ -276,7 +276,7 @@ static inline bool move_normal_pmd(struct vm_area_struct *vma, } #endif -#ifdef CONFIG_HAVE_MOVE_PUD +#if CONFIG_PGTABLE_LEVELS > 2 && defined(CONFIG_HAVE_MOVE_PUD) static bool move_normal_pud(struct vm_area_struct *vma, unsigned long old_addr, unsigned long new_addr, pud_t *old_pud, pud_t *new_pud) { -- 2.31.1
[PATCH v7 05/11] mm/mremap: Convert huge PUD move to separate helper
With TRANSPARENT_HUGEPAGE_PUD enabled the kernel can find huge PUD entries. Add a helper to move huge PUD entries on mremap(). This will be used by a later patch to optimize mremap of PUD_SIZE aligned level 4 PTE mapped address This also make sure we support mremap on huge PUD entries even with CONFIG_HAVE_MOVE_PUD disabled. Signed-off-by: Aneesh Kumar K.V --- mm/mremap.c | 80 - 1 file changed, 73 insertions(+), 7 deletions(-) diff --git a/mm/mremap.c b/mm/mremap.c index 47c255b60150..92ab7d24a587 100644 --- a/mm/mremap.c +++ b/mm/mremap.c @@ -324,10 +324,62 @@ static inline bool move_normal_pud(struct vm_area_struct *vma, } #endif + +#ifdef CONFIG_TRANSPARENT_HUGEPAGE_PUD +static bool move_huge_pud(struct vm_area_struct *vma, unsigned long old_addr, + unsigned long new_addr, pud_t *old_pud, pud_t *new_pud) +{ + spinlock_t *old_ptl, *new_ptl; + struct mm_struct *mm = vma->vm_mm; + pud_t pud; + + /* +* The destination pud shouldn't be established, free_pgtables() +* should have released it. +*/ + if (WARN_ON_ONCE(!pud_none(*new_pud))) + return false; + + /* +* We don't have to worry about the ordering of src and dst +* ptlocks because exclusive mmap_lock prevents deadlock. +*/ + old_ptl = pud_lock(vma->vm_mm, old_pud); + new_ptl = pud_lockptr(mm, new_pud); + if (new_ptl != old_ptl) + spin_lock_nested(new_ptl, SINGLE_DEPTH_NESTING); + + /* Clear the pud */ + pud = *old_pud; + pud_clear(old_pud); + + VM_BUG_ON(!pud_none(*new_pud)); + + /* Set the new pud */ + /* mark soft_ditry when we add pud level soft dirty support */ + set_pud_at(mm, new_addr, new_pud, pud); + flush_pud_tlb_range(vma, old_addr, old_addr + HPAGE_PUD_SIZE); + if (new_ptl != old_ptl) + spin_unlock(new_ptl); + spin_unlock(old_ptl); + + return true; +} +#else +static bool move_huge_pud(struct vm_area_struct *vma, unsigned long old_addr, + unsigned long new_addr, pud_t *old_pud, pud_t *new_pud) +{ + WARN_ON_ONCE(1); + return false; + +} +#endif + enum pgt_entry { NORMAL_PMD, HPAGE_PMD, NORMAL_PUD, + HPAGE_PUD, }; /* @@ -347,6 +399,7 @@ static __always_inline unsigned long get_extent(enum pgt_entry entry, mask = PMD_MASK; size = PMD_SIZE; break; + case HPAGE_PUD: case NORMAL_PUD: mask = PUD_MASK; size = PUD_SIZE; @@ -395,6 +448,11 @@ static bool move_pgt_entry(enum pgt_entry entry, struct vm_area_struct *vma, move_huge_pmd(vma, old_addr, new_addr, old_entry, new_entry); break; + case HPAGE_PUD: + moved = move_huge_pud(vma, old_addr, new_addr, old_entry, + new_entry); + break; + default: WARN_ON_ONCE(1); break; @@ -414,6 +472,7 @@ unsigned long move_page_tables(struct vm_area_struct *vma, unsigned long extent, old_end; struct mmu_notifier_range range; pmd_t *old_pmd, *new_pmd; + pud_t *old_pud, *new_pud; old_end = old_addr + len; flush_cache_range(vma, old_addr, old_end); @@ -429,15 +488,22 @@ unsigned long move_page_tables(struct vm_area_struct *vma, * PUD level if possible. */ extent = get_extent(NORMAL_PUD, old_addr, old_end, new_addr); - if (IS_ENABLED(CONFIG_HAVE_MOVE_PUD) && extent == PUD_SIZE) { - pud_t *old_pud, *new_pud; - old_pud = get_old_pud(vma->vm_mm, old_addr); - if (!old_pud) + old_pud = get_old_pud(vma->vm_mm, old_addr); + if (!old_pud) + continue; + new_pud = alloc_new_pud(vma->vm_mm, vma, new_addr); + if (!new_pud) + break; + if (pud_trans_huge(*old_pud) || pud_devmap(*old_pud)) { + if (extent == HPAGE_PUD_SIZE) { + move_pgt_entry(HPAGE_PUD, vma, old_addr, new_addr, + old_pud, new_pud, need_rmap_locks); + /* We ignore and continue on error? */ continue; - new_pud = alloc_new_pud(vma->vm_mm, vma, new_addr); - if (!new_pud) - break; + } + } else if (IS_ENABLED(CONFIG_HAVE_MOVE_PUD) && extent == PUD_SIZE) { + if (move_pgt_entry(NORMAL_PUD, vma, old_addr, new_addr,
[PATCH v7 04/11] selftest/mremap_test: Avoid crash with static build
With a large mmap map size, we can overlap with the text area and using MAP_FIXED results in unmapping that area. Switch to MAP_FIXED_NOREPLACE and handle the EEXIST error. Reviewed-by: Kalesh Singh Signed-off-by: Aneesh Kumar K.V --- tools/testing/selftests/vm/mremap_test.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/tools/testing/selftests/vm/mremap_test.c b/tools/testing/selftests/vm/mremap_test.c index c9a5461eb786..0624d1bd71b5 100644 --- a/tools/testing/selftests/vm/mremap_test.c +++ b/tools/testing/selftests/vm/mremap_test.c @@ -75,9 +75,10 @@ static void *get_source_mapping(struct config c) retry: addr += c.src_alignment; src_addr = mmap((void *) addr, c.region_size, PROT_READ | PROT_WRITE, - MAP_FIXED | MAP_ANONYMOUS | MAP_SHARED, -1, 0); + MAP_FIXED_NOREPLACE | MAP_ANONYMOUS | MAP_SHARED, + -1, 0); if (src_addr == MAP_FAILED) { - if (errno == EPERM) + if (errno == EPERM || errno == EEXIST) goto retry; goto error; } -- 2.31.1
[PATCH v7 03/11] selftest/mremap_test: Update the test to handle pagesize other than 4K
Instead of hardcoding 4K page size fetch it using sysconf(). For the performance measurements test still assume 2M and 1G are hugepage sizes. Reviewed-by: Kalesh Singh Signed-off-by: Aneesh Kumar K.V --- tools/testing/selftests/vm/mremap_test.c | 113 --- 1 file changed, 61 insertions(+), 52 deletions(-) diff --git a/tools/testing/selftests/vm/mremap_test.c b/tools/testing/selftests/vm/mremap_test.c index 9c391d016922..c9a5461eb786 100644 --- a/tools/testing/selftests/vm/mremap_test.c +++ b/tools/testing/selftests/vm/mremap_test.c @@ -45,14 +45,15 @@ enum { _4MB = 4ULL << 20, _1GB = 1ULL << 30, _2GB = 2ULL << 30, - PTE = _4KB, PMD = _2MB, PUD = _1GB, }; +#define PTE page_size + #define MAKE_TEST(source_align, destination_align, size, \ overlaps, should_fail, test_name) \ -{ \ +(struct test){ \ .name = test_name, \ .config = { \ .src_alignment = source_align, \ @@ -252,12 +253,17 @@ static int parse_args(int argc, char **argv, unsigned int *threshold_mb, return 0; } +#define MAX_TEST 13 +#define MAX_PERF_TEST 3 int main(int argc, char **argv) { int failures = 0; int i, run_perf_tests; unsigned int threshold_mb = VALIDATION_DEFAULT_THRESHOLD; unsigned int pattern_seed; + struct test test_cases[MAX_TEST]; + struct test perf_test_cases[MAX_PERF_TEST]; + int page_size; time_t t; pattern_seed = (unsigned int) time(); @@ -268,56 +274,59 @@ int main(int argc, char **argv) ksft_print_msg("Test configs:\n\tthreshold_mb=%u\n\tpattern_seed=%u\n\n", threshold_mb, pattern_seed); - struct test test_cases[] = { - /* Expected mremap failures */ - MAKE_TEST(_4KB, _4KB, _4KB, OVERLAPPING, EXPECT_FAILURE, - "mremap - Source and Destination Regions Overlapping"), - MAKE_TEST(_4KB, _1KB, _4KB, NON_OVERLAPPING, EXPECT_FAILURE, - "mremap - Destination Address Misaligned (1KB-aligned)"), - MAKE_TEST(_1KB, _4KB, _4KB, NON_OVERLAPPING, EXPECT_FAILURE, - "mremap - Source Address Misaligned (1KB-aligned)"), - - /* Src addr PTE aligned */ - MAKE_TEST(PTE, PTE, _8KB, NON_OVERLAPPING, EXPECT_SUCCESS, - "8KB mremap - Source PTE-aligned, Destination PTE-aligned"), - - /* Src addr 1MB aligned */ - MAKE_TEST(_1MB, PTE, _2MB, NON_OVERLAPPING, EXPECT_SUCCESS, - "2MB mremap - Source 1MB-aligned, Destination PTE-aligned"), - MAKE_TEST(_1MB, _1MB, _2MB, NON_OVERLAPPING, EXPECT_SUCCESS, - "2MB mremap - Source 1MB-aligned, Destination 1MB-aligned"), - - /* Src addr PMD aligned */ - MAKE_TEST(PMD, PTE, _4MB, NON_OVERLAPPING, EXPECT_SUCCESS, - "4MB mremap - Source PMD-aligned, Destination PTE-aligned"), - MAKE_TEST(PMD, _1MB, _4MB, NON_OVERLAPPING, EXPECT_SUCCESS, - "4MB mremap - Source PMD-aligned, Destination 1MB-aligned"), - MAKE_TEST(PMD, PMD, _4MB, NON_OVERLAPPING, EXPECT_SUCCESS, - "4MB mremap - Source PMD-aligned, Destination PMD-aligned"), - - /* Src addr PUD aligned */ - MAKE_TEST(PUD, PTE, _2GB, NON_OVERLAPPING, EXPECT_SUCCESS, - "2GB mremap - Source PUD-aligned, Destination PTE-aligned"), - MAKE_TEST(PUD, _1MB, _2GB, NON_OVERLAPPING, EXPECT_SUCCESS, - "2GB mremap - Source PUD-aligned, Destination 1MB-aligned"), - MAKE_TEST(PUD, PMD, _2GB, NON_OVERLAPPING, EXPECT_SUCCESS, - "2GB mremap - Source PUD-aligned, Destination PMD-aligned"), - MAKE_TEST(PUD, PUD, _2GB, NON_OVERLAPPING, EXPECT_SUCCESS, - "2GB mremap - Source PUD-aligned, Destination PUD-aligned"), - }; - - struct test perf_test_cases[] = { - /* -* mremap 1GB region - Page table level aligned time -* comparison. -*/ - MAKE_TEST(PTE, PTE, _1GB, NON_OVERLAPPING, EXPECT_SUCCESS, - "1GB mremap - Source PTE-aligned, Destination PTE-aligned"), - MAKE_TEST(PMD, PMD, _1GB, NON_OVERLAPPING, EXPECT_SUCCESS, - "1GB mremap - Source PMD-aligned, Destination PMD-aligned"), - MAKE_TEST(PUD, PUD, _1GB, NON_OVERLAPPING, EXPECT_SUCCESS, - "1GB mremap - Source PUD-aligned, Destination PUD-aligned"), - }; + page_size = sysconf(_SC_PAGESIZE); + + /* Expected mremap failures */ + test_cases[0] =
[PATCH v7 01/11] mm/mremap: Fix race between MOVE_PMD mremap and pageout
CPU 1 CPU 2 CPU 3 mremap(old_addr, new_addr) page_shrinker/try_to_unmap_one mmap_write_lock_killable() addr = old_addr lock(pte_ptl) lock(pmd_ptl) pmd = *old_pmd pmd_clear(old_pmd) flush_tlb_range(old_addr) *new_pmd = pmd *new_addr = 10; and fills TLB with new addr and old pfn unlock(pmd_ptl) ptep_clear_flush() old pfn is free. Stale TLB entry Fix this race by holding pmd lock in pageout. This still doesn't handle the race between MOVE_PUD and pageout. Fixes: 2c91bd4a4e2e ("mm: speed up mremap by 20x on large regions") Link: https://lore.kernel.org/linux-mm/CAHk-=wgxvr04ebntxqfevontwnp6fdm+oj5vauqxp3s-huw...@mail.gmail.com Signed-off-by: Aneesh Kumar K.V --- include/linux/rmap.h | 9 ++--- mm/page_vma_mapped.c | 36 ++-- 2 files changed, 24 insertions(+), 21 deletions(-) diff --git a/include/linux/rmap.h b/include/linux/rmap.h index def5c62c93b3..272ab0c2b60b 100644 --- a/include/linux/rmap.h +++ b/include/linux/rmap.h @@ -207,7 +207,8 @@ struct page_vma_mapped_walk { unsigned long address; pmd_t *pmd; pte_t *pte; - spinlock_t *ptl; + spinlock_t *pte_ptl; + spinlock_t *pmd_ptl; unsigned int flags; }; @@ -216,8 +217,10 @@ static inline void page_vma_mapped_walk_done(struct page_vma_mapped_walk *pvmw) /* HugeTLB pte is set to the relevant page table entry without pte_mapped. */ if (pvmw->pte && !PageHuge(pvmw->page)) pte_unmap(pvmw->pte); - if (pvmw->ptl) - spin_unlock(pvmw->ptl); + if (pvmw->pte_ptl) + spin_unlock(pvmw->pte_ptl); + if (pvmw->pmd_ptl) + spin_unlock(pvmw->pmd_ptl); } bool page_vma_mapped_walk(struct page_vma_mapped_walk *pvmw); diff --git a/mm/page_vma_mapped.c b/mm/page_vma_mapped.c index 2cf01d933f13..87a2c94c7e27 100644 --- a/mm/page_vma_mapped.c +++ b/mm/page_vma_mapped.c @@ -47,8 +47,10 @@ static bool map_pte(struct page_vma_mapped_walk *pvmw) return false; } } - pvmw->ptl = pte_lockptr(pvmw->vma->vm_mm, pvmw->pmd); - spin_lock(pvmw->ptl); + if (USE_SPLIT_PTE_PTLOCKS) { + pvmw->pte_ptl = pte_lockptr(pvmw->vma->vm_mm, pvmw->pmd); + spin_lock(pvmw->pte_ptl); + } return true; } @@ -162,8 +164,8 @@ bool page_vma_mapped_walk(struct page_vma_mapped_walk *pvmw) if (!pvmw->pte) return false; - pvmw->ptl = huge_pte_lockptr(page_hstate(page), mm, pvmw->pte); - spin_lock(pvmw->ptl); + pvmw->pte_ptl = huge_pte_lockptr(page_hstate(page), mm, pvmw->pte); + spin_lock(pvmw->pte_ptl); if (!check_pte(pvmw)) return not_found(pvmw); return true; @@ -179,6 +181,7 @@ bool page_vma_mapped_walk(struct page_vma_mapped_walk *pvmw) if (!pud_present(*pud)) return false; pvmw->pmd = pmd_offset(pud, pvmw->address); + pvmw->pmd_ptl = pmd_lock(mm, pvmw->pmd); /* * Make sure the pmd value isn't cached in a register by the * compiler and used as a stale value after we've observed a @@ -186,7 +189,6 @@ bool page_vma_mapped_walk(struct page_vma_mapped_walk *pvmw) */ pmde = READ_ONCE(*pvmw->pmd); if (pmd_trans_huge(pmde) || is_pmd_migration_entry(pmde)) { - pvmw->ptl = pmd_lock(mm, pvmw->pmd); if (likely(pmd_trans_huge(*pvmw->pmd))) { if (pvmw->flags & PVMW_MIGRATION) return not_found(pvmw); @@ -206,14 +208,10 @@ bool page_vma_mapped_walk(struct page_vma_mapped_walk *pvmw) } } return not_found(pvmw); - } else { - /* THP pmd was split under us: handle on pte level */ - spin_unlock(pvmw->ptl); - pvmw->ptl = NULL; } - } else if (!pmd_present(pmde)) { - return false; - } + } else if (!pmd_present(pmde)) + return not_found(pvmw); + if (!map_pte(pvmw)) goto next_pte; while (1) { @@ -233,19 +231,21 @@ bool page_vma_mapped_walk(struct page_vma_mapped_walk *pvmw) /* Did we cross page table boundary? */ if (pvmw->address %
[PATCH v7 02/11] mm/mremap: Fix race between MOVE_PUD mremap and pageout
CPU 1 CPU 2 CPU 3 mremap(old_addr, new_addr) page_shrinker/try_to_unmap_one mmap_write_lock_killable() addr = old_addr lock(pte_ptl) lock(pud_ptl) pud = *old_pud pud_clear(old_pud) flush_tlb_range(old_addr) *new_pud = pud *new_addr = 10; and fills TLB with new addr and old pfn unlock(pud_ptl) ptep_clear_flush() old pfn is free. Stale TLB entry Fix this race by holding pud lock in pageout. Fixes: c49dd3401802 ("mm: speedup mremap on 1GB or larger regions") Signed-off-by: Aneesh Kumar K.V --- include/linux/rmap.h | 4 mm/page_vma_mapped.c | 13 ++--- 2 files changed, 14 insertions(+), 3 deletions(-) diff --git a/include/linux/rmap.h b/include/linux/rmap.h index 272ab0c2b60b..491c65ce1d46 100644 --- a/include/linux/rmap.h +++ b/include/linux/rmap.h @@ -209,6 +209,7 @@ struct page_vma_mapped_walk { pte_t *pte; spinlock_t *pte_ptl; spinlock_t *pmd_ptl; + spinlock_t *pud_ptl; unsigned int flags; }; @@ -221,6 +222,9 @@ static inline void page_vma_mapped_walk_done(struct page_vma_mapped_walk *pvmw) spin_unlock(pvmw->pte_ptl); if (pvmw->pmd_ptl) spin_unlock(pvmw->pmd_ptl); + if (pvmw->pud_ptl) + spin_unlock(pvmw->pud_ptl); + } bool page_vma_mapped_walk(struct page_vma_mapped_walk *pvmw); diff --git a/mm/page_vma_mapped.c b/mm/page_vma_mapped.c index 87a2c94c7e27..c913bc34b1d3 100644 --- a/mm/page_vma_mapped.c +++ b/mm/page_vma_mapped.c @@ -180,8 +180,11 @@ bool page_vma_mapped_walk(struct page_vma_mapped_walk *pvmw) pud = pud_offset(p4d, pvmw->address); if (!pud_present(*pud)) return false; + + pvmw->pud_ptl = pud_lock(mm, pud); pvmw->pmd = pmd_offset(pud, pvmw->address); - pvmw->pmd_ptl = pmd_lock(mm, pvmw->pmd); + if (USE_SPLIT_PMD_PTLOCKS) + pvmw->pmd_ptl = pmd_lock(mm, pvmw->pmd); /* * Make sure the pmd value isn't cached in a register by the * compiler and used as a stale value after we've observed a @@ -235,8 +238,12 @@ bool page_vma_mapped_walk(struct page_vma_mapped_walk *pvmw) spin_unlock(pvmw->pte_ptl); pvmw->pte_ptl = NULL; } - spin_unlock(pvmw->pmd_ptl); - pvmw->pmd_ptl = NULL; + if (pvmw->pmd_ptl) { + spin_unlock(pvmw->pmd_ptl); + pvmw->pmd_ptl = NULL; + } + spin_unlock(pvmw->pud_ptl); + pvmw->pud_ptl = NULL; goto restart; } else { pvmw->pte++; -- 2.31.1
[PATCH v7 00/11] Speedup mremap on ppc64
This patchset enables MOVE_PMD/MOVE_PUD support on power. This requires the platform to support updating higher-level page tables without updating page table entries. This also needs to invalidate the Page Walk Cache on architecture supporting the same. Changes from v6: * Update ppc64 flush_tlb_range to invalidate page walk cache. * Add patches to fix race between mremap and page out * Add patch to fix build error with page table levels 2 Changes from v5: * Drop patch mm/mremap: Move TLB flush outside page table lock * Add fixes for race between optimized mremap and page out Changes from v4: * Change function name and arguments based on review feedback. Changes from v3: * Fix build error reported by kernel test robot * Address review feedback. Changes from v2: * switch from using mmu_gather to flush_pte_tlb_pwc_range() Changes from v1: * Rebase to recent upstream * Fix build issues with tlb_gather_mmu changes Aneesh Kumar K.V (11): mm/mremap: Fix race between MOVE_PMD mremap and pageout mm/mremap: Fix race between MOVE_PUD mremap and pageout selftest/mremap_test: Update the test to handle pagesize other than 4K selftest/mremap_test: Avoid crash with static build mm/mremap: Convert huge PUD move to separate helper mm/mremap: Don't enable optimized PUD move if page table levels is 2 mm/mremap: Use pmd/pud_poplulate to update page table entries powerpc/mm/book3s64: Fix possible build error mm/mremap: Allow arch runtime override powerpc/book3s64/mm: Update flush_tlb_range to flush page walk cache powerpc/mm: Enable HAVE_MOVE_PMD support .../include/asm/book3s/64/tlbflush-radix.h| 2 + arch/powerpc/include/asm/tlb.h| 6 + arch/powerpc/mm/book3s64/radix_hugetlbpage.c | 8 +- arch/powerpc/mm/book3s64/radix_tlb.c | 70 +++ arch/powerpc/platforms/Kconfig.cputype| 2 + include/linux/rmap.h | 13 +- mm/mremap.c | 104 +-- mm/page_vma_mapped.c | 43 --- tools/testing/selftests/vm/mremap_test.c | 118 ++ 9 files changed, 251 insertions(+), 115 deletions(-) -- 2.31.1
Re: [PATCH v2 00/15] init_mm: cleanup ARCH's text/data/brk setup code
Hi Kefeng, Le 07/06/2021 à 02:55, Kefeng Wang a écrit : On 2021/6/7 5:29, Mike Rapoport wrote: Hello Kefeng, On Fri, Jun 04, 2021 at 03:06:18PM +0800, Kefeng Wang wrote: Add setup_initial_init_mm() helper, then use it to cleanup the text, data and brk setup code. v2: - change argument from "char *" to "void *" setup_initial_init_mm() suggested by Geert Uytterhoeven - use NULL instead of (void *)0 on h8300 and m68k - collect ACKs Cc: linux-snps-...@lists.infradead.org Cc: linux-arm-ker...@lists.infradead.org Cc: linux-c...@vger.kernel.org Cc: uclinux-h8-de...@lists.sourceforge.jp Cc: linux-m...@lists.linux-m68k.org Cc: openr...@lists.librecores.org Cc: linuxppc-dev@lists.ozlabs.org Cc: linux-ri...@lists.infradead.org Cc: linux...@vger.kernel.org Cc: linux-s...@vger.kernel.org Kefeng Wang (15): mm: add setup_initial_init_mm() helper arc: convert to setup_initial_init_mm() arm: convert to setup_initial_init_mm() arm64: convert to setup_initial_init_mm() csky: convert to setup_initial_init_mm() h8300: convert to setup_initial_init_mm() m68k: convert to setup_initial_init_mm() nds32: convert to setup_initial_init_mm() nios2: convert to setup_initial_init_mm() openrisc: convert to setup_initial_init_mm() powerpc: convert to setup_initial_init_mm() riscv: convert to setup_initial_init_mm() s390: convert to setup_initial_init_mm() sh: convert to setup_initial_init_mm() x86: convert to setup_initial_init_mm() I might be missing something, but AFAIU the init_mm.start_code and other fields are not used really early so the new setup_initial_init_mm() function can be called in the generic code outside setup_arch(), e.g in mm_init(). Hi Mike, each architecture has their own value, not the same, eg m68K and h8300, also the name of the text/code/brk is different in some arch, so I keep unchanged. What you could do is to define a __weak function that architectures can override and call that function from mm_init() as suggested by Mike, Something like: void __weak setup_initial_init_mm(void) { init_mm.start_code = (unsigned long)_stext; init_mm.end_code = (unsigned long)_etext; init_mm.end_data = (unsigned long)_edata; init_mm.brk = (unsigned long)_end; } Then only the few architecture that are different would override it. I see a few archictectures are usigne PAGE_OFFSET to set .start_code, but it is likely that this is equivalent to _stext. Christophe
Re: [PATCH v2 3/3] powerpc/mm/hash: Avoid multiple HPT resize-downs on memory hotunplug
On Fri, Apr 30, 2021 at 11:36:10AM -0300, Leonardo Bras wrote: > During memory hotunplug, after each LMB is removed, the HPT may be > resized-down if it would map a max of 4 times the current amount of memory. > (2 shifts, due to introduced histeresis) > > It usually is not an issue, but it can take a lot of time if HPT > resizing-down fails. This happens because resize-down failures > usually repeat at each LMB removal, until there are no more bolted entries > conflict, which can take a while to happen. > > This can be solved by doing a single HPT resize at the end of memory > hotunplug, after all requested entries are removed. > > To make this happen, it's necessary to temporarily disable all HPT > resize-downs before hotunplug, re-enable them after hotunplug ends, > and then resize-down HPT to the current memory size. > > As an example, hotunplugging 256GB from a 385GB guest took 621s without > this patch, and 100s after applied. > > Signed-off-by: Leonardo Bras Hrm. This looks correct, but it seems overly complicated. AFAICT, the resize calls that this adds should in practice be the *only* times we call resize, all the calls from the lower level code should be suppressed. In which case can't we just remove those calls entirely, and not deal with the clunky locking and exclusion here. That should also remove the need for the 'shrinking' parameter in 1/3. > --- > arch/powerpc/include/asm/book3s/64/hash.h | 2 + > arch/powerpc/mm/book3s64/hash_utils.c | 45 +-- > .../platforms/pseries/hotplug-memory.c| 26 +++ > 3 files changed, 70 insertions(+), 3 deletions(-) > > diff --git a/arch/powerpc/include/asm/book3s/64/hash.h > b/arch/powerpc/include/asm/book3s/64/hash.h > index fad4af8b8543..6cd66e7e98c9 100644 > --- a/arch/powerpc/include/asm/book3s/64/hash.h > +++ b/arch/powerpc/include/asm/book3s/64/hash.h > @@ -256,6 +256,8 @@ int hash__create_section_mapping(unsigned long start, > unsigned long end, > int hash__remove_section_mapping(unsigned long start, unsigned long end); > > void hash_batch_expand_prepare(unsigned long newsize); > +void hash_batch_shrink_begin(void); > +void hash_batch_shrink_end(void); > > #endif /* !__ASSEMBLY__ */ > #endif /* __KERNEL__ */ > diff --git a/arch/powerpc/mm/book3s64/hash_utils.c > b/arch/powerpc/mm/book3s64/hash_utils.c > index 3fa395b3fe57..73ecd0f61acd 100644 > --- a/arch/powerpc/mm/book3s64/hash_utils.c > +++ b/arch/powerpc/mm/book3s64/hash_utils.c > @@ -795,6 +795,9 @@ static unsigned long __init htab_get_table_size(void) > } > > #ifdef CONFIG_MEMORY_HOTPLUG > + > +static DEFINE_MUTEX(hpt_resize_down_lock); > + > static int resize_hpt_for_hotplug(unsigned long new_mem_size, bool shrinking) > { > unsigned target_hpt_shift; > @@ -805,7 +808,7 @@ static int resize_hpt_for_hotplug(unsigned long > new_mem_size, bool shrinking) > target_hpt_shift = htab_shift_for_mem_size(new_mem_size); > > if (shrinking) { > - > + int ret; > /* >* To avoid lots of HPT resizes if memory size is fluctuating >* across a boundary, we deliberately have some hysterisis > @@ -818,10 +821,20 @@ static int resize_hpt_for_hotplug(unsigned long > new_mem_size, bool shrinking) > if (target_hpt_shift >= ppc64_pft_size - 1) > return 0; > > - } else if (target_hpt_shift <= ppc64_pft_size) { > - return 0; > + /* When batch removing entries, only resizes HPT at the end. */ > + > + if (!mutex_trylock(_resize_down_lock)) > + return 0; > + > + ret = mmu_hash_ops.resize_hpt(target_hpt_shift); > + > + mutex_unlock(_resize_down_lock); > + return ret; > } > > + if (target_hpt_shift <= ppc64_pft_size) > + return 0; > + > return mmu_hash_ops.resize_hpt(target_hpt_shift); > } > > @@ -879,6 +892,32 @@ void hash_batch_expand_prepare(unsigned long newsize) > break; > } > } > + > +void hash_batch_shrink_begin(void) > +{ > + /* Disable HPT resize-down during hot-unplug */ > + mutex_lock(_resize_down_lock); > +} > + > +void hash_batch_shrink_end(void) > +{ > + const u64 starting_size = ppc64_pft_size; > + unsigned long newsize; > + > + newsize = memblock_phys_mem_size(); > + /* Resize to smallest SHIFT possible */ > + while (resize_hpt_for_hotplug(newsize, true) == -ENOSPC) { > + newsize *= 2; > + pr_warn("Hash collision while resizing HPT\n"); > + > + /* Do not try to resize to the starting size, or bigger value */ > + if (htab_shift_for_mem_size(newsize) >= starting_size) > + break; > + } > + > + /* Re-enables HPT resize-down after hot-unplug */ > + mutex_unlock(_resize_down_lock); > +} > #endif /* CONFIG_MEMORY_HOTPLUG */ > > static void __init
Re: [PATCH v2 2/3] powerpc/mm/hash: Avoid multiple HPT resize-ups on memory hotplug
On Fri, Apr 30, 2021 at 11:36:08AM -0300, Leonardo Bras wrote: > Every time a memory hotplug happens, and the memory limit crosses a 2^n > value, it may be necessary to perform HPT resizing-up, which can take > some time (over 100ms in my tests). > > It usually is not an issue, but it can take some time if a lot of memory > is added to a guest with little starting memory: > Adding 256G to a 2GB guest, for example will require 8 HPT resizes. > > Perform an HPT resize before memory hotplug, updating HPT to its > final size (considering a successful hotplug), taking the number of > HPT resizes to at most one per memory hotplug action. > > Signed-off-by: Leonardo Bras Reviewed-by: David Gibson > --- > arch/powerpc/include/asm/book3s/64/hash.h | 2 ++ > arch/powerpc/mm/book3s64/hash_utils.c | 20 +++ > .../platforms/pseries/hotplug-memory.c| 9 + > 3 files changed, 31 insertions(+) > > diff --git a/arch/powerpc/include/asm/book3s/64/hash.h > b/arch/powerpc/include/asm/book3s/64/hash.h > index d959b0195ad9..fad4af8b8543 100644 > --- a/arch/powerpc/include/asm/book3s/64/hash.h > +++ b/arch/powerpc/include/asm/book3s/64/hash.h > @@ -255,6 +255,8 @@ int hash__create_section_mapping(unsigned long start, > unsigned long end, >int nid, pgprot_t prot); > int hash__remove_section_mapping(unsigned long start, unsigned long end); > > +void hash_batch_expand_prepare(unsigned long newsize); > + > #endif /* !__ASSEMBLY__ */ > #endif /* __KERNEL__ */ > #endif /* _ASM_POWERPC_BOOK3S_64_HASH_H */ > diff --git a/arch/powerpc/mm/book3s64/hash_utils.c > b/arch/powerpc/mm/book3s64/hash_utils.c > index 608e4ed397a9..3fa395b3fe57 100644 > --- a/arch/powerpc/mm/book3s64/hash_utils.c > +++ b/arch/powerpc/mm/book3s64/hash_utils.c > @@ -859,6 +859,26 @@ int hash__remove_section_mapping(unsigned long start, > unsigned long end) > > return rc; > } > + > +void hash_batch_expand_prepare(unsigned long newsize) > +{ > + const u64 starting_size = ppc64_pft_size; > + > + /* > + * Resizing-up HPT should never fail, but there are some cases system > starts with higher > + * SHIFT than required, and we go through the funny case of resizing > HPT down while > + * adding memory > + */ > + > + while (resize_hpt_for_hotplug(newsize, false) == -ENOSPC) { > + newsize *= 2; > + pr_warn("Hash collision while resizing HPT\n"); > + > + /* Do not try to resize to the starting size, or bigger value */ > + if (htab_shift_for_mem_size(newsize) >= starting_size) > + break; > + } > +} > #endif /* CONFIG_MEMORY_HOTPLUG */ > > static void __init hash_init_partition_table(phys_addr_t hash_table, > diff --git a/arch/powerpc/platforms/pseries/hotplug-memory.c > b/arch/powerpc/platforms/pseries/hotplug-memory.c > index 8377f1f7c78e..48b2cfe4ce69 100644 > --- a/arch/powerpc/platforms/pseries/hotplug-memory.c > +++ b/arch/powerpc/platforms/pseries/hotplug-memory.c > @@ -13,6 +13,7 @@ > #include > #include > #include > +#include > > #include > #include > @@ -671,6 +672,10 @@ static int dlpar_memory_add_by_count(u32 lmbs_to_add) > if (lmbs_available < lmbs_to_add) > return -EINVAL; > > + if (!radix_enabled()) > + hash_batch_expand_prepare(memblock_phys_mem_size() + > + lmbs_to_add * > drmem_lmb_size()); > + > for_each_drmem_lmb(lmb) { > if (lmb->flags & DRCONF_MEM_ASSIGNED) > continue; > @@ -788,6 +793,10 @@ static int dlpar_memory_add_by_ic(u32 lmbs_to_add, u32 > drc_index) > if (lmbs_available < lmbs_to_add) > return -EINVAL; > > + if (!radix_enabled()) > + hash_batch_expand_prepare(memblock_phys_mem_size() + > + lmbs_to_add * drmem_lmb_size()); > + > for_each_drmem_lmb_in_range(lmb, start_lmb, end_lmb) { > if (lmb->flags & DRCONF_MEM_ASSIGNED) > continue; -- David Gibson| I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson signature.asc Description: PGP signature
Re: [PATCH v2 1/3] powerpc/mm/hash: Avoid resizing-down HPT on first memory hotplug
On Fri, Apr 30, 2021 at 11:36:06AM -0300, Leonardo Bras wrote: > Because hypervisors may need to create HPTs without knowing the guest > page size, the smallest used page-size (4k) may be chosen, resulting in > a HPT that is possibly bigger than needed. > > On a guest with bigger page-sizes, the amount of entries for HTP may be > too high, causing the guest to ask for a HPT resize-down on the first > hotplug. > > This becomes a problem when HPT resize-down fails, and causes the > HPT resize to be performed on every LMB added, until HPT size is > compatible to guest memory size, causing a major slowdown. > > So, avoiding HPT resizing-down on hot-add significantly improves memory > hotplug times. > > As an example, hotplugging 256GB on a 129GB guest took 710s without this > patch, and 21s after applied. > > Signed-off-by: Leonardo Bras Sorry it's taken me so long to look at these I don't love the extra statefulness that the 'shrinking' parameter adds, but I can't see an elegant way to avoid it, so: Reviewed-by: David Gibson > --- > arch/powerpc/mm/book3s64/hash_utils.c | 36 --- > 1 file changed, 21 insertions(+), 15 deletions(-) > > diff --git a/arch/powerpc/mm/book3s64/hash_utils.c > b/arch/powerpc/mm/book3s64/hash_utils.c > index 581b20a2feaf..608e4ed397a9 100644 > --- a/arch/powerpc/mm/book3s64/hash_utils.c > +++ b/arch/powerpc/mm/book3s64/hash_utils.c > @@ -795,7 +795,7 @@ static unsigned long __init htab_get_table_size(void) > } > > #ifdef CONFIG_MEMORY_HOTPLUG > -static int resize_hpt_for_hotplug(unsigned long new_mem_size) > +static int resize_hpt_for_hotplug(unsigned long new_mem_size, bool shrinking) > { > unsigned target_hpt_shift; > > @@ -804,19 +804,25 @@ static int resize_hpt_for_hotplug(unsigned long > new_mem_size) > > target_hpt_shift = htab_shift_for_mem_size(new_mem_size); > > - /* > - * To avoid lots of HPT resizes if memory size is fluctuating > - * across a boundary, we deliberately have some hysterisis > - * here: we immediately increase the HPT size if the target > - * shift exceeds the current shift, but we won't attempt to > - * reduce unless the target shift is at least 2 below the > - * current shift > - */ > - if (target_hpt_shift > ppc64_pft_size || > - target_hpt_shift < ppc64_pft_size - 1) > - return mmu_hash_ops.resize_hpt(target_hpt_shift); > + if (shrinking) { > > - return 0; > + /* > + * To avoid lots of HPT resizes if memory size is fluctuating > + * across a boundary, we deliberately have some hysterisis > + * here: we immediately increase the HPT size if the target > + * shift exceeds the current shift, but we won't attempt to > + * reduce unless the target shift is at least 2 below the > + * current shift > + */ > + > + if (target_hpt_shift >= ppc64_pft_size - 1) > + return 0; > + > + } else if (target_hpt_shift <= ppc64_pft_size) { > + return 0; > + } > + > + return mmu_hash_ops.resize_hpt(target_hpt_shift); > } > > int hash__create_section_mapping(unsigned long start, unsigned long end, > @@ -829,7 +835,7 @@ int hash__create_section_mapping(unsigned long start, > unsigned long end, > return -1; > } > > - resize_hpt_for_hotplug(memblock_phys_mem_size()); > + resize_hpt_for_hotplug(memblock_phys_mem_size(), false); > > rc = htab_bolt_mapping(start, end, __pa(start), > pgprot_val(prot), mmu_linear_psize, > @@ -848,7 +854,7 @@ int hash__remove_section_mapping(unsigned long start, > unsigned long end) > int rc = htab_remove_mapping(start, end, mmu_linear_psize, >mmu_kernel_ssize); > > - if (resize_hpt_for_hotplug(memblock_phys_mem_size()) == -ENOSPC) > + if (resize_hpt_for_hotplug(memblock_phys_mem_size(), true) == -ENOSPC) > pr_warn("Hash collision while resizing HPT\n"); > > return rc; -- David Gibson| I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson signature.asc Description: PGP signature
Re: [PATCH 17/26] nvdimm-pmem: convert to blk_alloc_disk/blk_cleanup_disk
[ add Sachin who reported this commit in -next ] On Thu, May 20, 2021 at 10:52 PM Christoph Hellwig wrote: > > Convert the nvdimm-pmem driver to use the blk_alloc_disk and > blk_cleanup_disk helpers to simplify gendisk and request_queue > allocation. > > Signed-off-by: Christoph Hellwig > --- > drivers/nvdimm/pmem.c | 15 +-- > 1 file changed, 5 insertions(+), 10 deletions(-) > > diff --git a/drivers/nvdimm/pmem.c b/drivers/nvdimm/pmem.c > index 968b8483c763..9fcd05084564 100644 > --- a/drivers/nvdimm/pmem.c > +++ b/drivers/nvdimm/pmem.c > @@ -338,7 +338,7 @@ static void pmem_pagemap_cleanup(struct dev_pagemap > *pgmap) > struct request_queue *q = > container_of(pgmap->ref, struct request_queue, > q_usage_counter); > > - blk_cleanup_queue(q); > + blk_cleanup_disk(queue_to_disk(q)); This is broken. This comes after del_gendisk() which means the queue device is no longer associated with its disk parent. Perhaps @pmem could be stashed in pgmap->owner and then this can use pmem->disk? Not see any other readily available ways to get back to the disk from here after del_gendisk().
[Bug 213079] [bisected] IRQ problems and crashes on a PowerMac G5 with 5.12.3
https://bugzilla.kernel.org/show_bug.cgi?id=213079 --- Comment #5 from Oliver O'Halloran (ooh...@gmail.com) --- Hmm, it's pretty weird to see an NVMe drive using LSIs. Not too sure what to make of that. I figure there's something screwy going on with interrupt routing, but I don't have any g5 hardware to replicate this with. Could you add "debug" to the kernel command line and post the dmesg output for a boot with the patch applied and reverted? -- You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug.
Re: [PATCH v8 00/15] Restricted DMA
On Sat, Jun 5, 2021 at 1:48 AM Will Deacon wrote: > > Hi Claire, > > On Thu, May 27, 2021 at 08:58:30PM +0800, Claire Chang wrote: > > This series implements mitigations for lack of DMA access control on > > systems without an IOMMU, which could result in the DMA accessing the > > system memory at unexpected times and/or unexpected addresses, possibly > > leading to data leakage or corruption. > > > > For example, we plan to use the PCI-e bus for Wi-Fi and that PCI-e bus is > > not behind an IOMMU. As PCI-e, by design, gives the device full access to > > system memory, a vulnerability in the Wi-Fi firmware could easily escalate > > to a full system exploit (remote wifi exploits: [1a], [1b] that shows a > > full chain of exploits; [2], [3]). > > > > To mitigate the security concerns, we introduce restricted DMA. Restricted > > DMA utilizes the existing swiotlb to bounce streaming DMA in and out of a > > specially allocated region and does memory allocation from the same region. > > The feature on its own provides a basic level of protection against the DMA > > overwriting buffer contents at unexpected times. However, to protect > > against general data leakage and system memory corruption, the system needs > > to provide a way to restrict the DMA to a predefined memory region (this is > > usually done at firmware level, e.g. MPU in ATF on some ARM platforms [4]). > > > > [1a] > > https://googleprojectzero.blogspot.com/2017/04/over-air-exploiting-broadcoms-wi-fi_4.html > > [1b] > > https://googleprojectzero.blogspot.com/2017/04/over-air-exploiting-broadcoms-wi-fi_11.html > > [2] https://blade.tencent.com/en/advisories/qualpwn/ > > [3] > > https://www.bleepingcomputer.com/news/security/vulnerabilities-found-in-highly-popular-firmware-for-wifi-chips/ > > [4] > > https://github.com/ARM-software/arm-trusted-firmware/blob/master/plat/mediatek/mt8183/drivers/emi_mpu/emi_mpu.c#L132 > > > > v8: > > - Fix reserved-memory.txt and add the reg property in example. > > - Fix sizeof for of_property_count_elems_of_size in > > drivers/of/address.c#of_dma_set_restricted_buffer. > > - Apply Will's suggestion to try the OF node having DMA configuration in > > drivers/of/address.c#of_dma_set_restricted_buffer. > > - Fix typo in the comment of > > drivers/of/address.c#of_dma_set_restricted_buffer. > > - Add error message for PageHighMem in > > kernel/dma/swiotlb.c#rmem_swiotlb_device_init and move it to > > rmem_swiotlb_setup. > > - Fix the message string in rmem_swiotlb_setup. > > Thanks for the v8. It works for me out of the box on arm64 under KVM, so: > > Tested-by: Will Deacon > > Note that something seems to have gone wrong with the mail threading, so > the last 5 patches ended up as a separate thread for me. Probably worth > posting again with all the patches in one place, if you can. Thanks for testing. Christoph also added some comments in v7, so I'll prepare v9. > > Cheers, > > Will
[PATCH -next] macintosh: Use for_each_child_of_node() macro
Use for_each_child_of_node() macro instead of open coding it. Reported-by: Hulk Robot Signed-off-by: Zou Wei --- drivers/macintosh/macio_asic.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/drivers/macintosh/macio_asic.c b/drivers/macintosh/macio_asic.c index 49af60b..f552c7c 100644 --- a/drivers/macintosh/macio_asic.c +++ b/drivers/macintosh/macio_asic.c @@ -474,7 +474,7 @@ static void macio_pci_add_devices(struct macio_chip *chip) root_res = >resource[0]; /* First scan 1st level */ - for (np = NULL; (np = of_get_next_child(pnode, np)) != NULL;) { + for_each_child_of_node(pnode, np) { if (macio_skip_device(np)) continue; of_node_get(np); @@ -491,7 +491,7 @@ static void macio_pci_add_devices(struct macio_chip *chip) /* Add media bay devices if any */ if (mbdev) { pnode = mbdev->ofdev.dev.of_node; - for (np = NULL; (np = of_get_next_child(pnode, np)) != NULL;) { + for_each_child_of_node(pnode, np) { if (macio_skip_device(np)) continue; of_node_get(np); @@ -504,7 +504,7 @@ static void macio_pci_add_devices(struct macio_chip *chip) /* Add serial ports if any */ if (sdev) { pnode = sdev->ofdev.dev.of_node; - for (np = NULL; (np = of_get_next_child(pnode, np)) != NULL;) { + for_each_child_of_node(pnode, np) { if (macio_skip_device(np)) continue; of_node_get(np); -- 2.6.2
[PATCH v3] mm: add setup_initial_init_mm() helper
Add setup_initial_init_mm() helper to setup kernel text, data and brk. Cc: linux-snps-...@lists.infradead.org Cc: linux-arm-ker...@lists.infradead.org Cc: linux-c...@vger.kernel.org Cc: uclinux-h8-de...@lists.sourceforge.jp Cc: linux-m...@lists.linux-m68k.org Cc: openr...@lists.librecores.org Cc: linuxppc-dev@lists.ozlabs.org Cc: linux-ri...@lists.infradead.org Cc: linux...@vger.kernel.org Cc: linux-s...@vger.kernel.org Cc: x...@kernel.org Signed-off-by: Kefeng Wang --- v3: declaration in mm.h, implemention in init-mm.c include/linux/mm.h | 3 +++ mm/init-mm.c | 9 + 2 files changed, 12 insertions(+) diff --git a/include/linux/mm.h b/include/linux/mm.h index c274f75efcf9..02aa057540b7 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -244,6 +244,9 @@ int __add_to_page_cache_locked(struct page *page, struct address_space *mapping, #define lru_to_page(head) (list_entry((head)->prev, struct page, lru)) +void setup_initial_init_mm(void *start_code, void *end_code, + void *end_data, void *brk); + /* * Linux kernel virtual memory manager primitives. * The idea being to have a "virtual" mm in the same way diff --git a/mm/init-mm.c b/mm/init-mm.c index 153162669f80..b4a6f38fb51d 100644 --- a/mm/init-mm.c +++ b/mm/init-mm.c @@ -40,3 +40,12 @@ struct mm_struct init_mm = { .cpu_bitmap = CPU_BITS_NONE, INIT_MM_CONTEXT(init_mm) }; + +void setup_initial_init_mm(void *start_code, void *end_code, + void *end_data, void *brk) +{ + init_mm.start_code = (unsigned long)start_code; + init_mm.end_code = (unsigned long)end_code; + init_mm.end_data = (unsigned long)end_data; + init_mm.brk = (unsigned long)brk; +} -- 2.26.2
Re: [PATCH v2 01/15] mm: add setup_initial_init_mm() helper
On 2021/6/7 5:31, Mike Rapoport wrote: Hello Kefeng, On Fri, Jun 04, 2021 at 03:06:19PM +0800, Kefeng Wang wrote: Add setup_initial_init_mm() helper to setup kernel text, data and brk. Cc: linux-snps-...@lists.infradead.org Cc: linux-arm-ker...@lists.infradead.org Cc: linux-c...@vger.kernel.org Cc: uclinux-h8-de...@lists.sourceforge.jp Cc: linux-m...@lists.linux-m68k.org Cc: openr...@lists.librecores.org Cc: linuxppc-dev@lists.ozlabs.org Cc: linux-ri...@lists.infradead.org Cc: linux...@vger.kernel.org Cc: linux-s...@vger.kernel.org Cc: x...@kernel.org Signed-off-by: Kefeng Wang --- include/linux/mm_types.h | 8 1 file changed, 8 insertions(+) diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h index 5aacc1c10a45..e1d2429089a4 100644 --- a/include/linux/mm_types.h +++ b/include/linux/mm_types.h @@ -572,6 +572,14 @@ struct mm_struct { }; extern struct mm_struct init_mm; +static inline void setup_initial_init_mm(void *start_code, void *end_code, +void *end_data, void *brk) I think it's not that performance sensitive to make it inline. It can be placed in mm/init-mm.c with a forward declaration in mm.h Ok, I will send a update one with this change. +{ + init_mm.start_code = (unsigned long)start_code; + init_mm.end_code = (unsigned long)end_code; + init_mm.end_data = (unsigned long)end_data; + init_mm.brk = (unsigned long)brk; +} /* Pointer magic because the dynamic array size confuses some compilers. */ static inline void mm_init_cpumask(struct mm_struct *mm) -- 2.26.2 ___ linux-riscv mailing list linux-ri...@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-riscv
Re: [PATCH] watchdog: Remove MV64x60 watchdog driver
Guenter Roeck writes: > On 5/17/21 4:17 AM, Michael Ellerman wrote: >> Guenter Roeck writes: >>> On 3/18/21 10:25 AM, Christophe Leroy wrote: Commit 92c8c16f3457 ("powerpc/embedded6xx: Remove C2K board support") removed the last selector of CONFIG_MV64X60. Therefore CONFIG_MV64X60_WDT cannot be selected anymore and can be removed. Signed-off-by: Christophe Leroy >>> >>> Reviewed-by: Guenter Roeck >>> --- drivers/watchdog/Kconfig | 4 - drivers/watchdog/Makefile | 1 - drivers/watchdog/mv64x60_wdt.c | 324 - include/linux/mv643xx.h| 8 - 4 files changed, 337 deletions(-) delete mode 100644 drivers/watchdog/mv64x60_wdt.c >> >> I assumed this would go via the watchdog tree, but seems like I >> misinterpreted. >> > > Wim didn't send a pull request this time around. > > Guenter > >> Should I take this via the powerpc tree for v5.14 ? I still don't see this in the watchdog tree, should I take it? cheers
Re: [PATCH v2 00/15] init_mm: cleanup ARCH's text/data/brk setup code
On 2021/6/7 5:29, Mike Rapoport wrote: Hello Kefeng, On Fri, Jun 04, 2021 at 03:06:18PM +0800, Kefeng Wang wrote: Add setup_initial_init_mm() helper, then use it to cleanup the text, data and brk setup code. v2: - change argument from "char *" to "void *" setup_initial_init_mm() suggested by Geert Uytterhoeven - use NULL instead of (void *)0 on h8300 and m68k - collect ACKs Cc: linux-snps-...@lists.infradead.org Cc: linux-arm-ker...@lists.infradead.org Cc: linux-c...@vger.kernel.org Cc: uclinux-h8-de...@lists.sourceforge.jp Cc: linux-m...@lists.linux-m68k.org Cc: openr...@lists.librecores.org Cc: linuxppc-dev@lists.ozlabs.org Cc: linux-ri...@lists.infradead.org Cc: linux...@vger.kernel.org Cc: linux-s...@vger.kernel.org Kefeng Wang (15): mm: add setup_initial_init_mm() helper arc: convert to setup_initial_init_mm() arm: convert to setup_initial_init_mm() arm64: convert to setup_initial_init_mm() csky: convert to setup_initial_init_mm() h8300: convert to setup_initial_init_mm() m68k: convert to setup_initial_init_mm() nds32: convert to setup_initial_init_mm() nios2: convert to setup_initial_init_mm() openrisc: convert to setup_initial_init_mm() powerpc: convert to setup_initial_init_mm() riscv: convert to setup_initial_init_mm() s390: convert to setup_initial_init_mm() sh: convert to setup_initial_init_mm() x86: convert to setup_initial_init_mm() I might be missing something, but AFAIU the init_mm.start_code and other fields are not used really early so the new setup_initial_init_mm() function can be called in the generic code outside setup_arch(), e.g in mm_init(). Hi Mike, each architecture has their own value, not the same, eg m68K and h8300, also the name of the text/code/brk is different in some arch, so I keep unchanged. arch/arc/mm/init.c | 5 + arch/arm/kernel/setup.c| 5 + arch/arm64/kernel/setup.c | 5 + arch/csky/kernel/setup.c | 5 + arch/h8300/kernel/setup.c | 5 + arch/m68k/kernel/setup_mm.c| 5 + arch/m68k/kernel/setup_no.c| 5 + arch/nds32/kernel/setup.c | 5 + arch/nios2/kernel/setup.c | 5 + arch/openrisc/kernel/setup.c | 5 + arch/powerpc/kernel/setup-common.c | 5 + arch/riscv/kernel/setup.c | 5 + arch/s390/kernel/setup.c | 5 + arch/sh/kernel/setup.c | 5 + arch/x86/kernel/setup.c| 5 + include/linux/mm_types.h | 8 16 files changed, 23 insertions(+), 60 deletions(-) -- 2.26.2 ___ linux-riscv mailing list linux-ri...@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-riscv
[powerpc:next-test] BUILD REGRESSION 88d03cc0a992227ea2aa51bf78404670a2f6f2a6
tree/branch: https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git next-test branch HEAD: 88d03cc0a992227ea2aa51bf78404670a2f6f2a6 selftests/powerpc: Remove the repeated declaration Error/Warning in current branch: ERROR: modpost: "disable_kuap_key" [drivers/fsi/fsi-scom.ko] undefined! ERROR: modpost: "disable_kuap_key" [drivers/input/evdev.ko] undefined! ERROR: modpost: "disable_kuap_key" [drivers/input/joydev.ko] undefined! ERROR: modpost: "disable_kuap_key" [drivers/input/mousedev.ko] undefined! ERROR: modpost: "disable_kuap_key" [drivers/input/serio/serio_raw.ko] undefined! ERROR: modpost: "disable_kuap_key" [drivers/tee/tee.ko] undefined! ERROR: modpost: "disable_kuap_key" [drivers/watchdog/mv64x60_wdt.ko] undefined! ERROR: modpost: "disable_kuap_key" [net/decnet/decnet.ko] undefined! ERROR: modpost: "disable_kuap_key" [net/phonet/phonet.ko] undefined! ERROR: modpost: "disable_kuap_key" [net/phonet/pn_pep.ko] undefined! arch/powerpc/kernel/rtasd.c:366:30: error: unused variable 'rtas_log_proc_ops' [-Werror,-Wunused-const-variable] arch/powerpc/kernel/time.c:169:29: error: unused function 'read_spurr' [-Werror,-Wunused-function] arch/powerpc/platforms/4xx/pci.c:47:19: error: unused function 'ppc440spe_revA' [-Werror,-Wunused-function] arch/powerpc/sysdev/fsl_msi.c:574:37: error: unused variable 'vmpic_msi_feature' [-Werror,-Wunused-const-variable] arch/powerpc/sysdev/grackle.c:26:20: error: unused function 'grackle_set_stg' [-Werror,-Wunused-function] possible Error/Warning in current branch: arch/powerpc/platforms/pseries/hotplug-memory.c:605:19: error: unused function 'pseries_remove_memblock' [-Werror,-Wunused-function] arch/powerpc/platforms/pseries/vas.c:266:13: error: no previous prototype for function 'pseries_vas_fault_thread_fn' [-Werror,-Wmissing-prototypes] Error/Warning ids grouped by kconfigs: gcc_recent_errors `-- powerpc64-randconfig-r003-20210606 |-- ERROR:disable_kuap_key-drivers-fsi-fsi-scom.ko-undefined |-- ERROR:disable_kuap_key-drivers-input-evdev.ko-undefined |-- ERROR:disable_kuap_key-drivers-input-joydev.ko-undefined |-- ERROR:disable_kuap_key-drivers-input-mousedev.ko-undefined |-- ERROR:disable_kuap_key-drivers-input-serio-serio_raw.ko-undefined |-- ERROR:disable_kuap_key-drivers-tee-tee.ko-undefined |-- ERROR:disable_kuap_key-drivers-watchdog-mv64x60_wdt.ko-undefined |-- ERROR:disable_kuap_key-net-decnet-decnet.ko-undefined |-- ERROR:disable_kuap_key-net-phonet-phonet.ko-undefined `-- ERROR:disable_kuap_key-net-phonet-pn_pep.ko-undefined clang_recent_errors |-- powerpc-randconfig-r011-20210606 | `-- arch-powerpc-platforms-4xx-pci.c:error:unused-function-ppc440spe_revA-Werror-Wunused-function |-- powerpc64-randconfig-r011-20210606 | |-- arch-powerpc-platforms-pseries-hotplug-memory.c:error:unused-function-pseries_remove_memblock-Werror-Wunused-function | `-- arch-powerpc-platforms-pseries-vas.c:error:no-previous-prototype-for-function-pseries_vas_fault_thread_fn-Werror-Wmissing-prototypes `-- powerpc64-randconfig-r016-20210606 |-- arch-powerpc-kernel-rtasd.c:error:unused-variable-rtas_log_proc_ops-Werror-Wunused-const-variable |-- arch-powerpc-kernel-time.c:error:unused-function-read_spurr-Werror-Wunused-function |-- arch-powerpc-sysdev-fsl_msi.c:error:unused-variable-vmpic_msi_feature-Werror-Wunused-const-variable `-- arch-powerpc-sysdev-grackle.c:error:unused-function-grackle_set_stg-Werror-Wunused-function elapsed time: 728m configs tested: 198 configs skipped: 2 gcc tested configs: arm defconfig arm64allyesconfig arm64 defconfig arm allyesconfig arm allmodconfig nds32alldefconfig powerpc pcm030_defconfig sh se7343_defconfig armmps2_defconfig arm sunxi_defconfig powerpc storcenter_defconfig powerpc sequoia_defconfig sh shx3_defconfig powerpc kmeter1_defconfig microblaze defconfig openrisc simple_smp_defconfig powerpc maple_defconfig powerpc bamboo_defconfig powerpc ppc64e_defconfig arm mv78xx0_defconfig xtensa iss_defconfig m68kdefconfig sh r7785rp_defconfig arm badge4_defconfig armmulti_v5_defconfig m68km5272c3_defconfig powerpc kilauea_defconfig powerpc chrp32_defconfi
[powerpc:merge] BUILD SUCCESS c53db722ec7ab3ebf29ecf61e922820f31e5284b
currituck_defconfig openriscdefconfig powerpccell_defconfig arm simpad_defconfig powerpc makalu_defconfig powerpc tqm5200_defconfig arm stm32_defconfig mips malta_defconfig powerpcmpc7448_hpc2_defconfig armmulti_v7_defconfig s390 debug_defconfig sh se7206_defconfig alphaalldefconfig mipsbcm47xx_defconfig riscv rv32_defconfig arm imx_v4_v5_defconfig powerpc mpc832x_mds_defconfig mipsgpr_defconfig shedosk7705_defconfig arm nhk8815_defconfig ia64generic_defconfig arm am200epdkit_defconfig openriscor1ksim_defconfig sh se7751_defconfig csky alldefconfig x86_64allnoconfig armneponset_defconfig mips ip22_defconfig powerpcklondike_defconfig mips bmips_be_defconfig arm exynos_defconfig powerpc allyesconfig m68kmvme16x_defconfig xtensa defconfig ia64 allmodconfig m68k allmodconfig m68k allyesconfig nios2 defconfig arc allyesconfig nds32 defconfig nios2allyesconfig cskydefconfig alpha defconfig alphaallyesconfig xtensa allyesconfig h8300allyesconfig arc defconfig sh allmodconfig parisc defconfig s390 allmodconfig parisc allyesconfig s390defconfig i386 allyesconfig sparcallyesconfig sparc defconfig i386defconfig mips allyesconfig powerpc allmodconfig powerpc allnoconfig x86_64 randconfig-a002-20210606 x86_64 randconfig-a004-20210606 x86_64 randconfig-a003-20210606 x86_64 randconfig-a006-20210606 x86_64 randconfig-a005-20210606 x86_64 randconfig-a001-20210606 i386 randconfig-a003-20210606 i386 randconfig-a006-20210606 i386 randconfig-a004-20210606 i386 randconfig-a001-20210606 i386 randconfig-a005-20210606 i386 randconfig-a002-20210606 i386 randconfig-a015-20210606 i386 randconfig-a013-20210606 i386 randconfig-a016-20210606 i386 randconfig-a011-20210606 i386 randconfig-a014-20210606 i386 randconfig-a012-20210606 i386 randconfig-a015-20210607 i386 randconfig-a013-20210607 i386 randconfig-a011-20210607 i386 randconfig-a016-20210607 i386 randconfig-a014-20210607 i386 randconfig-a012-20210607 riscvnommu_k210_defconfig riscvallyesconfig riscvnommu_virt_defconfig riscv allnoconfig riscv defconfig riscvallmodconfig um x86_64_defconfig um i386_defconfig umkunit_defconfig x86_64 allyesconfig x86_64rhel-8.3-kselftests x86_64 rhel-8.3 x86_64 rhel-8.3-kbuiltin x86_64 kexec clang tested configs: x86_64 randconfig-b001-20210607 x86_64 randconfig-b001-20210606 x86_64 randconfig-a015-20210606 x86_64 randconfig-a011-20210606 x86_64 randconfig-a014-20210606 x86_64 randconfig-a012-20210606 x86_64 randconfig-a016-20210606 x86_64 randconfig-a013-20210606 x86_64 randconfig-a002-20210607 x86_64 randconfig-a004-20210607 x86_64 randconfig-a003-20210607 x86_64
[powerpc:fixes-test] BUILD SUCCESS 8e11d62e2e8769fe29d1ae98b44b23c7233eb8a2
maltaup_defconfig powerpc sbc8548_defconfig powerpc ppc40x_defconfig openriscdefconfig powerpccell_defconfig arm simpad_defconfig powerpc makalu_defconfig powerpc tqm5200_defconfig arm stm32_defconfig powerpcmpc7448_hpc2_defconfig armmulti_v7_defconfig s390 debug_defconfig sh se7206_defconfig alphaalldefconfig mipsbcm47xx_defconfig riscv rv32_defconfig arm imx_v4_v5_defconfig powerpc mpc832x_mds_defconfig mipsgpr_defconfig shedosk7705_defconfig arm nhk8815_defconfig x86_64allnoconfig riscvallyesconfig ia64generic_defconfig arm am200epdkit_defconfig openriscor1ksim_defconfig sh se7751_defconfig csky alldefconfig armneponset_defconfig mips ip22_defconfig arm mainstone_defconfig arm s5pv210_defconfig powerpc allmodconfig mips maltasmvp_defconfig powerpc allnoconfig powerpc allyesconfig m68kmvme16x_defconfig xtensa defconfig ia64 allmodconfig ia64defconfig m68k allmodconfig m68k allyesconfig nios2 defconfig arc allyesconfig nds32 defconfig nios2allyesconfig cskydefconfig alpha defconfig alphaallyesconfig xtensa allyesconfig h8300allyesconfig arc defconfig parisc defconfig s390 allmodconfig parisc allyesconfig s390defconfig i386 allyesconfig sparcallyesconfig sparc defconfig i386defconfig mips allyesconfig x86_64 randconfig-a002-20210606 x86_64 randconfig-a004-20210606 x86_64 randconfig-a003-20210606 x86_64 randconfig-a006-20210606 x86_64 randconfig-a005-20210606 x86_64 randconfig-a001-20210606 i386 randconfig-a003-20210606 i386 randconfig-a006-20210606 i386 randconfig-a004-20210606 i386 randconfig-a001-20210606 i386 randconfig-a005-20210606 i386 randconfig-a002-20210606 i386 randconfig-a015-20210606 i386 randconfig-a013-20210606 i386 randconfig-a016-20210606 i386 randconfig-a011-20210606 i386 randconfig-a014-20210606 i386 randconfig-a012-20210606 riscvnommu_k210_defconfig riscvnommu_virt_defconfig riscv allnoconfig riscv defconfig riscvallmodconfig um x86_64_defconfig um i386_defconfig umkunit_defconfig x86_64 allyesconfig x86_64rhel-8.3-kselftests x86_64 defconfig x86_64 rhel-8.3 x86_64 rhel-8.3-kbuiltin x86_64 kexec clang tested configs: x86_64 randconfig-b001-20210606 x86_64 randconfig-b001-20210607 x86_64 randconfig-a015-20210606 x86_64 randconfig-a011-20210606 x86_64 randconfig-a014-20210606 x86_64 randconfig-a012-20210606 x86_64 randconfig-a016-20210606 x86_64 randconfig-a013-20210606 --- 0-DAY CI Kernel Test Service, Intel Corporation https://lists.01.org/hyperkitty/list/kbuild-...@lists.01.org
Re: [PATCH v2 01/15] mm: add setup_initial_init_mm() helper
Hello Kefeng, On Fri, Jun 04, 2021 at 03:06:19PM +0800, Kefeng Wang wrote: > Add setup_initial_init_mm() helper to setup kernel text, > data and brk. > > Cc: linux-snps-...@lists.infradead.org > Cc: linux-arm-ker...@lists.infradead.org > Cc: linux-c...@vger.kernel.org > Cc: uclinux-h8-de...@lists.sourceforge.jp > Cc: linux-m...@lists.linux-m68k.org > Cc: openr...@lists.librecores.org > Cc: linuxppc-dev@lists.ozlabs.org > Cc: linux-ri...@lists.infradead.org > Cc: linux...@vger.kernel.org > Cc: linux-s...@vger.kernel.org > Cc: x...@kernel.org > Signed-off-by: Kefeng Wang > --- > include/linux/mm_types.h | 8 > 1 file changed, 8 insertions(+) > > diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h > index 5aacc1c10a45..e1d2429089a4 100644 > --- a/include/linux/mm_types.h > +++ b/include/linux/mm_types.h > @@ -572,6 +572,14 @@ struct mm_struct { > }; > > extern struct mm_struct init_mm; > +static inline void setup_initial_init_mm(void *start_code, void *end_code, > + void *end_data, void *brk) I think it's not that performance sensitive to make it inline. It can be placed in mm/init-mm.c with a forward declaration in mm.h > +{ > + init_mm.start_code = (unsigned long)start_code; > + init_mm.end_code = (unsigned long)end_code; > + init_mm.end_data = (unsigned long)end_data; > + init_mm.brk = (unsigned long)brk; > +} > /* Pointer magic because the dynamic array size confuses some compilers. */ > static inline void mm_init_cpumask(struct mm_struct *mm) > -- > 2.26.2 > > > ___ > linux-riscv mailing list > linux-ri...@lists.infradead.org > http://lists.infradead.org/mailman/listinfo/linux-riscv -- Sincerely yours, Mike.
Re: [PATCH v2 00/15] init_mm: cleanup ARCH's text/data/brk setup code
Hello Kefeng, On Fri, Jun 04, 2021 at 03:06:18PM +0800, Kefeng Wang wrote: > Add setup_initial_init_mm() helper, then use it > to cleanup the text, data and brk setup code. > > v2: > - change argument from "char *" to "void *" setup_initial_init_mm() > suggested by Geert Uytterhoeven > - use NULL instead of (void *)0 on h8300 and m68k > - collect ACKs > > Cc: linux-snps-...@lists.infradead.org > Cc: linux-arm-ker...@lists.infradead.org > Cc: linux-c...@vger.kernel.org > Cc: uclinux-h8-de...@lists.sourceforge.jp > Cc: linux-m...@lists.linux-m68k.org > Cc: openr...@lists.librecores.org > Cc: linuxppc-dev@lists.ozlabs.org > Cc: linux-ri...@lists.infradead.org > Cc: linux...@vger.kernel.org > Cc: linux-s...@vger.kernel.org > Kefeng Wang (15): > mm: add setup_initial_init_mm() helper > arc: convert to setup_initial_init_mm() > arm: convert to setup_initial_init_mm() > arm64: convert to setup_initial_init_mm() > csky: convert to setup_initial_init_mm() > h8300: convert to setup_initial_init_mm() > m68k: convert to setup_initial_init_mm() > nds32: convert to setup_initial_init_mm() > nios2: convert to setup_initial_init_mm() > openrisc: convert to setup_initial_init_mm() > powerpc: convert to setup_initial_init_mm() > riscv: convert to setup_initial_init_mm() > s390: convert to setup_initial_init_mm() > sh: convert to setup_initial_init_mm() > x86: convert to setup_initial_init_mm() I might be missing something, but AFAIU the init_mm.start_code and other fields are not used really early so the new setup_initial_init_mm() function can be called in the generic code outside setup_arch(), e.g in mm_init(). > arch/arc/mm/init.c | 5 + > arch/arm/kernel/setup.c| 5 + > arch/arm64/kernel/setup.c | 5 + > arch/csky/kernel/setup.c | 5 + > arch/h8300/kernel/setup.c | 5 + > arch/m68k/kernel/setup_mm.c| 5 + > arch/m68k/kernel/setup_no.c| 5 + > arch/nds32/kernel/setup.c | 5 + > arch/nios2/kernel/setup.c | 5 + > arch/openrisc/kernel/setup.c | 5 + > arch/powerpc/kernel/setup-common.c | 5 + > arch/riscv/kernel/setup.c | 5 + > arch/s390/kernel/setup.c | 5 + > arch/sh/kernel/setup.c | 5 + > arch/x86/kernel/setup.c| 5 + > include/linux/mm_types.h | 8 > 16 files changed, 23 insertions(+), 60 deletions(-) > > -- > 2.26.2 > > > ___ > linux-riscv mailing list > linux-ri...@lists.infradead.org > http://lists.infradead.org/mailman/listinfo/linux-riscv -- Sincerely yours, Mike.
Re: [GIT PULL] Please pull powerpc/linux.git powerpc-5.13-5 tag
The pull request you sent on Sun, 06 Jun 2021 22:44:24 +1000: > https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git > tags/powerpc-5.13-5 has been merged into torvalds/linux.git: https://git.kernel.org/torvalds/c/bd7b12aa6081c3755b693755d608f58e13798a60 Thank you! -- Deet-doot-dot, I am a bot. https://korg.docs.kernel.org/prtracker.html
[Bug 213079] [bisected] IRQ problems and crashes on a PowerMac G5 with 5.12.3
https://bugzilla.kernel.org/show_bug.cgi?id=213079 --- Comment #4 from Erhard F. (erhar...@mailbox.org) --- Created attachment 297191 --> https://bugzilla.kernel.org/attachment.cgi?id=297191=edit bisect.log Turns out the problem was introduced between v5.11 and v5.12 by following commit: # git bisect good fbbefb320214db14c3e740fce98e2c95c9d0669b is the first bad commit commit fbbefb320214db14c3e740fce98e2c95c9d0669b Author: Oliver O'Halloran Date: Tue Nov 3 15:35:07 2020 +1100 powerpc/pci: Move PHB discovery for PCI_DN using platforms Make powernv, pseries, powermac and maple use ppc_mc.discover_phbs. These platforms need to be done together because they all depend on pci_dn's being created from the DT. The pci_dn contains a pointer to the relevant pci_controller so they need to be created after the pci_controller structures are available, but before PCI devices are scanned. Currently this ordering is provided by initcalls and the sequence is: 1. PHBs are discovered (setup_arch) (early boot, pre-initcalls) 2. pci_dn are created from the unflattended DT (core initcall) 3. PHBs are scanned pcibios_init() (subsys initcall) The new ppc_md.discover_phbs() function is also a core_initcall so we can't guarantee ordering between the creation of pci_controllers and the creation of pci_dn's which require a pci_controller. We could use the postcore, or core_sync initcall levels, but it's cleaner to just move the pci_dn setup into the per-PHB inits which occur inside of .discover_phb() for these platforms. This brings the boot-time path in line with the PHB hotplug path that is used for pseries DLPAR operations too. Signed-off-by: Oliver O'Halloran [mpe: Squash powermac & maple in to avoid breakage those platforms, convert memblock allocs to use kmalloc to avoid warnings] Signed-off-by: Michael Ellerman Link: https://lore.kernel.org/r/20201103043523.916109-2-ooh...@gmail.com -- You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug.
Re: [PATCH] Fixup for "[v2] powerpc/8xx: Allow disabling KUAP at boot time"
Le 06/06/2021 à 19:43, Christophe Leroy a écrit : Michael, I sent it as a Fixup because it's in next-test, but if you prefer I can sent a v3. Christophe Kernel test robot reported: ERROR: modpost: "disable_kuap_key" [net/phonet/pn_pep.ko] undefined! ERROR: modpost: "disable_kuap_key" [net/phonet/phonet.ko] undefined! ERROR: modpost: "disable_kuap_key" [net/decnet/decnet.ko] undefined! ERROR: modpost: "disable_kuap_key" [drivers/tee/tee.ko] undefined! ERROR: modpost: "disable_kuap_key" [drivers/input/evdev.ko] undefined! ERROR: modpost: "disable_kuap_key" [drivers/input/joydev.ko] undefined! ERROR: modpost: "disable_kuap_key" [drivers/input/mousedev.ko] undefined! ERROR: modpost: "disable_kuap_key" [drivers/input/serio/serio_raw.ko] undefined! ERROR: modpost: "disable_kuap_key" [drivers/fsi/fsi-scom.ko] undefined! ERROR: modpost: "disable_kuap_key" [drivers/watchdog/mv64x60_wdt.ko] undefined! WARNING: modpost: suppressed 13 unresolved symbol warnings because there were too many) disable_kuap_key has to be exported. Use EXPORT_SYMBOL() as userspace access function are not exported as GPL today. Reported-by: kernel test robot Signed-off-by: Christophe Leroy --- arch/powerpc/mm/nohash/8xx.c | 1 + 1 file changed, 1 insertion(+) diff --git a/arch/powerpc/mm/nohash/8xx.c b/arch/powerpc/mm/nohash/8xx.c index a8d44e9342f3..fc663322ba58 100644 --- a/arch/powerpc/mm/nohash/8xx.c +++ b/arch/powerpc/mm/nohash/8xx.c @@ -257,6 +257,7 @@ void __init setup_kuep(bool disabled) #ifdef CONFIG_PPC_KUAP struct static_key_false disable_kuap_key; +EXPORT_SYMBOL(disable_kuap_key); void __init setup_kuap(bool disabled) {
[PATCH] Fixup for "[v2] powerpc/8xx: Allow disabling KUAP at boot time"
Kernel test robot reported: >> ERROR: modpost: "disable_kuap_key" [net/phonet/pn_pep.ko] undefined! >> ERROR: modpost: "disable_kuap_key" [net/phonet/phonet.ko] undefined! >> ERROR: modpost: "disable_kuap_key" [net/decnet/decnet.ko] undefined! >> ERROR: modpost: "disable_kuap_key" [drivers/tee/tee.ko] undefined! >> ERROR: modpost: "disable_kuap_key" [drivers/input/evdev.ko] undefined! >> ERROR: modpost: "disable_kuap_key" [drivers/input/joydev.ko] undefined! >> ERROR: modpost: "disable_kuap_key" [drivers/input/mousedev.ko] undefined! >> ERROR: modpost: "disable_kuap_key" [drivers/input/serio/serio_raw.ko] >> undefined! >> ERROR: modpost: "disable_kuap_key" [drivers/fsi/fsi-scom.ko] undefined! >> ERROR: modpost: "disable_kuap_key" [drivers/watchdog/mv64x60_wdt.ko] >> undefined! WARNING: modpost: suppressed 13 unresolved symbol warnings because there were too many) disable_kuap_key has to be exported. Use EXPORT_SYMBOL() as userspace access function are not exported as GPL today. Reported-by: kernel test robot Signed-off-by: Christophe Leroy --- arch/powerpc/mm/nohash/8xx.c | 1 + 1 file changed, 1 insertion(+) diff --git a/arch/powerpc/mm/nohash/8xx.c b/arch/powerpc/mm/nohash/8xx.c index a8d44e9342f3..fc663322ba58 100644 --- a/arch/powerpc/mm/nohash/8xx.c +++ b/arch/powerpc/mm/nohash/8xx.c @@ -257,6 +257,7 @@ void __init setup_kuep(bool disabled) #ifdef CONFIG_PPC_KUAP struct static_key_false disable_kuap_key; +EXPORT_SYMBOL(disable_kuap_key); void __init setup_kuap(bool disabled) { -- 2.25.0
Re: [PATCH 10/30] ps3disk: use blk_mq_alloc_disk
Hi Christoph, On 6/1/21 11:53 PM, Christoph Hellwig wrote: > Use the blk_mq_alloc_disk API to simplify the gendisk and request_queue > allocation. > > drivers/block/ps3disk.c | 36 ++-- > 1 file changed, 14 insertions(+), 22 deletions(-) I tested your alloc_disk-part2 branch on PS3, and it seemed to be working OK. Tested-by: Geoff Levand
[powerpc]next-20210604 - Kernel crash while running pmem tests
Running pmem tests [1] against latest next tree booted on powerpc results into following crash [ 1307.124289] Kernel attempted to read user page (330) - exploit attempt? (uid: 0) [ 1307.124319] BUG: Kernel NULL pointer dereference on read at 0x0330 [ 1307.124328] Faulting instruction address: 0xc0906344 [ 1307.124336] Oops: Kernel access of bad area, sig: 11 [#1] [ 1307.124343] LE PAGE_SIZE=64K MMU=Radix SMP NR_CPUS=2048 NUMA pSeries [ 1307.124353] Modules linked in: rpadlpar_io rpaphp dm_mod bonding rfkill sunrpc pseries_rng papr_scm uio_pdrv_genirq uio sch_fq_codel ip_tables sd_mod t10_pi sg ibmvscsi scsi_transport_srp ibmveth fuse [ 1307.124392] CPU: 14 PID: 23553 Comm: lt-ndctl Not tainted 5.13.0-rc4-next-20210604 #1 [ 1307.124403] NIP: c0906344 LR: c04701d4 CTR: c0906320 [ 1307.124411] REGS: c00022cbb720 TRAP: 0300 Not tainted (5.13.0-rc4-next-20210604) [ 1307.124420] MSR: 8280b033 CR: 48048288 XER: 2004 [ 1307.124441] CFAR: c04701d0 DAR: 0330 DSISR: 4000 IRQMASK: 0 [ 1307.124441] GPR00: c04701d4 c00022cbb9c0 c1b39100 c000220e16a0 [ 1307.124441] GPR04: c0009483a300 c0009483a300 28048282 c001dd30 [ 1307.124441] GPR08: 0001 0001 c1b7e060 [ 1307.124441] GPR12: c0906320 c0167fa21a80 fffa 10050f6c [ 1307.124441] GPR16: 10050e88 7fffc289f87b 7fffc289a850 7fffc289a6b8 [ 1307.124441] GPR20: 0003 10033600 10050f6a [ 1307.124441] GPR24: c000268cf810 cbcb2660 c000220e1728 [ 1307.124441] GPR28: 0001 c1bf1978 c000220e16a0 040f1000 [ 1307.124537] NIP [c0906344] pmem_pagemap_cleanup+0x24/0x40 [ 1307.124550] LR [c04701d4] memunmap_pages+0x1b4/0x4b0 [ 1307.124560] Call Trace: [ 1307.124564] [c00022cbb9c0] [c09063c8] pmem_pagemap_kill+0x28/0x40 (unreliable) [ 1307.124576] [c00022cbb9e0] [c04701d4] memunmap_pages+0x1b4/0x4b0 [ 1307.124586] [c00022cbba90] [c08b28a0] devm_action_release+0x30/0x50 [ 1307.124597] [c00022cbbab0] [c08b39c8] release_nodes+0x2f8/0x3e0 [ 1307.124607] [c00022cbbb60] [c08ac440] device_release_driver_internal+0x190/0x2b0 [ 1307.124619] [c00022cbbba0] [c08a8450] unbind_store+0x130/0x170 [ 1307.124629] [c00022cbbbe0] [c08a75b4] drv_attr_store+0x44/0x60 [ 1307.124638] [c00022cbbc00] [c0594a08] sysfs_kf_write+0x68/0x80 [ 1307.124648] [c00022cbbc20] [c05930e0] kernfs_fop_write_iter+0x1a0/0x290 [ 1307.124657] [c00022cbbc70] [c047830c] new_sync_write+0x14c/0x1d0 [ 1307.124666] [c00022cbbd10] [c047b8d4] vfs_write+0x224/0x330 [ 1307.124675] [c00022cbbd60] [c047bbbc] ksys_write+0x7c/0x140 [ 1307.124683] [c00022cbbdb0] [c002ecd0] system_call_exception+0x150/0x2d0 [ 1307.124694] [c00022cbbe10] [c000d45c] system_call_common+0xec/0x278 [ 1307.124703] --- interrupt: c00 at 0x7fffa26cbd74 [ 1307.124710] NIP: 7fffa26cbd74 LR: 7fffa28bb6bc CTR: [ 1307.124717] REGS: c00022cbbe80 TRAP: 0c00 Not tainted (5.13.0-rc4-next-20210604) [ 1307.124726] MSR: 8280f033 CR: 24048402 XER: [ 1307.124746] IRQMASK: 0 [ 1307.124746] GPR00: 0004 7fffc289a180 7fffa27c7100 0004 [ 1307.124746] GPR04: 47d24c4c 0007 7fffa28dd1d8 [ 1307.124746] GPR08: [ 1307.124746] GPR12: 7fffa29a2560 fffa 10050f6c [ 1307.124746] GPR16: 10050e88 7fffc289f87b 7fffc289a850 7fffc289a6b8 [ 1307.124746] GPR20: 0003 10033600 10050f6a [ 1307.124746] GPR24: 10050e60 47d28340 100314e8 [ 1307.124746] GPR28: 7fffa299b5c8 0004 0007 47d24c4c [ 1307.124831] NIP [7fffa26cbd74] 0x7fffa26cbd74 [ 1307.124838] LR [7fffa28bb6bc] 0x7fffa28bb6bc [ 1307.124844] --- interrupt: c00 [ 1307.124848] Instruction dump: [ 1307.124854] 7c0803a6 4e800020 6000 3c4c0123 38422de0 7c0802a6 6000 7c0802a6 [ 1307.124870] f8010010 f821ffe1 e9230030 e9290088 4bddbb01 6000 38210020 [ 1307.124886] ---[ end trace 9881d6f8c705bac2 ]— next-20210601 was good. The code in question pmem_pagemap_cleanup was last changed by commit 87eb73b2ca7 nvdimm-pmem: convert to blk_alloc_disk/blk_cleanup_disk static void pmem_pagemap_cleanup(struct dev_pagemap *pgmap) { struct request_queue *q = container_of(pgmap->ref, struct request_queue, q_usage_counter); blk_cleanup_disk(queue_to_disk(q)); << } Thanks -Sachin
[GIT PULL] Please pull powerpc/linux.git powerpc-5.13-5 tag
-BEGIN PGP SIGNED MESSAGE- Hash: SHA256 Hi Linus, Please pull some more powerpc fixes for 5.13: The following changes since commit d72500f992849d31ebae8f821a023660ddd0dcc2: powerpc/64s/syscall: Fix ptrace syscall info with scv syscalls (2021-05-21 00:58:56 +1000) are available in the git repository at: https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git tags/powerpc-5.13-5 for you to fetch changes up to 59cc84c802eb923805e7bba425976a3df5ce35d8: Revert "powerpc/kernel/iommu: Align size for IOMMU_PAGE_SIZE() to save TCEs" (2021-06-01 11:17:08 +1000) - -- powerpc fixes for 5.13 #5 Fix our KVM reverse map real-mode handling since we enabled huge vmalloc (in some configurations). Revert a recent change to our IOMMU code which broke some devices. Fix KVM handling of FSCR on P7/P8, which could have possibly let a guest crash it's Qemu. Fix kprobes validation of prefixed instructions across page boundary. Thanks to: Alexey Kardashevskiy, Christophe Leroy, Fabiano Rosas, Frederic Barrat, Naveen N. Rao, Nicholas Piggin. - -- Frederic Barrat (1): Revert "powerpc/kernel/iommu: Align size for IOMMU_PAGE_SIZE() to save TCEs" Naveen N. Rao (1): powerpc/kprobes: Fix validation of prefixed instructions across page boundary Nicholas Piggin (2): powerpc: Fix reverse map real-mode address lookup with huge vmalloc KVM: PPC: Book3S HV: Save host FSCR in the P7/8 path arch/powerpc/include/asm/pte-walk.h | 29 arch/powerpc/kernel/eeh.c | 23 +--- arch/powerpc/kernel/io-workarounds.c| 16 ++- arch/powerpc/kernel/iommu.c | 11 arch/powerpc/kernel/kprobes.c | 4 +-- arch/powerpc/kvm/book3s_hv.c| 1 - arch/powerpc/kvm/book3s_hv_rm_mmu.c | 15 ++ arch/powerpc/kvm/book3s_hv_rmhandlers.S | 7 + 8 files changed, 49 insertions(+), 57 deletions(-) -BEGIN PGP SIGNATURE- iQIzBAEBCAAdFiEEJFGtCPCthwEv2Y/bUevqPMjhpYAFAmC8wv4ACgkQUevqPMjh pYDyIw/8C5X6YwKf+jM18S08wUfAy9BdnH79SNM4uANdYkpgduv8joMyjpoDs4Tv I0LH0vx1QybGO8sJwOZJfGTrb5iQtZIBoZUtWoKZWFLsX8s6ltxh7Cv6skP3GfgK YFrQirUtnOoi7xbgILkofklKCriRYdy0ww5+VqoNRk6WqWecRGhXtr17z3KNluTs 9Mt/7uWT275/XFd1IUzHFJfV/vkGnWTQD5N5sx/K9YxlIye/LdGb2o3FzLGr2jyB SMHSS7cevNyl4chM5AaFAGs7WZygLFZmScDdR0jEh9oipk77puQnGvwTk9GcZVQL Vy5tneHjWiKg0PbgWmuPWk3XfsgtoBGrpqsk2Guj23qOWolxhZ1DlpgO5+MRXXVm 0GLOJzbzR3Tf5NgsRxaGN2kjFuexyxawVJc1w8cM98QPAPYBIHIdHSjX9LIq/iW0 mXYWag1/etDQGmWgkKpun0aVRU2VH3pLejyRqwRT2ZZYm1Zo8Lsz21eDBoD+8jCV pwOcB44F0jz4+13cjtyYcWfln34I4ex0pumrc0pGVF+6tfDPWJI8JXihORNrcUhn KwZKkCZaqAMskiaNDyFS/45vWsYAevQdh74rVYG1Ad1yTXz9naY2t4ryOFB01DLh H5pwoan6sdHt97C9SCI35oTC+W0cv/qEVv0fyJl0oH8U/QlKSJo= =K09P -END PGP SIGNATURE-
Re: [PATCH v7 01/32] KVM: PPC: Book3S 64: move KVM interrupt entry to a common entry point
Nicholas Piggin writes: > Rather than bifurcate the call depending on whether or not HV is > possible, and have the HV entry test for PR, just make a single > common point which does the demultiplexing. This makes it simpler > to add another type of exit handler. > > Acked-by: Paul Mackerras > Reviewed-by: Daniel Axtens > Reviewed-by: Fabiano Rosas > Signed-off-by: Nicholas Piggin > --- > arch/powerpc/kernel/exceptions-64s.S| 8 +- > arch/powerpc/kvm/Makefile | 3 +++ > arch/powerpc/kvm/book3s_64_entry.S | 36 + > arch/powerpc/kvm/book3s_hv_rmhandlers.S | 11 ++-- > 4 files changed, 42 insertions(+), 16 deletions(-) > create mode 100644 arch/powerpc/kvm/book3s_64_entry.S > > diff --git a/arch/powerpc/kernel/exceptions-64s.S > b/arch/powerpc/kernel/exceptions-64s.S > index fa8e52a0239e..868077f7a96f 100644 > --- a/arch/powerpc/kernel/exceptions-64s.S > +++ b/arch/powerpc/kernel/exceptions-64s.S > @@ -208,7 +208,6 @@ do_define_int n > .endm > > #ifdef CONFIG_KVM_BOOK3S_64_HANDLER > -#ifdef CONFIG_KVM_BOOK3S_HV_POSSIBLE > /* > * All interrupts which set HSRR registers, as well as SRESET and MCE and > * syscall when invoked with "sc 1" switch to MSR[HV]=1 (HVMODE) to be taken, > @@ -238,13 +237,8 @@ do_define_int n > > /* > * If an interrupt is taken while a guest is running, it is immediately > routed > - * to KVM to handle. If both HV and PR KVM arepossible, KVM interrupts go > first > - * to kvmppc_interrupt_hv, which handles the PR guest case. > + * to KVM to handle. > */ > -#define kvmppc_interrupt kvmppc_interrupt_hv > -#else > -#define kvmppc_interrupt kvmppc_interrupt_pr > -#endif > > .macro KVMTEST name > lbz r10,HSTATE_IN_GUEST(r13) > diff --git a/arch/powerpc/kvm/Makefile b/arch/powerpc/kvm/Makefile > index 2bfeaa13befb..cdd119028f64 100644 > --- a/arch/powerpc/kvm/Makefile > +++ b/arch/powerpc/kvm/Makefile > @@ -59,6 +59,9 @@ kvm-pr-y := \ > kvm-book3s_64-builtin-objs-$(CONFIG_KVM_BOOK3S_64_HANDLER) += \ > tm.o > > +kvm-book3s_64-builtin-objs-y += \ > + book3s_64_entry.o > + Further we down we have: obj-y += $(kvm-book3s_64-builtin-objs-y) Which means book3s_64_entry.S ends up getting built for BOOKE, which breaks. I think instead we want to add it to the preceding entry, eg: diff --git a/arch/powerpc/kvm/Makefile b/arch/powerpc/kvm/Makefile index 91eb67bb91e1..ab241317481c 100644 --- a/arch/powerpc/kvm/Makefile +++ b/arch/powerpc/kvm/Makefile @@ -57,11 +57,9 @@ kvm-pr-y := \ book3s_32_mmu.o kvm-book3s_64-builtin-objs-$(CONFIG_KVM_BOOK3S_64_HANDLER) += \ + book3s_64_entry.o \ tm.o -kvm-book3s_64-builtin-objs-y += \ - book3s_64_entry.o - ifdef CONFIG_KVM_BOOK3S_PR_POSSIBLE kvm-book3s_64-builtin-objs-$(CONFIG_KVM_BOOK3S_64_HANDLER) += \ book3s_rmhandlers.o cheers
Re: [PATCH -next] powerpc/pseries/memhotplug: Remove unused inline function dlpar_memory_remove()
On Fri, 14 May 2021 15:10:41 +0800, YueHaibing wrote: > dlpar_memory_remove() is never used, so can be removed. Applied to powerpc/next. [1/1] powerpc/pseries/memhotplug: Remove unused inline function dlpar_memory_remove() https://git.kernel.org/powerpc/c/9b373899e9606d252364191ce2b385043a8808bc cheers
Re: [PATCH v3] powerpc/papr_scm: Reduce error severity if nvdimm stats inaccessible
On Sat, 8 May 2021 10:06:42 +0530, Vaibhav Jain wrote: > Currently drc_pmem_qeury_stats() generates a dev_err in case > "Enable Performance Information Collection" feature is disabled from > HMC or performance stats are not available for an nvdimm. The error is > of the form below: > > papr_scm ibm,persistent-memory:ibm,pmemory@44104001: Failed to query >performance stats, Err:-10 > > [...] Applied to powerpc/next. [1/1] powerpc/papr_scm: Reduce error severity if nvdimm stats inaccessible https://git.kernel.org/powerpc/c/f3f6d18417eb44ef393b23570d384d2778ef22dc cheers
Re: [PATCH 1/1] powerpc/pseries/ras: Delete a redundant condition branch
On Mon, 10 May 2021 21:19:24 +0800, Zhen Lei wrote: > The statement of the last "if (xxx)" branch is the same as the "else" > branch. Delete it to simplify code. > > No functional change. Applied to powerpc/next. [1/1] powerpc/pseries/ras: Delete a redundant condition branch https://git.kernel.org/powerpc/c/ad06bcfd5b8f989690053e6026cf742886ba9f90 cheers
Re: [PATCH v2 0/2] powerpc/sstep: Add emulation support and tests for 'setb' instruction
On Tue, 11 May 2021 07:18:31 -0500, Sathvika Vasireddy wrote: > This patchset adds emulation support and tests for setb instruction. > Test cases are written to test different CR fields with different > bits set in each field. > > v1->v2: > - Extract all the bits of the CR field (bfa) and check if the > LT, GT bits of that CR field (bfa) are set. > - Place 'setb' emulation code after 'mfcr' instruction emulation. > - Add 'cpu_feature' in the selftests patch to restrict them to ISA v3.0 > > [...] Applied to powerpc/next. [1/2] powerpc/sstep: Add emulation support for ‘setb’ instruction https://git.kernel.org/powerpc/c/5b75bd763d369e43e6d09e85eaea22fde37c0e89 [2/2] powerpc/sstep: Add tests for setb instruction https://git.kernel.org/powerpc/c/60060d704c55a9450208b8f0bc5026df9d4ab1d6 cheers
Re: [PATCH v2] powerpc/powernv/pci: fix header guard
On Tue, 18 May 2021 13:40:41 -0700, Nick Desaulniers wrote: > While looking at -Wundef warnings, the #if CONFIG_EEH stood out as a > possible candidate to convert to #ifdef CONFIG_EEH. > > It seems that based on Kconfig dependencies it's not possible to build > this file without CONFIG_EEH enabled, but based on upstream discussion, > it's not clear yet that CONFIG_EEH should be enabled by default. > > [...] Applied to powerpc/next. [1/1] powerpc/powernv/pci: fix header guard https://git.kernel.org/powerpc/c/73e6e4e01134c9ee97433ad1f470c71b0748b08f cheers
Re: [PATCH] powerpc: Kconfig: disable CONFIG_COMPAT for clang < 12
On Tue, 18 May 2021 13:58:57 -0700, Nick Desaulniers wrote: > Until clang-12, clang would attempt to assemble 32b powerpc assembler in > 64b emulation mode when using a 64b target triple with -m32, leading to > errors during the build of the compat VDSO. Simply disable all of > CONFIG_COMPAT; users should upgrade to the latest release of clang for > proper support. Applied to powerpc/next. [1/1] powerpc: Kconfig: disable CONFIG_COMPAT for clang < 12 https://git.kernel.org/powerpc/c/6fcb574125e673f33ff058caa54b4e65629f3a08 cheers
Re: [PATCH] powerpc/xmon: make dumping log buffer contents more reliable
On Fri, 14 May 2021 11:24:20 -0500, Nathan Lynch wrote: > Log buffer entries that are too long for dump_log_buf()'s small > local buffer are: > > * silently discarded when a single-line entry is too long; > kmsg_dump_get_line() returns true but sets to 0. > * silently truncated to the last fitting new line when a multi-line > entry is too long, e.g. register dumps from __show_regs(); this > seems undetectable via the kmsg_dump API. > > [...] Applied to powerpc/next. [1/1] powerpc/xmon: make dumping log buffer contents more reliable https://git.kernel.org/powerpc/c/2cec178e35baf57d307c0982fd2e53055bd1e9bb cheers
Re: [PATCH] powerpc/udbg_hvc: retry putc on -EAGAIN
On Fri, 14 May 2021 16:44:22 -0500, Nathan Lynch wrote: > hvterm_raw_put_chars() calls hvc_put_chars(), which may return -EAGAIN > when the underlying hcall returns a "busy" status, but udbg_hvc_putc() > doesn't handle this. When using xmon on a PowerVM guest, this can > result in incomplete or garbled output when printing relatively large > amounts of data quickly, such as when dumping the kernel log buffer. > > Call again on -EAGAIN. Applied to powerpc/next. [1/1] powerpc/udbg_hvc: retry putc on -EAGAIN https://git.kernel.org/powerpc/c/027f55e87c3094270a3223f7331d033fe15a2b3f cheers
Re: [PATCH] selftests/powerpc: Add test of mitigation patching
On Fri, 7 May 2021 16:42:25 +1000, Michael Ellerman wrote: > We recently discovered some of our mitigation patching was not safe > against other CPUs running concurrently. > > Add a test which enable/disables all mitigations in a tight loop while > also running some stress load. On an unpatched system this almost always > leads to an oops and panic/reboot, but we also check if the kernel > becomes tainted in case we have a non-fatal oops. Applied to powerpc/next. [1/1] selftests/powerpc: Add test of mitigation patching https://git.kernel.org/powerpc/c/34f7f79827ec4db30cff9001dfba19f496473e8d cheers
Re: [PATCH] powerpc/Makefile: Add ppc32/ppc64_randconfig targets
On Wed, 28 Apr 2021 23:27:00 +1000, Michael Ellerman wrote: > Make it easier to generate a 32 or 64-bit specific randconfig. Applied to powerpc/next. [1/1] powerpc/Makefile: Add ppc32/ppc64_randconfig targets https://git.kernel.org/powerpc/c/f259fb893c69d60ac1c7192f1974635c554fd716 cheers
Re: [PATCH] selftests/powerpc: Fix duplicate included pthread.h
On Thu, 13 May 2021 19:03:40 +0800, Jiapeng Chong wrote: > Clean up the following includecheck warning: > > ./tools/testing/selftests/powerpc/tm/tm-vmx-unavail.c: pthread.h is > included more than once. > > No functional change. Applied to powerpc/next. [1/1] selftests/powerpc: Fix duplicate included pthread.h https://git.kernel.org/powerpc/c/c67454615cf95160cb806f7a471158a901eb261d cheers
Re: [PATCH v2 0/4] Unisolate LMBs DRC on removal error + cleanups
On Wed, 12 May 2021 17:28:05 -0300, Daniel Henrique Barboza wrote: > changes from v1: > - patch 1: added David's r-b > - patch 2: > * removed the DRCONF_MEM_RESERVED assumption for > dlpar_memory_remove_by_ic() > * reworded the commit msg > - patch 3: dropped. the differences between dlpar_memory_remove_by_ic() > and dlpar_memory_remove_by_count() makes a helper function too complex > to handle both cases > - (new) patch 3 and 4: minor enhancements > > [...] Applied to powerpc/next. [1/4] powerpc/pseries: Set UNISOLATE on dlpar_memory_remove_by_ic() error https://git.kernel.org/powerpc/c/feb0e079f43dee055701c1a294785d37341d6f97 [2/4] powerpc/pseries: check DRCONF_MEM_RESERVED in lmb_is_removable() https://git.kernel.org/powerpc/c/2ad216b4d6ff0f92fc645c1bd8838f04fbf09b9d [3/4] powerpc/pseries: break early in dlpar_memory_remove_by_count() loops https://git.kernel.org/powerpc/c/163e7921750f6cd965666f472c21de056c63dcbc [4/4] powerpc/pseries: minor enhancements in dlpar_memory_remove_by_ic() https://git.kernel.org/powerpc/c/40999b041e03b32434b2f4a951668e9865a3cb6b cheers
Re: [PATCH v2 1/2] powerpc/asm-offset: Remove unused items
On Wed, 5 May 2021 14:02:12 + (UTC), Christophe Leroy wrote: > Following PACA related items are not used anymore by ASM code: > PACA_SIZE, PACACONTEXTID, PACALOWSLICESPSIZE, PACAHIGHSLICEPSIZE, > PACA_SLB_ADDR_LIMIT, MMUPSIZEDEFSIZE, PACASLBCACHE, PACASLBCACHEPTR, > PACASTABRR, PACAVMALLOCSLLP, MMUPSIZESLLP, PACACONTEXTSLLP, > PACALPPACAPTR, LPPACA_DTLIDX and PACA_DTL_RIDX. > > Following items are also not used anymore: > SIGSEGV, NMI_MASK, THREAD_DBCR0, KUAP, TI_FLAGS, TI_PREEMPT, > DCACHEL1BLOCKSPERPAGE, ICACHEL1BLOCKSIZE, ICACHEL1LOGBLOCKSIZE, > ICACHEL1BLOCKSPERPAGE, STACK_REGS_KUAP, KVM_NEED_FLUSH, KVM_FWNMI, > VCPU_DEC, VCPU_SPMC, HSTATE_XICS_PHYS, HSTATE_SAVED_XIRR and > PPC_DBELL_MSGTYPE. > > [...] Applied to powerpc/next. [1/2] powerpc/asm-offset: Remove unused items https://git.kernel.org/powerpc/c/1a3c6ceed2533121e857d9b38559839487d31f33 [2/2] powerpc/paca: Remove mm_ctx_id and mm_ctx_slb_addr_limit https://git.kernel.org/powerpc/c/13c7dad95176d6abaf9608e4d34b2f512689775e cheers
Re: [PATCH v2 1/2] kprobes: Allow architectures to override optinsn page allocation
On Thu, 13 May 2021 09:07:51 + (UTC), Christophe Leroy wrote: > Some architectures like powerpc require a non standard > allocation of optinsn page, because module pages are > too far from the kernel for direct branches. > > Define weak alloc_optinsn_page() and free_optinsn_page(), that > fall back on alloc_insn_page() and free_insn_page() when not > overriden by the architecture. Applied to powerpc/next. [1/2] kprobes: Allow architectures to override optinsn page allocation https://git.kernel.org/powerpc/c/7ee3e97e00a3893e354c3993c3f7d9dc127e9c5e [2/2] powerpc/kprobes: Replace ppc_optinsn by common optinsn https://git.kernel.org/powerpc/c/b73c8cccd72ac28beaf262fd6ef4b91411fc8335 cheers
Re: [PATCH] powerpc: Only pad struct pt_regs when needed
On Thu, 6 May 2021 13:30:51 + (UTC), Christophe Leroy wrote: > If neither KUAP nor PPC64 is selected, there is nothing in the second > union of struct pt_regs, so the alignment padding is waste of memory. Applied to powerpc/next. [1/1] powerpc: Only pad struct pt_regs when needed https://git.kernel.org/powerpc/c/b09049c516af90d4b6643b5d0d2549cd01539086 cheers
Re: [PATCH] powerpc/mmu: Remove leftover CONFIG_E200
On Fri, 7 May 2021 06:00:58 + (UTC), Christophe Leroy wrote: > Commit 39c8bf2b3cc1 ("powerpc: Retire e200 core (mpc555x processor)") > removed CONFIG_E200. > Commit f9158d58a4e1 ("powerpc/mm: Add mask of always present MMU > features") was merged in the same cycle and added a new use of > CONFIG_E200. > > Remove that use. Applied to powerpc/next. [1/1] powerpc/mmu: Remove leftover CONFIG_E200 https://git.kernel.org/powerpc/c/0441729e16379649ea0f393a5be68a19ba384d94 cheers
Re: [PATCH] powerpc/mmu: Don't duplicate radix_enabled()
On Fri, 7 May 2021 05:45:52 + (UTC), Christophe Leroy wrote: > mmu_has_feature(MMU_FTR_TYPE_RADIX) can be evaluated regardless of > CONFIG_PPC_RADIX_MMU. > > When CONFIG_PPC_RADIX_MMU is not set, mmu_has_feature(MMU_FTR_TYPE_RADIX) > will evaluate to 'false' at build time because MMU_FTR_TYPE_RADIX > wont be included in MMU_FTRS_POSSIBLE. Applied to powerpc/next. [1/1] powerpc/mmu: Don't duplicate radix_enabled() https://git.kernel.org/powerpc/c/fe3dc333d2ed50c9764d281869d87bae0d795ce5 cheers
Re: [PATCH] powerpc: Define NR_CPUS all the time
On Fri, 7 May 2021 09:24:43 + (UTC), Christophe Leroy wrote: > include/linux/threads.h defines a default value for CONFIG_NR_CPUS > but suggests the architectures should always define it. > > So do it, when CONFIG_SMP is not selected set CONFIG_NR_CPUS to 1. Applied to powerpc/next. [1/1] powerpc: Define NR_CPUS all the time https://git.kernel.org/powerpc/c/c176c3d58a3ed623e8917acaafe240245e700e33 cheers
Re: [PATCH] powerpc/8xx: Update mpc885_ads_defconfig to improve CI
On Fri, 7 May 2021 16:37:53 + (UTC), Christophe Leroy wrote: > mpc885_ads_defconfig is used by several CI robots. > > A few functionnalities are specific to 8xx and are not covered > by other default configuration, so improve build test coverage > by adding them to mpc885_ads_defconfig. > > 8xx is the only platform supporting 16k page size in addition > to 4k page size. Considering that 4k page size is widely tested > by other configurations, lets make 16k pages the selection for > 8xx, as it has demonstrated in the past to be a weakness. > > [...] Applied to powerpc/next. [1/1] powerpc/8xx: Update mpc885_ads_defconfig to improve CI https://git.kernel.org/powerpc/c/9a1762a4a4ff3cc096c605212165f59481e84503 cheers
Re: [PATCH] powerpc/603: Avoid a pile of NOPs when not using SW LRU in TLB exceptions
On Fri, 7 May 2021 05:02:02 + (UTC), Christophe Leroy wrote: > The SW LRU is in an MMU feature section. When not used, that's a > dozen of NOPs to fetch for nothing. > > Define an ALT section that does the few remaining operations. > > That also avoids a double read on SRR1 in the SW LRU case. Applied to powerpc/next. [1/1] powerpc/603: Avoid a pile of NOPs when not using SW LRU in TLB exceptions https://git.kernel.org/powerpc/c/70d6ebf82bd0cfddaebb54e861fc15e9945a5fc6 cheers
Re: [PATCH] powerpc/32s: Speed up likely path of kuap_update_sr()
On Thu, 6 May 2021 13:27:31 + (UTC), Christophe Leroy wrote: > In most cases, kuap_update_sr() will update a single segment > register. > > We know that first update will always be done, if there is no > segment register to update at all, kuap_update_sr() is not > called. > > [...] Applied to powerpc/next. [1/1] powerpc/32s: Speed up likely path of kuap_update_sr() https://git.kernel.org/powerpc/c/8af8d72dc58e90dc945ca627b24968400e0f21b6 cheers
Re: [PATCH] powerpc/32s: Remove m8260_gorom()
On Thu, 6 May 2021 09:10:01 + (UTC), Christophe Leroy wrote: > Last user of m8260_gorom() was removed by > Commit 917f0af9e5a9 ("powerpc: Remove arch/ppc and include/asm-ppc") > removed last user of m8260_gorom(). > > In fact m8260_gorom() was ported to arch/powerpc/ but the > platform using it died with arch/ppc/ > > [...] Applied to powerpc/next. [1/1] powerpc/32s: Remove m8260_gorom() https://git.kernel.org/powerpc/c/3a5988b884a33cb3e4ab427b08a020ce32f3b3ba cheers
Re: [PATCH -next] ppc: boot: Fix a typo in the file decompress.c
On Mon, 10 May 2021 15:51:34 +0800, Zhang Jianhua wrote: > s/input buffer/output buffer/ > s/length of the input buffer/length of the input buffer/ > > Applied to powerpc/next. [1/1] ppc: boot: Fix a typo in the file decompress.c https://git.kernel.org/powerpc/c/930a77c3ad79c30ce9ba8cbad9eded5bc5805343 cheers
Re: [PATCH] powerpc/32s: Remove asm/book3s/32/hash.h
On Thu, 6 May 2021 13:32:18 + (UTC), Christophe Leroy wrote: > Move the PAGE bits into pgtable.h to be more similar to book3s/64. Applied to powerpc/next. [1/1] powerpc/32s: Remove asm/book3s/32/hash.h https://git.kernel.org/powerpc/c/ca8cc36901e9bdd01d371f6236faf9f61d1325d1 cheers
Re: [PATCH v2] KVM: PPC: Book3S HV: Fix reverse map real-mode address lookup with huge vmalloc
On Wed, 26 May 2021 22:00:05 +1000, Nicholas Piggin wrote: > real_vmalloc_addr() does not currently work for huge vmalloc, which is > what the reverse map can be allocated with for radix host, hash guest. > > Extract the hugepage aware equivalent from eeh code into a helper, and > convert existing sites including this one to use it. Applied to powerpc/fixes. [1/1] KVM: PPC: Book3S HV: Fix reverse map real-mode address lookup with huge vmalloc https://git.kernel.org/powerpc/c/5362a4b6ee6136018558ef6b2c4701aa15ebc602 cheers
Re: [PATCH] Revert "powerpc/kernel/iommu: Align size for IOMMU_PAGE_SIZE() to save TCEs"
On Wed, 26 May 2021 16:45:40 +0200, Frederic Barrat wrote: > This reverts commit 3c0468d4451eb6b4f6604370639f163f9637a479. > > That commit was breaking alignment guarantees for the DMA address when > allocating coherent mappings, as described in > Documentation/core-api/dma-api-howto.rst > > It was also noticed by Mellanox' driver: > [ 1515.763621] mlx5_core c002:01:00.0: mlx5_frag_buf_alloc_node:146:(pid > 13402): unexpected map alignment: 0x08c61000, page_shift=16 > [ 1515.763635] mlx5_core c002:01:00.0: mlx5_cqwq_create:181:(pid > 13402): mlx5_frag_buf_alloc_node() failed, -12 Applied to powerpc/fixes. [1/1] Revert "powerpc/kernel/iommu: Align size for IOMMU_PAGE_SIZE() to save TCEs" https://git.kernel.org/powerpc/c/59cc84c802eb923805e7bba425976a3df5ce35d8 cheers
Re: [PATCH v2] KVM: PPC: Book3S HV: Save host FSCR in the P7/8 path
On Wed, 26 May 2021 22:58:51 +1000, Nicholas Piggin wrote: > Similar to commit 25edcc50d76c ("KVM: PPC: Book3S HV: Save and restore > FSCR in the P9 path"), ensure the P7/8 path saves and restores the host > FSCR. The logic explained in that patch actually applies there to the > old path well: a context switch can be made before kvmppc_vcpu_run_hv > restores the host FSCR and returns. > > Now both the p9 and the p7/8 paths now save and restore their FSCR, it > no longer needs to be restored at the end of kvmppc_vcpu_run_hv Applied to powerpc/fixes. [1/1] KVM: PPC: Book3S HV: Save host FSCR in the P7/8 path https://git.kernel.org/powerpc/c/1438709e6328925ef496dafd467dbd0353137434 cheers
Re: [PATCH 0/5] powerpc/kprobes: fixes and cleanups
On Wed, 19 May 2021 16:17:16 +0530, Naveen N. Rao wrote: > Various fixes and some code refactoring for kprobes on powerpc. The > first patch fixes an invalid access if probing the first instruction in > a kernel module. The rest are small cleanups. More details in the > individual patches. > > - Naveen > > [...] Patch 1 applied to powerpc/fixes. [1/5] powerpc/kprobes: Fix validation of prefixed instructions across page boundary https://git.kernel.org/powerpc/c/82123a3d1d5a306fdf50c968a474cc60fe43a80f cheers