build regression from c153693: Simplify module TOC handling
Hi Alan, Your patch for "powerpc: Simplify module TOC handling" is causing the Fedora ppc64le to fail to build with depmod failures. Reverting the commit fixes it for us on rawhide. We're getting the out put below, full logs at [1]. Let me know if you have any other queries. Regards, Peter [1] http://ppc.koji.fedoraproject.org/kojifiles/work/tasks/5115/3125115/build.log + depmod -b . -aeF ./System.map 4.5.0-0.rc2.git0.1.fc24.ppc64le Depmod failure + '[' -s depmod.out ']' + echo 'Depmod failure' + cat depmod.out depmod: WARNING: /builddir/build/BUILDROOT/kernel-4.5.0-0.rc2.git0.1.fc24.ppc64le/./lib/modules/4.5.0-0.rc2.git0.1.fc24.ppc64le/kernel/arch/powerpc/platforms/powernv/opal-prd.ko needs unknown symbol .TOC. depmod: WARNING: /builddir/build/BUILDROOT/kernel-4.5.0-0.rc2.git0.1.fc24.ppc64le/./lib/modules/4.5.0-0.rc2.git0.1.fc24.ppc64le/kernel/arch/powerpc/platforms/pseries/pseries_energy.ko needs unknown symbol .TOC. depmod: WARNING: /builddir/build/BUILDROOT/kernel-4.5.0-0.rc2.git0.1.fc24.ppc64le/./lib/modules/4.5.0-0.rc2.git0.1.fc24.ppc64le/kernel/arch/powerpc/platforms/pseries/hvcserver.ko needs unknown symbol .TOC. depmod: WARNING: /builddir/build/BUILDROOT/kernel-4.5.0-0.rc2.git0.1.fc24.ppc64le/./lib/modules/4.5.0-0.rc2.git0.1.fc24.ppc64le/kernel/arch/powerpc/kvm/kvm.ko needs unknown symbol .TOC. depmod: WARNING: /builddir/build/BUILDROOT/kernel-4.5.0-0.rc2.git0.1.fc24.ppc64le/./lib/modules/4.5.0-0.rc2.git0.1.fc24.ppc64le/kernel/arch/powerpc/kvm/kvm-pr.ko needs unknown symbol .TOC. depmod: WARNING: /builddir/build/BUILDROOT/kernel-4.5.0-0.rc2.git0.1.fc24.ppc64le/./lib/modules/4.5.0-0.rc2.git0.1.fc24.ppc64le/kernel/arch/powerpc/kvm/kvm-hv.ko needs unknown symbol .TOC. depmod: WARNING: /builddir/build/BUILDROOT/kernel-4.5.0-0.rc2.git0.1.fc24.ppc64le/./lib/modules/4.5.0-0.rc2.git0.1.fc24.ppc64le/kernel/kernel/rcu/rcutorture.ko needs unknown symbol .TOC. depmod: WARNING: /builddir/build/BUILDROOT/kernel-4.5.0-0.rc2.git0.1.fc24.ppc64le/./lib/modules/4.5.0-0.rc2.git0.1.fc24.ppc64le/kernel/kernel/trace/ring_buffer_benchmark.ko needs unknown symbol .TOC. depmod: WARNING: /builddir/build/BUILDROOT/kernel-4.5.0-0.rc2.git0.1.fc24.ppc64le/./lib/modules/4.5.0-0.rc2.git0.1.fc24.ppc64le/kernel/kernel/torture.ko needs unknown symbol .TOC. depmod: WARNING: /builddir/build/BUILDROOT/kernel-4.5.0-0.rc2.git0.1.fc24.ppc64le/./lib/modules/4.5.0-0.rc2.git0.1.fc24.ppc64le/kernel/fs/nfs_common/nfs_acl.ko needs unknown symbol .TOC. depmod: WARNING: /builddir/build/BUILDROOT/kernel-4.5.0-0.rc2.git0.1.fc24.ppc64le/./lib/modules/4.5.0-0.rc2.git0.1.fc24.ppc64le/kernel/fs/nfs_common/grace.ko needs unknown symbol .TOC. ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH v7 02/23] powerpc/8xx: Map linear kernel RAM with 8M pages
On a live running system (VoIP gateway for Air Trafic Control), over a 10 minutes period (with 277s idle), we get 87 millions DTLB misses and approximatly 35 secondes are spent in DTLB handler. This represents 5.8% of the overall time and even 10.8% of the non-idle time. Among those 87 millions DTLB misses, 15% are on user addresses and 85% are on kernel addresses. And within the kernel addresses, 93% are on addresses from the linear address space and only 7% are on addresses from the virtual address space. MPC8xx has no BATs but it has 8Mb page size. This patch implements mapping of kernel RAM using 8Mb pages, on the same model as what is done on the 40x. In 4k pages mode, each PGD entry maps a 4Mb area: we map every two entries to the same 8Mb physical page. In each second entry, we add 4Mb to the page physical address to ease life of the FixupDAR routine. This is just ignored by HW. In 16k pages mode, each PGD entry maps a 64Mb area: each PGD entry will point to the first page of the area. The DTLB handler adds the 3 bits from EPN to map the correct page. With this patch applied, we now get only 13 millions TLB misses during the 10 minutes period. The idle time has increased to 313s and the overall time spent in DTLB miss handler is 6.3s, which represents 1% of the overall time and 2.2% of non-idle time. Signed-off-by: Christophe Leroy--- v2: using bt instead of bgt and named the label explicitly v3: no change v4: no change v5: removed use of pmd_val() as L-value v6: no change v7: no change arch/powerpc/kernel/head_8xx.S | 35 +- arch/powerpc/mm/8xx_mmu.c | 83 ++ arch/powerpc/mm/Makefile | 1 + arch/powerpc/mm/mmu_decl.h | 15 ++-- 4 files changed, 120 insertions(+), 14 deletions(-) create mode 100644 arch/powerpc/mm/8xx_mmu.c diff --git a/arch/powerpc/kernel/head_8xx.S b/arch/powerpc/kernel/head_8xx.S index a89492e..87d1f5f 100644 --- a/arch/powerpc/kernel/head_8xx.S +++ b/arch/powerpc/kernel/head_8xx.S @@ -398,11 +398,13 @@ DataStoreTLBMiss: BRANCH_UNLESS_KERNEL(3f) lis r11, (swapper_pg_dir-PAGE_OFFSET)@ha 3: - mtcrr3 /* Insert level 1 index */ rlwimi r11, r10, 32 - ((PAGE_SHIFT - 2) << 1), (PAGE_SHIFT - 2) << 1, 29 lwz r11, (swapper_pg_dir-PAGE_OFFSET)@l(r11)/* Get the level 1 entry */ + mtcrr11 + bt- 28,DTLBMiss8M /* bit 28 = Large page (8M) */ + mtcrr3 /* We have a pte table, so load fetch the pte from the table. */ @@ -455,6 +457,29 @@ DataStoreTLBMiss: EXCEPTION_EPILOG_0 rfi +DTLBMiss8M: + mtcrr3 + ori r11, r11, MD_SVALID + MTSPR_CPU6(SPRN_MD_TWC, r11, r3) +#ifdef CONFIG_PPC_16K_PAGES + /* +* In 16k pages mode, each PGD entry defines a 64M block. +* Here we select the 8M page within the block. +*/ + rlwimi r11, r10, 0, 0x0380 +#endif + rlwinm r10, r11, 0, 0xff80 + ori r10, r10, 0xf0 | MD_SPS16K | _PAGE_SHARED | _PAGE_DIRTY | \ + _PAGE_PRESENT + MTSPR_CPU6(SPRN_MD_RPN, r10, r3)/* Update TLB entry */ + + li r11, RPN_PATTERN + mfspr r3, SPRN_SPRG_SCRATCH2 + mtspr SPRN_DAR, r11 /* Tag DAR */ + EXCEPTION_EPILOG_0 + rfi + + /* This is an instruction TLB error on the MPC8xx. This could be due * to many reasons, such as executing guarded memory or illegal instruction * addresses. There is nothing to do but handle a big time error fault. @@ -532,13 +557,15 @@ FixupDAR:/* Entry point for dcbx workaround. */ /* Insert level 1 index */ 3: rlwimi r11, r10, 32 - ((PAGE_SHIFT - 2) << 1), (PAGE_SHIFT - 2) << 1, 29 lwz r11, (swapper_pg_dir-PAGE_OFFSET)@l(r11)/* Get the level 1 entry */ + mtcrr11 + bt 28,200f /* bit 28 = Large page (8M) */ rlwinm r11, r11,0,0,19 /* Extract page descriptor page address */ /* Insert level 2 index */ rlwimi r11, r10, 32 - (PAGE_SHIFT - 2), 32 - PAGE_SHIFT, 29 lwz r11, 0(r11) /* Get the pte */ /* concat physical page address(r11) and page offset(r10) */ rlwimi r11, r10, 0, 32 - PAGE_SHIFT, 31 - lwz r11,0(r11) +201: lwz r11,0(r11) /* Check if it really is a dcbx instruction. */ /* dcbt and dcbtst does not generate DTLB Misses/Errors, * no need to include them here */ @@ -557,6 +584,10 @@ FixupDAR:/* Entry point for dcbx workaround. */ 141: mfspr r10,SPRN_SPRG_SCRATCH2 b DARFixed/* Nope, go back to normal TLB processing */ + /* concat physical page address(r11) and page offset(r10) */ +200: rlwimi r11, r10, 0, 32 - (PAGE_SHIFT << 1), 31 + b 201b + 144: mfspr r10, SPRN_DSISR rlwinm r10, r10,0,7,5 /* Clear store bit for buggy dcbst insn */ mtspr
[PATCH v7 03/23] powerpc: Update documentation for noltlbs kernel parameter
Now the noltlbs kernel parameter is also applicable to PPC8xx Signed-off-by: Christophe Leroy--- v2: no change v3: no change v4: no change v5: no change v6: no change v7: no change Documentation/kernel-parameters.txt | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt index 59e1515..c3e420b 100644 --- a/Documentation/kernel-parameters.txt +++ b/Documentation/kernel-parameters.txt @@ -2592,7 +2592,7 @@ bytes respectively. Such letter suffixes can also be entirely omitted. nolapic_timer [X86-32,APIC] Do not use the local APIC timer. noltlbs [PPC] Do not use large page/tlb entries for kernel - lowmem mapping on PPC40x. + lowmem mapping on PPC40x and PPC8xx nomca [IA-64] Disable machine check abort handling -- 2.1.0 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH v7 10/23] powerpc/8xx: map more RAM at startup when needed
On recent kernels, with some debug options like for instance CONFIG_LOCKDEP, the BSS requires more than 8M memory, allthough the kernel code fits in the first 8M. Today, it is necessary to activate CONFIG_PIN_TLB to get more than 8M at startup, allthough pinning TLB is not necessary for that. This patch adds more pages (up to 24Mb) to the initial mapping if possible/needed in order to have the necessary mappings regardless of CONFIG_PIN_TLB. We could have mapped 16M or 24M inconditionnally but since some platforms only have 8M memory, we need something a bit more elaborated Therefore, if the bootloader is compliant with ePAPR standard, we use r7 to know how much memory was mapped by the bootloader. Otherwise, we try to determine the required memory size by looking at the _end symbol and the address of the device tree. This patch does not modify the behaviour when CONFIG_PIN_TLB is selected. Signed-off-by: Christophe Leroy--- v2: no change v3: Automatic detection of available/needed memory instead of allocating 16M for all. v4: no change v5: no change v6: no change v7: no change arch/powerpc/kernel/head_8xx.S | 56 +++--- arch/powerpc/mm/8xx_mmu.c | 10 +++- 2 files changed, 56 insertions(+), 10 deletions(-) diff --git a/arch/powerpc/kernel/head_8xx.S b/arch/powerpc/kernel/head_8xx.S index ae721a1..a268cf4 100644 --- a/arch/powerpc/kernel/head_8xx.S +++ b/arch/powerpc/kernel/head_8xx.S @@ -72,6 +72,9 @@ #define RPN_PATTERN0x00f0 #endif +/* ePAPR magic value for non BOOK III-E CPUs */ +#define EPAPR_SMAGIC 0x65504150 + __HEAD _ENTRY(_stext); _ENTRY(_start); @@ -101,6 +104,38 @@ _ENTRY(_start); */ .globl __start __start: +/* + * Determine initial RAM size + * + * If the Bootloader is ePAPR compliant, the size is given in r7 + * otherwise, we have to determine how much is needed. For that, we have to + * check whether _end of kernel and device tree are within the first 8Mb. + */ + lis r30, 0x0080@h /* 8Mb by default */ + + lis r8, EPAPR_SMAGIC@h + ori r8, r8, EPAPR_SMAGIC@l + cmplw cr0, r8, r6 + bne 1f + lis r30, 0x0180@h /* 24Mb max */ + cmplw cr0, r7, r30 + bgt 2f + mr r30, r7 /* save initial ram size */ + b 2f +1: + /* is kernel _end or DTB in the first 8M ? if not map 16M */ + lis r8, (_end - PAGE_OFFSET)@h + ori r8, r8, (_end - PAGE_OFFSET)@l + addir8, r8, -1 + or r8, r8, r3 + cmplw cr0, r8, r30 + blt 2f + lis r30, 0x0100@h /* 16Mb */ + /* is kernel _end or DTB in the first 16M ? if not map 24M */ + cmplw cr0, r8, r30 + blt 2f + lis r30, 0x0180@h /* 24Mb */ +2: mr r31,r3 /* save device tree ptr */ /* We have to turn on the MMU right away so we get cache modes @@ -737,6 +772,8 @@ start_here: /* * Decide what sort of machine this is and initialize the MMU. */ + lis r3, initial_memory_size@ha + stw r30, initial_memory_size@l(r3) li r3,0 mr r4,r31 bl machine_init @@ -868,10 +905,15 @@ initial_mmu: mtspr SPRN_MD_RPN, r8 #ifdef CONFIG_PIN_TLB - /* Map two more 8M kernel data pages. - */ + /* Map one more 8M kernel data page. */ addir10, r10, 0x0100 mtspr SPRN_MD_CTR, r10 +#else + /* Map one more 8M kernel data page if needed */ + lis r10, 0x0080@h + cmplw cr0, r30, r10 + ble 1f +#endif lis r8, KERNELBASE@h/* Create vaddr for TLB */ addis r8, r8, 0x0080 /* Add 8M */ @@ -884,20 +926,28 @@ initial_mmu: addis r11, r11, 0x0080/* Add 8M */ mtspr SPRN_MD_RPN, r11 +#ifdef CONFIG_PIN_TLB + /* Map one more 8M kernel data page. */ addir10, r10, 0x0100 mtspr SPRN_MD_CTR, r10 +#else + /* Map one more 8M kernel data page if needed */ + lis r10, 0x0100@h + cmplw cr0, r30, r10 + ble 1f +#endif addis r8, r8, 0x0080 /* Add 8M */ mtspr SPRN_MD_EPN, r8 mtspr SPRN_MD_TWC, r9 addis r11, r11, 0x0080/* Add 8M */ mtspr SPRN_MD_RPN, r11 -#endif /* Since the cache is enabled according to the information we * just loaded into the TLB, invalidate and enable the caches here. * We should probably check/set other modeslater. */ +1: lis r8, IDC_INVALL@h mtspr SPRN_IC_CST, r8 mtspr SPRN_DC_CST, r8 diff --git a/arch/powerpc/mm/8xx_mmu.c b/arch/powerpc/mm/8xx_mmu.c index f37d5ec..50f17d2 100644 --- a/arch/powerpc/mm/8xx_mmu.c +++ b/arch/powerpc/mm/8xx_mmu.c @@ -20,6 +20,7 @@ #define IMMR_SIZE
[PATCH v7 17/23] powerpc/8xx: rewrite flush_instruction_cache() in C
On PPC8xx, flushing instruction cache is performed by writing in register SPRN_IC_CST. This registers suffers CPU6 ERRATA. The patch rewrites the fonction in C so that CPU6 ERRATA will be handled transparently Signed-off-by: Christophe Leroy--- v2: no change v3: no change v4: no change v5: no change v6: no change v7: no change arch/powerpc/kernel/misc_32.S | 10 -- arch/powerpc/mm/8xx_mmu.c | 7 +++ 2 files changed, 11 insertions(+), 6 deletions(-) diff --git a/arch/powerpc/kernel/misc_32.S b/arch/powerpc/kernel/misc_32.S index be8edd6..7d1284f 100644 --- a/arch/powerpc/kernel/misc_32.S +++ b/arch/powerpc/kernel/misc_32.S @@ -296,12 +296,9 @@ _GLOBAL(real_writeb) * Flush instruction cache. * This is a no-op on the 601. */ +#ifndef CONFIG_PPC_8xx _GLOBAL(flush_instruction_cache) -#if defined(CONFIG_8xx) - isync - lis r5, IDC_INVALL@h - mtspr SPRN_IC_CST, r5 -#elif defined(CONFIG_4xx) +#if defined(CONFIG_4xx) #ifdef CONFIG_403GCX li r3, 512 mtctr r3 @@ -334,9 +331,10 @@ END_FTR_SECTION_IFSET(CPU_FTR_UNIFIED_ID_CACHE) mfspr r3,SPRN_HID0 ori r3,r3,HID0_ICFI mtspr SPRN_HID0,r3 -#endif /* CONFIG_8xx/4xx */ +#endif /* CONFIG_4xx */ isync blr +#endif /* CONFIG_PPC_8xx */ /* * Write any modified data cache blocks out to memory diff --git a/arch/powerpc/mm/8xx_mmu.c b/arch/powerpc/mm/8xx_mmu.c index b75c461..e2ce480 100644 --- a/arch/powerpc/mm/8xx_mmu.c +++ b/arch/powerpc/mm/8xx_mmu.c @@ -181,3 +181,10 @@ void set_context(unsigned long id, pgd_t *pgd) /* sync */ mb(); } + +void flush_instruction_cache(void) +{ + isync(); + mtspr(SPRN_IC_CST, IDC_INVALL); + isync(); +} -- 2.1.0 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH][v3] mpc85xx/lbc: modify suspend/resume entry sequence
Modify platform driver suspend/resume to syscore suspend/resume. This is because p1022ds needs to use localbus when entering the PCIE resume. Signed-off-by: Raghav Dogra--- Changes for v3: rebased to linux.git main branch arch/powerpc/sysdev/Makefile | 2 +- arch/powerpc/sysdev/fsl_lbc.c | 49 +-- 2 files changed, 39 insertions(+), 12 deletions(-) diff --git a/arch/powerpc/sysdev/Makefile b/arch/powerpc/sysdev/Makefile index bd6bd72..ee972aa 100644 --- a/arch/powerpc/sysdev/Makefile +++ b/arch/powerpc/sysdev/Makefile @@ -18,9 +18,9 @@ obj-$(CONFIG_PPC_PMI) += pmi.o obj-$(CONFIG_U3_DART) += dart_iommu.o obj-$(CONFIG_MMIO_NVRAM) += mmio_nvram.o obj-$(CONFIG_FSL_SOC) += fsl_soc.o fsl_mpic_err.o +obj-$(CONFIG_FSL_LBC) += fsl_lbc.o obj-$(CONFIG_FSL_PCI) += fsl_pci.o $(fsl-msi-obj-y) obj-$(CONFIG_FSL_PMC) += fsl_pmc.o -obj-$(CONFIG_FSL_LBC) += fsl_lbc.o obj-$(CONFIG_FSL_GTM) += fsl_gtm.o obj-$(CONFIG_FSL_85XX_CACHE_SRAM) += fsl_85xx_l2ctlr.o fsl_85xx_cache_sram.o obj-$(CONFIG_SIMPLE_GPIO) += simple_gpio.o diff --git a/arch/powerpc/sysdev/fsl_lbc.c b/arch/powerpc/sysdev/fsl_lbc.c index 47f7810..424b67f 100644 --- a/arch/powerpc/sysdev/fsl_lbc.c +++ b/arch/powerpc/sysdev/fsl_lbc.c @@ -27,6 +27,7 @@ #include #include #include +#include #include #include @@ -352,24 +353,42 @@ err: #ifdef CONFIG_SUSPEND /* save lbc registers */ -static int fsl_lbc_suspend(struct platform_device *pdev, pm_message_t state) +static int fsl_lbc_syscore_suspend(void) { - struct fsl_lbc_ctrl *ctrl = dev_get_drvdata(>dev); - struct fsl_lbc_regs __iomem *lbc = ctrl->regs; + struct fsl_lbc_ctrl *ctrl; + struct fsl_lbc_regs __iomem *lbc; + + ctrl = fsl_lbc_ctrl_dev; + if (!ctrl) + goto out; + + lbc = ctrl->regs; + if (!lbc) + goto out; ctrl->saved_regs = kmalloc(sizeof(struct fsl_lbc_regs), GFP_KERNEL); if (!ctrl->saved_regs) return -ENOMEM; _memcpy_fromio(ctrl->saved_regs, lbc, sizeof(struct fsl_lbc_regs)); + +out: return 0; } /* restore lbc registers */ -static int fsl_lbc_resume(struct platform_device *pdev) +static void fsl_lbc_syscore_resume(void) { - struct fsl_lbc_ctrl *ctrl = dev_get_drvdata(>dev); - struct fsl_lbc_regs __iomem *lbc = ctrl->regs; + struct fsl_lbc_ctrl *ctrl; + struct fsl_lbc_regs __iomem *lbc; + + ctrl = fsl_lbc_ctrl_dev; + if (!ctrl) + goto out; + + lbc = ctrl->regs; + if (!lbc) + goto out; if (ctrl->saved_regs) { _memcpy_toio(lbc, ctrl->saved_regs, @@ -377,7 +396,9 @@ static int fsl_lbc_resume(struct platform_device *pdev) kfree(ctrl->saved_regs); ctrl->saved_regs = NULL; } - return 0; + +out: + return; } #endif /* CONFIG_SUSPEND */ @@ -389,20 +410,26 @@ static const struct of_device_id fsl_lbc_match[] = { {}, }; +#ifdef CONFIG_SUSPEND +static struct syscore_ops lbc_syscore_pm_ops = { + .suspend = fsl_lbc_syscore_suspend, + .resume = fsl_lbc_syscore_resume, +}; +#endif + static struct platform_driver fsl_lbc_ctrl_driver = { .driver = { .name = "fsl-lbc", .of_match_table = fsl_lbc_match, }, .probe = fsl_lbc_ctrl_probe, -#ifdef CONFIG_SUSPEND - .suspend = fsl_lbc_suspend, - .resume = fsl_lbc_resume, -#endif }; static int __init fsl_lbc_init(void) { +#ifdef CONFIG_SUSPEND + register_syscore_ops(_syscore_pm_ops); +#endif return platform_driver_register(_lbc_ctrl_driver); } subsys_initcall(fsl_lbc_init); -- 1.9.1 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH v7 08/23] powerpc/8xx: Map IMMR area with 512k page at a fixed address
Once the linear memory space has been mapped with 8Mb pages, as seen in the related commit, we get 11 millions DTLB missed during the reference 600s period. 77% of the misses are on user addresses and 23% are on kernel addresses (1 fourth for linear address space and 3 fourth for virtual address space) Traditionaly, each driver manages one computer board which has its own components with its own memory maps. But on embedded chips like the MPC8xx, the SOC has all registers located in the same IO area. When looking at ioremaps done during startup, we see that many drivers are re-mapping small parts of the IMMR for their own use and all those small pieces gets their own 4k page, amplifying the number of TLB misses: in our system we get 0xff00 mapped 31 times and 0xff003000 mapped 9 times. Even if each part of IMMR was mapped only once with 4k pages, it would still be several small mappings towards linear area. With the patch, on the same principle as what was done for the RAM, the IMMR gets mapped by a 512k page. In 4k pages mode, we reserve a 4Mb area for mapping IMMR. The TLB miss handler checks that we are within the first 512k and bail out with page not marked valid if we are outside In 16k pages mode, it is not realistic to reserve a 64Mb area, so we do a standard mapping of the 512k area using 32 pages of 16k. The CPM will be mapped via the first two pages, and the SEC engine will be mapped via the 16th and 17th pages. As the pages are marked guarded, there will be no speculative accesses. With this patch applied, the number of DTLB misses during the 10 min period is reduced to 11.8 millions for a duration of 5.8s, which represents 2% of the non-idle time hence yet another 10% reduction. Signed-off-by: Christophe Leroy--- v2: - using bt instead of blt/bgt - reorganised in order to have only one taken branch for both 512k and 8M instead of a first branch for both 8M and 512k then a second branch for 512k v3: - using fixmap - using the new x_block_mapped() functions v4: no change v5: no change v6: removed use of pmd_val() as L-value v7: no change arch/powerpc/include/asm/fixmap.h | 9 ++- arch/powerpc/kernel/head_8xx.S| 36 +- arch/powerpc/mm/8xx_mmu.c | 53 +++ arch/powerpc/mm/mmu_decl.h| 3 ++- 4 files changed, 98 insertions(+), 3 deletions(-) diff --git a/arch/powerpc/include/asm/fixmap.h b/arch/powerpc/include/asm/fixmap.h index d7dd8fb..b954dc3 100644 --- a/arch/powerpc/include/asm/fixmap.h +++ b/arch/powerpc/include/asm/fixmap.h @@ -52,12 +52,19 @@ enum fixed_addresses { FIX_KMAP_END = FIX_KMAP_BEGIN+(KM_TYPE_NR*NR_CPUS)-1, #endif #ifdef CONFIG_PPC_8xx - /* For IMMR we need an aligned 512K area */ FIX_IMMR_START, +#ifdef CONFIG_PPC_4K_PAGES + /* For IMMR we need an aligned 4M area (full PGD entry) */ + FIX_IMMR_TOP = (FIX_IMMR_START - 1 + ((4 * 1024 * 1024) / PAGE_SIZE)) & + ~(((4 * 1024 * 1024) / PAGE_SIZE) - 1), + FIX_IMMR_BASE = FIX_IMMR_TOP - 1 + ((4 * 1024 * 1024) / PAGE_SIZE), +#else + /* For IMMR we need an aligned 512K area */ FIX_IMMR_TOP = (FIX_IMMR_START - 1 + ((512 * 1024) / PAGE_SIZE)) & ~(((512 * 1024) / PAGE_SIZE) - 1), FIX_IMMR_BASE = FIX_IMMR_TOP - 1 + ((512 * 1024) / PAGE_SIZE), #endif +#endif /* FIX_PCIE_MCFG, */ __end_of_fixed_addresses }; diff --git a/arch/powerpc/kernel/head_8xx.S b/arch/powerpc/kernel/head_8xx.S index 09173ae..ae721a1 100644 --- a/arch/powerpc/kernel/head_8xx.S +++ b/arch/powerpc/kernel/head_8xx.S @@ -254,6 +254,37 @@ DataAccess: . = 0x400 InstructionAccess: +/* + * Bottom part of DTLBMiss handler for 512k pages + * not enough space in the primary location + */ +#ifdef CONFIG_PPC_4K_PAGES +/* + * 512k pages are only used for mapping IMMR area in 4K pages mode. + * Only map the first 512k page of the 4M area covered by the PGD entry. + * This should not happen, but if we are called for another page of that + * area, don't mark it valid + * + * In 16k pages mode, IMMR is directly mapped with 16k pages + */ +DTLBMiss512k: + rlwinm. r10, r10, 0, 0x0038 + bne-1f + ori r11, r11, MD_SVALID +1: mtcrr3 + MTSPR_CPU6(SPRN_MD_TWC, r11, r3) + rlwinm r10, r11, 0, 0xffc0 + ori r10, r10, 0xf0 | MD_SPS16K | _PAGE_SHARED | _PAGE_DIRTY | \ + _PAGE_PRESENT | _PAGE_NO_CACHE + MTSPR_CPU6(SPRN_MD_RPN, r10, r3)/* Update TLB entry */ + + li r11, RPN_PATTERN + mfspr r3, SPRN_SPRG_SCRATCH2 + mtspr SPRN_DAR, r11 /* Tag DAR */ + EXCEPTION_EPILOG_0 + rfi +#endif + /* External interrupt */ EXCEPTION(0x500, HardwareInterrupt, do_IRQ, EXC_XFER_LITE) @@ -405,6 +436,9 @@ DataStoreTLBMiss: lwz r11, (swapper_pg_dir-PAGE_OFFSET)@l(r11)/* Get the level 1 entry
[PATCH v7 21/23] powerpc: Simplify test in __dma_sync()
This simplification helps the compiler. We now have only one test instead of two, so it reduces the number of branches. Signed-off-by: Christophe Leroy--- v2: new v3: no change v4: no change v5: no change v6: no change v7: no change arch/powerpc/mm/dma-noncoherent.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/powerpc/mm/dma-noncoherent.c b/arch/powerpc/mm/dma-noncoherent.c index 169aba4..2dc74e5 100644 --- a/arch/powerpc/mm/dma-noncoherent.c +++ b/arch/powerpc/mm/dma-noncoherent.c @@ -327,7 +327,7 @@ void __dma_sync(void *vaddr, size_t size, int direction) * invalidate only when cache-line aligned otherwise there is * the potential for discarding uncommitted data from the cache */ - if ((start & (L1_CACHE_BYTES - 1)) || (size & (L1_CACHE_BYTES - 1))) + if ((start | end) & (L1_CACHE_BYTES - 1)) flush_dcache_range(start, end); else invalidate_dcache_range(start, end); -- 2.1.0 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH v3 0/2] Consolidate redundant register/stack access code
On Tue, 2016-02-09 at 00:38 -0500, David Long wrote: > From: "David A. Long"> > Move duplicate and functionally equivalent code for accessing registers > and stack (CONFIG_HAVE_REGS_AND_STACK_ACCESS_API) from arch subdirs into > common kernel files. > > I'm sending this out again (with updated distribution list) because v2 > just never got pulled in, even though I don't think there were any > outstanding issues. A big cross arch patch like this would often get taken by Andrew Morton, but AFAICS you didn't CC him - so I just added him, perhaps he'll pick it up for us :D cheers ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH v7 05/23] powerpc32: Fix pte_offset_kernel() to return NULL for bad pages
The fixmap related functions try to map kernel pages that are already mapped through Large TLBs. pte_offset_kernel() has to return NULL for LTLBs, otherwise the caller will try to access level 2 table which doesn't exist Signed-off-by: Christophe Leroy--- v3: new v4: no change v5: no change v6: no change v7: no change arch/powerpc/include/asm/nohash/32/pgtable.h | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/arch/powerpc/include/asm/nohash/32/pgtable.h b/arch/powerpc/include/asm/nohash/32/pgtable.h index c82cbf5..e201600 100644 --- a/arch/powerpc/include/asm/nohash/32/pgtable.h +++ b/arch/powerpc/include/asm/nohash/32/pgtable.h @@ -309,7 +309,8 @@ static inline void __ptep_set_access_flags(pte_t *ptep, pte_t entry) #define pte_index(address) \ (((address) >> PAGE_SHIFT) & (PTRS_PER_PTE - 1)) #define pte_offset_kernel(dir, addr) \ - ((pte_t *) pmd_page_vaddr(*(dir)) + pte_index(addr)) + (pmd_bad(*(dir)) ? NULL : (pte_t *)pmd_page_vaddr(*(dir)) + \ + pte_index(addr)) #define pte_offset_map(dir, addr) \ ((pte_t *) kmap_atomic(pmd_page(*(dir))) + pte_index(addr)) #define pte_unmap(pte) kunmap_atomic(pte) -- 2.1.0 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH v7 13/23] powerpc/8xx: Add missing SPRN defines into reg_8xx.h
Add missing SPRN defines into reg_8xx.h Some of them are defined in mmu-8xx.h, so we include mmu-8xx.h in reg_8xx.h, for that we remove references to PAGE_SHIFT in mmu-8xx.h to have it self sufficient, as includers of reg_8xx.h don't all include asm/page.h Signed-off-by: Christophe Leroy--- v2: no change v3: We just add missing ones, don't move anymore the ones from mmu-8xx.h v4: no change v5: no change v6: no change v7: no change arch/powerpc/include/asm/mmu-8xx.h | 4 ++-- arch/powerpc/include/asm/reg_8xx.h | 11 +++ 2 files changed, 13 insertions(+), 2 deletions(-) diff --git a/arch/powerpc/include/asm/mmu-8xx.h b/arch/powerpc/include/asm/mmu-8xx.h index f05500a..0a566f1 100644 --- a/arch/powerpc/include/asm/mmu-8xx.h +++ b/arch/powerpc/include/asm/mmu-8xx.h @@ -171,9 +171,9 @@ typedef struct { } mm_context_t; #endif /* !__ASSEMBLY__ */ -#if (PAGE_SHIFT == 12) +#if defined(CONFIG_PPC_4K_PAGES) #define mmu_virtual_psize MMU_PAGE_4K -#elif (PAGE_SHIFT == 14) +#elif defined(CONFIG_PPC_16K_PAGES) #define mmu_virtual_psize MMU_PAGE_16K #else #error "Unsupported PAGE_SIZE" diff --git a/arch/powerpc/include/asm/reg_8xx.h b/arch/powerpc/include/asm/reg_8xx.h index e8ea346..0f71c81 100644 --- a/arch/powerpc/include/asm/reg_8xx.h +++ b/arch/powerpc/include/asm/reg_8xx.h @@ -4,6 +4,8 @@ #ifndef _ASM_POWERPC_REG_8xx_H #define _ASM_POWERPC_REG_8xx_H +#include + /* Cache control on the MPC8xx is provided through some additional * special purpose registers. */ @@ -14,6 +16,15 @@ #define SPRN_DC_ADR569 /* Address needed for some commands */ #define SPRN_DC_DAT570 /* Read-only data register */ +/* Misc Debug */ +#define SPRN_DPDR 630 +#define SPRN_MI_CAM816 +#define SPRN_MI_RAM0 817 +#define SPRN_MI_RAM1 818 +#define SPRN_MD_CAM824 +#define SPRN_MD_RAM0 825 +#define SPRN_MD_RAM1 826 + /* Commands. Only the first few are available to the instruction cache. */ #defineIDC_ENABLE 0x0200 /* Cache enable */ -- 2.1.0 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH v7 19/23] powerpc32: Remove clear_pages() and define clear_page() inline
clear_pages() is never used expect by clear_page, and PPC32 is the only architecture (still) having this function. Neither PPC64 nor any other architecture has it. This patch removes clear_pages() and moves clear_page() function inline (same as PPC64) as it only is a few isns Signed-off-by: Christophe Leroy--- v2: no change v3: no change v4: no change v5: no change v6: no change v7: no change arch/powerpc/include/asm/page_32.h | 17 ++--- arch/powerpc/kernel/misc_32.S | 16 arch/powerpc/kernel/ppc_ksyms_32.c | 1 - 3 files changed, 14 insertions(+), 20 deletions(-) diff --git a/arch/powerpc/include/asm/page_32.h b/arch/powerpc/include/asm/page_32.h index 68d73b2..6a8e179 100644 --- a/arch/powerpc/include/asm/page_32.h +++ b/arch/powerpc/include/asm/page_32.h @@ -1,6 +1,8 @@ #ifndef _ASM_POWERPC_PAGE_32_H #define _ASM_POWERPC_PAGE_32_H +#include + #if defined(CONFIG_PHYSICAL_ALIGN) && (CONFIG_PHYSICAL_START != 0) #if (CONFIG_PHYSICAL_START % CONFIG_PHYSICAL_ALIGN) != 0 #error "CONFIG_PHYSICAL_START must be a multiple of CONFIG_PHYSICAL_ALIGN" @@ -36,9 +38,18 @@ typedef unsigned long long pte_basic_t; typedef unsigned long pte_basic_t; #endif -struct page; -extern void clear_pages(void *page, int order); -static inline void clear_page(void *page) { clear_pages(page, 0); } +/* + * Clear page using the dcbz instruction, which doesn't cause any + * memory traffic (except to write out any cache lines which get + * displaced). This only works on cacheable memory. + */ +static inline void clear_page(void *addr) +{ + unsigned int i; + + for (i = 0; i < PAGE_SIZE / L1_CACHE_BYTES; i++, addr += L1_CACHE_BYTES) + dcbz(addr); +} extern void copy_page(void *to, void *from); #include diff --git a/arch/powerpc/kernel/misc_32.S b/arch/powerpc/kernel/misc_32.S index 7d1284f..181afc1 100644 --- a/arch/powerpc/kernel/misc_32.S +++ b/arch/powerpc/kernel/misc_32.S @@ -517,22 +517,6 @@ END_FTR_SECTION_IFSET(CPU_FTR_COHERENT_ICACHE) #endif /* CONFIG_BOOKE */ /* - * Clear pages using the dcbz instruction, which doesn't cause any - * memory traffic (except to write out any cache lines which get - * displaced). This only works on cacheable memory. - * - * void clear_pages(void *page, int order) ; - */ -_GLOBAL(clear_pages) - li r0,PAGE_SIZE/L1_CACHE_BYTES - slw r0,r0,r4 - mtctr r0 -1: dcbz0,r3 - addir3,r3,L1_CACHE_BYTES - bdnz1b - blr - -/* * Copy a whole page. We use the dcbz instruction on the destination * to reduce memory traffic (it eliminates the unnecessary reads of * the destination into cache). This requires that the destination diff --git a/arch/powerpc/kernel/ppc_ksyms_32.c b/arch/powerpc/kernel/ppc_ksyms_32.c index 30ddd8a..2bfaafe 100644 --- a/arch/powerpc/kernel/ppc_ksyms_32.c +++ b/arch/powerpc/kernel/ppc_ksyms_32.c @@ -10,7 +10,6 @@ #include #include -EXPORT_SYMBOL(clear_pages); EXPORT_SYMBOL(ISA_DMA_THRESHOLD); EXPORT_SYMBOL(DMA_MODE_READ); EXPORT_SYMBOL(DMA_MODE_WRITE); -- 2.1.0 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH v3 0/2] Consolidate redundant register/stack access code
* Michael Ellermanwrote: > On Tue, 2016-02-09 at 00:38 -0500, David Long wrote: > > > From: "David A. Long" > > > > Move duplicate and functionally equivalent code for accessing registers > > and stack (CONFIG_HAVE_REGS_AND_STACK_ACCESS_API) from arch subdirs into > > common kernel files. > > > > I'm sending this out again (with updated distribution list) because v2 > > just never got pulled in, even though I don't think there were any > > outstanding issues. > > A big cross arch patch like this would often get taken by Andrew Morton, but > AFAICS you didn't CC him - so I just added him, perhaps he'll pick it up for > us :D The other problem is that the second patch is commingling changes to 6 separate architectures: 16 files changed, 106 insertions(+), 343 deletions(-) that should probably be 6 separate patches. Easier to review, easier to bisect to, easier to revert, etc. Thanks, Ingo ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH v7 00/23] powerpc/8xx: Use large pages for RAM and IMMR and other improvments
The main purpose of this patchset is to dramatically reduce the time spent in DTLB miss handler. This is achieved by: 1/ Mapping RAM with 8M pages 2/ Mapping IMMR with a fixed 512K page On a live running system (VoIP gateway for Air Trafic Control), over a 10 minutes period (with 277s idle), we get 87 millions DTLB misses and approximatly 35 secondes are spent in DTLB handler. This represents 5.8% of the overall time and even 10.8% of the non-idle time. Among those 87 millions DTLB misses, 15% are on user addresses and 85% are on kernel addresses. And within the kernel addresses, 93% are on addresses from the linear address space and only 7% are on addresses from the virtual address space. Once the full patchset applied, the number of DTLB misses during the period is reduced to 11.8 millions for a duration of 5.8s, which represents 2% of the non-idle time. This patch also includes other miscellaneous improvements: 1/ Handling of CPU6 ERRATA directly in mtspr() C macro to reduce code specific to PPC8xx 2/ Rewrite of a few non critical ASM functions in C 3/ Removal of some unused items See related patches for details Main changes in v3: * Using fixmap instead of fix address for mapping IMMR Change in v4: * Fix of a wrong #if notified by kbuild robot in 07/23 Change in v5: * Removed use of pmd_val() as L-value * Adapted to match the new include files layout in Linux 4.5 Change in v6: * Removed remaining use of pmd_val() as L-value (reported by kbuild test robot) Change in v7: * Don't include x_block_mapped() from compilation in arch/powerpc/mm/fsl_booke_mmu.c when CONFIG_FSL_BOOKE is not set (reported by kbuild test robot) Christophe Leroy (23): powerpc/8xx: Save r3 all the time in DTLB miss handler powerpc/8xx: Map linear kernel RAM with 8M pages powerpc: Update documentation for noltlbs kernel parameter powerpc/8xx: move setup_initial_memory_limit() into 8xx_mmu.c powerpc32: Fix pte_offset_kernel() to return NULL for bad pages powerpc32: refactor x_mapped_by_bats() and x_mapped_by_tlbcam() together powerpc/8xx: Fix vaddr for IMMR early remap powerpc/8xx: Map IMMR area with 512k page at a fixed address powerpc/8xx: CONFIG_PIN_TLB unneeded for CONFIG_PPC_EARLY_DEBUG_CPM powerpc/8xx: map more RAM at startup when needed powerpc32: Remove useless/wrong MMU:setio progress message powerpc32: remove ioremap_base powerpc/8xx: Add missing SPRN defines into reg_8xx.h powerpc/8xx: Handle CPU6 ERRATA directly in mtspr() macro powerpc/8xx: remove special handling of CPU6 errata in set_dec() powerpc/8xx: rewrite set_context() in C powerpc/8xx: rewrite flush_instruction_cache() in C powerpc: add inline functions for cache related instructions powerpc32: Remove clear_pages() and define clear_page() inline powerpc32: move x_dcache_range() functions inline powerpc: Simplify test in __dma_sync() powerpc32: small optimisation in flush_icache_range() powerpc32: Remove one insn in mulhdu Documentation/kernel-parameters.txt | 2 +- arch/powerpc/Kconfig.debug | 1 - arch/powerpc/include/asm/cache.h | 19 +++ arch/powerpc/include/asm/cacheflush.h| 52 ++- arch/powerpc/include/asm/fixmap.h| 14 ++ arch/powerpc/include/asm/mmu-8xx.h | 4 +- arch/powerpc/include/asm/nohash/32/pgtable.h | 5 +- arch/powerpc/include/asm/page_32.h | 17 ++- arch/powerpc/include/asm/reg.h | 2 + arch/powerpc/include/asm/reg_8xx.h | 93 arch/powerpc/include/asm/time.h | 6 +- arch/powerpc/kernel/asm-offsets.c| 8 ++ arch/powerpc/kernel/head_8xx.S | 207 +-- arch/powerpc/kernel/misc_32.S| 107 ++ arch/powerpc/kernel/ppc_ksyms.c | 2 + arch/powerpc/kernel/ppc_ksyms_32.c | 1 - arch/powerpc/mm/8xx_mmu.c| 190 arch/powerpc/mm/Makefile | 1 + arch/powerpc/mm/dma-noncoherent.c| 2 +- arch/powerpc/mm/fsl_booke_mmu.c | 4 +- arch/powerpc/mm/init_32.c| 23 --- arch/powerpc/mm/mmu_decl.h | 34 +++-- arch/powerpc/mm/pgtable_32.c | 47 +- arch/powerpc/mm/ppc_mmu_32.c | 4 +- arch/powerpc/platforms/embedded6xx/mpc10x.h | 10 -- arch/powerpc/sysdev/cpm_common.c | 15 +- 26 files changed, 583 insertions(+), 287 deletions(-) create mode 100644 arch/powerpc/mm/8xx_mmu.c -- 2.1.0 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH v7 07/23] powerpc/8xx: Fix vaddr for IMMR early remap
Memory: 124428K/131072K available (3748K kernel code, 188K rwdata, 648K rodata, 508K init, 290K bss, 6644K reserved) Kernel virtual memory layout: * 0xfffdf000..0xf000 : fixmap * 0xfde0..0xfe00 : consistent mem * 0xfddf6000..0xfde0 : early ioremap * 0xc900..0xfddf6000 : vmalloc & ioremap SLUB: HWalign=16, Order=0-3, MinObjects=0, CPUs=1, Nodes=1 Today, IMMR is mapped 1:1 at startup Mapping IMMR 1:1 is just wrong because it may overlap with another area. On most mpc8xx boards it is OK as IMMR is set to 0xff00 but for instance on EP88xC board, IMMR is at 0xfa20 which overlaps with VM ioremap area This patch fixes the virtual address for remapping IMMR with the fixmap regardless of the value of IMMR. The size of IMMR area is 256kbytes (CPM at offset 0, security engine at offset 128k) so a 512k page is enough Signed-off-by: Christophe Leroy--- v2: no change v3: Using fixmap instead of fixed address v4: Fix a wrong #if notified by kbuild robot v5: no change v6: no change v7: no change arch/powerpc/include/asm/fixmap.h | 7 +++ arch/powerpc/kernel/asm-offsets.c | 8 arch/powerpc/kernel/head_8xx.S| 11 ++- arch/powerpc/mm/mmu_decl.h| 7 +++ arch/powerpc/sysdev/cpm_common.c | 15 --- 5 files changed, 40 insertions(+), 8 deletions(-) diff --git a/arch/powerpc/include/asm/fixmap.h b/arch/powerpc/include/asm/fixmap.h index 90f604b..d7dd8fb 100644 --- a/arch/powerpc/include/asm/fixmap.h +++ b/arch/powerpc/include/asm/fixmap.h @@ -51,6 +51,13 @@ enum fixed_addresses { FIX_KMAP_BEGIN, /* reserved pte's for temporary kernel mappings */ FIX_KMAP_END = FIX_KMAP_BEGIN+(KM_TYPE_NR*NR_CPUS)-1, #endif +#ifdef CONFIG_PPC_8xx + /* For IMMR we need an aligned 512K area */ + FIX_IMMR_START, + FIX_IMMR_TOP = (FIX_IMMR_START - 1 + ((512 * 1024) / PAGE_SIZE)) & + ~(((512 * 1024) / PAGE_SIZE) - 1), + FIX_IMMR_BASE = FIX_IMMR_TOP - 1 + ((512 * 1024) / PAGE_SIZE), +#endif /* FIX_PCIE_MCFG, */ __end_of_fixed_addresses }; diff --git a/arch/powerpc/kernel/asm-offsets.c b/arch/powerpc/kernel/asm-offsets.c index 07cebc3..9724ff8 100644 --- a/arch/powerpc/kernel/asm-offsets.c +++ b/arch/powerpc/kernel/asm-offsets.c @@ -68,6 +68,10 @@ #include "../mm/mmu_decl.h" #endif +#ifdef CONFIG_PPC_8xx +#include +#endif + int main(void) { DEFINE(THREAD, offsetof(struct task_struct, thread)); @@ -772,5 +776,9 @@ int main(void) DEFINE(PPC_DBELL_SERVER, PPC_DBELL_SERVER); +#ifdef CONFIG_PPC_8xx + DEFINE(VIRT_IMMR_BASE, __fix_to_virt(FIX_IMMR_BASE)); +#endif + return 0; } diff --git a/arch/powerpc/kernel/head_8xx.S b/arch/powerpc/kernel/head_8xx.S index 87d1f5f..09173ae 100644 --- a/arch/powerpc/kernel/head_8xx.S +++ b/arch/powerpc/kernel/head_8xx.S @@ -30,6 +30,7 @@ #include #include #include +#include /* Macro to make the code more readable. */ #ifdef CONFIG_8xx_CPU6 @@ -763,7 +764,7 @@ start_here: * virtual to physical. Also, set the cache mode since that is defined * by TLB entries and perform any additional mapping (like of the IMMR). * If configured to pin some TLBs, we pin the first 8 Mbytes of kernel, - * 24 Mbytes of data, and the 8M IMMR space. Anything not covered by + * 24 Mbytes of data, and the 512k IMMR space. Anything not covered by * these mappings is mapped by page tables. */ initial_mmu: @@ -812,7 +813,7 @@ initial_mmu: ori r8, r8, MD_APG_INIT@l mtspr SPRN_MD_AP, r8 - /* Map another 8 MByte at the IMMR to get the processor + /* Map a 512k page for the IMMR to get the processor * internal registers (among other things). */ #ifdef CONFIG_PIN_TLB @@ -820,12 +821,12 @@ initial_mmu: mtspr SPRN_MD_CTR, r10 #endif mfspr r9, 638 /* Get current IMMR */ - andis. r9, r9, 0xff80 /* Get 8Mbyte boundary */ + andis. r9, r9, 0xfff8 /* Get 512 kbytes boundary */ - mr r8, r9 /* Create vaddr for TLB */ + lis r8, VIRT_IMMR_BASE@h/* Create vaddr for TLB */ ori r8, r8, MD_EVALID /* Mark it valid */ mtspr SPRN_MD_EPN, r8 - li r8, MD_PS8MEG /* Set 8M byte page */ + li r8, MD_PS512K | MD_GUARDED /* Set 512k byte page */ ori r8, r8, MD_SVALID /* Make it valid */ mtspr SPRN_MD_TWC, r8 mr r8, r9 /* Create paddr for TLB */ diff --git a/arch/powerpc/mm/mmu_decl.h b/arch/powerpc/mm/mmu_decl.h index 40dd5d3..e7228b7 100644 --- a/arch/powerpc/mm/mmu_decl.h +++ b/arch/powerpc/mm/mmu_decl.h @@ -107,6 +107,13 @@ struct hash_pte; extern struct hash_pte *Hash, *Hash_end; extern unsigned long Hash_size, Hash_mask; +#define PHYS_IMMR_BASE (mfspr(SPRN_IMMR) & 0xfff8) +#ifdef CONFIG_PPC_8xx +#define VIRT_IMMR_BASE
[PATCH v7 06/23] powerpc32: refactor x_mapped_by_bats() and x_mapped_by_tlbcam() together
x_mapped_by_bats() and x_mapped_by_tlbcam() serve the same kind of purpose, and are never defined at the same time. So rename them x_block_mapped() and define them in the relevant places Signed-off-by: Christophe Leroy--- v2: no change v3: Functions are mutually exclusive so renamed iaw Scott comment instead of grouping into a single function v4: no change v5: no change v6: no change v7: Don't include x_block_mapped() from compilation in arch/powerpc/mm/fsl_booke_mmu.c when CONFIG_FSL_BOOKE is not set (problem reported by kbuild robot with a configuration having CONFIG_FSL_BOOK3E and not CONFIG_FSL_BOOKE) arch/powerpc/mm/fsl_booke_mmu.c | 4 ++-- arch/powerpc/mm/mmu_decl.h | 10 ++ arch/powerpc/mm/pgtable_32.c| 44 ++--- arch/powerpc/mm/ppc_mmu_32.c| 4 ++-- 4 files changed, 20 insertions(+), 42 deletions(-) diff --git a/arch/powerpc/mm/fsl_booke_mmu.c b/arch/powerpc/mm/fsl_booke_mmu.c index f3afe3d..5d45341 100644 --- a/arch/powerpc/mm/fsl_booke_mmu.c +++ b/arch/powerpc/mm/fsl_booke_mmu.c @@ -75,7 +75,7 @@ unsigned long tlbcam_sz(int idx) /* * Return PA for this VA if it is mapped by a CAM, or 0 */ -phys_addr_t v_mapped_by_tlbcam(unsigned long va) +phys_addr_t v_block_mapped(unsigned long va) { int b; for (b = 0; b < tlbcam_index; ++b) @@ -87,7 +87,7 @@ phys_addr_t v_mapped_by_tlbcam(unsigned long va) /* * Return VA for a given PA or 0 if not mapped */ -unsigned long p_mapped_by_tlbcam(phys_addr_t pa) +unsigned long p_block_mapped(phys_addr_t pa) { int b; for (b = 0; b < tlbcam_index; ++b) diff --git a/arch/powerpc/mm/mmu_decl.h b/arch/powerpc/mm/mmu_decl.h index 7faeb9f..40dd5d3 100644 --- a/arch/powerpc/mm/mmu_decl.h +++ b/arch/powerpc/mm/mmu_decl.h @@ -158,3 +158,13 @@ struct tlbcam { u32 MAS7; }; #endif + +#if defined(CONFIG_6xx) || defined(CONFIG_FSL_BOOKE) +/* 6xx have BATS */ +/* FSL_BOOKE have TLBCAM */ +phys_addr_t v_block_mapped(unsigned long va); +unsigned long p_block_mapped(phys_addr_t pa); +#else +static inline phys_addr_t v_block_mapped(unsigned long va) { return 0; } +static inline unsigned long p_block_mapped(phys_addr_t pa) { return 0; } +#endif diff --git a/arch/powerpc/mm/pgtable_32.c b/arch/powerpc/mm/pgtable_32.c index 7692d1b..db0d35e 100644 --- a/arch/powerpc/mm/pgtable_32.c +++ b/arch/powerpc/mm/pgtable_32.c @@ -41,32 +41,8 @@ unsigned long ioremap_base; unsigned long ioremap_bot; EXPORT_SYMBOL(ioremap_bot);/* aka VMALLOC_END */ -#ifdef CONFIG_6xx -#define HAVE_BATS 1 -#endif - -#if defined(CONFIG_FSL_BOOKE) -#define HAVE_TLBCAM1 -#endif - extern char etext[], _stext[]; -#ifdef HAVE_BATS -extern phys_addr_t v_mapped_by_bats(unsigned long va); -extern unsigned long p_mapped_by_bats(phys_addr_t pa); -#else /* !HAVE_BATS */ -#define v_mapped_by_bats(x)(0UL) -#define p_mapped_by_bats(x)(0UL) -#endif /* HAVE_BATS */ - -#ifdef HAVE_TLBCAM -extern phys_addr_t v_mapped_by_tlbcam(unsigned long va); -extern unsigned long p_mapped_by_tlbcam(phys_addr_t pa); -#else /* !HAVE_TLBCAM */ -#define v_mapped_by_tlbcam(x) (0UL) -#define p_mapped_by_tlbcam(x) (0UL) -#endif /* HAVE_TLBCAM */ - #define PGDIR_ORDER(32 + PGD_T_LOG2 - PGDIR_SHIFT) #ifndef CONFIG_PPC_4K_PAGES @@ -228,19 +204,10 @@ __ioremap_caller(phys_addr_t addr, unsigned long size, unsigned long flags, /* * Is it already mapped? Perhaps overlapped by a previous -* BAT mapping. If the whole area is mapped then we're done, -* otherwise remap it since we want to keep the virt addrs for -* each request contiguous. -* -* We make the assumption here that if the bottom and top -* of the range we want are mapped then it's mapped to the -* same virt address (and this is contiguous). -* -- Cort +* mapping. */ - if ((v = p_mapped_by_bats(p)) /*&& p_mapped_by_bats(p+size-1)*/ ) - goto out; - - if ((v = p_mapped_by_tlbcam(p))) + v = p_block_mapped(p); + if (v) goto out; if (slab_is_available()) { @@ -278,7 +245,8 @@ void iounmap(volatile void __iomem *addr) * If mapped by BATs then there is nothing to do. * Calling vfree() generates a benign warning. */ - if (v_mapped_by_bats((unsigned long)addr)) return; + if (v_block_mapped((unsigned long)addr)) + return; if (addr > high_memory && (unsigned long) addr < ioremap_bot) vunmap((void *) (PAGE_MASK & (unsigned long)addr)); @@ -403,7 +371,7 @@ static int __change_page_attr(struct page *page, pgprot_t prot) BUG_ON(PageHighMem(page)); address = (unsigned long)page_address(page); - if (v_mapped_by_bats(address) || v_mapped_by_tlbcam(address)) + if (v_block_mapped(address)) return 0; if (!get_pteptr(_mm, address, , ))
[PATCH v7 15/23] powerpc/8xx: remove special handling of CPU6 errata in set_dec()
CPU6 ERRATA is now handled directly in mtspr(), so we can use the standard set_dec() fonction in all cases. Signed-off-by: Christophe Leroy--- v2: no change v3: no change v4: no change v5: no change v6: no change v7: no change arch/powerpc/include/asm/time.h | 6 +- arch/powerpc/kernel/head_8xx.S | 18 -- 2 files changed, 1 insertion(+), 23 deletions(-) diff --git a/arch/powerpc/include/asm/time.h b/arch/powerpc/include/asm/time.h index 2d7109a..1092fdd 100644 --- a/arch/powerpc/include/asm/time.h +++ b/arch/powerpc/include/asm/time.h @@ -31,8 +31,6 @@ extern void tick_broadcast_ipi_handler(void); extern void generic_calibrate_decr(void); -extern void set_dec_cpu6(unsigned int val); - /* Some sane defaults: 125 MHz timebase, 1GHz processor */ extern unsigned long ppc_proc_freq; #define DEFAULT_PROC_FREQ (DEFAULT_TB_FREQ * 8) @@ -166,14 +164,12 @@ static inline void set_dec(int val) { #if defined(CONFIG_40x) mtspr(SPRN_PIT, val); -#elif defined(CONFIG_8xx_CPU6) - set_dec_cpu6(val - 1); #else #ifndef CONFIG_BOOKE --val; #endif mtspr(SPRN_DEC, val); -#endif /* not 40x or 8xx_CPU6 */ +#endif /* not 40x */ } static inline unsigned long tb_ticks_since(unsigned long tstamp) diff --git a/arch/powerpc/kernel/head_8xx.S b/arch/powerpc/kernel/head_8xx.S index a268cf4..637f8e9 100644 --- a/arch/powerpc/kernel/head_8xx.S +++ b/arch/powerpc/kernel/head_8xx.S @@ -1011,24 +1011,6 @@ _GLOBAL(set_context) SYNC blr -#ifdef CONFIG_8xx_CPU6 -/* It's here because it is unique to the 8xx. - * It is important we get called with interrupts disabled. I used to - * do that, but it appears that all code that calls this already had - * interrupt disabled. - */ - .globl set_dec_cpu6 -set_dec_cpu6: - lis r7, cpu6_errata_word@h - ori r7, r7, cpu6_errata_word@l - li r4, 0x2c00 - stw r4, 8(r7) - lwz r4, 8(r7) -mtspr 22, r3 /* Update Decrementer */ - SYNC - blr -#endif - /* * We put a few things here that have to be page-aligned. * This stuff goes at the beginning of the data segment, -- 2.1.0 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH v7 16/23] powerpc/8xx: rewrite set_context() in C
There is no real need to have set_context() in assembly. Now that we have mtspr() handling CPU6 ERRATA directly, we can rewrite set_context() in C language for easier maintenance. Signed-off-by: Christophe Leroy--- v2: no change v3: no change v4: no change v5: no change v6: no change v7: no change arch/powerpc/kernel/head_8xx.S | 44 -- arch/powerpc/mm/8xx_mmu.c | 34 2 files changed, 34 insertions(+), 44 deletions(-) diff --git a/arch/powerpc/kernel/head_8xx.S b/arch/powerpc/kernel/head_8xx.S index 637f8e9..bb2b657 100644 --- a/arch/powerpc/kernel/head_8xx.S +++ b/arch/powerpc/kernel/head_8xx.S @@ -968,50 +968,6 @@ initial_mmu: /* - * Set up to use a given MMU context. - * r3 is context number, r4 is PGD pointer. - * - * We place the physical address of the new task page directory loaded - * into the MMU base register, and set the ASID compare register with - * the new "context." - */ -_GLOBAL(set_context) - -#ifdef CONFIG_BDI_SWITCH - /* Context switch the PTE pointer for the Abatron BDI2000. -* The PGDIR is passed as second argument. -*/ - lis r5, KERNELBASE@h - lwz r5, 0xf0(r5) - stw r4, 0x4(r5) -#endif - - /* Register M_TW will contain base address of level 1 table minus the -* lower part of the kernel PGDIR base address, so that all accesses to -* level 1 table are done relative to lower part of kernel PGDIR base -* address. -*/ - li r5, (swapper_pg_dir-PAGE_OFFSET)@l - sub r4, r4, r5 - tophys (r4, r4) -#ifdef CONFIG_8xx_CPU6 - lis r6, cpu6_errata_word@h - ori r6, r6, cpu6_errata_word@l - li r7, 0x3f80 - stw r7, 12(r6) - lwz r7, 12(r6) -#endif - mtspr SPRN_M_TW, r4 /* Update pointeur to level 1 table */ -#ifdef CONFIG_8xx_CPU6 - li r7, 0x3380 - stw r7, 12(r6) - lwz r7, 12(r6) -#endif - mtspr SPRN_M_CASID, r3/* Update context */ - SYNC - blr - -/* * We put a few things here that have to be page-aligned. * This stuff goes at the beginning of the data segment, * which is page-aligned. diff --git a/arch/powerpc/mm/8xx_mmu.c b/arch/powerpc/mm/8xx_mmu.c index 50f17d2..b75c461 100644 --- a/arch/powerpc/mm/8xx_mmu.c +++ b/arch/powerpc/mm/8xx_mmu.c @@ -147,3 +147,37 @@ void setup_initial_memory_limit(phys_addr_t first_memblock_base, memblock_set_current_limit(min_t(u64, first_memblock_size, initial_memory_size)); } + +/* + * Set up to use a given MMU context. + * id is context number, pgd is PGD pointer. + * + * We place the physical address of the new task page directory loaded + * into the MMU base register, and set the ASID compare register with + * the new "context." + */ +void set_context(unsigned long id, pgd_t *pgd) +{ + s16 offset = (s16)(__pa(swapper_pg_dir)); + +#ifdef CONFIG_BDI_SWITCH + pgd_t **ptr = *(pgd_t ***)(KERNELBASE + 0xf0); + + /* Context switch the PTE pointer for the Abatron BDI2000. +* The PGDIR is passed as second argument. +*/ + *(ptr + 1) = pgd; +#endif + + /* Register M_TW will contain base address of level 1 table minus the +* lower part of the kernel PGDIR base address, so that all accesses to +* level 1 table are done relative to lower part of kernel PGDIR base +* address. +*/ + mtspr(SPRN_M_TW, __pa(pgd) - offset); + + /* Update context */ + mtspr(SPRN_M_CASID, id); + /* sync */ + mb(); +} -- 2.1.0 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH v3 2/2] powerpc: tracing: don't trace hcalls on offline CPUs
On Fri, 2016-02-05 at 09:36 -0500, Steven Rostedt wrote: > On Fri, 5 Feb 2016 14:20:17 +0300 > Denis Kirjanovwrote: > > > > > Signed-off-by: Denis Kirjanov > > > > > > Hi Steven, > > > > > > please apply with Michael's acked-by tag. > > > > ping > > Actually, can you take this through the ppc tree? The > TRACE_EVENT_FN_COND is already in mainline. > > You can add my: > > Acked-by: Steven Rostedt Thanks, will do. I tidied up the change log a bit: powerpc/pseries: Don't trace hcalls on offline CPUs If a cpu is hotplugged while the hcall trace points are active, it's possible to hit a warning from RCU due to the trace points calling into RCU from an offline cpu, eg: RCU used illegally from offline CPU! rcu_scheduler_active = 1, debug_locks = 1 Make the hypervisor tracepoints conditional by using TRACE_EVENT_FN_COND. Acked-by: Steven Rostedt Signed-off-by: Denis Kirjanov Signed-off-by: Michael Ellerman cheers ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: powerpc/perf/hv-gpci: Increase request buffer size
On Tue, 2016-09-02 at 03:08:30 UTC, Sukadev Bhattiprolu wrote: > >From 31edd352fb7c2a72913f1977fa1bf168109089ad Mon Sep 17 00:00:00 2001 > From: Sukadev Bhattiprolu> Date: Tue, 9 Feb 2016 02:47:45 -0500 > Subject: [PATCH] powerpc/perf/hv-gpci: Increase request buffer size > > The GPCI hcall allows for a 4K buffer but we limit the buffer > to 1K. The problem with a 1K buffer is if a request results in > returning more values than can be accomodated in the 1K buffer > the request will fail. > > The buffer we are using is currently allocated on the stack and > hence limited in size. Instead use a per-CPU 4K buffer like we do > with 24x7 counters (hv-24x7.c). > > diff --git a/arch/powerpc/perf/hv-gpci.c b/arch/powerpc/perf/hv-gpci.c > index 856fe6e..e6fad73 100644 > --- a/arch/powerpc/perf/hv-gpci.c > +++ b/arch/powerpc/perf/hv-gpci.c > @@ -127,8 +127,16 @@ static const struct attribute_group *attr_groups[] = { > NULL, > }; > > +#define HGPCI_REQ_BUFFER_SIZE4096 > #define GPCI_MAX_DATA_BYTES \ > - (1024 - sizeof(struct hv_get_perf_counter_info_params)) > + (HGPCI_REQ_BUFFER_SIZE - sizeof(struct hv_get_perf_counter_info_params)) > + > +DEFINE_PER_CPU(char, hv_gpci_reqb[HGPCI_REQ_BUFFER_SIZE]) > __aligned(sizeof(uint64_t)); > + > +struct hv_gpci_request_buffer { > + struct hv_get_perf_counter_info_params params; > + uint8_t bytes[1]; bytes is 1 byte long, but .. > @@ -163,9 +168,11 @@ static unsigned long single_gpci_request(u32 req, u32 > starting_index, >*/ > count = 0; > for (i = offset; i < offset + length; i++) > - count |= arg.bytes[i] << (i - offset); > + count |= arg->bytes[i] << (i - offset); Here you read from bytes[i] where i can be > 1 (AFAICS). That's fishy at best, and newer GCCs just don't allow it. I think you could do this and it would work, but untested: struct hv_gpci_request_buffer { struct hv_get_perf_counter_info_params params; uint8_t bytes[4096 - sizeof(struct hv_get_perf_counter_info_parms)]; }; cheers ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH v7 01/23] powerpc/8xx: Save r3 all the time in DTLB miss handler
We are spending between 40 and 160 cycles with a mean of 65 cycles in the DTLB handling routine (measured with mftbl) so make it more simple althought it adds one instruction. With this modification, we get three registers available at all time, which will help with following patch. Signed-off-by: Christophe Leroy--- v2: no change v3: no change v4: no change v5: no change v6: no change v7: no change arch/powerpc/kernel/head_8xx.S | 13 - 1 file changed, 4 insertions(+), 9 deletions(-) diff --git a/arch/powerpc/kernel/head_8xx.S b/arch/powerpc/kernel/head_8xx.S index e629e28..a89492e 100644 --- a/arch/powerpc/kernel/head_8xx.S +++ b/arch/powerpc/kernel/head_8xx.S @@ -385,23 +385,20 @@ InstructionTLBMiss: . = 0x1200 DataStoreTLBMiss: -#ifdef CONFIG_8xx_CPU6 mtspr SPRN_SPRG_SCRATCH2, r3 -#endif EXCEPTION_PROLOG_0 - mfcrr10 + mfcrr3 /* If we are faulting a kernel address, we have to use the * kernel page tables. */ - mfspr r11, SPRN_MD_EPN - IS_KERNEL(r11, r11) + mfspr r10, SPRN_MD_EPN + IS_KERNEL(r11, r10) mfspr r11, SPRN_M_TW /* Get level 1 table */ BRANCH_UNLESS_KERNEL(3f) lis r11, (swapper_pg_dir-PAGE_OFFSET)@ha 3: - mtcrr10 - mfspr r10, SPRN_MD_EPN + mtcrr3 /* Insert level 1 index */ rlwimi r11, r10, 32 - ((PAGE_SHIFT - 2) << 1), (PAGE_SHIFT - 2) << 1, 29 @@ -453,9 +450,7 @@ DataStoreTLBMiss: MTSPR_CPU6(SPRN_MD_RPN, r10, r3)/* Update TLB entry */ /* Restore registers */ -#ifdef CONFIG_8xx_CPU6 mfspr r3, SPRN_SPRG_SCRATCH2 -#endif mtspr SPRN_DAR, r11 /* Tag DAR */ EXCEPTION_EPILOG_0 rfi -- 2.1.0 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH v7 09/23] powerpc/8xx: CONFIG_PIN_TLB unneeded for CONFIG_PPC_EARLY_DEBUG_CPM
IMMR is now mapped by page tables so it is not anymore necessary to PIN TLBs Signed-off-by: Christophe Leroy--- v2: no change v3: no change v4: no change v5: no change v6: no change v7: no change arch/powerpc/Kconfig.debug | 1 - 1 file changed, 1 deletion(-) diff --git a/arch/powerpc/Kconfig.debug b/arch/powerpc/Kconfig.debug index 638f9ce..136b09c 100644 --- a/arch/powerpc/Kconfig.debug +++ b/arch/powerpc/Kconfig.debug @@ -220,7 +220,6 @@ config PPC_EARLY_DEBUG_40x config PPC_EARLY_DEBUG_CPM bool "Early serial debugging for Freescale CPM-based serial ports" depends on SERIAL_CPM - select PIN_TLB if PPC_8xx help Select this to enable early debugging for Freescale chips using a CPM-based serial port. This assumes that the bootwrapper -- 2.1.0 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: powerpc/xmon: Add xmon command to dump process/task similar to ps(1)
On Mon, 2015-23-11 at 15:01:15 UTC, Douglas Miller wrote: > Add 'P' command with optional task_struct address to dump all/one task's > information: task pointer, kernel stack pointer, PID, PPID, state > (interpreted), CPU where (last) running, and command. > > Introduce XMON_PROTECT macro to standardize memory-access-fault > protection (setjmp). Initially used only by the 'P' command. Hi Doug, Sorry this has taken a while, it keeps getting preempted by more important patches. I'm also not a big fan of the protect macro, it works for this case, but it's already a bit ugly calling for_each_process() inside the macro, and it would be even worse for multi line logic. I think I'd rather just open code it, and hopefully we can come up with a better solution for catching errors in the long run. I also renamed the routines to use "task", because "proc" in xmon is already used to mean "procedure", and the struct is task_struct after all. How does this look? cheers diff --git a/arch/powerpc/xmon/xmon.c b/arch/powerpc/xmon/xmon.c index 47e195d66a9a..942796fa4767 100644 --- a/arch/powerpc/xmon/xmon.c +++ b/arch/powerpc/xmon/xmon.c @@ -163,6 +163,7 @@ static int cpu_cmd(void); static void csum(void); static void bootcmds(void); static void proccall(void); +static void show_tasks(void); void dump_segments(void); static void symbol_lookup(void); static void xmon_show_stack(unsigned long sp, unsigned long lr, @@ -238,6 +239,7 @@ Commands:\n\ mz zero a block of memory\n\ mi show information about memory allocation\n\ pcall a procedure\n\ + Plist processes/tasks\n\ rprint registers\n\ ssingle step\n" #ifdef CONFIG_SPU_BASE @@ -967,6 +969,9 @@ cmds(struct pt_regs *excp) case 'p': proccall(); break; + case 'P': + show_tasks(); + break; #ifdef CONFIG_PPC_STD_MMU case 'u': dump_segments(); @@ -2566,6 +2571,61 @@ memzcan(void) printf("%.8x\n", a - mskip); } +static void show_task(struct task_struct *tsk) +{ + char state; + + /* +* Cloned from kdb_task_state_char(), which is not entirely +* appropriate for calling from xmon. This could be moved +* to a common, generic, routine used by both. +*/ + state = (tsk->state == 0) ? 'R' : + (tsk->state < 0) ? 'U' : + (tsk->state & TASK_UNINTERRUPTIBLE) ? 'D' : + (tsk->state & TASK_STOPPED) ? 'T' : + (tsk->state & TASK_TRACED) ? 'C' : + (tsk->exit_state & EXIT_ZOMBIE) ? 'Z' : + (tsk->exit_state & EXIT_DEAD) ? 'E' : + (tsk->state & TASK_INTERRUPTIBLE) ? 'S' : '?'; + + printf("%p %016lx %6d %6d %c %2d %s\n", tsk, + tsk->thread.ksp, + tsk->pid, tsk->parent->pid, + state, task_thread_info(tsk)->cpu, + tsk->comm); +} + +static void show_tasks(void) +{ + unsigned long tskv; + struct task_struct *tsk = NULL; + + printf(" task_struct ->thread.kspPID PPID S P CMD\n"); + + if (scanhex()) + tsk = (struct task_struct *)tskv; + + if (setjmp(bus_error_jmp) != 0) { + catch_memory_errors = 0; + printf("*** Error dumping task %p\n", tsk); + return; + } + + catch_memory_errors = 1; + sync(); + + if (tsk) + show_task(tsk); + else + for_each_process(tsk) + show_task(tsk); + + sync(); + __delay(200); + catch_memory_errors = 0; +} + static void proccall(void) { unsigned long args[8]; ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH v7 04/10] ppc64 ftrace_with_regs configuration variables
On Mon, Feb 08, 2016 at 10:49:28AM -0500, Steven Rostedt wrote: > On Mon, 8 Feb 2016 16:23:06 +0100 > Petr Mladekwrote: > > > >From 2b0fcb678d7720d03f9c9f233b61ed9ed4d420b3 Mon Sep 17 00:00:00 2001 > > From: Petr Mladek > > Date: Mon, 8 Feb 2016 16:03:03 +0100 > > Subject: [PATCH] ftrace: Allow to explicitly disable the build of the > > dynamic > > ftrace with regs > > > > This patch allows to explicitly disable > > CONFIG_DYNAMIC_FTRACE_WITH_REGS. We will need to do so on > > PPC with a broken gcc. This situation will be detected at > > buildtime and could not be handled by Kbuild automatically. > > Wait. Can it be detected at build time? That is, does it cause a build Yes, I wrote a test to detect it at build time. It is similar to "asm goto" and part of the v7 patch set. > error? If so, then you can have Kbuild automatically detect this and > set the proper value. We do this with 'asm goto'. There's tricks in the > build system that can change the configs based on if a compiler is > broken or not. Please clarify. All I could find is Makefile magic that does it. AFAICS This runs _after_ Kconfig. But what I'd like to see is to offer the user the full choice, where possible, e.g. Kernel Tracing ... 0) none 1) static FTRACE 2) DYNAMIC_FTRACE 3) DYNAMIC_FTRACE_WITH_REGS Can such a test be used to simply reduce these options? With Petr's patch, it comes quite close to the above, and if you select "3" and your compiler is broken, compilation will fail. For "2", it will just do the right thing ( fall back to plain "-pg" ). Without Petr's patch you have *no* choice between "2" and "3". (That's what I'd call a bug :) So, the question is, can such a test be used to provide _input_ to "make config" ? I can see the "env=" mechanism, but it seems not to be used very heavily. That would then be a prerequisite to all "make *config". Even if it can provide this input, you can still not choose between 2 and 3 where both are available. Torsten ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH v7 04/23] powerpc/8xx: move setup_initial_memory_limit() into 8xx_mmu.c
Now we have a 8xx specific .c file for that so put it in there as other powerpc variants do Signed-off-by: Christophe Leroy--- v2: no change v3: no change v4: no change v5: no change v6: no change v7: no change arch/powerpc/mm/8xx_mmu.c | 17 + arch/powerpc/mm/init_32.c | 19 --- 2 files changed, 17 insertions(+), 19 deletions(-) diff --git a/arch/powerpc/mm/8xx_mmu.c b/arch/powerpc/mm/8xx_mmu.c index 2d42745..a84f5eb 100644 --- a/arch/powerpc/mm/8xx_mmu.c +++ b/arch/powerpc/mm/8xx_mmu.c @@ -81,3 +81,20 @@ unsigned long __init mmu_mapin_ram(unsigned long top) return mapped; } + +void setup_initial_memory_limit(phys_addr_t first_memblock_base, + phys_addr_t first_memblock_size) +{ + /* We don't currently support the first MEMBLOCK not mapping 0 +* physical on those processors +*/ + BUG_ON(first_memblock_base != 0); + +#ifdef CONFIG_PIN_TLB + /* 8xx can only access 24MB at the moment */ + memblock_set_current_limit(min_t(u64, first_memblock_size, 0x0180)); +#else + /* 8xx can only access 8MB at the moment */ + memblock_set_current_limit(min_t(u64, first_memblock_size, 0x0080)); +#endif +} diff --git a/arch/powerpc/mm/init_32.c b/arch/powerpc/mm/init_32.c index a10be66..1a18e4b 100644 --- a/arch/powerpc/mm/init_32.c +++ b/arch/powerpc/mm/init_32.c @@ -193,22 +193,3 @@ void __init MMU_init(void) /* Shortly after that, the entire linear mapping will be available */ memblock_set_current_limit(lowmem_end_addr); } - -#ifdef CONFIG_8xx /* No 8xx specific .c file to put that in ... */ -void setup_initial_memory_limit(phys_addr_t first_memblock_base, - phys_addr_t first_memblock_size) -{ - /* We don't currently support the first MEMBLOCK not mapping 0 -* physical on those processors -*/ - BUG_ON(first_memblock_base != 0); - -#ifdef CONFIG_PIN_TLB - /* 8xx can only access 24MB at the moment */ - memblock_set_current_limit(min_t(u64, first_memblock_size, 0x0180)); -#else - /* 8xx can only access 8MB at the moment */ - memblock_set_current_limit(min_t(u64, first_memblock_size, 0x0080)); -#endif -} -#endif /* CONFIG_8xx */ -- 2.1.0 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH v7 14/23] powerpc/8xx: Handle CPU6 ERRATA directly in mtspr() macro
MPC8xx has an ERRATA on the use of mtspr() for some registers This patch includes the ERRATA handling directly into mtspr() macro so that mtspr() users don't need to bother about that errata Signed-off-by: Christophe Leroy--- v2: no change v3: no change v4: no change v5: no change v6: no change v7: no change arch/powerpc/include/asm/reg.h | 2 + arch/powerpc/include/asm/reg_8xx.h | 82 ++ 2 files changed, 84 insertions(+) diff --git a/arch/powerpc/include/asm/reg.h b/arch/powerpc/include/asm/reg.h index c4cb2ff..7b5d97f 100644 --- a/arch/powerpc/include/asm/reg.h +++ b/arch/powerpc/include/asm/reg.h @@ -1211,9 +1211,11 @@ static inline void mtmsr_isync(unsigned long val) #define mfspr(rn) ({unsigned long rval; \ asm volatile("mfspr %0," __stringify(rn) \ : "=r" (rval)); rval;}) +#ifndef mtspr #define mtspr(rn, v) asm volatile("mtspr " __stringify(rn) ",%0" : \ : "r" ((unsigned long)(v)) \ : "memory") +#endif extern void msr_check_and_set(unsigned long bits); extern bool strict_msr_control; diff --git a/arch/powerpc/include/asm/reg_8xx.h b/arch/powerpc/include/asm/reg_8xx.h index 0f71c81..d41412c 100644 --- a/arch/powerpc/include/asm/reg_8xx.h +++ b/arch/powerpc/include/asm/reg_8xx.h @@ -50,4 +50,86 @@ #define DC_DFWT0x4000 /* Data cache is forced write through */ #define DC_LES 0x2000 /* Caches are little endian mode */ +#ifdef CONFIG_8xx_CPU6 +#define do_mtspr_cpu6(rn, rn_addr, v) \ + do {\ + int _reg_cpu6 = rn_addr, _tmp_cpu6[1]; \ + asm volatile("stw %0, %1;" \ +"lwz %0, %1;" \ +"mtspr " __stringify(rn) ",%2" : \ +: "r" (_reg_cpu6), "m"(_tmp_cpu6), \ + "r" ((unsigned long)(v)) \ +: "memory"); \ + } while (0) + +#define do_mtspr(rn, v)asm volatile("mtspr " __stringify(rn) ",%0" : \ +: "r" ((unsigned long)(v)) \ +: "memory") +#define mtspr(rn, v) \ + do {\ + if (rn == SPRN_IMMR)\ + do_mtspr_cpu6(rn, 0x3d30, v); \ + else if (rn == SPRN_IC_CST) \ + do_mtspr_cpu6(rn, 0x2110, v); \ + else if (rn == SPRN_IC_ADR) \ + do_mtspr_cpu6(rn, 0x2310, v); \ + else if (rn == SPRN_IC_DAT) \ + do_mtspr_cpu6(rn, 0x2510, v); \ + else if (rn == SPRN_DC_CST) \ + do_mtspr_cpu6(rn, 0x3110, v); \ + else if (rn == SPRN_DC_ADR) \ + do_mtspr_cpu6(rn, 0x3310, v); \ + else if (rn == SPRN_DC_DAT) \ + do_mtspr_cpu6(rn, 0x3510, v); \ + else if (rn == SPRN_MI_CTR) \ + do_mtspr_cpu6(rn, 0x2180, v); \ + else if (rn == SPRN_MI_AP) \ + do_mtspr_cpu6(rn, 0x2580, v); \ + else if (rn == SPRN_MI_EPN) \ + do_mtspr_cpu6(rn, 0x2780, v); \ + else if (rn == SPRN_MI_TWC) \ + do_mtspr_cpu6(rn, 0x2b80, v); \ + else if (rn == SPRN_MI_RPN) \ + do_mtspr_cpu6(rn, 0x2d80, v); \ + else if (rn == SPRN_MI_CAM) \ + do_mtspr_cpu6(rn, 0x2190, v); \ + else if (rn == SPRN_MI_RAM0)\ + do_mtspr_cpu6(rn, 0x2390, v); \ + else if (rn == SPRN_MI_RAM1)\ + do_mtspr_cpu6(rn, 0x2590, v); \ + else if (rn == SPRN_MD_CTR) \ + do_mtspr_cpu6(rn, 0x3180, v); \ + else if (rn == SPRN_M_CASID)
[PATCH v7 22/23] powerpc32: small optimisation in flush_icache_range()
Inlining of _dcache_range() functions has shown that the compiler does the same thing a bit better with one insn less Signed-off-by: Christophe Leroy--- v2: new v3: no change v4: no change v5: no change v6: no change v7: no change arch/powerpc/kernel/misc_32.S | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-) diff --git a/arch/powerpc/kernel/misc_32.S b/arch/powerpc/kernel/misc_32.S index 09e1e5d..3ec5a22 100644 --- a/arch/powerpc/kernel/misc_32.S +++ b/arch/powerpc/kernel/misc_32.S @@ -348,10 +348,9 @@ BEGIN_FTR_SECTION PURGE_PREFETCHED_INS blr /* for 601, do nothing */ END_FTR_SECTION_IFSET(CPU_FTR_COHERENT_ICACHE) - li r5,L1_CACHE_BYTES-1 - andcr3,r3,r5 + rlwinm r3,r3,0,0,31 - L1_CACHE_SHIFT subfr4,r3,r4 - add r4,r4,r5 + addir4,r4,L1_CACHE_BYTES - 1 srwi. r4,r4,L1_CACHE_SHIFT beqlr mtctr r4 -- 2.1.0 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: Fix kgdb on little endian ppc64le
On Mon, 2016-01-02 at 06:03:25 UTC, Balbir Singh wrote: > From: Balbir Singh> > I spent some time trying to use kgdb and debugged my inability to > resume from kgdb_handle_breakpoint(). NIP is not incremented > and that leads to a loop in the debugger. > > I've tested this lightly on a virtual instance with KDB enabled. > After the patch, I am able to get the "go" command to work as > expected The test suite isn't working for me (I think?), so I think maybe we need something more? KGDB: Registered I/O driver kgdbts kgdbts:RUN plant and detach test Entering kdb (current=0xc001fefc, pid 1) on processor 12 due to Keyboard Entry [12]kdb> kgdbts:RUN sw breakpoint test kgdbts: BP mismatch c018e2cc expected c061c510 KGDB: re-enter exception: ALL breakpoints killed CPU: 12 PID: 1 Comm: swapper/0 Not tainted 4.5.0-rc3-6-g893791ee8b01 #5 Call Trace: [c001fb082e10] [c09c4608] dump_stack+0xb0/0xf0 (unreliable) [c001fb082e50] [c018fa8c] kgdb_handle_exception+0x2ac/0x2c0 [c001fb082f20] [c00405f4] kgdb_handle_breakpoint+0x64/0xa0 [c001fb082f50] [c09bcee4] program_check_exception+0x144/0x370 [c001fb082fc0] [c0006244] program_check_common+0x144/0x180 --- interrupt: 700 at check_and_rewind_pc+0x100/0x130 LR = check_and_rewind_pc+0xfc/0x130 [c001fb083340] [c061bfc0] validate_simple_test+0x60/0x170 [c001fb083370] [c061c784] run_simple_test+0x194/0x3c0 [c001fb0833f0] [c061c17c] kgdbts_put_char+0x4c/0x70 [c001fb083420] [c01902e0] put_packet+0x130/0x210 [c001fb083470] [c0191338] gdb_serial_stub+0x478/0x1110 [c001fb083560] [c018f19c] kgdb_cpu_enter+0x3fc/0x800 [c001fb083660] [c018f970] kgdb_handle_exception+0x190/0x2c0 [c001fb083730] [c00405f4] kgdb_handle_breakpoint+0x64/0xa0 [c001fb083760] [c09bcee4] program_check_exception+0x144/0x370 [c001fb0837d0] [c0006244] program_check_common+0x144/0x180 --- interrupt: 700 at kgdb_breakpoint+0x3c/0x70 LR = run_breakpoint_test+0xa4/0x120 [c001fb083ac0] [] (null) (unreliable) [c001fb083ae0] [c061d714] run_breakpoint_test+0xa4/0x120 [c001fb083b50] [c061dcb4] configure_kgdbts+0x2c4/0x6e0 [c001fb083c30] [c000b3d0] do_one_initcall+0xd0/0x250 [c001fb083d00] [c0cc42f8] kernel_init_freeable+0x270/0x350 [c001fb083dc0] [c000bd3c] kernel_init+0x2c/0x150 [c001fb083e30] [c00095b0] ret_from_kernel_thread+0x5c/0xac Kernel panic - not syncing: Recursive entry to debugger CPU: 12 PID: 1 Comm: swapper/0 Not tainted 4.5.0-rc3-6-g893791ee8b01 #5 Call Trace: [c001fb082d70] [c09c4608] dump_stack+0xb0/0xf0 (unreliable) [c001fb082db0] [c09c2edc] panic+0x138/0x300 [c001fb082e50] [c018fa9c] kgdb_handle_exception+0x2bc/0x2c0 [c001fb082f20] [c00405f4] kgdb_handle_breakpoint+0x64/0xa0 [c001fb082f50] [c09bcee4] program_check_exception+0x144/0x370 [c001fb082fc0] [c0006244] program_check_common+0x144/0x180 --- interrupt: 700 at check_and_rewind_pc+0x100/0x130 LR = check_and_rewind_pc+0xfc/0x130 [c001fb083340] [c061bfc0] validate_simple_test+0x60/0x170 [c001fb083370] [c061c784] run_simple_test+0x194/0x3c0 [c001fb0833f0] [c061c17c] kgdbts_put_char+0x4c/0x70 [c001fb083420] [c01902e0] put_packet+0x130/0x210 [c001fb083470] [c0191338] gdb_serial_stub+0x478/0x1110 [c001fb083560] [c018f19c] kgdb_cpu_enter+0x3fc/0x800 [c001fb083660] [c018f970] kgdb_handle_exception+0x190/0x2c0 [c001fb083730] [c00405f4] kgdb_handle_breakpoint+0x64/0xa0 [c001fb083760] [c09bcee4] program_check_exception+0x144/0x370 [c001fb0837d0] [c0006244] program_check_common+0x144/0x180 --- interrupt: 700 at kgdb_breakpoint+0x3c/0x70 LR = run_breakpoint_test+0xa4/0x120 [c001fb083ac0] [] (null) (unreliable) [c001fb083ae0] [c061d714] run_breakpoint_test+0xa4/0x120 [c001fb083b50] [c061dcb4] configure_kgdbts+0x2c4/0x6e0 [c001fb083c30] [c000b3d0] do_one_initcall+0xd0/0x250 [c001fb083d00] [c0cc42f8] kernel_init_freeable+0x270/0x350 [c001fb083dc0] [c000bd3c] kernel_init+0x2c/0x150 [c001fb083e30] [c00095b0] ret_from_kernel_thread+0x5c/0xac Rebooting in 10 seconds.. cheers ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [1/2] powerpc/powernv: new function to access OPAL msglog
On Tue, 2016-02-09 at 16:29 +1100, Andrew Donnellan wrote: > On 08/02/16 22:31, Michael Ellerman wrote: > > Pulling the memcons out of the bin_attr here is not all that nice. This > > routine > > should really stand on its own without reference to the bin_attr. In theory > > I > > might want to disable building sysfs but still have this routine available. > > Yeah it's a bit ugly, though does disabling sysfs actually break it? Probably not, it looks like bin_attribute is still defined even when sysfs is disabled. And the build would break in other places too. > I can separate it out anyway - there's no reason for the memcons to be > tied to the sysfs entry. Yeah that was more my point. > > It's also a bit fishy if it's called before the bin_attr is initialised or > > when > > the memcons initialisation fails. In both cases it should be OK, because the > > structs in question are static and so the private pointer will be NULL, but > > that's a bit fragile. > > > > I think the solution is simply to create a: > > > >static struct memcons *opal_memcons; > > > > And use that in opal_msglog_copy() and so on. > > Will respin. Thanks. cheers ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH v7 12/23] powerpc32: remove ioremap_base
ioremap_base is not initialised and is nowhere used so remove it Signed-off-by: Christophe Leroy--- v2: no change v3: fix comment as well v4: no change v5: no change v6: no change v7: no change arch/powerpc/include/asm/nohash/32/pgtable.h | 2 +- arch/powerpc/mm/mmu_decl.h | 1 - arch/powerpc/mm/pgtable_32.c | 3 +-- arch/powerpc/platforms/embedded6xx/mpc10x.h | 10 -- 4 files changed, 2 insertions(+), 14 deletions(-) diff --git a/arch/powerpc/include/asm/nohash/32/pgtable.h b/arch/powerpc/include/asm/nohash/32/pgtable.h index e201600..7808475 100644 --- a/arch/powerpc/include/asm/nohash/32/pgtable.h +++ b/arch/powerpc/include/asm/nohash/32/pgtable.h @@ -86,7 +86,7 @@ extern int icache_44x_need_flush; * We no longer map larger than phys RAM with the BATs so we don't have * to worry about the VMALLOC_OFFSET causing problems. We do have to worry * about clashes between our early calls to ioremap() that start growing down - * from ioremap_base being run into the VM area allocations (growing upwards + * from IOREMAP_TOP being run into the VM area allocations (growing upwards * from VMALLOC_START). For this reason we have ioremap_bot to check when * we actually run into our mappings setup in the early boot with the VM * system. This really does become a problem for machines with good amounts diff --git a/arch/powerpc/mm/mmu_decl.h b/arch/powerpc/mm/mmu_decl.h index 3872332..53564a3 100644 --- a/arch/powerpc/mm/mmu_decl.h +++ b/arch/powerpc/mm/mmu_decl.h @@ -100,7 +100,6 @@ extern void setbat(int index, unsigned long virt, phys_addr_t phys, extern int __map_without_bats; extern int __allow_ioremap_reserved; -extern unsigned long ioremap_base; extern unsigned int rtas_data, rtas_size; struct hash_pte; diff --git a/arch/powerpc/mm/pgtable_32.c b/arch/powerpc/mm/pgtable_32.c index db0d35e..815ccd7 100644 --- a/arch/powerpc/mm/pgtable_32.c +++ b/arch/powerpc/mm/pgtable_32.c @@ -37,7 +37,6 @@ #include "mmu_decl.h" -unsigned long ioremap_base; unsigned long ioremap_bot; EXPORT_SYMBOL(ioremap_bot);/* aka VMALLOC_END */ @@ -173,7 +172,7 @@ __ioremap_caller(phys_addr_t addr, unsigned long size, unsigned long flags, /* * Choose an address to map it to. * Once the vmalloc system is running, we use it. -* Before then, we use space going down from ioremap_base +* Before then, we use space going down from IOREMAP_TOP * (ioremap_bot records where we're up to). */ p = addr & PAGE_MASK; diff --git a/arch/powerpc/platforms/embedded6xx/mpc10x.h b/arch/powerpc/platforms/embedded6xx/mpc10x.h index b290b63..5ad1202 100644 --- a/arch/powerpc/platforms/embedded6xx/mpc10x.h +++ b/arch/powerpc/platforms/embedded6xx/mpc10x.h @@ -24,13 +24,11 @@ * Processor: 0x8000 - 0x807f -> PCI I/O: 0x - 0x007f * Processor: 0xc000 - 0xdfff -> PCI MEM: 0x - 0x1fff * PCI MEM: 0x8000 -> Processor System Memory: 0x - * EUMB mapped to: ioremap_base - 0x0010 (ioremap_base - 1 MB) * * MAP B (CHRP Map) * Processor: 0xfe00 - 0xfebf -> PCI I/O: 0x - 0x00bf * Processor: 0x8000 - 0xbfff -> PCI MEM: 0x8000 - 0xbfff * PCI MEM: 0x -> Processor System Memory: 0x - * EUMB mapped to: ioremap_base - 0x0010 (ioremap_base - 1 MB) */ /* @@ -138,14 +136,6 @@ #define MPC10X_EUMB_WP_OFFSET 0x000ff000 /* Data path diagnostic, watchpoint reg offset */ #define MPC10X_EUMB_WP_SIZE0x1000 /* Data path diagnostic, watchpoint reg size */ -/* - * Define some recommended places to put the EUMB regs. - * For both maps, recommend putting the EUMB from 0xeff0 to 0xefff. - */ -extern unsigned long ioremap_base; -#defineMPC10X_MAPA_EUMB_BASE (ioremap_base - MPC10X_EUMB_SIZE) -#defineMPC10X_MAPB_EUMB_BASE MPC10X_MAPA_EUMB_BASE - enum ppc_sys_devices { MPC10X_IIC1, MPC10X_DMA0, -- 2.1.0 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH v7 11/23] powerpc32: Remove useless/wrong MMU:setio progress message
Commit 771168494719 ("[POWERPC] Remove unused machine call outs") removed the call to setup_io_mappings(), so remove the associated progress line message Signed-off-by: Christophe Leroy--- v2: no change v3: no change v4: no change v5: no change v6: no change v7: no change arch/powerpc/mm/init_32.c | 4 1 file changed, 4 deletions(-) diff --git a/arch/powerpc/mm/init_32.c b/arch/powerpc/mm/init_32.c index 1a18e4b..4eb1b8f 100644 --- a/arch/powerpc/mm/init_32.c +++ b/arch/powerpc/mm/init_32.c @@ -178,10 +178,6 @@ void __init MMU_init(void) /* Initialize early top-down ioremap allocator */ ioremap_bot = IOREMAP_TOP; - /* Map in I/O resources */ - if (ppc_md.progress) - ppc_md.progress("MMU:setio", 0x302); - if (ppc_md.progress) ppc_md.progress("MMU:exit", 0x211); -- 2.1.0 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH v7 18/23] powerpc: add inline functions for cache related instructions
This patch adds inline functions to use dcbz, dcbi, dcbf, dcbst from C functions Signed-off-by: Christophe Leroy--- v2: new v3: no change v4: no change v5: no change v6: no change v7: no change arch/powerpc/include/asm/cache.h | 19 +++ 1 file changed, 19 insertions(+) diff --git a/arch/powerpc/include/asm/cache.h b/arch/powerpc/include/asm/cache.h index 5f8229e..ffbafbf 100644 --- a/arch/powerpc/include/asm/cache.h +++ b/arch/powerpc/include/asm/cache.h @@ -69,6 +69,25 @@ extern void _set_L3CR(unsigned long); #define _set_L3CR(val) do { } while(0) #endif +static inline void dcbz(void *addr) +{ + __asm__ __volatile__ ("dcbz 0, %0" : : "r"(addr) : "memory"); +} + +static inline void dcbi(void *addr) +{ + __asm__ __volatile__ ("dcbi 0, %0" : : "r"(addr) : "memory"); +} + +static inline void dcbf(void *addr) +{ + __asm__ __volatile__ ("dcbf 0, %0" : : "r"(addr) : "memory"); +} + +static inline void dcbst(void *addr) +{ + __asm__ __volatile__ ("dcbst 0, %0" : : "r"(addr) : "memory"); +} #endif /* !__ASSEMBLY__ */ #endif /* __KERNEL__ */ #endif /* _ASM_POWERPC_CACHE_H */ -- 2.1.0 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH v7 20/23] powerpc32: move xxxxx_dcache_range() functions inline
flush/clean/invalidate _dcache_range() functions are all very similar and are quite short. They are mainly used in __dma_sync() perf_event locate them in the top 3 consumming functions during heavy ethernet activity They are good candidate for inlining, as __dma_sync() does almost nothing but calling them Signed-off-by: Christophe Leroy--- v2: new v3: no change v4: no change v5: no change v6: no change v7: no change arch/powerpc/include/asm/cacheflush.h | 52 ++-- arch/powerpc/kernel/misc_32.S | 65 --- arch/powerpc/kernel/ppc_ksyms.c | 2 ++ 3 files changed, 51 insertions(+), 68 deletions(-) diff --git a/arch/powerpc/include/asm/cacheflush.h b/arch/powerpc/include/asm/cacheflush.h index 6229e6b..97c9978 100644 --- a/arch/powerpc/include/asm/cacheflush.h +++ b/arch/powerpc/include/asm/cacheflush.h @@ -47,12 +47,58 @@ static inline void __flush_dcache_icache_phys(unsigned long physaddr) } #endif -extern void flush_dcache_range(unsigned long start, unsigned long stop); #ifdef CONFIG_PPC32 -extern void clean_dcache_range(unsigned long start, unsigned long stop); -extern void invalidate_dcache_range(unsigned long start, unsigned long stop); +/* + * Write any modified data cache blocks out to memory and invalidate them. + * Does not invalidate the corresponding instruction cache blocks. + */ +static inline void flush_dcache_range(unsigned long start, unsigned long stop) +{ + void *addr = (void *)(start & ~(L1_CACHE_BYTES - 1)); + unsigned long size = stop - (unsigned long)addr + (L1_CACHE_BYTES - 1); + unsigned long i; + + for (i = 0; i < size >> L1_CACHE_SHIFT; i++, addr += L1_CACHE_BYTES) + dcbf(addr); + mb(); /* sync */ +} + +/* + * Write any modified data cache blocks out to memory. + * Does not invalidate the corresponding cache lines (especially for + * any corresponding instruction cache). + */ +static inline void clean_dcache_range(unsigned long start, unsigned long stop) +{ + void *addr = (void *)(start & ~(L1_CACHE_BYTES - 1)); + unsigned long size = stop - (unsigned long)addr + (L1_CACHE_BYTES - 1); + unsigned long i; + + for (i = 0; i < size >> L1_CACHE_SHIFT; i++, addr += L1_CACHE_BYTES) + dcbst(addr); + mb(); /* sync */ +} + +/* + * Like above, but invalidate the D-cache. This is used by the 8xx + * to invalidate the cache so the PPC core doesn't get stale data + * from the CPM (no cache snooping here :-). + */ +static inline void invalidate_dcache_range(unsigned long start, + unsigned long stop) +{ + void *addr = (void *)(start & ~(L1_CACHE_BYTES - 1)); + unsigned long size = stop - (unsigned long)addr + (L1_CACHE_BYTES - 1); + unsigned long i; + + for (i = 0; i < size >> L1_CACHE_SHIFT; i++, addr += L1_CACHE_BYTES) + dcbi(addr); + mb(); /* sync */ +} + #endif /* CONFIG_PPC32 */ #ifdef CONFIG_PPC64 +extern void flush_dcache_range(unsigned long start, unsigned long stop); extern void flush_inval_dcache_range(unsigned long start, unsigned long stop); extern void flush_dcache_phys_range(unsigned long start, unsigned long stop); #endif diff --git a/arch/powerpc/kernel/misc_32.S b/arch/powerpc/kernel/misc_32.S index 181afc1..09e1e5d 100644 --- a/arch/powerpc/kernel/misc_32.S +++ b/arch/powerpc/kernel/misc_32.S @@ -375,71 +375,6 @@ END_FTR_SECTION_IFSET(CPU_FTR_COHERENT_ICACHE) isync blr /* - * Write any modified data cache blocks out to memory. - * Does not invalidate the corresponding cache lines (especially for - * any corresponding instruction cache). - * - * clean_dcache_range(unsigned long start, unsigned long stop) - */ -_GLOBAL(clean_dcache_range) - li r5,L1_CACHE_BYTES-1 - andcr3,r3,r5 - subfr4,r3,r4 - add r4,r4,r5 - srwi. r4,r4,L1_CACHE_SHIFT - beqlr - mtctr r4 - -1: dcbst 0,r3 - addir3,r3,L1_CACHE_BYTES - bdnz1b - sync/* wait for dcbst's to get to ram */ - blr - -/* - * Write any modified data cache blocks out to memory and invalidate them. - * Does not invalidate the corresponding instruction cache blocks. - * - * flush_dcache_range(unsigned long start, unsigned long stop) - */ -_GLOBAL(flush_dcache_range) - li r5,L1_CACHE_BYTES-1 - andcr3,r3,r5 - subfr4,r3,r4 - add r4,r4,r5 - srwi. r4,r4,L1_CACHE_SHIFT - beqlr - mtctr r4 - -1: dcbf0,r3 - addir3,r3,L1_CACHE_BYTES - bdnz1b - sync/* wait for dcbst's to get to ram */ - blr - -/* - * Like above, but invalidate the D-cache. This is used by the 8xx - * to invalidate the cache so the PPC core doesn't get stale data - * from the CPM (no cache snooping here :-). - * - *
[PATCH v7 23/23] powerpc32: Remove one insn in mulhdu
Remove one instruction in mulhdu Signed-off-by: Christophe Leroy--- v2: new v3: no change v4: no change v5: no change v6: no change v7: no change arch/powerpc/kernel/misc_32.S | 11 +-- 1 file changed, 5 insertions(+), 6 deletions(-) diff --git a/arch/powerpc/kernel/misc_32.S b/arch/powerpc/kernel/misc_32.S index 3ec5a22..bf5160f 100644 --- a/arch/powerpc/kernel/misc_32.S +++ b/arch/powerpc/kernel/misc_32.S @@ -91,17 +91,16 @@ _GLOBAL(mulhdu) addcr7,r0,r7 addze r4,r4 1: beqlr cr1 /* all done if high part of A is 0 */ - mr r10,r3 mullw r9,r3,r5 - mulhwu r3,r3,r5 + mulhwu r10,r3,r5 beq 2f - mullw r0,r10,r6 - mulhwu r8,r10,r6 + mullw r0,r3,r6 + mulhwu r8,r3,r6 addcr7,r0,r7 adder4,r4,r8 - addze r3,r3 + addze r10,r10 2: addcr4,r4,r9 - addze r3,r3 + addze r3,r10 blr /* -- 2.1.0 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [RFC PATCH kernel] powerpc/ioda: Set "read" permission when "write" is set
We finally got the chance to test it end of last week. I forgot to update everyone Monday. B all appearances, the patch fixes the problem. We did not see any new issues with the patch (vs. same test scenarios without). I'll also update the bugzilla. Thanks, Doug On 02/08/2016 07:37 PM, Alexey Kardashevskiy wrote: On 01/20/2016 06:01 AM, Douglas Miller wrote: On 01/18/2016 09:52 PM, Alexey Kardashevskiy wrote: On 01/13/2016 01:24 PM, Douglas Miller wrote: On 01/12/2016 05:07 PM, Benjamin Herrenschmidt wrote: On Tue, 2016-01-12 at 15:40 +1100, Alexey Kardashevskiy wrote: Quite often drivers set only "write" permission assuming that this includes "read" permission as well and this works on plenty platforms. However IODA2 is strict about this and produces an EEH when "read" permission is not and reading happens. This adds a workaround in IODA code to always add the "read" bit when the "write" bit is set. Cc: Benjamin HerrenschmidtSigned-off-by: Alexey Kardashevskiy --- Ben, what was the driver which did not set "read" and caused EEH? aacraid Cheers, Ben. Just to be precise, the driver wasn't responsible for setting READ. The driver called scsi_dma_map() and the scsicmd was set (by scsi layer) as DMA_FROM_DEVICE so the current code would set the permissions to WRITE-ONLY. Previously, and in other architectures, this scsicmd would have resulted in READ+WRITE permissions on the DMA map. Does the patch fix the issue? Thanks. --- arch/powerpc/platforms/powernv/pci.c | 6 ++ 1 file changed, 6 insertions(+) diff --git a/arch/powerpc/platforms/powernv/pci.c b/arch/powerpc/platforms/powernv/pci.c index f2dd772..c7dcae5 100644 --- a/arch/powerpc/platforms/powernv/pci.c +++ b/arch/powerpc/platforms/powernv/pci.c @@ -601,6 +601,9 @@ int pnv_tce_build(struct iommu_table *tbl, long index, long npages, u64 rpn = __pa(uaddr) >> tbl->it_page_shift; long i; +if (proto_tce & TCE_PCI_WRITE) +proto_tce |= TCE_PCI_READ; + for (i = 0; i < npages; i++) { unsigned long newtce = proto_tce | ((rpn + i) << tbl->it_page_shift); @@ -622,6 +625,9 @@ int pnv_tce_xchg(struct iommu_table *tbl, long index, BUG_ON(*hpa & ~IOMMU_PAGE_MASK(tbl)); +if (newtce & TCE_PCI_WRITE) +newtce |= TCE_PCI_READ; + oldtce = xchg(pnv_tce(tbl, idx), cpu_to_be64(newtce)); *hpa = be64_to_cpu(oldtce) & ~(TCE_PCI_READ | TCE_PCI_WRITE); *direction = iommu_tce_direction(oldtce); ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev I am still working on getting a machine to try this on. From code inspection, it looks like it should work. The problem is shortage of machines and machines tied-up by Test. Any progress here? Thanks. ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
RE: PowerPC agpmode issues
Mike and Gerhard, dont think the situation of the pcie powerpc is bettrer.but compared with last years with the new kernels and last xorg on a radeonhd 4650 i have an increase of performance about 250x ... example QuakeSpasm was gaving 640x480157fps on Radeon 4650... Now is 380 fps yes compared the old nvida 7800gtx on Osx 450 fps this results are less but for sure better than before. The worst of last period im facing r600 radeon ring test errors and i cant usewith gpu accel now only in fbdev the 5450 and 6570 that it was perfect working before. Luigi > From: gerhard_pirc...@gmx.net > To: michael.hel...@gmail.com > Subject: Re: PowerPC agpmode issues > Date: Tue, 9 Feb 2016 12:52:15 +0100 > CC: aneesh.ku...@linux.vnet.ibm.com; mic...@daenzer.net; > linuxppc-dev@lists.ozlabs.org; reinhard.bo...@googlemail.com; > bobby.pr...@gmail.com > > > On 9 Feb 2016 03:27, "Mike"wrote: > > Ok, so its quirks to be added then? Something not implemented in KMS > > that was in UMS? > > Reports are that the same issue exsist on PPC Amiga Ones with a VIA > > chipset, and the Pegasos 2 with the Artica s chipset, i posted a > > mail from detailiing that. > Just to avoid some confusion: > Old long story short: the issues for AmigaOnes and the Pegasos _1_ with > ArticiaS northbridge and VIA southbridge are that: > 1. the AGP controller corrupts data transfers in AGP mode (also depending > on the AGP HW request queue size). So there is no official AGP driver that > would require radeon.agpmode=-1. The microA1 is supposed to have a fix > for this HW data corruption, but I yet have to dig out my ArticiaS AGP > driver code for some test runs... > 2. At least the AmigaOne with ArticiaS chip need non-coherent DMA > allocations and/or proper cache flushes to avoid corrupted DMA transfers. > > Nonetheless I had DRI1 working _only_ on my A1SE under Debian Squeeze (i.e. > glxgears could run on the desktop with hardware acceleration), but DRI2 > with its very dynamic GART mapping is a no-go on every first-gen AmigaOne > machine, even if the GART driver test (radeon.test=1) runs through in > PCIGART mode (could it be that it uses a more or less static GART mapping > for the test?). > > > Sure that might be it, but i get different results trying agpmode=1-2-4, > > 2 gave a noisy screen before the hard crash. i find it rather impossible > > to debug at all as the crash happens so fast no logs seem to be written.. > > I think i would need serial... > > I'd personally love nothing more then to see support restored and a > > default as expected working condition ought be the minimum requirement. > > I use a powerbook a1106, 5,6. With a 5,8 on the way. Those are the last > > two revision powerbooks in the 15" series. In swrast they become useless, > > impossible to use for any productivity. Most people trying to use linux > > on ppc for personal use come in macs, with the exception of the Amiga PPC > > crowd now running their amcc 440/460ex or e600 based x500/5000, all of > > which have of course pci-e more cores and more threads. Yet struggle even > > with regressions left and right to keep up with the single core performance > > of the G4's. Sure it's pushing 10 years , but it's the only alternative > > if one wishes to remain mobile. > swrast definitely isn't fun on 10 years old PPC machines. Current Firefox > is already slow enough on these machines... :-) > > > On 9 Feb 2016 02:41, "Michel Dänzer" wrote: > > > On 08.02.2016 22:28, Mike wrote: > > > Certainly 750~800 fps in glxgears vs 3000+ in debian squeeze, i cant > > > bring myself to say that it's an acceptable situation no matter how > > > tired i am of the problem knowing how well the setup could do. It's > > > clear that the implementation is broken for everything but x86, [...] > > > > Why is that? It was working fine on my last-gen PowerBook. AFAIK Darwin > > / OS X never used anything but a static AGP GART mapping though, so it > > seems very likely that the issues with older UniNorth revisions are > > simply due to the hardware being unable to support the usage patterns of > > modern GPU drivers. > > > > That said, if you guys have specific suggestions for a "proper" > > solution, nobody's standing in your way. > I have to admit that I lack the knowledge of the inner workings of the > TTM/radeon code (and its TTM AGP backend) to do any useful work here. > I was hoping that the TMA DMA allocator could be of any help at least for > non-cache coherent machines given that (IIRC) ARM is using it together > with the nuoveau driver on the TEGRA platform, but I guess that would need > some modifications also on the powerpc architecture side (maybe a new > non-coherent DMA allocator that is not limited to 2M virtual address space > for mappings). Thus I guess a lot of things could be improved/fixed, but > nowadays Linux code doesn't seem to be something for the "occasional hobby > hacker". :-) > > regards, > Gerhard >
[powerpc:topic/math-emu 6/9] arch/sh/math-emu/math.c:129:1: warning: ISO C90 forbids mixed declarations and code
tree: https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git topic/math-emu head: 0d351023a638b7c82abdd8d66ebf5b5b3d6cb169 commit: 3b76bfd2f01f3b9101f040878e2636cdb7f3bbdd [6/9] sh/math-emu: Move sh from math-emu-old to math-emu config: sh-allyesconfig (attached as .config) reproduce: wget https://git.kernel.org/cgit/linux/kernel/git/wfg/lkp-tests.git/plain/sbin/make.cross -O ~/bin/make.cross chmod +x ~/bin/make.cross git checkout 3b76bfd2f01f3b9101f040878e2636cdb7f3bbdd # save the attached .config to linux build tree make.cross ARCH=sh All warnings (new ones prefixed by >>): In file included from arch/sh/math-emu/math.c:23:0: include/math-emu/single.h:76:21: warning: "__BIG_ENDIAN" is not defined [-Wundef] In file included from arch/sh/math-emu/math.c:24:0: include/math-emu/double.h:81:22: warning: "__BIG_ENDIAN" is not defined [-Wundef] arch/sh/math-emu/math.c:54:0: warning: "WRITE" redefined [enabled by default] include/linux/fs.h:199:0: note: this is the location of the previous definition arch/sh/math-emu/math.c:55:0: warning: "READ" redefined [enabled by default] include/linux/fs.h:198:0: note: this is the location of the previous definition arch/sh/math-emu/math.c: In function 'fadd': >> arch/sh/math-emu/math.c:129:1: warning: ISO C90 forbids mixed declarations >> and code [-Wdeclaration-after-statement] >> arch/sh/math-emu/math.c:129:1: warning: ISO C90 forbids mixed declarations >> and code [-Wdeclaration-after-statement] arch/sh/math-emu/math.c: In function 'fsub': arch/sh/math-emu/math.c:136:1: warning: ISO C90 forbids mixed declarations and code [-Wdeclaration-after-statement] arch/sh/math-emu/math.c:136:1: warning: ISO C90 forbids mixed declarations and code [-Wdeclaration-after-statement] arch/sh/math-emu/math.c: In function 'fmac': arch/sh/math-emu/math.c:165:1: warning: ISO C90 forbids mixed declarations and code [-Wdeclaration-after-statement] arch/sh/math-emu/math.c:165:1: warning: ISO C90 forbids mixed declarations and code [-Wdeclaration-after-statement] arch/sh/math-emu/math.c:165:1: warning: ISO C90 forbids mixed declarations and code [-Wdeclaration-after-statement] arch/sh/math-emu/math.c:165:1: warning: ISO C90 forbids mixed declarations and code [-Wdeclaration-after-statement] arch/sh/math-emu/math.c:165:1: warning: ISO C90 forbids mixed declarations and code [-Wdeclaration-after-statement] arch/sh/math-emu/math.c: In function 'ffloat': arch/sh/math-emu/math.c:313:1: warning: ISO C90 forbids mixed declarations and code [-Wdeclaration-after-statement] arch/sh/math-emu/math.c:315:1: warning: ISO C90 forbids mixed declarations and code [-Wdeclaration-after-statement] arch/sh/math-emu/math.c: At top level: arch/sh/math-emu/math.c:524:12: warning: 'ieee_fpe_handler' defined but not used [-Wunused-function] vim +129 arch/sh/math-emu/math.c 17 #include 18 #include 19 #include 20 21 #include "sfp-util.h" 22 #include > 23 #include 24 #include 25 26 #define FPUL(fregs->fpul) 27 #define FPSCR (fregs->fpscr) 28 #define FPSCR_RM(FPSCR&3) 29 #define FPSCR_DN((FPSCR>>18)&1) 30 #define FPSCR_PR((FPSCR>>19)&1) 31 #define FPSCR_SZ((FPSCR>>20)&1) 32 #define FPSCR_FR((FPSCR>>21)&1) 33 #define FPSCR_MASK 0x003fUL 34 35 #define BANK(n) (n^(FPSCR_FR?16:0)) 36 #define FR ((unsigned long*)(fregs->fp_regs)) 37 #define FR0 (FR[BANK(0)]) 38 #define FRn (FR[BANK(n)]) 39 #define FRm (FR[BANK(m)]) 40 #define DR ((unsigned long long*)(fregs->fp_regs)) 41 #define DRn (DR[BANK(n)/2]) 42 #define DRm (DR[BANK(m)/2]) 43 44 #define XREG(n) (n^16) 45 #define XFn (FR[BANK(XREG(n))]) 46 #define XFm (FR[BANK(XREG(m))]) 47 #define XDn (DR[BANK(XREG(n))/2]) 48 #define XDm (DR[BANK(XREG(m))/2]) 49 50 #define R0 (regs->regs[0]) 51 #define Rn (regs->regs[n]) 52 #define Rm (regs->regs[m]) 53 54 #define WRITE(d,a) ({if(put_user(d, (typeof (d)*)a)) return -EFAULT;}) 55 #define READ(d,a) ({if(get_user(d, (typeof (d)*)a)) return -EFAULT;}) 56 57 #define PACK_S(r,f) FP_PACK_SP(,f) 58 #define PACK_SEMIRAW_S(r,f) FP_PACK_SEMIRAW_SP(,f) 59 #define PACK_RAW_S(r,f) FP_PACK_RAW_SP(,f) 60 #define UNPACK_S(f,r) FP_UNPACK_SP(f,) 61 #define UNPACK_SEMIRAW_S(f,r) FP_UNPACK_SEMIRAW_SP(f,) 62 #define UNPACK_RAW_S(f,r) FP_UNPACK_RAW_SP(f,) 63 #define PACK_D(r,f) \ 64 {u32 t[2]; FP_PACK_DP(t,f); ((u32*))[0]=t[1]; ((u32*))[1]=t[0];} 65 #define PACK_SEMIRAW_D(r,f) \ 66 {u32 t[2]; FP_PACK_SEMIRAW_DP(t,f); ((u32*))[0]=t[1]; \ 67 ((u32*))[1]=t[0];}
enable kdump capture kernel functionality
Hi, I am trying to rebuild the kernel for one of our powerpc devices to include kdump capture kernel functionality, but I'm having trouble getting it to load for use on panic. I'm running Linux 3.14.60 on e500v2 (COMX-P2020 module). I've tried following the documentation at https://www.kernel.org/doc/Documentation/kdump/kdump.txt: CONFIG_KEXEC=y CONFIG_CRASH_DUMP=y CONFIG_NONSTATIC_KERNEL=y CONFIG_PROC_VMCORE=y CONFIG_RELOCATABLE=y CONFIG_RELOCATABLE_PPC32=y These are the parameters I'm using to boot the kernel: > cat /proc/cmdline root=/dev/mmcblk0p2 rw rootdelay=15 ip=10.215.181.92::10.215.180.1:255.255.254.0:XSTREAM-DEV2:eth0 loglevel=0 mfgstring=1:XF40-LB-SS-MMDDYY console=ttyS0,115200 ramdisk_size=70 cache-sram-size=0x1 crashkernel=0x0400@0x0400 slub_debug=FPZ Running the following command: > kexec -l /boot/uImage -t uImage-ppc --dtb=/boot/comx.dtb > --initrd=/boot/initramfs_data.cpio --command-line="root=/dev/mmcblk0p2 3 > maxcpus=1 irqpoll noirqdistrib reset_devices rw rootdelay=15 > ip=10.215.181.92::10.215.180.1:255.255.254.0:XSTREAM-DEV2:eth0 loglevel=0 > mfgstring=1:XF40-LB-SS-MMDDYY console=ttyS0,115200 ramdisk_size=70 > cache-sram-size=0x1" Returns: Can't add kernel to addr 0x len 0 Cannot load /boot/uImage I'm not sure what mechanism is used to load the kernel and where this can be configured. I assumed that the crashkernel parameter from the cmdline is the actual setting. Where should I set this value? Also, how can I ensure that the two kernel images don't overlap? Thanks in advance, Cosmin ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH v3 13/18] cxl: sysfs support for guests
Le 08/02/2016 04:02, Stewart Smith a écrit : Frederic Barratwrites: --- a/Documentation/ABI/testing/sysfs-class-cxl +++ b/Documentation/ABI/testing/sysfs-class-cxl @@ -183,7 +183,7 @@ Description:read only Identifies the revision level of the PSL. Users:https://github.com/ibm-capi/libcxl -What: /sys/class/cxl//base_image +What: /sys/class/cxl//base_image (not in a guest) Is this going to be the case for KVM guest as well as PowerVM guest? That's too early to say. The entries we've removed are because the information is filtered by pHyp and not available to the OS. Some of it because nobody thought it would be useful, some of it because it's not meant to be seen by the OS. For KVM, if the card can be shared between guests, I would expect the same kind of restrictions. Fred ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: powerpc/xmon: Add xmon command to dump process/task similar to ps(1)
That looks fine to me. Thanks! On 02/09/2016 04:58 AM, Michael Ellerman wrote: On Mon, 2015-23-11 at 15:01:15 UTC, Douglas Miller wrote: Add 'P' command with optional task_struct address to dump all/one task's information: task pointer, kernel stack pointer, PID, PPID, state (interpreted), CPU where (last) running, and command. Introduce XMON_PROTECT macro to standardize memory-access-fault protection (setjmp). Initially used only by the 'P' command. Hi Doug, Sorry this has taken a while, it keeps getting preempted by more important patches. I'm also not a big fan of the protect macro, it works for this case, but it's already a bit ugly calling for_each_process() inside the macro, and it would be even worse for multi line logic. I think I'd rather just open code it, and hopefully we can come up with a better solution for catching errors in the long run. I also renamed the routines to use "task", because "proc" in xmon is already used to mean "procedure", and the struct is task_struct after all. How does this look? cheers diff --git a/arch/powerpc/xmon/xmon.c b/arch/powerpc/xmon/xmon.c index 47e195d66a9a..942796fa4767 100644 --- a/arch/powerpc/xmon/xmon.c +++ b/arch/powerpc/xmon/xmon.c @@ -163,6 +163,7 @@ static int cpu_cmd(void); static void csum(void); static void bootcmds(void); static void proccall(void); +static void show_tasks(void); void dump_segments(void); static void symbol_lookup(void); static void xmon_show_stack(unsigned long sp, unsigned long lr, @@ -238,6 +239,7 @@ Commands:\n\ mz zero a block of memory\n\ mi show information about memory allocation\n\ p call a procedure\n\ + Plist processes/tasks\n\ r print registers\n\ s single step\n" #ifdef CONFIG_SPU_BASE @@ -967,6 +969,9 @@ cmds(struct pt_regs *excp) case 'p': proccall(); break; + case 'P': + show_tasks(); + break; #ifdef CONFIG_PPC_STD_MMU case 'u': dump_segments(); @@ -2566,6 +2571,61 @@ memzcan(void) printf("%.8x\n", a - mskip); } +static void show_task(struct task_struct *tsk) +{ + char state; + + /* +* Cloned from kdb_task_state_char(), which is not entirely +* appropriate for calling from xmon. This could be moved +* to a common, generic, routine used by both. +*/ + state = (tsk->state == 0) ? 'R' : + (tsk->state < 0) ? 'U' : + (tsk->state & TASK_UNINTERRUPTIBLE) ? 'D' : + (tsk->state & TASK_STOPPED) ? 'T' : + (tsk->state & TASK_TRACED) ? 'C' : + (tsk->exit_state & EXIT_ZOMBIE) ? 'Z' : + (tsk->exit_state & EXIT_DEAD) ? 'E' : + (tsk->state & TASK_INTERRUPTIBLE) ? 'S' : '?'; + + printf("%p %016lx %6d %6d %c %2d %s\n", tsk, + tsk->thread.ksp, + tsk->pid, tsk->parent->pid, + state, task_thread_info(tsk)->cpu, + tsk->comm); +} + +static void show_tasks(void) +{ + unsigned long tskv; + struct task_struct *tsk = NULL; + + printf(" task_struct ->thread.kspPID PPID S P CMD\n"); + + if (scanhex()) + tsk = (struct task_struct *)tskv; + + if (setjmp(bus_error_jmp) != 0) { + catch_memory_errors = 0; + printf("*** Error dumping task %p\n", tsk); + return; + } + + catch_memory_errors = 1; + sync(); + + if (tsk) + show_task(tsk); + else + for_each_process(tsk) + show_task(tsk); + + sync(); + __delay(200); + catch_memory_errors = 0; +} + static void proccall(void) { unsigned long args[8]; ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH V2 00/29] Book3s abstraction in preparation for new MMU model
Hi Scott, I missed adding you on CC:, Can you take a look at this and make sure we are not breaking anything on freescale. "Aneesh Kumar K.V"writes: > Hello, > > This is a large series, mostly consisting of code movement. No new features > are done in this series. The changes are done to accomodate the upcoming new > memory > model in future powerpc chips. The details of the new MMU model can be found > at > > http://ibm.biz/power-isa3 (Needs registration). I am including a summary of > the changes below. > > ISA 3.0 adds support for the radix tree style of MMU with full > virtualization and related control mechanisms that manage its > coexistence with the HPT. Radix-using operating systems will > manage their own translation tables instead of relying on hcalls. > > Radix style MMU model requires us to do a 4 level page table > with 64K and 4K page size. The table index size different page size > is listed below > > PGD -> 13 bits > PUD -> 9 (1G hugepage) > PMD -> 9 (2M huge page) > PTE -> 5 (for 64k), 9 (for 4k) > > We also require the page table to be in big endian format. > > The changes proposed in this series enables us to support both > hash page table and radix tree style MMU using a single kernel > with limited impact. The idea is to change core page table > accessors to static inline functions and later hotpatch them > to switch to hash or radix tree functions. For ex: > > static inline int pte_write(pte_t pte) > { >if (radix_enabled()) >return rpte_write(pte); > return hlpte_write(pte); > } > > On boot we will hotpatch the code so as to avoid conditional operation. > > The other two major change propsed in this series is to switch hash > linux page table to a 4 level table in big endian format. This is > done so that functions like pte_val(), pud_populate() doesn't need > hotpatching and thereby helps in limiting runtime impact of the changes. > > I didn't included the radix related changes in this series. You can > find them at https://github.com/kvaneesh/linux/commits/radix-mmu-v1 > > Changes from V1: > * move patches adding helpers to the next series > Thanks -aneesh ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: PowerPC agpmode issues
> On 9 Feb 2016 03:27, "Mike"wrote: > Ok, so its quirks to be added then? Something not implemented in KMS > that was in UMS? > Reports are that the same issue exsist on PPC Amiga Ones with a VIA > chipset, and the Pegasos 2 with the Artica s chipset, i posted a > mail from detailiing that. Just to avoid some confusion: Old long story short: the issues for AmigaOnes and the Pegasos _1_ with ArticiaS northbridge and VIA southbridge are that: 1. the AGP controller corrupts data transfers in AGP mode (also depending on the AGP HW request queue size). So there is no official AGP driver that would require radeon.agpmode=-1. The microA1 is supposed to have a fix for this HW data corruption, but I yet have to dig out my ArticiaS AGP driver code for some test runs... 2. At least the AmigaOne with ArticiaS chip need non-coherent DMA allocations and/or proper cache flushes to avoid corrupted DMA transfers. Nonetheless I had DRI1 working _only_ on my A1SE under Debian Squeeze (i.e. glxgears could run on the desktop with hardware acceleration), but DRI2 with its very dynamic GART mapping is a no-go on every first-gen AmigaOne machine, even if the GART driver test (radeon.test=1) runs through in PCIGART mode (could it be that it uses a more or less static GART mapping for the test?). > Sure that might be it, but i get different results trying agpmode=1-2-4, > 2 gave a noisy screen before the hard crash. i find it rather impossible > to debug at all as the crash happens so fast no logs seem to be written.. > I think i would need serial... > I'd personally love nothing more then to see support restored and a > default as expected working condition ought be the minimum requirement. > I use a powerbook a1106, 5,6. With a 5,8 on the way. Those are the last > two revision powerbooks in the 15" series. In swrast they become useless, > impossible to use for any productivity. Most people trying to use linux > on ppc for personal use come in macs, with the exception of the Amiga PPC > crowd now running their amcc 440/460ex or e600 based x500/5000, all of > which have of course pci-e more cores and more threads. Yet struggle even > with regressions left and right to keep up with the single core performance > of the G4's. Sure it's pushing 10 years , but it's the only alternative > if one wishes to remain mobile. swrast definitely isn't fun on 10 years old PPC machines. Current Firefox is already slow enough on these machines... :-) > On 9 Feb 2016 02:41, "Michel Dänzer" wrote: > > On 08.02.2016 22:28, Mike wrote: > > Certainly 750~800 fps in glxgears vs 3000+ in debian squeeze, i cant > > bring myself to say that it's an acceptable situation no matter how > > tired i am of the problem knowing how well the setup could do. It's > > clear that the implementation is broken for everything but x86, [...] > > Why is that? It was working fine on my last-gen PowerBook. AFAIK Darwin > / OS X never used anything but a static AGP GART mapping though, so it > seems very likely that the issues with older UniNorth revisions are > simply due to the hardware being unable to support the usage patterns of > modern GPU drivers. > > That said, if you guys have specific suggestions for a "proper" > solution, nobody's standing in your way. I have to admit that I lack the knowledge of the inner workings of the TTM/radeon code (and its TTM AGP backend) to do any useful work here. I was hoping that the TMA DMA allocator could be of any help at least for non-cache coherent machines given that (IIRC) ARM is using it together with the nuoveau driver on the TEGRA platform, but I guess that would need some modifications also on the powerpc architecture side (maybe a new non-coherent DMA allocator that is not limited to 2M virtual address space for mappings). Thus I guess a lot of things could be improved/fixed, but nowadays Linux code doesn't seem to be something for the "occasional hobby hacker". :-) regards, Gerhard ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [V3] powerpc/mm: Fix Multi hit ERAT cause by recent THP update
On Tue, 2016-09-02 at 01:20:31 UTC, "Aneesh Kumar K.V" wrote: > With ppc64 we use the deposited pgtable_t to store the hash pte slot > information. We should not withdraw the deposited pgtable_t without > marking the pmd none. This ensure that low level hash fault handling > will skip this huge pte and we will handle them at upper levels. > > Recent change to pmd splitting changed the above in order to handle the > race between pmd split and exit_mmap. The race is explained below. > > Consider following race: > > CPU0CPU1 > shrink_page_list() > add_to_swap() > split_huge_page_to_list() > __split_huge_pmd_locked() > pmdp_huge_clear_flush_notify() > // pmd_none() == true > exit_mmap() > unmap_vmas() > zap_pmd_range() > // no action on pmd since > pmd_none() == true > pmd_populate() > > As result the THP will not be freed. The leak is detected by check_mm(): > > BUG: Bad rss-counter state mm:880058d2e580 idx:1 val:512 > > The above required us to not mark pmd none during a pmd split. > > The fix for ppc is to clear the huge pte of _PAGE_USER, so that low > level fault handling code skip this pte. At higher level we do take ptl > lock. That should serialze us against the pmd split. Once the lock is > acquired we do check the pmd again using pmd_same. That should always > return false for us and hence we should retry the access. We do the > pmd_same check in all case after taking plt with > THP (do_huge_pmd_wp_page, do_huge_pmd_numa_page and > huge_pmd_set_accessed) > > Also make sure we wait for irq disable section in other cpus to finish > before flipping a huge pte entry with a regular pmd entry. Code paths > like find_linux_pte_or_hugepte depend on irq disable to get > a stable pte_t pointer. A parallel thp split need to make sure we > don't convert a pmd pte to a regular pmd entry without waiting for the > irq disable section to finish. > > Acked-by: Kirill A. Shutemov> Signed-off-by: Aneesh Kumar K.V Applied to powerpc fixes, thanks. https://git.kernel.org/powerpc/c/9db4cd6c21535a4846b38808f3 cheers ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [v3,2/2] powerpc: tracing: don't trace hcalls on offline CPUs
On Mon, 2015-14-12 at 20:18:06 UTC, Denis Kirjanov wrote: > ./drmgr -c cpu -a -r gives the following warning: > > ... Applied to powerpc fixes, thanks. https://git.kernel.org/powerpc/c/168a20bb35122539682671d15c cheers ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: powerpc: fix dedotify for binutils >= 2.26
On Fri, 2016-05-02 at 18:50:03 UTC, Andreas Schwab wrote: > Since binutils 2.26 BFD is doing suffix merging on STRTAB sections. But > dedotify modifies the symbol names in place, which can also modify > unrelated symbols with a name that matches a suffix of a dotted name. To > remove the leading dot of a symbol name we can just increment the pointer > into the STRTAB section instead. > > Signed-off-by: Andreas SchwabApplied to powerpc fixes, thanks. https://git.kernel.org/powerpc/c/f15838e9cac8f78f0cc506529b cheers ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH v7 00/23] powerpc/8xx: Use large pages for RAM and IMMR and other improvments
Le 09/02/2016 11:23, Christophe Leroy a écrit : The main purpose of this patchset is to dramatically reduce the time spent in DTLB miss handler. This is achieved by: 1/ Mapping RAM with 8M pages 2/ Mapping IMMR with a fixed 512K page Change in v7: * Don't include x_block_mapped() from compilation in arch/powerpc/mm/fsl_booke_mmu.c when CONFIG_FSL_BOOKE is not set (reported by kbuild test robot) Please don't apply it, for some reason the modification supposed to be in v7 didn't get in. I will submit v8. Sorry for the noise. Christophe ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH v8 02/23] powerpc/8xx: Map linear kernel RAM with 8M pages
On a live running system (VoIP gateway for Air Trafic Control), over a 10 minutes period (with 277s idle), we get 87 millions DTLB misses and approximatly 35 secondes are spent in DTLB handler. This represents 5.8% of the overall time and even 10.8% of the non-idle time. Among those 87 millions DTLB misses, 15% are on user addresses and 85% are on kernel addresses. And within the kernel addresses, 93% are on addresses from the linear address space and only 7% are on addresses from the virtual address space. MPC8xx has no BATs but it has 8Mb page size. This patch implements mapping of kernel RAM using 8Mb pages, on the same model as what is done on the 40x. In 4k pages mode, each PGD entry maps a 4Mb area: we map every two entries to the same 8Mb physical page. In each second entry, we add 4Mb to the page physical address to ease life of the FixupDAR routine. This is just ignored by HW. In 16k pages mode, each PGD entry maps a 64Mb area: each PGD entry will point to the first page of the area. The DTLB handler adds the 3 bits from EPN to map the correct page. With this patch applied, we now get only 13 millions TLB misses during the 10 minutes period. The idle time has increased to 313s and the overall time spent in DTLB miss handler is 6.3s, which represents 1% of the overall time and 2.2% of non-idle time. Signed-off-by: Christophe Leroy--- v2: using bt instead of bgt and named the label explicitly v3: no change v4: no change v5: removed use of pmd_val() as L-value v6: no change v8: no change arch/powerpc/kernel/head_8xx.S | 35 +- arch/powerpc/mm/8xx_mmu.c | 83 ++ arch/powerpc/mm/Makefile | 1 + arch/powerpc/mm/mmu_decl.h | 15 ++-- 4 files changed, 120 insertions(+), 14 deletions(-) create mode 100644 arch/powerpc/mm/8xx_mmu.c diff --git a/arch/powerpc/kernel/head_8xx.S b/arch/powerpc/kernel/head_8xx.S index a89492e..87d1f5f 100644 --- a/arch/powerpc/kernel/head_8xx.S +++ b/arch/powerpc/kernel/head_8xx.S @@ -398,11 +398,13 @@ DataStoreTLBMiss: BRANCH_UNLESS_KERNEL(3f) lis r11, (swapper_pg_dir-PAGE_OFFSET)@ha 3: - mtcrr3 /* Insert level 1 index */ rlwimi r11, r10, 32 - ((PAGE_SHIFT - 2) << 1), (PAGE_SHIFT - 2) << 1, 29 lwz r11, (swapper_pg_dir-PAGE_OFFSET)@l(r11)/* Get the level 1 entry */ + mtcrr11 + bt- 28,DTLBMiss8M /* bit 28 = Large page (8M) */ + mtcrr3 /* We have a pte table, so load fetch the pte from the table. */ @@ -455,6 +457,29 @@ DataStoreTLBMiss: EXCEPTION_EPILOG_0 rfi +DTLBMiss8M: + mtcrr3 + ori r11, r11, MD_SVALID + MTSPR_CPU6(SPRN_MD_TWC, r11, r3) +#ifdef CONFIG_PPC_16K_PAGES + /* +* In 16k pages mode, each PGD entry defines a 64M block. +* Here we select the 8M page within the block. +*/ + rlwimi r11, r10, 0, 0x0380 +#endif + rlwinm r10, r11, 0, 0xff80 + ori r10, r10, 0xf0 | MD_SPS16K | _PAGE_SHARED | _PAGE_DIRTY | \ + _PAGE_PRESENT + MTSPR_CPU6(SPRN_MD_RPN, r10, r3)/* Update TLB entry */ + + li r11, RPN_PATTERN + mfspr r3, SPRN_SPRG_SCRATCH2 + mtspr SPRN_DAR, r11 /* Tag DAR */ + EXCEPTION_EPILOG_0 + rfi + + /* This is an instruction TLB error on the MPC8xx. This could be due * to many reasons, such as executing guarded memory or illegal instruction * addresses. There is nothing to do but handle a big time error fault. @@ -532,13 +557,15 @@ FixupDAR:/* Entry point for dcbx workaround. */ /* Insert level 1 index */ 3: rlwimi r11, r10, 32 - ((PAGE_SHIFT - 2) << 1), (PAGE_SHIFT - 2) << 1, 29 lwz r11, (swapper_pg_dir-PAGE_OFFSET)@l(r11)/* Get the level 1 entry */ + mtcrr11 + bt 28,200f /* bit 28 = Large page (8M) */ rlwinm r11, r11,0,0,19 /* Extract page descriptor page address */ /* Insert level 2 index */ rlwimi r11, r10, 32 - (PAGE_SHIFT - 2), 32 - PAGE_SHIFT, 29 lwz r11, 0(r11) /* Get the pte */ /* concat physical page address(r11) and page offset(r10) */ rlwimi r11, r10, 0, 32 - PAGE_SHIFT, 31 - lwz r11,0(r11) +201: lwz r11,0(r11) /* Check if it really is a dcbx instruction. */ /* dcbt and dcbtst does not generate DTLB Misses/Errors, * no need to include them here */ @@ -557,6 +584,10 @@ FixupDAR:/* Entry point for dcbx workaround. */ 141: mfspr r10,SPRN_SPRG_SCRATCH2 b DARFixed/* Nope, go back to normal TLB processing */ + /* concat physical page address(r11) and page offset(r10) */ +200: rlwimi r11, r10, 0, 32 - (PAGE_SHIFT << 1), 31 + b 201b + 144: mfspr r10, SPRN_DSISR rlwinm r10, r10,0,7,5 /* Clear store bit for buggy dcbst insn */ mtspr
[PATCH v8 08/23] powerpc/8xx: Map IMMR area with 512k page at a fixed address
Once the linear memory space has been mapped with 8Mb pages, as seen in the related commit, we get 11 millions DTLB missed during the reference 600s period. 77% of the misses are on user addresses and 23% are on kernel addresses (1 fourth for linear address space and 3 fourth for virtual address space) Traditionaly, each driver manages one computer board which has its own components with its own memory maps. But on embedded chips like the MPC8xx, the SOC has all registers located in the same IO area. When looking at ioremaps done during startup, we see that many drivers are re-mapping small parts of the IMMR for their own use and all those small pieces gets their own 4k page, amplifying the number of TLB misses: in our system we get 0xff00 mapped 31 times and 0xff003000 mapped 9 times. Even if each part of IMMR was mapped only once with 4k pages, it would still be several small mappings towards linear area. With the patch, on the same principle as what was done for the RAM, the IMMR gets mapped by a 512k page. In 4k pages mode, we reserve a 4Mb area for mapping IMMR. The TLB miss handler checks that we are within the first 512k and bail out with page not marked valid if we are outside In 16k pages mode, it is not realistic to reserve a 64Mb area, so we do a standard mapping of the 512k area using 32 pages of 16k. The CPM will be mapped via the first two pages, and the SEC engine will be mapped via the 16th and 17th pages. As the pages are marked guarded, there will be no speculative accesses. With this patch applied, the number of DTLB misses during the 10 min period is reduced to 11.8 millions for a duration of 5.8s, which represents 2% of the non-idle time hence yet another 10% reduction. Signed-off-by: Christophe Leroy--- v2: - using bt instead of blt/bgt - reorganised in order to have only one taken branch for both 512k and 8M instead of a first branch for both 8M and 512k then a second branch for 512k v3: - using fixmap - using the new x_block_mapped() functions v4: no change v5: no change v6: removed use of pmd_val() as L-value v8: no change arch/powerpc/include/asm/fixmap.h | 9 ++- arch/powerpc/kernel/head_8xx.S| 36 +- arch/powerpc/mm/8xx_mmu.c | 53 +++ arch/powerpc/mm/mmu_decl.h| 3 ++- 4 files changed, 98 insertions(+), 3 deletions(-) diff --git a/arch/powerpc/include/asm/fixmap.h b/arch/powerpc/include/asm/fixmap.h index d7dd8fb..b954dc3 100644 --- a/arch/powerpc/include/asm/fixmap.h +++ b/arch/powerpc/include/asm/fixmap.h @@ -52,12 +52,19 @@ enum fixed_addresses { FIX_KMAP_END = FIX_KMAP_BEGIN+(KM_TYPE_NR*NR_CPUS)-1, #endif #ifdef CONFIG_PPC_8xx - /* For IMMR we need an aligned 512K area */ FIX_IMMR_START, +#ifdef CONFIG_PPC_4K_PAGES + /* For IMMR we need an aligned 4M area (full PGD entry) */ + FIX_IMMR_TOP = (FIX_IMMR_START - 1 + ((4 * 1024 * 1024) / PAGE_SIZE)) & + ~(((4 * 1024 * 1024) / PAGE_SIZE) - 1), + FIX_IMMR_BASE = FIX_IMMR_TOP - 1 + ((4 * 1024 * 1024) / PAGE_SIZE), +#else + /* For IMMR we need an aligned 512K area */ FIX_IMMR_TOP = (FIX_IMMR_START - 1 + ((512 * 1024) / PAGE_SIZE)) & ~(((512 * 1024) / PAGE_SIZE) - 1), FIX_IMMR_BASE = FIX_IMMR_TOP - 1 + ((512 * 1024) / PAGE_SIZE), #endif +#endif /* FIX_PCIE_MCFG, */ __end_of_fixed_addresses }; diff --git a/arch/powerpc/kernel/head_8xx.S b/arch/powerpc/kernel/head_8xx.S index 09173ae..ae721a1 100644 --- a/arch/powerpc/kernel/head_8xx.S +++ b/arch/powerpc/kernel/head_8xx.S @@ -254,6 +254,37 @@ DataAccess: . = 0x400 InstructionAccess: +/* + * Bottom part of DTLBMiss handler for 512k pages + * not enough space in the primary location + */ +#ifdef CONFIG_PPC_4K_PAGES +/* + * 512k pages are only used for mapping IMMR area in 4K pages mode. + * Only map the first 512k page of the 4M area covered by the PGD entry. + * This should not happen, but if we are called for another page of that + * area, don't mark it valid + * + * In 16k pages mode, IMMR is directly mapped with 16k pages + */ +DTLBMiss512k: + rlwinm. r10, r10, 0, 0x0038 + bne-1f + ori r11, r11, MD_SVALID +1: mtcrr3 + MTSPR_CPU6(SPRN_MD_TWC, r11, r3) + rlwinm r10, r11, 0, 0xffc0 + ori r10, r10, 0xf0 | MD_SPS16K | _PAGE_SHARED | _PAGE_DIRTY | \ + _PAGE_PRESENT | _PAGE_NO_CACHE + MTSPR_CPU6(SPRN_MD_RPN, r10, r3)/* Update TLB entry */ + + li r11, RPN_PATTERN + mfspr r3, SPRN_SPRG_SCRATCH2 + mtspr SPRN_DAR, r11 /* Tag DAR */ + EXCEPTION_EPILOG_0 + rfi +#endif + /* External interrupt */ EXCEPTION(0x500, HardwareInterrupt, do_IRQ, EXC_XFER_LITE) @@ -405,6 +436,9 @@ DataStoreTLBMiss: lwz r11, (swapper_pg_dir-PAGE_OFFSET)@l(r11)/* Get the level 1 entry
[PATCH v8 11/23] powerpc32: Remove useless/wrong MMU:setio progress message
Commit 771168494719 ("[POWERPC] Remove unused machine call outs") removed the call to setup_io_mappings(), so remove the associated progress line message Signed-off-by: Christophe Leroy--- v2: no change v3: no change v4: no change v5: no change v6: no change v8: no change arch/powerpc/mm/init_32.c | 4 1 file changed, 4 deletions(-) diff --git a/arch/powerpc/mm/init_32.c b/arch/powerpc/mm/init_32.c index 1a18e4b..4eb1b8f 100644 --- a/arch/powerpc/mm/init_32.c +++ b/arch/powerpc/mm/init_32.c @@ -178,10 +178,6 @@ void __init MMU_init(void) /* Initialize early top-down ioremap allocator */ ioremap_bot = IOREMAP_TOP; - /* Map in I/O resources */ - if (ppc_md.progress) - ppc_md.progress("MMU:setio", 0x302); - if (ppc_md.progress) ppc_md.progress("MMU:exit", 0x211); -- 2.1.0 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH v8 00/23] powerpc/8xx: Use large pages for RAM and IMMR and other improvments
The main purpose of this patchset is to dramatically reduce the time spent in DTLB miss handler. This is achieved by: 1/ Mapping RAM with 8M pages 2/ Mapping IMMR with a fixed 512K page On a live running system (VoIP gateway for Air Trafic Control), over a 10 minutes period (with 277s idle), we get 87 millions DTLB misses and approximatly 35 secondes are spent in DTLB handler. This represents 5.8% of the overall time and even 10.8% of the non-idle time. Among those 87 millions DTLB misses, 15% are on user addresses and 85% are on kernel addresses. And within the kernel addresses, 93% are on addresses from the linear address space and only 7% are on addresses from the virtual address space. Once the full patchset applied, the number of DTLB misses during the period is reduced to 11.8 millions for a duration of 5.8s, which represents 2% of the non-idle time. This patch also includes other miscellaneous improvements: 1/ Handling of CPU6 ERRATA directly in mtspr() C macro to reduce code specific to PPC8xx 2/ Rewrite of a few non critical ASM functions in C 3/ Removal of some unused items See related patches for details Main changes in v3: * Using fixmap instead of fix address for mapping IMMR Change in v4: * Fix of a wrong #if notified by kbuild robot in 07/23 Change in v5: * Removed use of pmd_val() as L-value * Adapted to match the new include files layout in Linux 4.5 Change in v6: * Removed remaining use of pmd_val() as L-value (reported by kbuild test robot) Change in v7: * No change (commit error) Change in v8: * Don't include x_block_mapped() from compilation in arch/powerpc/mm/fsl_booke_mmu.c when CONFIG_FSL_BOOKE is not set (reported by kbuild test robot) Christophe Leroy (23): powerpc/8xx: Save r3 all the time in DTLB miss handler powerpc/8xx: Map linear kernel RAM with 8M pages powerpc: Update documentation for noltlbs kernel parameter powerpc/8xx: move setup_initial_memory_limit() into 8xx_mmu.c powerpc32: Fix pte_offset_kernel() to return NULL for bad pages powerpc32: refactor x_mapped_by_bats() and x_mapped_by_tlbcam() together powerpc/8xx: Fix vaddr for IMMR early remap powerpc/8xx: Map IMMR area with 512k page at a fixed address powerpc/8xx: CONFIG_PIN_TLB unneeded for CONFIG_PPC_EARLY_DEBUG_CPM powerpc/8xx: map more RAM at startup when needed powerpc32: Remove useless/wrong MMU:setio progress message powerpc32: remove ioremap_base powerpc/8xx: Add missing SPRN defines into reg_8xx.h powerpc/8xx: Handle CPU6 ERRATA directly in mtspr() macro powerpc/8xx: remove special handling of CPU6 errata in set_dec() powerpc/8xx: rewrite set_context() in C powerpc/8xx: rewrite flush_instruction_cache() in C powerpc: add inline functions for cache related instructions powerpc32: Remove clear_pages() and define clear_page() inline powerpc32: move x_dcache_range() functions inline powerpc: Simplify test in __dma_sync() powerpc32: small optimisation in flush_icache_range() powerpc32: Remove one insn in mulhdu Documentation/kernel-parameters.txt | 2 +- arch/powerpc/Kconfig.debug | 1 - arch/powerpc/include/asm/cache.h | 19 +++ arch/powerpc/include/asm/cacheflush.h| 52 ++- arch/powerpc/include/asm/fixmap.h| 14 ++ arch/powerpc/include/asm/mmu-8xx.h | 4 +- arch/powerpc/include/asm/nohash/32/pgtable.h | 5 +- arch/powerpc/include/asm/page_32.h | 17 ++- arch/powerpc/include/asm/reg.h | 2 + arch/powerpc/include/asm/reg_8xx.h | 93 arch/powerpc/include/asm/time.h | 6 +- arch/powerpc/kernel/asm-offsets.c| 8 ++ arch/powerpc/kernel/head_8xx.S | 207 +-- arch/powerpc/kernel/misc_32.S| 107 ++ arch/powerpc/kernel/ppc_ksyms.c | 2 + arch/powerpc/kernel/ppc_ksyms_32.c | 1 - arch/powerpc/mm/8xx_mmu.c| 190 arch/powerpc/mm/Makefile | 1 + arch/powerpc/mm/dma-noncoherent.c| 2 +- arch/powerpc/mm/fsl_booke_mmu.c | 6 +- arch/powerpc/mm/init_32.c| 23 --- arch/powerpc/mm/mmu_decl.h | 34 +++-- arch/powerpc/mm/pgtable_32.c | 47 +- arch/powerpc/mm/ppc_mmu_32.c | 4 +- arch/powerpc/platforms/embedded6xx/mpc10x.h | 10 -- arch/powerpc/sysdev/cpm_common.c | 15 +- 26 files changed, 585 insertions(+), 287 deletions(-) create mode 100644 arch/powerpc/mm/8xx_mmu.c -- 2.1.0 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH v8 01/23] powerpc/8xx: Save r3 all the time in DTLB miss handler
We are spending between 40 and 160 cycles with a mean of 65 cycles in the DTLB handling routine (measured with mftbl) so make it more simple althought it adds one instruction. With this modification, we get three registers available at all time, which will help with following patch. Signed-off-by: Christophe Leroy--- v2: no change v3: no change v4: no change v5: no change v6: no change v8: no change arch/powerpc/kernel/head_8xx.S | 13 - 1 file changed, 4 insertions(+), 9 deletions(-) diff --git a/arch/powerpc/kernel/head_8xx.S b/arch/powerpc/kernel/head_8xx.S index e629e28..a89492e 100644 --- a/arch/powerpc/kernel/head_8xx.S +++ b/arch/powerpc/kernel/head_8xx.S @@ -385,23 +385,20 @@ InstructionTLBMiss: . = 0x1200 DataStoreTLBMiss: -#ifdef CONFIG_8xx_CPU6 mtspr SPRN_SPRG_SCRATCH2, r3 -#endif EXCEPTION_PROLOG_0 - mfcrr10 + mfcrr3 /* If we are faulting a kernel address, we have to use the * kernel page tables. */ - mfspr r11, SPRN_MD_EPN - IS_KERNEL(r11, r11) + mfspr r10, SPRN_MD_EPN + IS_KERNEL(r11, r10) mfspr r11, SPRN_M_TW /* Get level 1 table */ BRANCH_UNLESS_KERNEL(3f) lis r11, (swapper_pg_dir-PAGE_OFFSET)@ha 3: - mtcrr10 - mfspr r10, SPRN_MD_EPN + mtcrr3 /* Insert level 1 index */ rlwimi r11, r10, 32 - ((PAGE_SHIFT - 2) << 1), (PAGE_SHIFT - 2) << 1, 29 @@ -453,9 +450,7 @@ DataStoreTLBMiss: MTSPR_CPU6(SPRN_MD_RPN, r10, r3)/* Update TLB entry */ /* Restore registers */ -#ifdef CONFIG_8xx_CPU6 mfspr r3, SPRN_SPRG_SCRATCH2 -#endif mtspr SPRN_DAR, r11 /* Tag DAR */ EXCEPTION_EPILOG_0 rfi -- 2.1.0 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH v8 03/23] powerpc: Update documentation for noltlbs kernel parameter
Now the noltlbs kernel parameter is also applicable to PPC8xx Signed-off-by: Christophe Leroy--- v2: no change v3: no change v4: no change v5: no change v6: no change v8: no change Documentation/kernel-parameters.txt | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt index 59e1515..c3e420b 100644 --- a/Documentation/kernel-parameters.txt +++ b/Documentation/kernel-parameters.txt @@ -2592,7 +2592,7 @@ bytes respectively. Such letter suffixes can also be entirely omitted. nolapic_timer [X86-32,APIC] Do not use the local APIC timer. noltlbs [PPC] Do not use large page/tlb entries for kernel - lowmem mapping on PPC40x. + lowmem mapping on PPC40x and PPC8xx nomca [IA-64] Disable machine check abort handling -- 2.1.0 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH v8 04/23] powerpc/8xx: move setup_initial_memory_limit() into 8xx_mmu.c
Now we have a 8xx specific .c file for that so put it in there as other powerpc variants do Signed-off-by: Christophe Leroy--- v2: no change v3: no change v4: no change v5: no change v6: no change v8: no change arch/powerpc/mm/8xx_mmu.c | 17 + arch/powerpc/mm/init_32.c | 19 --- 2 files changed, 17 insertions(+), 19 deletions(-) diff --git a/arch/powerpc/mm/8xx_mmu.c b/arch/powerpc/mm/8xx_mmu.c index 2d42745..a84f5eb 100644 --- a/arch/powerpc/mm/8xx_mmu.c +++ b/arch/powerpc/mm/8xx_mmu.c @@ -81,3 +81,20 @@ unsigned long __init mmu_mapin_ram(unsigned long top) return mapped; } + +void setup_initial_memory_limit(phys_addr_t first_memblock_base, + phys_addr_t first_memblock_size) +{ + /* We don't currently support the first MEMBLOCK not mapping 0 +* physical on those processors +*/ + BUG_ON(first_memblock_base != 0); + +#ifdef CONFIG_PIN_TLB + /* 8xx can only access 24MB at the moment */ + memblock_set_current_limit(min_t(u64, first_memblock_size, 0x0180)); +#else + /* 8xx can only access 8MB at the moment */ + memblock_set_current_limit(min_t(u64, first_memblock_size, 0x0080)); +#endif +} diff --git a/arch/powerpc/mm/init_32.c b/arch/powerpc/mm/init_32.c index a10be66..1a18e4b 100644 --- a/arch/powerpc/mm/init_32.c +++ b/arch/powerpc/mm/init_32.c @@ -193,22 +193,3 @@ void __init MMU_init(void) /* Shortly after that, the entire linear mapping will be available */ memblock_set_current_limit(lowmem_end_addr); } - -#ifdef CONFIG_8xx /* No 8xx specific .c file to put that in ... */ -void setup_initial_memory_limit(phys_addr_t first_memblock_base, - phys_addr_t first_memblock_size) -{ - /* We don't currently support the first MEMBLOCK not mapping 0 -* physical on those processors -*/ - BUG_ON(first_memblock_base != 0); - -#ifdef CONFIG_PIN_TLB - /* 8xx can only access 24MB at the moment */ - memblock_set_current_limit(min_t(u64, first_memblock_size, 0x0180)); -#else - /* 8xx can only access 8MB at the moment */ - memblock_set_current_limit(min_t(u64, first_memblock_size, 0x0080)); -#endif -} -#endif /* CONFIG_8xx */ -- 2.1.0 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH v8 05/23] powerpc32: Fix pte_offset_kernel() to return NULL for bad pages
The fixmap related functions try to map kernel pages that are already mapped through Large TLBs. pte_offset_kernel() has to return NULL for LTLBs, otherwise the caller will try to access level 2 table which doesn't exist Signed-off-by: Christophe Leroy--- v3: new v4: no change v5: no change v6: no change v8: no change arch/powerpc/include/asm/nohash/32/pgtable.h | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/arch/powerpc/include/asm/nohash/32/pgtable.h b/arch/powerpc/include/asm/nohash/32/pgtable.h index c82cbf5..e201600 100644 --- a/arch/powerpc/include/asm/nohash/32/pgtable.h +++ b/arch/powerpc/include/asm/nohash/32/pgtable.h @@ -309,7 +309,8 @@ static inline void __ptep_set_access_flags(pte_t *ptep, pte_t entry) #define pte_index(address) \ (((address) >> PAGE_SHIFT) & (PTRS_PER_PTE - 1)) #define pte_offset_kernel(dir, addr) \ - ((pte_t *) pmd_page_vaddr(*(dir)) + pte_index(addr)) + (pmd_bad(*(dir)) ? NULL : (pte_t *)pmd_page_vaddr(*(dir)) + \ + pte_index(addr)) #define pte_offset_map(dir, addr) \ ((pte_t *) kmap_atomic(pmd_page(*(dir))) + pte_index(addr)) #define pte_unmap(pte) kunmap_atomic(pte) -- 2.1.0 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH v8 06/23] powerpc32: refactor x_mapped_by_bats() and x_mapped_by_tlbcam() together
x_mapped_by_bats() and x_mapped_by_tlbcam() serve the same kind of purpose, and are never defined at the same time. So rename them x_block_mapped() and define them in the relevant places Signed-off-by: Christophe Leroy--- v2: no change v3: Functions are mutually exclusive so renamed iaw Scott comment instead of grouping into a single function v4: no change v5: no change v6: no change v8: Don't include x_block_mapped() from compilation in arch/powerpc/mm/fsl_booke_mmu.c when CONFIG_FSL_BOOKE is not set (problem reported by kbuild robot with a configuration having CONFIG_FSL_BOOK3E and not CONFIG_FSL_BOOKE) arch/powerpc/mm/fsl_booke_mmu.c | 6 -- arch/powerpc/mm/mmu_decl.h | 10 ++ arch/powerpc/mm/pgtable_32.c| 44 ++--- arch/powerpc/mm/ppc_mmu_32.c| 4 ++-- 4 files changed, 22 insertions(+), 42 deletions(-) diff --git a/arch/powerpc/mm/fsl_booke_mmu.c b/arch/powerpc/mm/fsl_booke_mmu.c index f3afe3d..a1b2713 100644 --- a/arch/powerpc/mm/fsl_booke_mmu.c +++ b/arch/powerpc/mm/fsl_booke_mmu.c @@ -72,10 +72,11 @@ unsigned long tlbcam_sz(int idx) return tlbcam_addrs[idx].limit - tlbcam_addrs[idx].start + 1; } +#ifdef CONFIG_FSL_BOOKE /* * Return PA for this VA if it is mapped by a CAM, or 0 */ -phys_addr_t v_mapped_by_tlbcam(unsigned long va) +phys_addr_t v_block_mapped(unsigned long va) { int b; for (b = 0; b < tlbcam_index; ++b) @@ -87,7 +88,7 @@ phys_addr_t v_mapped_by_tlbcam(unsigned long va) /* * Return VA for a given PA or 0 if not mapped */ -unsigned long p_mapped_by_tlbcam(phys_addr_t pa) +unsigned long p_block_mapped(phys_addr_t pa) { int b; for (b = 0; b < tlbcam_index; ++b) @@ -97,6 +98,7 @@ unsigned long p_mapped_by_tlbcam(phys_addr_t pa) return tlbcam_addrs[b].start+(pa-tlbcam_addrs[b].phys); return 0; } +#endif /* * Set up a variable-size TLB entry (tlbcam). The parameters are not checked; diff --git a/arch/powerpc/mm/mmu_decl.h b/arch/powerpc/mm/mmu_decl.h index 7faeb9f..40dd5d3 100644 --- a/arch/powerpc/mm/mmu_decl.h +++ b/arch/powerpc/mm/mmu_decl.h @@ -158,3 +158,13 @@ struct tlbcam { u32 MAS7; }; #endif + +#if defined(CONFIG_6xx) || defined(CONFIG_FSL_BOOKE) +/* 6xx have BATS */ +/* FSL_BOOKE have TLBCAM */ +phys_addr_t v_block_mapped(unsigned long va); +unsigned long p_block_mapped(phys_addr_t pa); +#else +static inline phys_addr_t v_block_mapped(unsigned long va) { return 0; } +static inline unsigned long p_block_mapped(phys_addr_t pa) { return 0; } +#endif diff --git a/arch/powerpc/mm/pgtable_32.c b/arch/powerpc/mm/pgtable_32.c index 7692d1b..db0d35e 100644 --- a/arch/powerpc/mm/pgtable_32.c +++ b/arch/powerpc/mm/pgtable_32.c @@ -41,32 +41,8 @@ unsigned long ioremap_base; unsigned long ioremap_bot; EXPORT_SYMBOL(ioremap_bot);/* aka VMALLOC_END */ -#ifdef CONFIG_6xx -#define HAVE_BATS 1 -#endif - -#if defined(CONFIG_FSL_BOOKE) -#define HAVE_TLBCAM1 -#endif - extern char etext[], _stext[]; -#ifdef HAVE_BATS -extern phys_addr_t v_mapped_by_bats(unsigned long va); -extern unsigned long p_mapped_by_bats(phys_addr_t pa); -#else /* !HAVE_BATS */ -#define v_mapped_by_bats(x)(0UL) -#define p_mapped_by_bats(x)(0UL) -#endif /* HAVE_BATS */ - -#ifdef HAVE_TLBCAM -extern phys_addr_t v_mapped_by_tlbcam(unsigned long va); -extern unsigned long p_mapped_by_tlbcam(phys_addr_t pa); -#else /* !HAVE_TLBCAM */ -#define v_mapped_by_tlbcam(x) (0UL) -#define p_mapped_by_tlbcam(x) (0UL) -#endif /* HAVE_TLBCAM */ - #define PGDIR_ORDER(32 + PGD_T_LOG2 - PGDIR_SHIFT) #ifndef CONFIG_PPC_4K_PAGES @@ -228,19 +204,10 @@ __ioremap_caller(phys_addr_t addr, unsigned long size, unsigned long flags, /* * Is it already mapped? Perhaps overlapped by a previous -* BAT mapping. If the whole area is mapped then we're done, -* otherwise remap it since we want to keep the virt addrs for -* each request contiguous. -* -* We make the assumption here that if the bottom and top -* of the range we want are mapped then it's mapped to the -* same virt address (and this is contiguous). -* -- Cort +* mapping. */ - if ((v = p_mapped_by_bats(p)) /*&& p_mapped_by_bats(p+size-1)*/ ) - goto out; - - if ((v = p_mapped_by_tlbcam(p))) + v = p_block_mapped(p); + if (v) goto out; if (slab_is_available()) { @@ -278,7 +245,8 @@ void iounmap(volatile void __iomem *addr) * If mapped by BATs then there is nothing to do. * Calling vfree() generates a benign warning. */ - if (v_mapped_by_bats((unsigned long)addr)) return; + if (v_block_mapped((unsigned long)addr)) + return; if (addr > high_memory && (unsigned long) addr < ioremap_bot) vunmap((void *) (PAGE_MASK &
[PATCH v8 10/23] powerpc/8xx: map more RAM at startup when needed
On recent kernels, with some debug options like for instance CONFIG_LOCKDEP, the BSS requires more than 8M memory, allthough the kernel code fits in the first 8M. Today, it is necessary to activate CONFIG_PIN_TLB to get more than 8M at startup, allthough pinning TLB is not necessary for that. This patch adds more pages (up to 24Mb) to the initial mapping if possible/needed in order to have the necessary mappings regardless of CONFIG_PIN_TLB. We could have mapped 16M or 24M inconditionnally but since some platforms only have 8M memory, we need something a bit more elaborated Therefore, if the bootloader is compliant with ePAPR standard, we use r7 to know how much memory was mapped by the bootloader. Otherwise, we try to determine the required memory size by looking at the _end symbol and the address of the device tree. This patch does not modify the behaviour when CONFIG_PIN_TLB is selected. Signed-off-by: Christophe Leroy--- v2: no change v3: Automatic detection of available/needed memory instead of allocating 16M for all. v4: no change v5: no change v6: no change v8: no change arch/powerpc/kernel/head_8xx.S | 56 +++--- arch/powerpc/mm/8xx_mmu.c | 10 +++- 2 files changed, 56 insertions(+), 10 deletions(-) diff --git a/arch/powerpc/kernel/head_8xx.S b/arch/powerpc/kernel/head_8xx.S index ae721a1..a268cf4 100644 --- a/arch/powerpc/kernel/head_8xx.S +++ b/arch/powerpc/kernel/head_8xx.S @@ -72,6 +72,9 @@ #define RPN_PATTERN0x00f0 #endif +/* ePAPR magic value for non BOOK III-E CPUs */ +#define EPAPR_SMAGIC 0x65504150 + __HEAD _ENTRY(_stext); _ENTRY(_start); @@ -101,6 +104,38 @@ _ENTRY(_start); */ .globl __start __start: +/* + * Determine initial RAM size + * + * If the Bootloader is ePAPR compliant, the size is given in r7 + * otherwise, we have to determine how much is needed. For that, we have to + * check whether _end of kernel and device tree are within the first 8Mb. + */ + lis r30, 0x0080@h /* 8Mb by default */ + + lis r8, EPAPR_SMAGIC@h + ori r8, r8, EPAPR_SMAGIC@l + cmplw cr0, r8, r6 + bne 1f + lis r30, 0x0180@h /* 24Mb max */ + cmplw cr0, r7, r30 + bgt 2f + mr r30, r7 /* save initial ram size */ + b 2f +1: + /* is kernel _end or DTB in the first 8M ? if not map 16M */ + lis r8, (_end - PAGE_OFFSET)@h + ori r8, r8, (_end - PAGE_OFFSET)@l + addir8, r8, -1 + or r8, r8, r3 + cmplw cr0, r8, r30 + blt 2f + lis r30, 0x0100@h /* 16Mb */ + /* is kernel _end or DTB in the first 16M ? if not map 24M */ + cmplw cr0, r8, r30 + blt 2f + lis r30, 0x0180@h /* 24Mb */ +2: mr r31,r3 /* save device tree ptr */ /* We have to turn on the MMU right away so we get cache modes @@ -737,6 +772,8 @@ start_here: /* * Decide what sort of machine this is and initialize the MMU. */ + lis r3, initial_memory_size@ha + stw r30, initial_memory_size@l(r3) li r3,0 mr r4,r31 bl machine_init @@ -868,10 +905,15 @@ initial_mmu: mtspr SPRN_MD_RPN, r8 #ifdef CONFIG_PIN_TLB - /* Map two more 8M kernel data pages. - */ + /* Map one more 8M kernel data page. */ addir10, r10, 0x0100 mtspr SPRN_MD_CTR, r10 +#else + /* Map one more 8M kernel data page if needed */ + lis r10, 0x0080@h + cmplw cr0, r30, r10 + ble 1f +#endif lis r8, KERNELBASE@h/* Create vaddr for TLB */ addis r8, r8, 0x0080 /* Add 8M */ @@ -884,20 +926,28 @@ initial_mmu: addis r11, r11, 0x0080/* Add 8M */ mtspr SPRN_MD_RPN, r11 +#ifdef CONFIG_PIN_TLB + /* Map one more 8M kernel data page. */ addir10, r10, 0x0100 mtspr SPRN_MD_CTR, r10 +#else + /* Map one more 8M kernel data page if needed */ + lis r10, 0x0100@h + cmplw cr0, r30, r10 + ble 1f +#endif addis r8, r8, 0x0080 /* Add 8M */ mtspr SPRN_MD_EPN, r8 mtspr SPRN_MD_TWC, r9 addis r11, r11, 0x0080/* Add 8M */ mtspr SPRN_MD_RPN, r11 -#endif /* Since the cache is enabled according to the information we * just loaded into the TLB, invalidate and enable the caches here. * We should probably check/set other modeslater. */ +1: lis r8, IDC_INVALL@h mtspr SPRN_IC_CST, r8 mtspr SPRN_DC_CST, r8 diff --git a/arch/powerpc/mm/8xx_mmu.c b/arch/powerpc/mm/8xx_mmu.c index f37d5ec..50f17d2 100644 --- a/arch/powerpc/mm/8xx_mmu.c +++ b/arch/powerpc/mm/8xx_mmu.c @@ -20,6 +20,7 @@ #define IMMR_SIZE
[PATCH v8 12/23] powerpc32: remove ioremap_base
ioremap_base is not initialised and is nowhere used so remove it Signed-off-by: Christophe Leroy--- v2: no change v3: fix comment as well v4: no change v5: no change v6: no change v8: no change arch/powerpc/include/asm/nohash/32/pgtable.h | 2 +- arch/powerpc/mm/mmu_decl.h | 1 - arch/powerpc/mm/pgtable_32.c | 3 +-- arch/powerpc/platforms/embedded6xx/mpc10x.h | 10 -- 4 files changed, 2 insertions(+), 14 deletions(-) diff --git a/arch/powerpc/include/asm/nohash/32/pgtable.h b/arch/powerpc/include/asm/nohash/32/pgtable.h index e201600..7808475 100644 --- a/arch/powerpc/include/asm/nohash/32/pgtable.h +++ b/arch/powerpc/include/asm/nohash/32/pgtable.h @@ -86,7 +86,7 @@ extern int icache_44x_need_flush; * We no longer map larger than phys RAM with the BATs so we don't have * to worry about the VMALLOC_OFFSET causing problems. We do have to worry * about clashes between our early calls to ioremap() that start growing down - * from ioremap_base being run into the VM area allocations (growing upwards + * from IOREMAP_TOP being run into the VM area allocations (growing upwards * from VMALLOC_START). For this reason we have ioremap_bot to check when * we actually run into our mappings setup in the early boot with the VM * system. This really does become a problem for machines with good amounts diff --git a/arch/powerpc/mm/mmu_decl.h b/arch/powerpc/mm/mmu_decl.h index 3872332..53564a3 100644 --- a/arch/powerpc/mm/mmu_decl.h +++ b/arch/powerpc/mm/mmu_decl.h @@ -100,7 +100,6 @@ extern void setbat(int index, unsigned long virt, phys_addr_t phys, extern int __map_without_bats; extern int __allow_ioremap_reserved; -extern unsigned long ioremap_base; extern unsigned int rtas_data, rtas_size; struct hash_pte; diff --git a/arch/powerpc/mm/pgtable_32.c b/arch/powerpc/mm/pgtable_32.c index db0d35e..815ccd7 100644 --- a/arch/powerpc/mm/pgtable_32.c +++ b/arch/powerpc/mm/pgtable_32.c @@ -37,7 +37,6 @@ #include "mmu_decl.h" -unsigned long ioremap_base; unsigned long ioremap_bot; EXPORT_SYMBOL(ioremap_bot);/* aka VMALLOC_END */ @@ -173,7 +172,7 @@ __ioremap_caller(phys_addr_t addr, unsigned long size, unsigned long flags, /* * Choose an address to map it to. * Once the vmalloc system is running, we use it. -* Before then, we use space going down from ioremap_base +* Before then, we use space going down from IOREMAP_TOP * (ioremap_bot records where we're up to). */ p = addr & PAGE_MASK; diff --git a/arch/powerpc/platforms/embedded6xx/mpc10x.h b/arch/powerpc/platforms/embedded6xx/mpc10x.h index b290b63..5ad1202 100644 --- a/arch/powerpc/platforms/embedded6xx/mpc10x.h +++ b/arch/powerpc/platforms/embedded6xx/mpc10x.h @@ -24,13 +24,11 @@ * Processor: 0x8000 - 0x807f -> PCI I/O: 0x - 0x007f * Processor: 0xc000 - 0xdfff -> PCI MEM: 0x - 0x1fff * PCI MEM: 0x8000 -> Processor System Memory: 0x - * EUMB mapped to: ioremap_base - 0x0010 (ioremap_base - 1 MB) * * MAP B (CHRP Map) * Processor: 0xfe00 - 0xfebf -> PCI I/O: 0x - 0x00bf * Processor: 0x8000 - 0xbfff -> PCI MEM: 0x8000 - 0xbfff * PCI MEM: 0x -> Processor System Memory: 0x - * EUMB mapped to: ioremap_base - 0x0010 (ioremap_base - 1 MB) */ /* @@ -138,14 +136,6 @@ #define MPC10X_EUMB_WP_OFFSET 0x000ff000 /* Data path diagnostic, watchpoint reg offset */ #define MPC10X_EUMB_WP_SIZE0x1000 /* Data path diagnostic, watchpoint reg size */ -/* - * Define some recommended places to put the EUMB regs. - * For both maps, recommend putting the EUMB from 0xeff0 to 0xefff. - */ -extern unsigned long ioremap_base; -#defineMPC10X_MAPA_EUMB_BASE (ioremap_base - MPC10X_EUMB_SIZE) -#defineMPC10X_MAPB_EUMB_BASE MPC10X_MAPA_EUMB_BASE - enum ppc_sys_devices { MPC10X_IIC1, MPC10X_DMA0, -- 2.1.0 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH v8 07/23] powerpc/8xx: Fix vaddr for IMMR early remap
Memory: 124428K/131072K available (3748K kernel code, 188K rwdata, 648K rodata, 508K init, 290K bss, 6644K reserved) Kernel virtual memory layout: * 0xfffdf000..0xf000 : fixmap * 0xfde0..0xfe00 : consistent mem * 0xfddf6000..0xfde0 : early ioremap * 0xc900..0xfddf6000 : vmalloc & ioremap SLUB: HWalign=16, Order=0-3, MinObjects=0, CPUs=1, Nodes=1 Today, IMMR is mapped 1:1 at startup Mapping IMMR 1:1 is just wrong because it may overlap with another area. On most mpc8xx boards it is OK as IMMR is set to 0xff00 but for instance on EP88xC board, IMMR is at 0xfa20 which overlaps with VM ioremap area This patch fixes the virtual address for remapping IMMR with the fixmap regardless of the value of IMMR. The size of IMMR area is 256kbytes (CPM at offset 0, security engine at offset 128k) so a 512k page is enough Signed-off-by: Christophe Leroy--- v2: no change v3: Using fixmap instead of fixed address v4: Fix a wrong #if notified by kbuild robot v5: no change v6: no change v8: no change arch/powerpc/include/asm/fixmap.h | 7 +++ arch/powerpc/kernel/asm-offsets.c | 8 arch/powerpc/kernel/head_8xx.S| 11 ++- arch/powerpc/mm/mmu_decl.h| 7 +++ arch/powerpc/sysdev/cpm_common.c | 15 --- 5 files changed, 40 insertions(+), 8 deletions(-) diff --git a/arch/powerpc/include/asm/fixmap.h b/arch/powerpc/include/asm/fixmap.h index 90f604b..d7dd8fb 100644 --- a/arch/powerpc/include/asm/fixmap.h +++ b/arch/powerpc/include/asm/fixmap.h @@ -51,6 +51,13 @@ enum fixed_addresses { FIX_KMAP_BEGIN, /* reserved pte's for temporary kernel mappings */ FIX_KMAP_END = FIX_KMAP_BEGIN+(KM_TYPE_NR*NR_CPUS)-1, #endif +#ifdef CONFIG_PPC_8xx + /* For IMMR we need an aligned 512K area */ + FIX_IMMR_START, + FIX_IMMR_TOP = (FIX_IMMR_START - 1 + ((512 * 1024) / PAGE_SIZE)) & + ~(((512 * 1024) / PAGE_SIZE) - 1), + FIX_IMMR_BASE = FIX_IMMR_TOP - 1 + ((512 * 1024) / PAGE_SIZE), +#endif /* FIX_PCIE_MCFG, */ __end_of_fixed_addresses }; diff --git a/arch/powerpc/kernel/asm-offsets.c b/arch/powerpc/kernel/asm-offsets.c index 07cebc3..9724ff8 100644 --- a/arch/powerpc/kernel/asm-offsets.c +++ b/arch/powerpc/kernel/asm-offsets.c @@ -68,6 +68,10 @@ #include "../mm/mmu_decl.h" #endif +#ifdef CONFIG_PPC_8xx +#include +#endif + int main(void) { DEFINE(THREAD, offsetof(struct task_struct, thread)); @@ -772,5 +776,9 @@ int main(void) DEFINE(PPC_DBELL_SERVER, PPC_DBELL_SERVER); +#ifdef CONFIG_PPC_8xx + DEFINE(VIRT_IMMR_BASE, __fix_to_virt(FIX_IMMR_BASE)); +#endif + return 0; } diff --git a/arch/powerpc/kernel/head_8xx.S b/arch/powerpc/kernel/head_8xx.S index 87d1f5f..09173ae 100644 --- a/arch/powerpc/kernel/head_8xx.S +++ b/arch/powerpc/kernel/head_8xx.S @@ -30,6 +30,7 @@ #include #include #include +#include /* Macro to make the code more readable. */ #ifdef CONFIG_8xx_CPU6 @@ -763,7 +764,7 @@ start_here: * virtual to physical. Also, set the cache mode since that is defined * by TLB entries and perform any additional mapping (like of the IMMR). * If configured to pin some TLBs, we pin the first 8 Mbytes of kernel, - * 24 Mbytes of data, and the 8M IMMR space. Anything not covered by + * 24 Mbytes of data, and the 512k IMMR space. Anything not covered by * these mappings is mapped by page tables. */ initial_mmu: @@ -812,7 +813,7 @@ initial_mmu: ori r8, r8, MD_APG_INIT@l mtspr SPRN_MD_AP, r8 - /* Map another 8 MByte at the IMMR to get the processor + /* Map a 512k page for the IMMR to get the processor * internal registers (among other things). */ #ifdef CONFIG_PIN_TLB @@ -820,12 +821,12 @@ initial_mmu: mtspr SPRN_MD_CTR, r10 #endif mfspr r9, 638 /* Get current IMMR */ - andis. r9, r9, 0xff80 /* Get 8Mbyte boundary */ + andis. r9, r9, 0xfff8 /* Get 512 kbytes boundary */ - mr r8, r9 /* Create vaddr for TLB */ + lis r8, VIRT_IMMR_BASE@h/* Create vaddr for TLB */ ori r8, r8, MD_EVALID /* Mark it valid */ mtspr SPRN_MD_EPN, r8 - li r8, MD_PS8MEG /* Set 8M byte page */ + li r8, MD_PS512K | MD_GUARDED /* Set 512k byte page */ ori r8, r8, MD_SVALID /* Make it valid */ mtspr SPRN_MD_TWC, r8 mr r8, r9 /* Create paddr for TLB */ diff --git a/arch/powerpc/mm/mmu_decl.h b/arch/powerpc/mm/mmu_decl.h index 40dd5d3..e7228b7 100644 --- a/arch/powerpc/mm/mmu_decl.h +++ b/arch/powerpc/mm/mmu_decl.h @@ -107,6 +107,13 @@ struct hash_pte; extern struct hash_pte *Hash, *Hash_end; extern unsigned long Hash_size, Hash_mask; +#define PHYS_IMMR_BASE (mfspr(SPRN_IMMR) & 0xfff8) +#ifdef CONFIG_PPC_8xx +#define VIRT_IMMR_BASE
[PATCH v8 14/23] powerpc/8xx: Handle CPU6 ERRATA directly in mtspr() macro
MPC8xx has an ERRATA on the use of mtspr() for some registers This patch includes the ERRATA handling directly into mtspr() macro so that mtspr() users don't need to bother about that errata Signed-off-by: Christophe Leroy--- v2: no change v3: no change v4: no change v5: no change v6: no change v8: no change arch/powerpc/include/asm/reg.h | 2 + arch/powerpc/include/asm/reg_8xx.h | 82 ++ 2 files changed, 84 insertions(+) diff --git a/arch/powerpc/include/asm/reg.h b/arch/powerpc/include/asm/reg.h index c4cb2ff..7b5d97f 100644 --- a/arch/powerpc/include/asm/reg.h +++ b/arch/powerpc/include/asm/reg.h @@ -1211,9 +1211,11 @@ static inline void mtmsr_isync(unsigned long val) #define mfspr(rn) ({unsigned long rval; \ asm volatile("mfspr %0," __stringify(rn) \ : "=r" (rval)); rval;}) +#ifndef mtspr #define mtspr(rn, v) asm volatile("mtspr " __stringify(rn) ",%0" : \ : "r" ((unsigned long)(v)) \ : "memory") +#endif extern void msr_check_and_set(unsigned long bits); extern bool strict_msr_control; diff --git a/arch/powerpc/include/asm/reg_8xx.h b/arch/powerpc/include/asm/reg_8xx.h index 0f71c81..d41412c 100644 --- a/arch/powerpc/include/asm/reg_8xx.h +++ b/arch/powerpc/include/asm/reg_8xx.h @@ -50,4 +50,86 @@ #define DC_DFWT0x4000 /* Data cache is forced write through */ #define DC_LES 0x2000 /* Caches are little endian mode */ +#ifdef CONFIG_8xx_CPU6 +#define do_mtspr_cpu6(rn, rn_addr, v) \ + do {\ + int _reg_cpu6 = rn_addr, _tmp_cpu6[1]; \ + asm volatile("stw %0, %1;" \ +"lwz %0, %1;" \ +"mtspr " __stringify(rn) ",%2" : \ +: "r" (_reg_cpu6), "m"(_tmp_cpu6), \ + "r" ((unsigned long)(v)) \ +: "memory"); \ + } while (0) + +#define do_mtspr(rn, v)asm volatile("mtspr " __stringify(rn) ",%0" : \ +: "r" ((unsigned long)(v)) \ +: "memory") +#define mtspr(rn, v) \ + do {\ + if (rn == SPRN_IMMR)\ + do_mtspr_cpu6(rn, 0x3d30, v); \ + else if (rn == SPRN_IC_CST) \ + do_mtspr_cpu6(rn, 0x2110, v); \ + else if (rn == SPRN_IC_ADR) \ + do_mtspr_cpu6(rn, 0x2310, v); \ + else if (rn == SPRN_IC_DAT) \ + do_mtspr_cpu6(rn, 0x2510, v); \ + else if (rn == SPRN_DC_CST) \ + do_mtspr_cpu6(rn, 0x3110, v); \ + else if (rn == SPRN_DC_ADR) \ + do_mtspr_cpu6(rn, 0x3310, v); \ + else if (rn == SPRN_DC_DAT) \ + do_mtspr_cpu6(rn, 0x3510, v); \ + else if (rn == SPRN_MI_CTR) \ + do_mtspr_cpu6(rn, 0x2180, v); \ + else if (rn == SPRN_MI_AP) \ + do_mtspr_cpu6(rn, 0x2580, v); \ + else if (rn == SPRN_MI_EPN) \ + do_mtspr_cpu6(rn, 0x2780, v); \ + else if (rn == SPRN_MI_TWC) \ + do_mtspr_cpu6(rn, 0x2b80, v); \ + else if (rn == SPRN_MI_RPN) \ + do_mtspr_cpu6(rn, 0x2d80, v); \ + else if (rn == SPRN_MI_CAM) \ + do_mtspr_cpu6(rn, 0x2190, v); \ + else if (rn == SPRN_MI_RAM0)\ + do_mtspr_cpu6(rn, 0x2390, v); \ + else if (rn == SPRN_MI_RAM1)\ + do_mtspr_cpu6(rn, 0x2590, v); \ + else if (rn == SPRN_MD_CTR) \ + do_mtspr_cpu6(rn, 0x3180, v); \ + else if (rn == SPRN_M_CASID)
[PATCH v8 09/23] powerpc/8xx: CONFIG_PIN_TLB unneeded for CONFIG_PPC_EARLY_DEBUG_CPM
IMMR is now mapped by page tables so it is not anymore necessary to PIN TLBs Signed-off-by: Christophe Leroy--- v2: no change v3: no change v4: no change v5: no change v6: no change v8: no change arch/powerpc/Kconfig.debug | 1 - 1 file changed, 1 deletion(-) diff --git a/arch/powerpc/Kconfig.debug b/arch/powerpc/Kconfig.debug index 638f9ce..136b09c 100644 --- a/arch/powerpc/Kconfig.debug +++ b/arch/powerpc/Kconfig.debug @@ -220,7 +220,6 @@ config PPC_EARLY_DEBUG_40x config PPC_EARLY_DEBUG_CPM bool "Early serial debugging for Freescale CPM-based serial ports" depends on SERIAL_CPM - select PIN_TLB if PPC_8xx help Select this to enable early debugging for Freescale chips using a CPM-based serial port. This assumes that the bootwrapper -- 2.1.0 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH v8 18/23] powerpc: add inline functions for cache related instructions
This patch adds inline functions to use dcbz, dcbi, dcbf, dcbst from C functions Signed-off-by: Christophe Leroy--- v2: new v3: no change v4: no change v5: no change v6: no change v8: no change arch/powerpc/include/asm/cache.h | 19 +++ 1 file changed, 19 insertions(+) diff --git a/arch/powerpc/include/asm/cache.h b/arch/powerpc/include/asm/cache.h index 5f8229e..ffbafbf 100644 --- a/arch/powerpc/include/asm/cache.h +++ b/arch/powerpc/include/asm/cache.h @@ -69,6 +69,25 @@ extern void _set_L3CR(unsigned long); #define _set_L3CR(val) do { } while(0) #endif +static inline void dcbz(void *addr) +{ + __asm__ __volatile__ ("dcbz 0, %0" : : "r"(addr) : "memory"); +} + +static inline void dcbi(void *addr) +{ + __asm__ __volatile__ ("dcbi 0, %0" : : "r"(addr) : "memory"); +} + +static inline void dcbf(void *addr) +{ + __asm__ __volatile__ ("dcbf 0, %0" : : "r"(addr) : "memory"); +} + +static inline void dcbst(void *addr) +{ + __asm__ __volatile__ ("dcbst 0, %0" : : "r"(addr) : "memory"); +} #endif /* !__ASSEMBLY__ */ #endif /* __KERNEL__ */ #endif /* _ASM_POWERPC_CACHE_H */ -- 2.1.0 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH v8 20/23] powerpc32: move xxxxx_dcache_range() functions inline
flush/clean/invalidate _dcache_range() functions are all very similar and are quite short. They are mainly used in __dma_sync() perf_event locate them in the top 3 consumming functions during heavy ethernet activity They are good candidate for inlining, as __dma_sync() does almost nothing but calling them Signed-off-by: Christophe Leroy--- v2: new v3: no change v4: no change v5: no change v6: no change v8: no change arch/powerpc/include/asm/cacheflush.h | 52 ++-- arch/powerpc/kernel/misc_32.S | 65 --- arch/powerpc/kernel/ppc_ksyms.c | 2 ++ 3 files changed, 51 insertions(+), 68 deletions(-) diff --git a/arch/powerpc/include/asm/cacheflush.h b/arch/powerpc/include/asm/cacheflush.h index 6229e6b..97c9978 100644 --- a/arch/powerpc/include/asm/cacheflush.h +++ b/arch/powerpc/include/asm/cacheflush.h @@ -47,12 +47,58 @@ static inline void __flush_dcache_icache_phys(unsigned long physaddr) } #endif -extern void flush_dcache_range(unsigned long start, unsigned long stop); #ifdef CONFIG_PPC32 -extern void clean_dcache_range(unsigned long start, unsigned long stop); -extern void invalidate_dcache_range(unsigned long start, unsigned long stop); +/* + * Write any modified data cache blocks out to memory and invalidate them. + * Does not invalidate the corresponding instruction cache blocks. + */ +static inline void flush_dcache_range(unsigned long start, unsigned long stop) +{ + void *addr = (void *)(start & ~(L1_CACHE_BYTES - 1)); + unsigned long size = stop - (unsigned long)addr + (L1_CACHE_BYTES - 1); + unsigned long i; + + for (i = 0; i < size >> L1_CACHE_SHIFT; i++, addr += L1_CACHE_BYTES) + dcbf(addr); + mb(); /* sync */ +} + +/* + * Write any modified data cache blocks out to memory. + * Does not invalidate the corresponding cache lines (especially for + * any corresponding instruction cache). + */ +static inline void clean_dcache_range(unsigned long start, unsigned long stop) +{ + void *addr = (void *)(start & ~(L1_CACHE_BYTES - 1)); + unsigned long size = stop - (unsigned long)addr + (L1_CACHE_BYTES - 1); + unsigned long i; + + for (i = 0; i < size >> L1_CACHE_SHIFT; i++, addr += L1_CACHE_BYTES) + dcbst(addr); + mb(); /* sync */ +} + +/* + * Like above, but invalidate the D-cache. This is used by the 8xx + * to invalidate the cache so the PPC core doesn't get stale data + * from the CPM (no cache snooping here :-). + */ +static inline void invalidate_dcache_range(unsigned long start, + unsigned long stop) +{ + void *addr = (void *)(start & ~(L1_CACHE_BYTES - 1)); + unsigned long size = stop - (unsigned long)addr + (L1_CACHE_BYTES - 1); + unsigned long i; + + for (i = 0; i < size >> L1_CACHE_SHIFT; i++, addr += L1_CACHE_BYTES) + dcbi(addr); + mb(); /* sync */ +} + #endif /* CONFIG_PPC32 */ #ifdef CONFIG_PPC64 +extern void flush_dcache_range(unsigned long start, unsigned long stop); extern void flush_inval_dcache_range(unsigned long start, unsigned long stop); extern void flush_dcache_phys_range(unsigned long start, unsigned long stop); #endif diff --git a/arch/powerpc/kernel/misc_32.S b/arch/powerpc/kernel/misc_32.S index 181afc1..09e1e5d 100644 --- a/arch/powerpc/kernel/misc_32.S +++ b/arch/powerpc/kernel/misc_32.S @@ -375,71 +375,6 @@ END_FTR_SECTION_IFSET(CPU_FTR_COHERENT_ICACHE) isync blr /* - * Write any modified data cache blocks out to memory. - * Does not invalidate the corresponding cache lines (especially for - * any corresponding instruction cache). - * - * clean_dcache_range(unsigned long start, unsigned long stop) - */ -_GLOBAL(clean_dcache_range) - li r5,L1_CACHE_BYTES-1 - andcr3,r3,r5 - subfr4,r3,r4 - add r4,r4,r5 - srwi. r4,r4,L1_CACHE_SHIFT - beqlr - mtctr r4 - -1: dcbst 0,r3 - addir3,r3,L1_CACHE_BYTES - bdnz1b - sync/* wait for dcbst's to get to ram */ - blr - -/* - * Write any modified data cache blocks out to memory and invalidate them. - * Does not invalidate the corresponding instruction cache blocks. - * - * flush_dcache_range(unsigned long start, unsigned long stop) - */ -_GLOBAL(flush_dcache_range) - li r5,L1_CACHE_BYTES-1 - andcr3,r3,r5 - subfr4,r3,r4 - add r4,r4,r5 - srwi. r4,r4,L1_CACHE_SHIFT - beqlr - mtctr r4 - -1: dcbf0,r3 - addir3,r3,L1_CACHE_BYTES - bdnz1b - sync/* wait for dcbst's to get to ram */ - blr - -/* - * Like above, but invalidate the D-cache. This is used by the 8xx - * to invalidate the cache so the PPC core doesn't get stale data - * from the CPM (no cache snooping here :-). - * - *
[PATCH v8 13/23] powerpc/8xx: Add missing SPRN defines into reg_8xx.h
Add missing SPRN defines into reg_8xx.h Some of them are defined in mmu-8xx.h, so we include mmu-8xx.h in reg_8xx.h, for that we remove references to PAGE_SHIFT in mmu-8xx.h to have it self sufficient, as includers of reg_8xx.h don't all include asm/page.h Signed-off-by: Christophe Leroy--- v2: no change v3: We just add missing ones, don't move anymore the ones from mmu-8xx.h v4: no change v5: no change v6: no change v8: no change arch/powerpc/include/asm/mmu-8xx.h | 4 ++-- arch/powerpc/include/asm/reg_8xx.h | 11 +++ 2 files changed, 13 insertions(+), 2 deletions(-) diff --git a/arch/powerpc/include/asm/mmu-8xx.h b/arch/powerpc/include/asm/mmu-8xx.h index f05500a..0a566f1 100644 --- a/arch/powerpc/include/asm/mmu-8xx.h +++ b/arch/powerpc/include/asm/mmu-8xx.h @@ -171,9 +171,9 @@ typedef struct { } mm_context_t; #endif /* !__ASSEMBLY__ */ -#if (PAGE_SHIFT == 12) +#if defined(CONFIG_PPC_4K_PAGES) #define mmu_virtual_psize MMU_PAGE_4K -#elif (PAGE_SHIFT == 14) +#elif defined(CONFIG_PPC_16K_PAGES) #define mmu_virtual_psize MMU_PAGE_16K #else #error "Unsupported PAGE_SIZE" diff --git a/arch/powerpc/include/asm/reg_8xx.h b/arch/powerpc/include/asm/reg_8xx.h index e8ea346..0f71c81 100644 --- a/arch/powerpc/include/asm/reg_8xx.h +++ b/arch/powerpc/include/asm/reg_8xx.h @@ -4,6 +4,8 @@ #ifndef _ASM_POWERPC_REG_8xx_H #define _ASM_POWERPC_REG_8xx_H +#include + /* Cache control on the MPC8xx is provided through some additional * special purpose registers. */ @@ -14,6 +16,15 @@ #define SPRN_DC_ADR569 /* Address needed for some commands */ #define SPRN_DC_DAT570 /* Read-only data register */ +/* Misc Debug */ +#define SPRN_DPDR 630 +#define SPRN_MI_CAM816 +#define SPRN_MI_RAM0 817 +#define SPRN_MI_RAM1 818 +#define SPRN_MD_CAM824 +#define SPRN_MD_RAM0 825 +#define SPRN_MD_RAM1 826 + /* Commands. Only the first few are available to the instruction cache. */ #defineIDC_ENABLE 0x0200 /* Cache enable */ -- 2.1.0 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH v8 15/23] powerpc/8xx: remove special handling of CPU6 errata in set_dec()
CPU6 ERRATA is now handled directly in mtspr(), so we can use the standard set_dec() fonction in all cases. Signed-off-by: Christophe Leroy--- v2: no change v3: no change v4: no change v5: no change v6: no change v8: no change arch/powerpc/include/asm/time.h | 6 +- arch/powerpc/kernel/head_8xx.S | 18 -- 2 files changed, 1 insertion(+), 23 deletions(-) diff --git a/arch/powerpc/include/asm/time.h b/arch/powerpc/include/asm/time.h index 2d7109a..1092fdd 100644 --- a/arch/powerpc/include/asm/time.h +++ b/arch/powerpc/include/asm/time.h @@ -31,8 +31,6 @@ extern void tick_broadcast_ipi_handler(void); extern void generic_calibrate_decr(void); -extern void set_dec_cpu6(unsigned int val); - /* Some sane defaults: 125 MHz timebase, 1GHz processor */ extern unsigned long ppc_proc_freq; #define DEFAULT_PROC_FREQ (DEFAULT_TB_FREQ * 8) @@ -166,14 +164,12 @@ static inline void set_dec(int val) { #if defined(CONFIG_40x) mtspr(SPRN_PIT, val); -#elif defined(CONFIG_8xx_CPU6) - set_dec_cpu6(val - 1); #else #ifndef CONFIG_BOOKE --val; #endif mtspr(SPRN_DEC, val); -#endif /* not 40x or 8xx_CPU6 */ +#endif /* not 40x */ } static inline unsigned long tb_ticks_since(unsigned long tstamp) diff --git a/arch/powerpc/kernel/head_8xx.S b/arch/powerpc/kernel/head_8xx.S index a268cf4..637f8e9 100644 --- a/arch/powerpc/kernel/head_8xx.S +++ b/arch/powerpc/kernel/head_8xx.S @@ -1011,24 +1011,6 @@ _GLOBAL(set_context) SYNC blr -#ifdef CONFIG_8xx_CPU6 -/* It's here because it is unique to the 8xx. - * It is important we get called with interrupts disabled. I used to - * do that, but it appears that all code that calls this already had - * interrupt disabled. - */ - .globl set_dec_cpu6 -set_dec_cpu6: - lis r7, cpu6_errata_word@h - ori r7, r7, cpu6_errata_word@l - li r4, 0x2c00 - stw r4, 8(r7) - lwz r4, 8(r7) -mtspr 22, r3 /* Update Decrementer */ - SYNC - blr -#endif - /* * We put a few things here that have to be page-aligned. * This stuff goes at the beginning of the data segment, -- 2.1.0 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH v8 16/23] powerpc/8xx: rewrite set_context() in C
There is no real need to have set_context() in assembly. Now that we have mtspr() handling CPU6 ERRATA directly, we can rewrite set_context() in C language for easier maintenance. Signed-off-by: Christophe Leroy--- v2: no change v3: no change v4: no change v5: no change v6: no change v8: no change arch/powerpc/kernel/head_8xx.S | 44 -- arch/powerpc/mm/8xx_mmu.c | 34 2 files changed, 34 insertions(+), 44 deletions(-) diff --git a/arch/powerpc/kernel/head_8xx.S b/arch/powerpc/kernel/head_8xx.S index 637f8e9..bb2b657 100644 --- a/arch/powerpc/kernel/head_8xx.S +++ b/arch/powerpc/kernel/head_8xx.S @@ -968,50 +968,6 @@ initial_mmu: /* - * Set up to use a given MMU context. - * r3 is context number, r4 is PGD pointer. - * - * We place the physical address of the new task page directory loaded - * into the MMU base register, and set the ASID compare register with - * the new "context." - */ -_GLOBAL(set_context) - -#ifdef CONFIG_BDI_SWITCH - /* Context switch the PTE pointer for the Abatron BDI2000. -* The PGDIR is passed as second argument. -*/ - lis r5, KERNELBASE@h - lwz r5, 0xf0(r5) - stw r4, 0x4(r5) -#endif - - /* Register M_TW will contain base address of level 1 table minus the -* lower part of the kernel PGDIR base address, so that all accesses to -* level 1 table are done relative to lower part of kernel PGDIR base -* address. -*/ - li r5, (swapper_pg_dir-PAGE_OFFSET)@l - sub r4, r4, r5 - tophys (r4, r4) -#ifdef CONFIG_8xx_CPU6 - lis r6, cpu6_errata_word@h - ori r6, r6, cpu6_errata_word@l - li r7, 0x3f80 - stw r7, 12(r6) - lwz r7, 12(r6) -#endif - mtspr SPRN_M_TW, r4 /* Update pointeur to level 1 table */ -#ifdef CONFIG_8xx_CPU6 - li r7, 0x3380 - stw r7, 12(r6) - lwz r7, 12(r6) -#endif - mtspr SPRN_M_CASID, r3/* Update context */ - SYNC - blr - -/* * We put a few things here that have to be page-aligned. * This stuff goes at the beginning of the data segment, * which is page-aligned. diff --git a/arch/powerpc/mm/8xx_mmu.c b/arch/powerpc/mm/8xx_mmu.c index 50f17d2..b75c461 100644 --- a/arch/powerpc/mm/8xx_mmu.c +++ b/arch/powerpc/mm/8xx_mmu.c @@ -147,3 +147,37 @@ void setup_initial_memory_limit(phys_addr_t first_memblock_base, memblock_set_current_limit(min_t(u64, first_memblock_size, initial_memory_size)); } + +/* + * Set up to use a given MMU context. + * id is context number, pgd is PGD pointer. + * + * We place the physical address of the new task page directory loaded + * into the MMU base register, and set the ASID compare register with + * the new "context." + */ +void set_context(unsigned long id, pgd_t *pgd) +{ + s16 offset = (s16)(__pa(swapper_pg_dir)); + +#ifdef CONFIG_BDI_SWITCH + pgd_t **ptr = *(pgd_t ***)(KERNELBASE + 0xf0); + + /* Context switch the PTE pointer for the Abatron BDI2000. +* The PGDIR is passed as second argument. +*/ + *(ptr + 1) = pgd; +#endif + + /* Register M_TW will contain base address of level 1 table minus the +* lower part of the kernel PGDIR base address, so that all accesses to +* level 1 table are done relative to lower part of kernel PGDIR base +* address. +*/ + mtspr(SPRN_M_TW, __pa(pgd) - offset); + + /* Update context */ + mtspr(SPRN_M_CASID, id); + /* sync */ + mb(); +} -- 2.1.0 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH v8 17/23] powerpc/8xx: rewrite flush_instruction_cache() in C
On PPC8xx, flushing instruction cache is performed by writing in register SPRN_IC_CST. This registers suffers CPU6 ERRATA. The patch rewrites the fonction in C so that CPU6 ERRATA will be handled transparently Signed-off-by: Christophe Leroy--- v2: no change v3: no change v4: no change v5: no change v6: no change v8: no change arch/powerpc/kernel/misc_32.S | 10 -- arch/powerpc/mm/8xx_mmu.c | 7 +++ 2 files changed, 11 insertions(+), 6 deletions(-) diff --git a/arch/powerpc/kernel/misc_32.S b/arch/powerpc/kernel/misc_32.S index be8edd6..7d1284f 100644 --- a/arch/powerpc/kernel/misc_32.S +++ b/arch/powerpc/kernel/misc_32.S @@ -296,12 +296,9 @@ _GLOBAL(real_writeb) * Flush instruction cache. * This is a no-op on the 601. */ +#ifndef CONFIG_PPC_8xx _GLOBAL(flush_instruction_cache) -#if defined(CONFIG_8xx) - isync - lis r5, IDC_INVALL@h - mtspr SPRN_IC_CST, r5 -#elif defined(CONFIG_4xx) +#if defined(CONFIG_4xx) #ifdef CONFIG_403GCX li r3, 512 mtctr r3 @@ -334,9 +331,10 @@ END_FTR_SECTION_IFSET(CPU_FTR_UNIFIED_ID_CACHE) mfspr r3,SPRN_HID0 ori r3,r3,HID0_ICFI mtspr SPRN_HID0,r3 -#endif /* CONFIG_8xx/4xx */ +#endif /* CONFIG_4xx */ isync blr +#endif /* CONFIG_PPC_8xx */ /* * Write any modified data cache blocks out to memory diff --git a/arch/powerpc/mm/8xx_mmu.c b/arch/powerpc/mm/8xx_mmu.c index b75c461..e2ce480 100644 --- a/arch/powerpc/mm/8xx_mmu.c +++ b/arch/powerpc/mm/8xx_mmu.c @@ -181,3 +181,10 @@ void set_context(unsigned long id, pgd_t *pgd) /* sync */ mb(); } + +void flush_instruction_cache(void) +{ + isync(); + mtspr(SPRN_IC_CST, IDC_INVALL); + isync(); +} -- 2.1.0 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH v8 19/23] powerpc32: Remove clear_pages() and define clear_page() inline
clear_pages() is never used expect by clear_page, and PPC32 is the only architecture (still) having this function. Neither PPC64 nor any other architecture has it. This patch removes clear_pages() and moves clear_page() function inline (same as PPC64) as it only is a few isns Signed-off-by: Christophe Leroy--- v2: no change v3: no change v4: no change v5: no change v6: no change v8: no change arch/powerpc/include/asm/page_32.h | 17 ++--- arch/powerpc/kernel/misc_32.S | 16 arch/powerpc/kernel/ppc_ksyms_32.c | 1 - 3 files changed, 14 insertions(+), 20 deletions(-) diff --git a/arch/powerpc/include/asm/page_32.h b/arch/powerpc/include/asm/page_32.h index 68d73b2..6a8e179 100644 --- a/arch/powerpc/include/asm/page_32.h +++ b/arch/powerpc/include/asm/page_32.h @@ -1,6 +1,8 @@ #ifndef _ASM_POWERPC_PAGE_32_H #define _ASM_POWERPC_PAGE_32_H +#include + #if defined(CONFIG_PHYSICAL_ALIGN) && (CONFIG_PHYSICAL_START != 0) #if (CONFIG_PHYSICAL_START % CONFIG_PHYSICAL_ALIGN) != 0 #error "CONFIG_PHYSICAL_START must be a multiple of CONFIG_PHYSICAL_ALIGN" @@ -36,9 +38,18 @@ typedef unsigned long long pte_basic_t; typedef unsigned long pte_basic_t; #endif -struct page; -extern void clear_pages(void *page, int order); -static inline void clear_page(void *page) { clear_pages(page, 0); } +/* + * Clear page using the dcbz instruction, which doesn't cause any + * memory traffic (except to write out any cache lines which get + * displaced). This only works on cacheable memory. + */ +static inline void clear_page(void *addr) +{ + unsigned int i; + + for (i = 0; i < PAGE_SIZE / L1_CACHE_BYTES; i++, addr += L1_CACHE_BYTES) + dcbz(addr); +} extern void copy_page(void *to, void *from); #include diff --git a/arch/powerpc/kernel/misc_32.S b/arch/powerpc/kernel/misc_32.S index 7d1284f..181afc1 100644 --- a/arch/powerpc/kernel/misc_32.S +++ b/arch/powerpc/kernel/misc_32.S @@ -517,22 +517,6 @@ END_FTR_SECTION_IFSET(CPU_FTR_COHERENT_ICACHE) #endif /* CONFIG_BOOKE */ /* - * Clear pages using the dcbz instruction, which doesn't cause any - * memory traffic (except to write out any cache lines which get - * displaced). This only works on cacheable memory. - * - * void clear_pages(void *page, int order) ; - */ -_GLOBAL(clear_pages) - li r0,PAGE_SIZE/L1_CACHE_BYTES - slw r0,r0,r4 - mtctr r0 -1: dcbz0,r3 - addir3,r3,L1_CACHE_BYTES - bdnz1b - blr - -/* * Copy a whole page. We use the dcbz instruction on the destination * to reduce memory traffic (it eliminates the unnecessary reads of * the destination into cache). This requires that the destination diff --git a/arch/powerpc/kernel/ppc_ksyms_32.c b/arch/powerpc/kernel/ppc_ksyms_32.c index 30ddd8a..2bfaafe 100644 --- a/arch/powerpc/kernel/ppc_ksyms_32.c +++ b/arch/powerpc/kernel/ppc_ksyms_32.c @@ -10,7 +10,6 @@ #include #include -EXPORT_SYMBOL(clear_pages); EXPORT_SYMBOL(ISA_DMA_THRESHOLD); EXPORT_SYMBOL(DMA_MODE_READ); EXPORT_SYMBOL(DMA_MODE_WRITE); -- 2.1.0 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH v8 21/23] powerpc: Simplify test in __dma_sync()
This simplification helps the compiler. We now have only one test instead of two, so it reduces the number of branches. Signed-off-by: Christophe Leroy--- v2: new v3: no change v4: no change v5: no change v6: no change v8: no change arch/powerpc/mm/dma-noncoherent.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/powerpc/mm/dma-noncoherent.c b/arch/powerpc/mm/dma-noncoherent.c index 169aba4..2dc74e5 100644 --- a/arch/powerpc/mm/dma-noncoherent.c +++ b/arch/powerpc/mm/dma-noncoherent.c @@ -327,7 +327,7 @@ void __dma_sync(void *vaddr, size_t size, int direction) * invalidate only when cache-line aligned otherwise there is * the potential for discarding uncommitted data from the cache */ - if ((start & (L1_CACHE_BYTES - 1)) || (size & (L1_CACHE_BYTES - 1))) + if ((start | end) & (L1_CACHE_BYTES - 1)) flush_dcache_range(start, end); else invalidate_dcache_range(start, end); -- 2.1.0 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH v8 23/23] powerpc32: Remove one insn in mulhdu
Remove one instruction in mulhdu Signed-off-by: Christophe Leroy--- v2: new v3: no change v4: no change v5: no change v6: no change v8: no change arch/powerpc/kernel/misc_32.S | 11 +-- 1 file changed, 5 insertions(+), 6 deletions(-) diff --git a/arch/powerpc/kernel/misc_32.S b/arch/powerpc/kernel/misc_32.S index 3ec5a22..bf5160f 100644 --- a/arch/powerpc/kernel/misc_32.S +++ b/arch/powerpc/kernel/misc_32.S @@ -91,17 +91,16 @@ _GLOBAL(mulhdu) addcr7,r0,r7 addze r4,r4 1: beqlr cr1 /* all done if high part of A is 0 */ - mr r10,r3 mullw r9,r3,r5 - mulhwu r3,r3,r5 + mulhwu r10,r3,r5 beq 2f - mullw r0,r10,r6 - mulhwu r8,r10,r6 + mullw r0,r3,r6 + mulhwu r8,r3,r6 addcr7,r0,r7 adder4,r4,r8 - addze r3,r3 + addze r10,r10 2: addcr4,r4,r9 - addze r3,r3 + addze r3,r10 blr /* -- 2.1.0 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH v8 22/23] powerpc32: small optimisation in flush_icache_range()
Inlining of _dcache_range() functions has shown that the compiler does the same thing a bit better with one insn less Signed-off-by: Christophe Leroy--- v2: new v3: no change v4: no change v5: no change v6: no change v8: no change arch/powerpc/kernel/misc_32.S | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-) diff --git a/arch/powerpc/kernel/misc_32.S b/arch/powerpc/kernel/misc_32.S index 09e1e5d..3ec5a22 100644 --- a/arch/powerpc/kernel/misc_32.S +++ b/arch/powerpc/kernel/misc_32.S @@ -348,10 +348,9 @@ BEGIN_FTR_SECTION PURGE_PREFETCHED_INS blr /* for 601, do nothing */ END_FTR_SECTION_IFSET(CPU_FTR_COHERENT_ICACHE) - li r5,L1_CACHE_BYTES-1 - andcr3,r3,r5 + rlwinm r3,r3,0,0,31 - L1_CACHE_SHIFT subfr4,r3,r4 - add r4,r4,r5 + addir4,r4,L1_CACHE_BYTES - 1 srwi. r4,r4,L1_CACHE_SHIFT beqlr mtctr r4 -- 2.1.0 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH v3 0/2] Consolidate redundant register/stack access code
On 02/09/2016 04:36 AM, Michael Ellerman wrote: On Tue, 2016-02-09 at 00:38 -0500, David Long wrote: From: "David A. Long"Move duplicate and functionally equivalent code for accessing registers and stack (CONFIG_HAVE_REGS_AND_STACK_ACCESS_API) from arch subdirs into common kernel files. I'm sending this out again (with updated distribution list) because v2 just never got pulled in, even though I don't think there were any outstanding issues. A big cross arch patch like this would often get taken by Andrew Morton, but AFAICS you didn't CC him - so I just added him, perhaps he'll pick it up for us :D cheers Thanks much. -dl ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH v3 0/2] Consolidate redundant register/stack access code
On 02/09/2016 04:45 AM, Ingo Molnar wrote: * Michael Ellermanwrote: On Tue, 2016-02-09 at 00:38 -0500, David Long wrote: From: "David A. Long" Move duplicate and functionally equivalent code for accessing registers and stack (CONFIG_HAVE_REGS_AND_STACK_ACCESS_API) from arch subdirs into common kernel files. I'm sending this out again (with updated distribution list) because v2 just never got pulled in, even though I don't think there were any outstanding issues. A big cross arch patch like this would often get taken by Andrew Morton, but AFAICS you didn't CC him - so I just added him, perhaps he'll pick it up for us :D The other problem is that the second patch is commingling changes to 6 separate architectures: 16 files changed, 106 insertions(+), 343 deletions(-) that should probably be 6 separate patches. Easier to review, easier to bisect to, easier to revert, etc. Thanks, Ingo I see your point but I'm not sure it could have been broken into separate successive patches that would each build for all architectures. -dl ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: build regression from c153693: Simplify module TOC handling
On Tue, Feb 9, 2016 at 5:28 PM, Peter Robinsonwrote: > Hi Alan, > > Your patch for "powerpc: Simplify module TOC handling" is causing the > Fedora ppc64le to fail to build with depmod failures. Reverting the > commit fixes it for us on rawhide. Anton's patch [1] fixes it. [1] https://build.opensuse.org/package/view_file/Base:System/kmod/depmod-Ignore_PowerPC64_ABIv2_.TOC.symbol.patch > > We're getting the out put below, full logs at [1]. Let me know if you > have any other queries. > > Regards, > Peter > > [1] > http://ppc.koji.fedoraproject.org/kojifiles/work/tasks/5115/3125115/build.log > > + depmod -b . -aeF ./System.map 4.5.0-0.rc2.git0.1.fc24.ppc64le > Depmod failure > + '[' -s depmod.out ']' > + echo 'Depmod failure' > + cat depmod.out > depmod: WARNING: > /builddir/build/BUILDROOT/kernel-4.5.0-0.rc2.git0.1.fc24.ppc64le/./lib/modules/4.5.0-0.rc2.git0.1.fc24.ppc64le/kernel/arch/powerpc/platforms/powernv/opal-prd.ko > needs unknown symbol .TOC. > depmod: WARNING: > /builddir/build/BUILDROOT/kernel-4.5.0-0.rc2.git0.1.fc24.ppc64le/./lib/modules/4.5.0-0.rc2.git0.1.fc24.ppc64le/kernel/arch/powerpc/platforms/pseries/pseries_energy.ko > needs unknown symbol .TOC. > depmod: WARNING: > /builddir/build/BUILDROOT/kernel-4.5.0-0.rc2.git0.1.fc24.ppc64le/./lib/modules/4.5.0-0.rc2.git0.1.fc24.ppc64le/kernel/arch/powerpc/platforms/pseries/hvcserver.ko > needs unknown symbol .TOC. > depmod: WARNING: > /builddir/build/BUILDROOT/kernel-4.5.0-0.rc2.git0.1.fc24.ppc64le/./lib/modules/4.5.0-0.rc2.git0.1.fc24.ppc64le/kernel/arch/powerpc/kvm/kvm.ko > needs unknown symbol .TOC. > depmod: WARNING: > /builddir/build/BUILDROOT/kernel-4.5.0-0.rc2.git0.1.fc24.ppc64le/./lib/modules/4.5.0-0.rc2.git0.1.fc24.ppc64le/kernel/arch/powerpc/kvm/kvm-pr.ko > needs unknown symbol .TOC. > depmod: WARNING: > /builddir/build/BUILDROOT/kernel-4.5.0-0.rc2.git0.1.fc24.ppc64le/./lib/modules/4.5.0-0.rc2.git0.1.fc24.ppc64le/kernel/arch/powerpc/kvm/kvm-hv.ko > needs unknown symbol .TOC. > depmod: WARNING: > /builddir/build/BUILDROOT/kernel-4.5.0-0.rc2.git0.1.fc24.ppc64le/./lib/modules/4.5.0-0.rc2.git0.1.fc24.ppc64le/kernel/kernel/rcu/rcutorture.ko > needs unknown symbol .TOC. > depmod: WARNING: > /builddir/build/BUILDROOT/kernel-4.5.0-0.rc2.git0.1.fc24.ppc64le/./lib/modules/4.5.0-0.rc2.git0.1.fc24.ppc64le/kernel/kernel/trace/ring_buffer_benchmark.ko > needs unknown symbol .TOC. > depmod: WARNING: > /builddir/build/BUILDROOT/kernel-4.5.0-0.rc2.git0.1.fc24.ppc64le/./lib/modules/4.5.0-0.rc2.git0.1.fc24.ppc64le/kernel/kernel/torture.ko > needs unknown symbol .TOC. > depmod: WARNING: > /builddir/build/BUILDROOT/kernel-4.5.0-0.rc2.git0.1.fc24.ppc64le/./lib/modules/4.5.0-0.rc2.git0.1.fc24.ppc64le/kernel/fs/nfs_common/nfs_acl.ko > needs unknown symbol .TOC. > depmod: WARNING: > /builddir/build/BUILDROOT/kernel-4.5.0-0.rc2.git0.1.fc24.ppc64le/./lib/modules/4.5.0-0.rc2.git0.1.fc24.ppc64le/kernel/fs/nfs_common/grace.ko > needs unknown symbol .TOC. > ___ > Linuxppc-dev mailing list > Linuxppc-dev@lists.ozlabs.org > https://lists.ozlabs.org/listinfo/linuxppc-dev ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH 2/6] ibmvscsi: Add and use enums for valid CRQ header values
On 02/09/2016 09:41 AM, Manoj Kumar wrote: >> Yeah, I can see how that is confusing. Since, all three possible valid >> crq message types have the first bit set I think this was originally a >> cute hack to grab anything that was likely valid. Then in >> ibmvscsi_handle_crq() we explicitly match the full header value in a >> switch statement logging anything that turned out actually invalid. >> >>> >>> If 'valid' will only have one of these four enums defined, would >>> this be better written as: >>> >>> if (crq->valid != VIOSRP_CRQ_FREE) >> >> This definitely would make the logic easier to read and follow. Also, >> this would make sure any crq with an invalid header that doesn't have >> its first bit set will also be logged by the ibmvscsi_handle_crq() >> switch statement default block and not silently ignored. >> >> -Tyrel > > Sounds good, Tyrel. Does this mean I should expect a v2 of this patch > series? > > - Manoj N. Kumar Haven't had a chance to clean up and resubmit, but yes there will be a v2 coming along soon. -Tyrel > > ___ > Linuxppc-dev mailing list > Linuxppc-dev@lists.ozlabs.org > https://lists.ozlabs.org/listinfo/linuxppc-dev > ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: build regression from c153693: Simplify module TOC handling
On Tue, 2016-02-09 at 22:02 +0100, Dinar Valeev wrote: > On Tue, Feb 9, 2016 at 5:28 PM, Peter Robinsonwrote: > > Hi Alan, > > > > Your patch for "powerpc: Simplify module TOC handling" is causing the > > Fedora ppc64le to fail to build with depmod failures. Reverting the > > commit fixes it for us on rawhide. > Anton's patch [1] fixes it. > > [1] > https://build.opensuse.org/package/view_file/Base:System/kmod/depmod-Ignore_PowerPC64_ABIv2_.TOC.symbol.patch Yep, you need an updated depmod. Anton sent a patch to linux-modules, reproduced below for the benefit of the list archive: depmod: Ignore PowerPC64 ABIv2 .TOC. symbol The .TOC. symbol on the PowerPC64 ABIv2 identifies the GOT pointer, similar to how other architectures use _GLOBAL_OFFSET_TABLE_. This is not a symbol that needs relocation, and should be ignored by depmod. --- tools/depmod.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/tools/depmod.c b/tools/depmod.c index 6e9bb4d..a2e07c1 100644 --- a/tools/depmod.c +++ b/tools/depmod.c @@ -2153,6 +2153,8 @@ static void depmod_add_fake_syms(struct depmod *depmod) depmod_symbol_add(depmod, "__this_module", true, 0, NULL); /* On S390, this is faked up too */ depmod_symbol_add(depmod, "_GLOBAL_OFFSET_TABLE_", true, 0, NULL); + /* On PowerPC64 ABIv2, .TOC. is more or less _GLOBAL_OFFSET_TABLE_ */ + depmod_symbol_add(depmod, "TOC.", true, 0, NULL); } static int depmod_load_symvers(struct depmod *depmod, const char *filename) -- 2.5.0 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH v7 07/10] ppc64 ftrace: disable profiling for some files
On Mon, 2016-01-25 at 16:31 +0100, Torsten Duwe wrote: > This patch complements the "notrace" attribute for selected functions. > It adds -mprofile-kernel to the cc flags to be stripped from the command > line for code-patching.o and feature-fixups.o, in addition to "-pg" This could probably be folded into patch 5, and the combined patch would be "remove -mprofile-kernel in all the same places we remove -pg and for the same reasons". I can't think of anywhere we would want to disable -pg but not disable -mprofile-kernel? Or vice versa. cheers ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [RFC PATCH kernel] powerpc/ioda: Set "read" permission when "write" is set
On 02/10/2016 01:28 AM, Douglas Miller wrote: We finally got the chance to test it end of last week. I forgot to update everyone Monday. B all appearances, the patch fixes the problem. We did not see any new issues with the patch (vs. same test scenarios without). I'll also update the bugzilla. Thanks. Care to add "Tested-by"? Thanks, Doug On 02/08/2016 07:37 PM, Alexey Kardashevskiy wrote: On 01/20/2016 06:01 AM, Douglas Miller wrote: On 01/18/2016 09:52 PM, Alexey Kardashevskiy wrote: On 01/13/2016 01:24 PM, Douglas Miller wrote: On 01/12/2016 05:07 PM, Benjamin Herrenschmidt wrote: On Tue, 2016-01-12 at 15:40 +1100, Alexey Kardashevskiy wrote: Quite often drivers set only "write" permission assuming that this includes "read" permission as well and this works on plenty platforms. However IODA2 is strict about this and produces an EEH when "read" permission is not and reading happens. This adds a workaround in IODA code to always add the "read" bit when the "write" bit is set. Cc: Benjamin HerrenschmidtSigned-off-by: Alexey Kardashevskiy --- Ben, what was the driver which did not set "read" and caused EEH? aacraid Cheers, Ben. Just to be precise, the driver wasn't responsible for setting READ. The driver called scsi_dma_map() and the scsicmd was set (by scsi layer) as DMA_FROM_DEVICE so the current code would set the permissions to WRITE-ONLY. Previously, and in other architectures, this scsicmd would have resulted in READ+WRITE permissions on the DMA map. Does the patch fix the issue? Thanks. --- arch/powerpc/platforms/powernv/pci.c | 6 ++ 1 file changed, 6 insertions(+) diff --git a/arch/powerpc/platforms/powernv/pci.c b/arch/powerpc/platforms/powernv/pci.c index f2dd772..c7dcae5 100644 --- a/arch/powerpc/platforms/powernv/pci.c +++ b/arch/powerpc/platforms/powernv/pci.c @@ -601,6 +601,9 @@ int pnv_tce_build(struct iommu_table *tbl, long index, long npages, u64 rpn = __pa(uaddr) >> tbl->it_page_shift; long i; +if (proto_tce & TCE_PCI_WRITE) +proto_tce |= TCE_PCI_READ; + for (i = 0; i < npages; i++) { unsigned long newtce = proto_tce | ((rpn + i) << tbl->it_page_shift); @@ -622,6 +625,9 @@ int pnv_tce_xchg(struct iommu_table *tbl, long index, BUG_ON(*hpa & ~IOMMU_PAGE_MASK(tbl)); +if (newtce & TCE_PCI_WRITE) +newtce |= TCE_PCI_READ; + oldtce = xchg(pnv_tce(tbl, idx), cpu_to_be64(newtce)); *hpa = be64_to_cpu(oldtce) & ~(TCE_PCI_READ | TCE_PCI_WRITE); *direction = iommu_tce_direction(oldtce); ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev I am still working on getting a machine to try this on. From code inspection, it looks like it should work. The problem is shortage of machines and machines tied-up by Test. Any progress here? Thanks. -- Alexey ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: powerpc/perf/hv-gpci: Increase request buffer size
Michael Ellerman [m...@ellerman.id.au] wrote: > Here you read from bytes[i] where i can be > 1 (AFAICS). Yes, buffer is large enough and I thought this construct of array was used in a several places. Maybe they are being changed out now (struct pid has one such usage). > > That's fishy at best, and newer GCCs just don't allow it. Ah, ok. > > I think you could do this and it would work, but untested: > >struct hv_gpci_request_buffer { > struct hv_get_perf_counter_info_params params; > uint8_t bytes[4096 - sizeof(struct hv_get_perf_counter_info_parms)]; There is a macro for this computation in that file. I could have used that. Will change it and repost. Thanks, Sukadev ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH V2 1/1] powerpc/perf/hv-gpci: Increase request buffer size
From f1afe08fbc9797ff63adf03efe564a807a37cfe6 Mon Sep 17 00:00:00 2001 From: Sukadev BhattiproluDate: Tue, 9 Feb 2016 02:47:45 -0500 Subject: [PATCH V2 1/1] powerpc/perf/hv-gpci: Increase request buffer size The GPCI hcall allows for a 4K buffer but we limit the buffer to 1K. The problem with a 1K buffer is if a request results in returning more values than can be accomodated in the 1K buffer the request will fail. The buffer we are using is currently allocated on the stack and hence limited in size. Instead use a per-CPU 4K buffer like we do with 24x7 counters (hv-24x7.c). While here, rename the macro GPCI_MAX_DATA_BYTES to HGPCI_MAX_DATA_BYTES for consistency with 24x7 counters. Signed-off-by: Sukadev Bhattiprolu --- Changelog[v2]: - [Michael Ellerman] Specify the exact size of buffer in the GPCI request rather than use an array elment of 1. --- arch/powerpc/perf/hv-gpci.c | 43 +-- 1 file changed, 25 insertions(+), 18 deletions(-) diff --git a/arch/powerpc/perf/hv-gpci.c b/arch/powerpc/perf/hv-gpci.c index 856fe6e..7aa3723 100644 --- a/arch/powerpc/perf/hv-gpci.c +++ b/arch/powerpc/perf/hv-gpci.c @@ -127,8 +127,16 @@ static const struct attribute_group *attr_groups[] = { NULL, }; -#define GPCI_MAX_DATA_BYTES \ - (1024 - sizeof(struct hv_get_perf_counter_info_params)) +#define HGPCI_REQ_BUFFER_SIZE 4096 +#define HGPCI_MAX_DATA_BYTES \ + (HGPCI_REQ_BUFFER_SIZE - sizeof(struct hv_get_perf_counter_info_params)) + +DEFINE_PER_CPU(char, hv_gpci_reqb[HGPCI_REQ_BUFFER_SIZE]) __aligned(sizeof(uint64_t)); + +struct hv_gpci_request_buffer { + struct hv_get_perf_counter_info_params params; + uint8_t bytes[HGPCI_MAX_DATA_BYTES]; +} __packed; static unsigned long single_gpci_request(u32 req, u32 starting_index, u16 secondary_index, u8 version_in, u32 offset, u8 length, @@ -137,24 +145,21 @@ static unsigned long single_gpci_request(u32 req, u32 starting_index, unsigned long ret; size_t i; u64 count; + struct hv_gpci_request_buffer *arg; + + arg = (void *)get_cpu_var(hv_gpci_reqb); + memset(arg, 0, HGPCI_REQ_BUFFER_SIZE); - struct { - struct hv_get_perf_counter_info_params params; - uint8_t bytes[GPCI_MAX_DATA_BYTES]; - } __packed __aligned(sizeof(uint64_t)) arg = { - .params = { - .counter_request = cpu_to_be32(req), - .starting_index = cpu_to_be32(starting_index), - .secondary_index = cpu_to_be16(secondary_index), - .counter_info_version_in = version_in, - } - }; + arg->params.counter_request = cpu_to_be32(req); + arg->params.starting_index = cpu_to_be32(starting_index); + arg->params.secondary_index = cpu_to_be16(secondary_index); + arg->params.counter_info_version_in = version_in; ret = plpar_hcall_norets(H_GET_PERF_COUNTER_INFO, - virt_to_phys(), sizeof(arg)); + virt_to_phys(arg), HGPCI_REQ_BUFFER_SIZE); if (ret) { pr_devel("hcall failed: 0x%lx\n", ret); - return ret; + goto out; } /* @@ -163,9 +168,11 @@ static unsigned long single_gpci_request(u32 req, u32 starting_index, */ count = 0; for (i = offset; i < offset + length; i++) - count |= arg.bytes[i] << (i - offset); + count |= arg->bytes[i] << (i - offset); *value = count; +out: + put_cpu_var(hv_gpci_reqb); return ret; } @@ -245,10 +252,10 @@ static int h_gpci_event_init(struct perf_event *event) } /* last byte within the buffer? */ - if ((event_get_offset(event) + length) > GPCI_MAX_DATA_BYTES) { + if ((event_get_offset(event) + length) > HGPCI_MAX_DATA_BYTES) { pr_devel("request outside of buffer: %zu > %zu\n", (size_t)event_get_offset(event) + length, - GPCI_MAX_DATA_BYTES); + HGPCI_MAX_DATA_BYTES); return -EINVAL; } -- 1.8.3.1 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH v7 06/10] ppc64 ftrace: disable profiling for some functions
On Mon, 2016-01-25 at 16:31 +0100, Torsten Duwe wrote: > At least POWER7/8 have MMUs that don't completely autoload; > a normal, recoverable memory fault might pass through these functions. > If a dynamic tracer function causes such a fault, any of these functions > being traced with -mprofile-kernel may cause an endless recursion. I'm not really happy with this one, still :) At the moment I can trace these without any problems, with either ftrace or kprobes, but obviously it was causing you some trouble. So I'd like to understand why you were having issues when regular tracing doesn't. If it's the case that tracing can work for these functions, but live patching doesn't (for some reason), then maybe these should be blocked by the live patching infrastructure rather than at the ftrace/kprobes level. cheers ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH v14 5/9] powerpc/eeh: EEH device for VF
From: Wei YangVFs and their corresponding pdn are created and released dynamically when their PF's SRIOV capability is enabled and disabled. This creates and releases EEH devices for VFs when creating and releasing their pdn instances, which means EEH devices and pdn instances have same life cycle. Also, VF's EEH device is identified by (struct eeh_dev::physfn). Signed-off-by: Wei Yang Acked-by: Gavin Shan --- arch/powerpc/include/asm/eeh.h | 1 + arch/powerpc/kernel/pci_dn.c | 15 +++ 2 files changed, 16 insertions(+) diff --git a/arch/powerpc/include/asm/eeh.h b/arch/powerpc/include/asm/eeh.h index 867c39b..574ed49a 100644 --- a/arch/powerpc/include/asm/eeh.h +++ b/arch/powerpc/include/asm/eeh.h @@ -141,6 +141,7 @@ struct eeh_dev { struct pci_controller *phb; /* Associated PHB */ struct pci_dn *pdn; /* Associated PCI device node */ struct pci_dev *pdev; /* Associated PCI device*/ + struct pci_dev *physfn; /* Associated SRIOV PF */ struct pci_bus *bus;/* PCI bus for partial hotplug */ }; diff --git a/arch/powerpc/kernel/pci_dn.c b/arch/powerpc/kernel/pci_dn.c index b3b4df9..e23bdf7 100644 --- a/arch/powerpc/kernel/pci_dn.c +++ b/arch/powerpc/kernel/pci_dn.c @@ -179,6 +179,7 @@ struct pci_dn *add_dev_pci_data(struct pci_dev *pdev) { #ifdef CONFIG_PCI_IOV struct pci_dn *parent, *pdn; + struct eeh_dev *edev; int i; /* Only support IOV for now */ @@ -204,6 +205,12 @@ struct pci_dn *add_dev_pci_data(struct pci_dev *pdev) __func__, i); return NULL; } + + /* Create the EEH device for the VF */ + eeh_dev_init(pdn, pci_bus_to_host(pdev->bus)); + edev = pdn_to_eeh_dev(pdn); + BUG_ON(!edev); + edev->physfn = pdev; } #endif /* CONFIG_PCI_IOV */ @@ -215,6 +222,7 @@ void remove_dev_pci_data(struct pci_dev *pdev) #ifdef CONFIG_PCI_IOV struct pci_dn *parent; struct pci_dn *pdn, *tmp; + struct eeh_dev *edev; int i; /* @@ -256,6 +264,13 @@ void remove_dev_pci_data(struct pci_dev *pdev) pdn->devfn != pci_iov_virtfn_devfn(pdev, i)) continue; + /* Release EEH device for the VF */ + edev = pdn_to_eeh_dev(pdn); + if (edev) { + pdn->edev = NULL; + kfree(edev); + } + if (!list_empty(>list)) list_del(>list); -- 2.1.0 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH v14 0/9] EEH Support for SRIOV VFs
This applies to linux-powerpc-next and additional unmerged patches: [v2,1/4] powerpc/eeh: Fix stale cached primary bus powerpc/eeh: fix incorrect function name in comment [V2] powerpc/powernv: Remove support for p5ioc2 [V7,1/6] powerpc/powernv: don't enable SRIOV when VF BAR has non 64bit-prefetchable BAR 92e963f Linux 4.5-rc1 - Linux powerpc next branch This patchset enables EEH on SRIOV VFs. The general idea is to create proper VF edev and VF PE and handle them properly. Different from the Bus PE, VF PE just contain one VF. This introduces the difference of EEH error handling on a VF PE. Generally, it has several differences. First, the VF's removal and re-enumerate rely on its PF. VF has a tight relationship between its PF. This is not proper to enumerate a VF by usual scan procedure. That's why virtfn_add/virtfn_remove are exported in this patch set. Second, the reset/restore of a VF is done in kernel space. FW is not aware of the VF, this means the usual reset function done in FW will not work. One of the patch will imitate the reset/restore function in kernel space. Third, the VF may be removed during the PF's error_detected function. In this case, the original error_detected->slot_reset->resume sequence is not proper to those removed VFs, since they are re-created by PF in a fresh state. A flag in eeh_dev is introduce to mark the eeh_dev is in error state. By doing so, we track whether this device needs to be reset or not. This has been tested both on host and in guest on Power8 with latest kernel version. Changelog = v14: * Rebased to linux-powerpc-next branch, plus additional patches related to powerpc/pci and powerpc/eeh subsystems. * Minor code changes * Fix build error on pSeries reported by mpe v13: * move eeh_rmv_data{} to eeh_driver.c v12: * Rephrase some commit log to make it more clear and specific * move vf_index assignment in CONFIG_PPC_POWERNV * merge "Cache VF index in pci_dn" with "Support error recovery for VF PE" * check the return value after eeh_dev_init() for VF * initialize the parameter before pass to read_config() * make pnv_pci_fixup_vf_mps() a dedicated patch, which fixup and store mps value in pci_dn v11: * move vf_index assignment in marco CONFIG_PPC_POWERNV * merge Patch "Cache VF index in pci_dn" into Patch "Support error recovery for VF PE" v10: * rebased on v4.2 * delete the last patch "powerpc/powernv: compound PE for VFs" since after redesign of SRIOV, there is no compound PE for VFs now. * add two patches which fix problems found during tests powerpc/eeh: Support error recovery for VF PE powerpc/eeh: Handle hot removed VF when PF is EEH aware v9: * split pcibios_bus_add_device() into a separate patch * Bjorn acked the PCI part and agreed this patch set to be merged from ppc tree * rebased on mpe/linux.git next branch v8: * fix on checking the return value of pnv_eeh_do_flr() * introduced a weak function pcibios_bus_add_device() to create PE for VFs v7: * fix compile error when PCI_IOV is not set v6: * code / commit log refactor by Gavin v5: * remove the compound field, iterate on Master VF PE instead * some code refine on PCI config restore and reset on VF the wait time for assert and deassert PCI device address format check on edev->pcie_cap and edev->aer_cap before access them v4: * refine the change logs, comment and code style * change pnv_pci_fixup_vf_eeh() to pnv_eeh_vf_final_fixup() and remove the CONFIG_PCI_IOV macro * reorder patch 5/6 to make the logic more reasonable * remove remove_dev_pci_data() * remove the EEH_DEV_VF flag, use edev->physfn to identify a VF EEH DEV and remove related CONFIG_PCI_IOV macro * add the option for VF reset * fix the pnv_eeh_cfg_blocked() logic * replace pnv_pci_cfg_{read,write} with eeh_ops->{read,write}_config in pnv_eeh_vf_restore_config() * rename pnv_eeh_vf_restore_config() to pnv_eeh_restore_vf_config() * rename pnv_pci_fixup_vf_caps() to pnv_pci_vf_header_fixup() and move it to arch/powerpc/platforms/powernv/pci.c * add a field compound in pnv_ioda_pe to link compound PEs * handle compound PE for VF PEs v3: * add back vf_index in pci_dn to track the VF's index * rename ppdev in eeh_dev to physfn for consistency * move edev->physfn assignment before dev->dev.archdata.edev is set * move pnv_pci_fixup_vf_eeh() and pnv_pci_fixup_vf_caps() to eeh-powernv.c * more clear and detail in commit log and comment in code * merge eeh_rmv_virt_device() with eeh_rmv_device() * move the cfg_blocked check logic from pnv_eeh_read/write_config() to pnv_eeh_cfg_blocked() * move the vf reset/restore logic into its own patch, two patches are created. powerpc/powernv: Support PCI config restore for VFs powerpc/powernv: Support EEH reset for VFs * simplify the vf reset logic v2: * add prefix pci_iov_
[PATCH v14 3/9] powerpc/pci: Remove VFs prior to PF
From: Wei YangAs commit ac205b7bb72f ("PCI: make sriov work with hotplug remove") indicates, VFs which is on the same PCI bus as their PF, should be removed before the PF. Otherwise, we might run into kernel crash at PCI unplugging time. This applies the above pattern to powerpc PCI hotplug path. Signed-off-by: Wei Yang Acked-by: Gavin Shan --- arch/powerpc/kernel/pci-hotplug.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/powerpc/kernel/pci-hotplug.c b/arch/powerpc/kernel/pci-hotplug.c index 7f9ed0c..59c4361 100644 --- a/arch/powerpc/kernel/pci-hotplug.c +++ b/arch/powerpc/kernel/pci-hotplug.c @@ -55,7 +55,7 @@ void pcibios_remove_pci_devices(struct pci_bus *bus) pr_debug("PCI: Removing devices on bus %04x:%02x\n", pci_domain_nr(bus), bus->number); - list_for_each_entry_safe(dev, tmp, >devices, bus_list) { + list_for_each_entry_safe_reverse(dev, tmp, >devices, bus_list) { pr_debug(" Removing %s...\n", pci_name(dev)); pci_stop_and_remove_bus_device(dev); } -- 2.1.0 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH v3 13/18] cxl: sysfs support for guests
Excerpts from Frederic Barrat's message of 2016-02-10 02:21:19 +1100: > > Le 08/02/2016 04:02, Stewart Smith a écrit : > > Frederic Barratwrites: > >> --- a/Documentation/ABI/testing/sysfs-class-cxl > >> +++ b/Documentation/ABI/testing/sysfs-class-cxl > >> @@ -183,7 +183,7 @@ Description:read only > >> Identifies the revision level of the PSL. > >> Users:https://github.com/ibm-capi/libcxl > >> > >> -What: /sys/class/cxl//base_image > >> +What: /sys/class/cxl//base_image (not in a guest) > > > > Is this going to be the case for KVM guest as well as PowerVM guest? > > > That's too early to say. > The entries we've removed are because the information is filtered by > pHyp and not available to the OS. Some of it because nobody thought it > would be useful, some of it because it's not meant to be seen by the OS. > For KVM, if the card can be shared between guests, I would expect the > same kind of restrictions. The OS doesn't particularly care about this - the only people who might even possibly need to know will be whoever is trying to flash their PSL image, and probably not even then. On KVM we are thinking that it will have to be root on the hypervisor responsible for flashing the PSL image (there isn't much other option unless we want to go into signed images and whatnot, but even if we do I'm 100% committed to making that a userspace problem to solve and not trying to do anything fancy in the kernel), so we won't really need it, but I also don't see any harm in exposing it to guests. Cheers, -Ian ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH v3 05/18] cxl: Rename some bare-metal specific functions
Excerpts from Frederic Barrat's message of 2016-02-07 00:28:52 +1100: > Rename a few functions, mostly prefixed by 'cxl_', to make clear that > the implementation is 'bare-metal' specific. Patch looks fine to me, though the commit message should probably say that you are changing the 'cxl_' prefix to 'cxl_pci_'. Acked-by: Ian Munsie___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH v3 07/18] cxl: Update cxl_irq() prototype
> The context parameter when calling cxl_irq() should be strongly typed. Fair enough ;-) Acked-by: Ian Munsie___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH v14 8/9] powerpc/powernv: Support PCI config restore for VFs
From: Wei YangAfter PE reset, OPAL API opal_pci_reinit() is called on all devices contained in the PE to reinitialize them. While skiboot is not aware of VFs, we have to implement the function in kernel to reinitialize VFs after reset on PE for VFs. In this patch, two functions pnv_pci_fixup_vf_mps() and pnv_eeh_restore_vf_config() both manipulate the MPS of the VF, since for a VF it has three cases. 1. Normal creation for a VF In this case, pnv_pci_fixup_vf_mps() is called to make the MPS a proper value compared with its parent. 2. EEH recovery without VF removed In this case, MPS is stored in pci_dn and pnv_eeh_restore_vf_config() is called to restore it and reinitialize other part. 3. EEH recovery with VF removed In this case, VF will be removed then re-created. Both functions are called. First pnv_pci_fixup_vf_mps() is called to store the proper MPS to pci_dn and then pnv_eeh_restore_vf_config() is called to do proper thing. This introduces two functions: pnv_pci_fixup_vf_mps() to fixup the VF's MPS to make sure it is equal to parent's and store this value in pci_dn for future use. pnv_eeh_restore_vf_config() to re-initialize on VF by restoring MPS, disabling completion timeout, enabling SERR, etc. Signed-off-by: Wei Yang Acked-by: Gavin Shan --- arch/powerpc/include/asm/pci-bridge.h| 1 + arch/powerpc/platforms/powernv/eeh-powernv.c | 95 +++- 2 files changed, 93 insertions(+), 3 deletions(-) diff --git a/arch/powerpc/include/asm/pci-bridge.h b/arch/powerpc/include/asm/pci-bridge.h index b0b43f5..f4d1758 100644 --- a/arch/powerpc/include/asm/pci-bridge.h +++ b/arch/powerpc/include/asm/pci-bridge.h @@ -220,6 +220,7 @@ struct pci_dn { #define IODA_INVALID_M64(-1) int (*m64_map)[PCI_SRIOV_NUM_BARS]; #endif /* CONFIG_PCI_IOV */ + int mps;/* Maximum Payload Size */ #endif struct list_head child_list; struct list_head list; diff --git a/arch/powerpc/platforms/powernv/eeh-powernv.c b/arch/powerpc/platforms/powernv/eeh-powernv.c index e26256b..950b3e5 100644 --- a/arch/powerpc/platforms/powernv/eeh-powernv.c +++ b/arch/powerpc/platforms/powernv/eeh-powernv.c @@ -1588,6 +1588,65 @@ static int pnv_eeh_next_error(struct eeh_pe **pe) return ret; } +static int pnv_eeh_restore_vf_config(struct pci_dn *pdn) +{ + struct eeh_dev *edev = pdn_to_eeh_dev(pdn); + u32 devctl, cmd, cap2, aer_capctl; + int old_mps; + + if (edev->pcie_cap) { + /* Restore MPS */ + old_mps = (ffs(pdn->mps) - 8) << 5; + eeh_ops->read_config(pdn, edev->pcie_cap + PCI_EXP_DEVCTL, +2, ); + devctl &= ~PCI_EXP_DEVCTL_PAYLOAD; + devctl |= old_mps; + eeh_ops->write_config(pdn, edev->pcie_cap + PCI_EXP_DEVCTL, + 2, devctl); + + /* Disable Completion Timeout */ + eeh_ops->read_config(pdn, edev->pcie_cap + PCI_EXP_DEVCAP2, +4, ); + if (cap2 & 0x10) { + eeh_ops->read_config(pdn, +edev->pcie_cap + PCI_EXP_DEVCTL2, +4, ); + cap2 |= 0x10; + eeh_ops->write_config(pdn, + edev->pcie_cap + PCI_EXP_DEVCTL2, + 4, cap2); + } + } + + /* Enable SERR and parity checking */ + eeh_ops->read_config(pdn, PCI_COMMAND, 2, ); + cmd |= (PCI_COMMAND_PARITY | PCI_COMMAND_SERR); + eeh_ops->write_config(pdn, PCI_COMMAND, 2, cmd); + + /* Enable report various errors */ + if (edev->pcie_cap) { + eeh_ops->read_config(pdn, edev->pcie_cap + PCI_EXP_DEVCTL, +2, ); + devctl &= ~PCI_EXP_DEVCTL_CERE; + devctl |= (PCI_EXP_DEVCTL_NFERE | + PCI_EXP_DEVCTL_FERE | + PCI_EXP_DEVCTL_URRE); + eeh_ops->write_config(pdn, edev->pcie_cap + PCI_EXP_DEVCTL, + 2, devctl); + } + + /* Enable ECRC generation and check */ + if (edev->pcie_cap && edev->aer_cap) { + eeh_ops->read_config(pdn, edev->aer_cap + PCI_ERR_CAP, +4, _capctl); + aer_capctl |= (PCI_ERR_CAP_ECRC_GENE | PCI_ERR_CAP_ECRC_CHKE); + eeh_ops->write_config(pdn, edev->aer_cap + PCI_ERR_CAP, + 4, aer_capctl); + } + + return 0; +} + static int pnv_eeh_restore_config(struct pci_dn *pdn) { struct eeh_dev *edev = pdn_to_eeh_dev(pdn); @@ -1597,9
[PATCH v14 7/9] powerpc/powernv: Support EEH reset for VF PE
From: Wei YangPEs for VFs don't have primary bus. So they have to have their own reset backend, which is used during EEH recovery. The patch implements the reset backend for VF's PE by issuing FLR or AF FLR to the VFs, which are contained in the PE. Signed-off-by: Wei Yang Acked-by: Gavin Shan --- arch/powerpc/include/asm/eeh.h | 1 + arch/powerpc/kernel/eeh.c| 9 +- arch/powerpc/platforms/powernv/eeh-powernv.c | 127 ++- 3 files changed, 133 insertions(+), 4 deletions(-) diff --git a/arch/powerpc/include/asm/eeh.h b/arch/powerpc/include/asm/eeh.h index 0c551a2..b5b5f45 100644 --- a/arch/powerpc/include/asm/eeh.h +++ b/arch/powerpc/include/asm/eeh.h @@ -137,6 +137,7 @@ struct eeh_dev { int pcix_cap; /* Saved PCIx capability*/ int pcie_cap; /* Saved PCIe capability*/ int aer_cap;/* Saved AER capability */ + int af_cap; /* Saved AF capability */ struct eeh_pe *pe; /* Associated PE*/ struct list_head list; /* Form link list in the PE */ struct pci_controller *phb; /* Associated PHB */ diff --git a/arch/powerpc/kernel/eeh.c b/arch/powerpc/kernel/eeh.c index 8c6005c..0d72462 100644 --- a/arch/powerpc/kernel/eeh.c +++ b/arch/powerpc/kernel/eeh.c @@ -761,7 +761,8 @@ int pcibios_set_pcie_reset_state(struct pci_dev *dev, enum pcie_reset_state stat case pcie_deassert_reset: eeh_ops->reset(pe, EEH_RESET_DEACTIVATE); eeh_unfreeze_pe(pe, false); - eeh_pe_state_clear(pe, EEH_PE_CFG_BLOCKED); + if (!(pe->type & EEH_PE_VF)) + eeh_pe_state_clear(pe, EEH_PE_CFG_BLOCKED); eeh_pe_dev_traverse(pe, eeh_restore_dev_state, dev); eeh_pe_state_clear(pe, EEH_PE_ISOLATED); break; @@ -769,14 +770,16 @@ int pcibios_set_pcie_reset_state(struct pci_dev *dev, enum pcie_reset_state stat eeh_pe_state_mark_with_cfg(pe, EEH_PE_ISOLATED); eeh_ops->set_option(pe, EEH_OPT_FREEZE_PE); eeh_pe_dev_traverse(pe, eeh_disable_and_save_dev_state, dev); - eeh_pe_state_mark(pe, EEH_PE_CFG_BLOCKED); + if (!(pe->type & EEH_PE_VF)) + eeh_pe_state_mark(pe, EEH_PE_CFG_BLOCKED); eeh_ops->reset(pe, EEH_RESET_HOT); break; case pcie_warm_reset: eeh_pe_state_mark_with_cfg(pe, EEH_PE_ISOLATED); eeh_ops->set_option(pe, EEH_OPT_FREEZE_PE); eeh_pe_dev_traverse(pe, eeh_disable_and_save_dev_state, dev); - eeh_pe_state_mark(pe, EEH_PE_CFG_BLOCKED); + if (!(pe->type & EEH_PE_VF)) + eeh_pe_state_mark(pe, EEH_PE_CFG_BLOCKED); eeh_ops->reset(pe, EEH_RESET_FUNDAMENTAL); break; default: diff --git a/arch/powerpc/platforms/powernv/eeh-powernv.c b/arch/powerpc/platforms/powernv/eeh-powernv.c index 830526e..e26256b 100644 --- a/arch/powerpc/platforms/powernv/eeh-powernv.c +++ b/arch/powerpc/platforms/powernv/eeh-powernv.c @@ -371,6 +371,7 @@ static void *pnv_eeh_probe(struct pci_dn *pdn, void *data) edev->mode &= 0xFF00; edev->pcix_cap = pnv_eeh_find_cap(pdn, PCI_CAP_ID_PCIX); edev->pcie_cap = pnv_eeh_find_cap(pdn, PCI_CAP_ID_EXP); + edev->af_cap = pnv_eeh_find_cap(pdn, PCI_CAP_ID_AF); edev->aer_cap = pnv_eeh_find_ecap(pdn, PCI_EXT_CAP_ID_ERR); if ((edev->class_code >> 8) == PCI_CLASS_BRIDGE_PCI) { edev->mode |= EEH_DEV_BRIDGE; @@ -879,6 +880,120 @@ void pnv_pci_reset_secondary_bus(struct pci_dev *dev) } } +static void pnv_eeh_wait_for_pending(struct pci_dn *pdn, const char *type, +int pos, u16 mask) +{ + struct eeh_dev *edev = pdn_to_eeh_dev(pdn); + int i, status = 0; + + /* Wait for Transaction Pending bit to be cleared */ + for (i = 0; i < 4; i++) { + eeh_ops->read_config(pdn, pos, 2, ); + if (!(status & mask)) + return; + + msleep((1 << i) * 100); + } + + pr_warn("%s: Pending transaction while issuing %sFLR to %04x:%02x:%02x.%01x\n", + __func__, type, + edev->phb->global_number, pdn->busno, + PCI_SLOT(pdn->devfn), PCI_FUNC(pdn->devfn)); +} + +static int pnv_eeh_do_flr(struct pci_dn *pdn, int option) +{ + struct eeh_dev *edev = pdn_to_eeh_dev(pdn); + u32 reg = 0; + + if (WARN_ON(!edev->pcie_cap)) + return -ENOTTY; + + eeh_ops->read_config(pdn, edev->pcie_cap + PCI_EXP_DEVCAP, 4, ); + if (!(reg &
[PATCH v14 6/9] powerpc/eeh: Create PE for VFs
From: Wei YangThis creates PEs for VFs in the weak function pcibios_bus_add_device(). Those PEs for VFs are identified with newly introduced flag EEH_PE_VF so that we treat them differently during EEH recovery. Signed-off-by: Wei Yang Acked-by: Gavin Shan --- arch/powerpc/include/asm/eeh.h | 1 + arch/powerpc/kernel/eeh_pe.c | 10 -- arch/powerpc/platforms/powernv/eeh-powernv.c | 16 3 files changed, 25 insertions(+), 2 deletions(-) diff --git a/arch/powerpc/include/asm/eeh.h b/arch/powerpc/include/asm/eeh.h index 574ed49a..0c551a2 100644 --- a/arch/powerpc/include/asm/eeh.h +++ b/arch/powerpc/include/asm/eeh.h @@ -72,6 +72,7 @@ struct pci_dn; #define EEH_PE_PHB (1 << 1)/* PHB PE*/ #define EEH_PE_DEVICE (1 << 2)/* Device PE */ #define EEH_PE_BUS (1 << 3)/* Bus PE*/ +#define EEH_PE_VF (1 << 4)/* VF PE */ #define EEH_PE_ISOLATED(1 << 0)/* Isolated PE */ #define EEH_PE_RECOVERING (1 << 1)/* Recovering PE*/ diff --git a/arch/powerpc/kernel/eeh_pe.c b/arch/powerpc/kernel/eeh_pe.c index faaf19e..e441d7b 100644 --- a/arch/powerpc/kernel/eeh_pe.c +++ b/arch/powerpc/kernel/eeh_pe.c @@ -299,7 +299,10 @@ static struct eeh_pe *eeh_pe_get_parent(struct eeh_dev *edev) * EEH device already having associated PE, but * the direct parent EEH device doesn't have yet. */ - pdn = pdn ? pdn->parent : NULL; + if (edev->physfn) + pdn = pci_get_pdn(edev->physfn); + else + pdn = pdn ? pdn->parent : NULL; while (pdn) { /* We're poking out of PCI territory */ parent = pdn_to_eeh_dev(pdn); @@ -382,7 +385,10 @@ int eeh_add_to_parent_pe(struct eeh_dev *edev) } /* Create a new EEH PE */ - pe = eeh_pe_alloc(edev->phb, EEH_PE_DEVICE); + if (edev->physfn) + pe = eeh_pe_alloc(edev->phb, EEH_PE_VF); + else + pe = eeh_pe_alloc(edev->phb, EEH_PE_DEVICE); if (!pe) { pr_err("%s: out of memory!\n", __func__); return -ENOMEM; diff --git a/arch/powerpc/platforms/powernv/eeh-powernv.c b/arch/powerpc/platforms/powernv/eeh-powernv.c index 8119172..830526e 100644 --- a/arch/powerpc/platforms/powernv/eeh-powernv.c +++ b/arch/powerpc/platforms/powernv/eeh-powernv.c @@ -1503,6 +1503,22 @@ static struct eeh_ops pnv_eeh_ops = { .restore_config = pnv_eeh_restore_config }; +void pcibios_bus_add_device(struct pci_dev *pdev) +{ + struct pci_dn *pdn = pci_get_pdn(pdev); + + if (!pdev->is_virtfn) + return; + + /* +* The following operations will fail if VF's sysfs files +* aren't created or its resources aren't finalized. +*/ + eeh_add_device_early(pdn); + eeh_add_device_late(pdev); + eeh_sysfs_add_device(pdev); +} + /** * eeh_powernv_init - Register platform dependent EEH operations * -- 2.1.0 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH v3 01/18] cxl: Move common code away from bare-metal-specific files
Acked-by: Ian Munsie___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH v3 02/18] cxl: Move bare-metal specific code to specialized files
Acked-by: Ian Munsie___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH v3 04/18] cxl: Introduce implementation-specific API
Acked-by: Ian Munsie___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH] powerpc32: PAGE_EXEC required for inittext
PAGE_EXEC is required for inittext, otherwise CONFIG_DEBUG_PAGEALLOC ends up with an Oops [0.00] Inode-cache hash table entries: 8192 (order: 1, 32768 bytes) [0.00] Sorting __ex_table... [0.00] bootmem::free_all_bootmem_core nid=0 start=0 end=2000 [0.00] Unable to handle kernel paging request for instruction fetch [0.00] Faulting instruction address: 0xc045b970 [0.00] Oops: Kernel access of bad area, sig: 11 [#1] [0.00] PREEMPT DEBUG_PAGEALLOC CMPC885 [0.00] CPU: 0 PID: 0 Comm: swapper Not tainted 3.18.25-local-dirty #1673 [0.00] task: c04d83d0 ti: c04f8000 task.ti: c04f8000 [0.00] NIP: c045b970 LR: c045b970 CTR: 000a [0.00] REGS: c04f9ea0 TRAP: 0400 Not tainted (3.18.25-local-dirty) [0.00] MSR: 08001032CR: 39955d35 XER: a000ff40 [0.00] GPR00: c045b970 c04f9f50 c04d83d0 c04dcdf4 0048 c04f6b10 GPR08: c04f6ab0 0001 c0563488 c04f6ab0 c04f8000 b6db6db7 GPR16: 3474 0180 2000 c7fec000 03ff 0176 c0415014 GPR24: c0471018 c0414ee8 c05304e8 c03aeaac c051 c0471018 c0471010 [0.00] NIP [c045b970] free_all_bootmem+0x164/0x228 [0.00] LR [c045b970] free_all_bootmem+0x164/0x228 [0.00] Call Trace: [0.00] [c04f9f50] [c045b970] free_all_bootmem+0x164/0x228 (unreliable) [0.00] [c04f9fa0] [c0454044] mem_init+0x3c/0xd0 [0.00] [c04f9fb0] [c045080c] start_kernel+0x1f4/0x390 [0.00] [c04f9ff0] [c0002214] start_here+0x38/0x98 [0.00] Instruction dump: [0.00] 2f15 7f968840 72a90001 3ad60001 56b5f87e 419a0028 419e0024 41a20018 [0.00] 807cc20c 3880 7c638214 4bffd2f5 <3a940001> 3a100024 4bc8 7e368b78 [0.00] ---[ end trace dc8fa200cb88537f ]--- Signed-off-by: Christophe Leroy --- This patch goes on top of the following serie: [PATCH v8 00/23] powerpc/8xx: Use large pages for RAM and IMMR and other improvments arch/powerpc/mm/pgtable_32.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/arch/powerpc/mm/pgtable_32.c b/arch/powerpc/mm/pgtable_32.c index 815ccd7..ba2ee66 100644 --- a/arch/powerpc/mm/pgtable_32.c +++ b/arch/powerpc/mm/pgtable_32.c @@ -40,7 +40,7 @@ unsigned long ioremap_bot; EXPORT_SYMBOL(ioremap_bot);/* aka VMALLOC_END */ -extern char etext[], _stext[]; +extern char etext[], _stext[], _sinittext[], _einittext[]; #define PGDIR_ORDER(32 + PGD_T_LOG2 - PGDIR_SHIFT) @@ -289,7 +289,8 @@ void __init __mapin_ram_chunk(unsigned long offset, unsigned long top) v = PAGE_OFFSET + s; p = memstart_addr + s; for (; s < top; s += PAGE_SIZE) { - ktext = ((char *) v >= _stext && (char *) v < etext); + ktext = ((char *)v >= _stext && (char *)v < etext) || + ((char *)v >= _sinittext && (char *)v < _einittext); f = ktext ? pgprot_val(PAGE_KERNEL_TEXT) : pgprot_val(PAGE_KERNEL); map_page(v, p, f); #ifdef CONFIG_PPC_STD_MMU_32 -- 2.1.0 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH v3 08/18] cxl: IRQ allocation for guests
Acked-by: Ian Munsie> +/* > + * Look for the interrupt number. > + * On bare-metal, we know the range 0 only contains the PSL > + * interrupt so, we could start counting at range 1 and initialize > + * afu_irq at 1. > + * In a guest, range 0 also contains AFU interrupts, so it must > + * be counted for, but we initialize afu_irq at 0 to take into > + * account the PSL interrupt. > + * > + * For code-readability, it just seems easier to go over all > + * the ranges. > + */ Thanks for adding that explanation :) > +if (cpu_has_feature(CPU_FTR_HVMODE)) > +alloc_count = count; > +else > +alloc_count = count + 1; Almost a shame you can't reuse the afu_irq_range_start function you defined for this, but doing so would probably make the code less readable, so fine to leave this as is. > /* We've allocated all memory now, so let's do the irq allocations */ > irq_name = list_first_entry(>irq_names, struct cxl_irq_name, list); > -for (r = 1; r < CXL_IRQ_RANGES; r++) { > +for (r = afu_irq_range_start(); r < CXL_IRQ_RANGES; r++) { > hwirq = ctx->irqs.offset[r]; > for (i = 0; i < ctx->irqs.range[r]; hwirq++, i++) { > -cxl_map_irq(ctx->afu->adapter, hwirq, > -cxl_irq_afu, ctx, irq_name->name); > +if (r == 0 && i == 0) > +/* PSL interrupt, only for guest */ That comment is perhaps not as clear as it could be - the interrupt is used on either, but it's only allocated per context on PowerVM guests. Cheers, -Ian ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH v14 1/9] PCI/IOV: Rename and export virtfn_{add, remove}
From: Wei YangDuring EEH recovery, hotplug is applied to the devices which don't have drivers or their drivers don't support EEH. However, the hotplug, which was implemented based on PCI bus, can't be applied to VF directly. Instead, we unplug and plug individual PCI devices (VFs). This renames virtn_{add,remove}() and exports them so they can be used in PCI hotplug during EEH recovery. Signed-off-by: Wei Yang Reviewed-by: Gavin Shan Acked-by: Bjorn Helgaas --- drivers/pci/iov.c | 10 +- include/linux/pci.h | 8 2 files changed, 13 insertions(+), 5 deletions(-) diff --git a/drivers/pci/iov.c b/drivers/pci/iov.c index 31f31d4..fa4f138 100644 --- a/drivers/pci/iov.c +++ b/drivers/pci/iov.c @@ -113,7 +113,7 @@ resource_size_t pci_iov_resource_size(struct pci_dev *dev, int resno) return dev->sriov->barsz[resno - PCI_IOV_RESOURCES]; } -static int virtfn_add(struct pci_dev *dev, int id, int reset) +int pci_iov_add_virtfn(struct pci_dev *dev, int id, int reset) { int i; int rc = -ENOMEM; @@ -188,7 +188,7 @@ failed: return rc; } -static void virtfn_remove(struct pci_dev *dev, int id, int reset) +void pci_iov_remove_virtfn(struct pci_dev *dev, int id, int reset) { char buf[VIRTFN_ID_LEN]; struct pci_dev *virtfn; @@ -321,7 +321,7 @@ static int sriov_enable(struct pci_dev *dev, int nr_virtfn) } for (i = 0; i < initial; i++) { - rc = virtfn_add(dev, i, 0); + rc = pci_iov_add_virtfn(dev, i, 0); if (rc) goto failed; } @@ -333,7 +333,7 @@ static int sriov_enable(struct pci_dev *dev, int nr_virtfn) failed: while (i--) - virtfn_remove(dev, i, 0); + pci_iov_remove_virtfn(dev, i, 0); pcibios_sriov_disable(dev); err_pcibios: @@ -359,7 +359,7 @@ static void sriov_disable(struct pci_dev *dev) return; for (i = 0; i < iov->num_VFs; i++) - virtfn_remove(dev, i, 0); + pci_iov_remove_virtfn(dev, i, 0); pcibios_sriov_disable(dev); diff --git a/include/linux/pci.h b/include/linux/pci.h index 27df4a6..3db5e30 100644 --- a/include/linux/pci.h +++ b/include/linux/pci.h @@ -1738,6 +1738,8 @@ int pci_iov_virtfn_devfn(struct pci_dev *dev, int id); int pci_enable_sriov(struct pci_dev *dev, int nr_virtfn); void pci_disable_sriov(struct pci_dev *dev); +int pci_iov_add_virtfn(struct pci_dev *dev, int id, int reset); +void pci_iov_remove_virtfn(struct pci_dev *dev, int id, int reset); int pci_num_vf(struct pci_dev *dev); int pci_vfs_assigned(struct pci_dev *dev); int pci_sriov_set_totalvfs(struct pci_dev *dev, u16 numvfs); @@ -1754,6 +1756,12 @@ static inline int pci_iov_virtfn_devfn(struct pci_dev *dev, int id) } static inline int pci_enable_sriov(struct pci_dev *dev, int nr_virtfn) { return -ENODEV; } +static inline int pci_iov_add_virtfn(struct pci_dev *dev, int id, int reset) +{ + return -ENOSYS; +} +static inline void pci_iov_remove_virtfn(struct pci_dev *dev, +int id, int reset) { } static inline void pci_disable_sriov(struct pci_dev *dev) { } static inline int pci_num_vf(struct pci_dev *dev) { return 0; } static inline int pci_vfs_assigned(struct pci_dev *dev) -- 2.1.0 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH v14 2/9] PCI: Add pcibios_bus_add_device() weak function
From: Wei YangThis adds weak function pcibios_bus_add_device() for arch dependent code could do proper setup. For example, powerpc could setup EEH related resources for SRIOV VFs. Signed-off-by: Wei Yang Reviewed-by: Gavin Shan Acked-by: Bjorn Helgaas --- drivers/pci/bus.c | 3 +++ include/linux/pci.h | 1 + 2 files changed, 4 insertions(+) diff --git a/drivers/pci/bus.c b/drivers/pci/bus.c index 89b3bef..6469ff6 100644 --- a/drivers/pci/bus.c +++ b/drivers/pci/bus.c @@ -271,6 +271,8 @@ bool pci_bus_clip_resource(struct pci_dev *dev, int idx) void __weak pcibios_resource_survey_bus(struct pci_bus *bus) { } +void __weak pcibios_bus_add_device(struct pci_dev *pdev) { } + /** * pci_bus_add_device - start driver for a single device * @dev: device to add @@ -285,6 +287,7 @@ void pci_bus_add_device(struct pci_dev *dev) * Can not put in pci_device_add yet because resources * are not assigned yet for some devices. */ + pcibios_bus_add_device(dev); pci_fixup_device(pci_fixup_final, dev); pci_create_sysfs_dev_files(dev); pci_proc_attach_device(dev); diff --git a/include/linux/pci.h b/include/linux/pci.h index 3db5e30..bc435d62 100644 --- a/include/linux/pci.h +++ b/include/linux/pci.h @@ -770,6 +770,7 @@ extern struct list_head pci_root_buses; /* list of all known PCI buses */ int no_pci_devices(void); void pcibios_resource_survey_bus(struct pci_bus *bus); +void pcibios_bus_add_device(struct pci_dev *pdev); void pcibios_add_bus(struct pci_bus *bus); void pcibios_remove_bus(struct pci_bus *bus); void pcibios_fixup_bus(struct pci_bus *); -- 2.1.0 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH v14 4/9] powerpc/eeh: Cache normal BARs, not windows or IOV BARs
From: Wei YangThis restricts the EEH address cache to use only the first 7 BARs. This makes __eeh_addr_cache_insert_dev() ignore PCI bridge window and IOV BARs. As the result of this change, eeh_addr_cache_get_dev() will return VFs from VF's resource addresses instead of parent PFs. This also removes PCI bridge check as we limit __eeh_addr_cache_insert_dev() to 7 BARs and this effectively excludes PCI bridges from being cached. Signed-off-by: Wei Yang Acked-by: Gavin Shan --- arch/powerpc/kernel/eeh_cache.c | 11 +-- 1 file changed, 5 insertions(+), 6 deletions(-) diff --git a/arch/powerpc/kernel/eeh_cache.c b/arch/powerpc/kernel/eeh_cache.c index a1e86e1..ddbcfab 100644 --- a/arch/powerpc/kernel/eeh_cache.c +++ b/arch/powerpc/kernel/eeh_cache.c @@ -195,8 +195,11 @@ static void __eeh_addr_cache_insert_dev(struct pci_dev *dev) return; } - /* Walk resources on this device, poke them into the tree */ - for (i = 0; i < DEVICE_COUNT_RESOURCE; i++) { + /* +* Walk resources on this device, poke the first 7 (6 normal BAR and 1 +* ROM BAR) into the tree. +*/ + for (i = 0; i <= PCI_ROM_RESOURCE; i++) { resource_size_t start = pci_resource_start(dev,i); resource_size_t end = pci_resource_end(dev,i); unsigned long flags = pci_resource_flags(dev,i); @@ -222,10 +225,6 @@ void eeh_addr_cache_insert_dev(struct pci_dev *dev) { unsigned long flags; - /* Ignore PCI bridges */ - if ((dev->class >> 16) == PCI_BASE_CLASS_BRIDGE) - return; - spin_lock_irqsave(_io_addr_cache_root.piar_lock, flags); __eeh_addr_cache_insert_dev(dev); spin_unlock_irqrestore(_io_addr_cache_root.piar_lock, flags); -- 2.1.0 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH v14 9/9] powerpc/eeh: powerpc/eeh: Support error recovery for VF PE
From: Wei YangPFs are enumerated on PCI bus, while VFs are created by PF's driver. In EEH recovery, it has two cases: 1. Device and driver is EEH aware, error handlers are called. 2. Device and driver is not EEH aware, un-plug the device and plug it again by enumerating it. The special thing happens on the second case. For a PF, we could use the original pci core to enumerate the bus, while for VF we need to record the VFs which aer un-plugged then plug it again. Also The patch caches the VF index in pci_dn, which can be used to calculate VF's bus, device and function number. Those information helps to locate the VF's PCI device instance when doing hotplug during EEH recovery if necessary. Signed-off-by: Wei Yang Acked-by: Gavin Shan --- arch/powerpc/include/asm/eeh.h| 2 + arch/powerpc/include/asm/pci-bridge.h | 1 + arch/powerpc/kernel/eeh.c | 8 ++ arch/powerpc/kernel/eeh_dev.c | 1 + arch/powerpc/kernel/eeh_driver.c | 137 +++--- arch/powerpc/kernel/pci_dn.c | 4 +- 6 files changed, 127 insertions(+), 26 deletions(-) diff --git a/arch/powerpc/include/asm/eeh.h b/arch/powerpc/include/asm/eeh.h index b5b5f45..fb9f376 100644 --- a/arch/powerpc/include/asm/eeh.h +++ b/arch/powerpc/include/asm/eeh.h @@ -140,9 +140,11 @@ struct eeh_dev { int af_cap; /* Saved AF capability */ struct eeh_pe *pe; /* Associated PE*/ struct list_head list; /* Form link list in the PE */ + struct list_head rmv_list; /* Record the removed edevs */ struct pci_controller *phb; /* Associated PHB */ struct pci_dn *pdn; /* Associated PCI device node */ struct pci_dev *pdev; /* Associated PCI device*/ + bool in_error; /* Error flag for edev */ struct pci_dev *physfn; /* Associated SRIOV PF */ struct pci_bus *bus;/* PCI bus for partial hotplug */ }; diff --git a/arch/powerpc/include/asm/pci-bridge.h b/arch/powerpc/include/asm/pci-bridge.h index f4d1758..9f165e8 100644 --- a/arch/powerpc/include/asm/pci-bridge.h +++ b/arch/powerpc/include/asm/pci-bridge.h @@ -212,6 +212,7 @@ struct pci_dn { #define IODA_INVALID_PE(-1) #ifdef CONFIG_PPC_POWERNV int pe_number; + int vf_index; /* VF index in the PF */ #ifdef CONFIG_PCI_IOV u16 vfs_expanded; /* number of VFs IOV BAR expanded */ u16 num_vfs;/* number of VFs enabled*/ diff --git a/arch/powerpc/kernel/eeh.c b/arch/powerpc/kernel/eeh.c index 0d72462..b7338a9 100644 --- a/arch/powerpc/kernel/eeh.c +++ b/arch/powerpc/kernel/eeh.c @@ -1246,6 +1246,14 @@ void eeh_remove_device(struct pci_dev *dev) * from the parent PE during the BAR resotre. */ edev->pdev = NULL; + + /* +* The flag "in_error" is used to trace EEH devices for VFs +* in error state or not. It's set in eeh_report_error(). If +* it's not set, eeh_report_{reset,resume}() won't be called +* for the VF EEH device. +*/ + edev->in_error = false; dev->dev.archdata.edev = NULL; if (!(edev->pe->state & EEH_PE_KEEP)) eeh_rmv_from_parent_pe(edev); diff --git a/arch/powerpc/kernel/eeh_dev.c b/arch/powerpc/kernel/eeh_dev.c index aabba94..7815095 100644 --- a/arch/powerpc/kernel/eeh_dev.c +++ b/arch/powerpc/kernel/eeh_dev.c @@ -67,6 +67,7 @@ void *eeh_dev_init(struct pci_dn *pdn, void *data) edev->pdn = pdn; edev->phb = phb; INIT_LIST_HEAD(>list); + INIT_LIST_HEAD(>rmv_list); return NULL; } diff --git a/arch/powerpc/kernel/eeh_driver.c b/arch/powerpc/kernel/eeh_driver.c index 301be31..d1c65f5 100644 --- a/arch/powerpc/kernel/eeh_driver.c +++ b/arch/powerpc/kernel/eeh_driver.c @@ -34,6 +34,11 @@ #include #include +struct eeh_rmv_data { + struct list_head edev_list; + int removed; +}; + /** * eeh_pcid_name - Retrieve name of PCI device driver * @pdev: PCI device @@ -211,6 +216,7 @@ static void *eeh_report_error(void *data, void *userdata) if (rc == PCI_ERS_RESULT_NEED_RESET) *res = rc; if (*res == PCI_ERS_RESULT_NONE) *res = rc; + edev->in_error = true; eeh_pcid_put(dev); return NULL; } @@ -282,7 +288,8 @@ static void *eeh_report_reset(void *data, void *userdata) if (!driver->err_handler || !driver->err_handler->slot_reset || - (edev->mode & EEH_DEV_NO_HANDLER)) { + (edev->mode & EEH_DEV_NO_HANDLER) || + (!edev->in_error)) { eeh_pcid_put(dev); return NULL; } @@ -326,6 +333,7 @@ static void *eeh_report_resume(void
Re: [PATCH v3 03/18] cxl: Define process problem state area at attach time only
Excerpts from Frederic Barrat's message of 2016-02-07 00:28:50 +1100: > Cxl kernel API was defining the process problem state area during > context initialization, making it possible to map the problem state > area before attaching the context. This won't work on a powerVM > guest. So do the logical thing, like in userspace: attach first, then > map the problem state area. > Remove calls to cxl_assign_psn_space during init. The function is > already called on the attach paths. Looks good. It might be a reasonable idea to make cxl_psa_map fail outright if it is called on a context that has not been attached yet like we do in the user api, but I trust kernel devs to get this right more than userspace so I'm not too worried :) Cheers, -Ian ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH v3 06/18] cxl: Isolate a few bare-metal-specific calls
Acked-by: Ian Munsie___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev