[PATCH v1 00/13] powerpc/32s: Use BATs for STRICT_KERNEL_RWX
The purpose of this serie is to use BATs with STRICT_KERNEL_RWX See patch 12 for details. Christophe Leroy (13): powerpc/mm: add exec protection on powerpc 603 powerpc/mm/32: add base address to mmu_mapin_ram() powerpc/mm/32s: rework mmu_mapin_ram() powerpc/mm/32s: use generic mmu_mapin_ram() for all blocks. powerpc/wii: remove wii_mmu_mapin_mem2() powerpc/mm/32s: use _PAGE_EXEC in setbat() powerpc/mm/32s: add setibat() clearibat() and update_bats() powerpc/32: add helper to write into segment registers powerpc/mmu: add is_strict_kernel_rwx() helper powerpc/kconfig: define PAGE_SHIFT inside Kconfig powerpc/kconfig: define CONFIG_DATA_SHIFT and CONFIG_ETEXT_SHIFT powerpc/mm/32s: Use BATs for STRICT_KERNEL_RWX powerpc/kconfig: make _etext and data areas alignment configurable on Book3s 32 arch/powerpc/Kconfig | 46 +++ arch/powerpc/include/asm/book3s/32/hash.h | 1 + arch/powerpc/include/asm/book3s/32/mmu-hash.h | 2 + arch/powerpc/include/asm/book3s/32/pgtable.h | 29 ++-- arch/powerpc/include/asm/cputable.h| 8 +- arch/powerpc/include/asm/mmu.h | 11 ++ arch/powerpc/include/asm/page.h| 13 +- arch/powerpc/include/asm/reg.h | 5 + arch/powerpc/kernel/head_32.S | 37 - arch/powerpc/kernel/vmlinux.lds.S | 9 +- arch/powerpc/mm/40x_mmu.c | 2 +- arch/powerpc/mm/44x_mmu.c | 2 +- arch/powerpc/mm/8xx_mmu.c | 2 +- arch/powerpc/mm/dump_linuxpagetables-generic.c | 2 - arch/powerpc/mm/fsl_booke_mmu.c| 2 +- arch/powerpc/mm/init_32.c | 6 +- arch/powerpc/mm/mmu_decl.h | 10 +- arch/powerpc/mm/pgtable.c | 20 +-- arch/powerpc/mm/pgtable_32.c | 35 +++-- arch/powerpc/mm/ppc_mmu_32.c | 178 + arch/powerpc/platforms/embedded6xx/wii.c | 24 21 files changed, 324 insertions(+), 120 deletions(-) -- 2.13.3
[PATCH v1 04/13] powerpc/mm/32s: use generic mmu_mapin_ram() for all blocks.
Now that mmu_mapin_ram() is able to handle other blocks than the one starting at 0, the WII can use it for all its blocks. Signed-off-by: Christophe Leroy --- arch/powerpc/mm/pgtable_32.c | 25 +++-- 1 file changed, 7 insertions(+), 18 deletions(-) diff --git a/arch/powerpc/mm/pgtable_32.c b/arch/powerpc/mm/pgtable_32.c index c030f24d1d05..68a5e2be5343 100644 --- a/arch/powerpc/mm/pgtable_32.c +++ b/arch/powerpc/mm/pgtable_32.c @@ -271,26 +271,15 @@ static void __init __mapin_ram_chunk(unsigned long offset, unsigned long top) void __init mapin_ram(void) { - unsigned long s, top; - -#ifndef CONFIG_WII - top = total_lowmem; - s = mmu_mapin_ram(0, top); - __mapin_ram_chunk(s, top); -#else - if (!wii_hole_size) { - s = mmu_mapin_ram(0, total_lowmem); - __mapin_ram_chunk(s, total_lowmem); - } else { - top = wii_hole_start; - s = mmu_mapin_ram(0, top); - __mapin_ram_chunk(s, top); + struct memblock_region *reg; + + for_each_memblock(memory, reg) { + unsigned long base = reg->base; + unsigned long top = base + reg->size; - top = memblock_end_of_DRAM(); - s = wii_mmu_mapin_mem2(top); - __mapin_ram_chunk(s, top); + base = mmu_mapin_ram(base, top); + __mapin_ram_chunk(base, top); } -#endif } /* Scan the real Linux page tables and return a PTE pointer for -- 2.13.3
[PATCH v1 10/13] powerpc/kconfig: define PAGE_SHIFT inside Kconfig
Signed-off-by: Christophe Leroy --- arch/powerpc/Kconfig| 7 +++ arch/powerpc/include/asm/page.h | 13 ++--- 2 files changed, 9 insertions(+), 11 deletions(-) diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig index 8be31261aec8..4a81a80d0635 100644 --- a/arch/powerpc/Kconfig +++ b/arch/powerpc/Kconfig @@ -711,6 +711,13 @@ config PPC_256K_PAGES endchoice +config PPC_PAGE_SHIFT + int + default 18 if PPC_256K_PAGES + default 16 if PPC_64K_PAGES + default 14 if PPC_16K_PAGES + default 12 + config THREAD_SHIFT int "Thread shift" if EXPERT range 13 15 diff --git a/arch/powerpc/include/asm/page.h b/arch/powerpc/include/asm/page.h index 9ea903221a9f..d12a55441629 100644 --- a/arch/powerpc/include/asm/page.h +++ b/arch/powerpc/include/asm/page.h @@ -20,20 +20,11 @@ /* * On regular PPC32 page size is 4K (but we support 4K/16K/64K/256K pages - * on PPC44x). For PPC64 we support either 4K or 64K software + * on PPC44x and 4K/16K on 8xx). For PPC64 we support either 4K or 64K software * page size. When using 64K pages however, whether we are really supporting * 64K pages in HW or not is irrelevant to those definitions. */ -#if defined(CONFIG_PPC_256K_PAGES) -#define PAGE_SHIFT 18 -#elif defined(CONFIG_PPC_64K_PAGES) -#define PAGE_SHIFT 16 -#elif defined(CONFIG_PPC_16K_PAGES) -#define PAGE_SHIFT 14 -#else -#define PAGE_SHIFT 12 -#endif - +#define PAGE_SHIFT CONFIG_PPC_PAGE_SHIFT #define PAGE_SIZE (ASM_CONST(1) << PAGE_SHIFT) #ifndef __ASSEMBLY__ -- 2.13.3
Re: [PATCH v8 07/20] powerpc/mm: add helpers to get/set mm.context->pte_frag
Hi Christophe, Thank you for the patch! Yet something to improve: [auto build test ERROR on powerpc/next] [also build test ERROR on v4.20-rc4] [cannot apply to next-20181129] [if your patch is applied to the wrong git tree, please drop us a note to help improve the system] url: https://github.com/0day-ci/linux/commits/Christophe-Leroy/powerpc-book3s32-Remove-CONFIG_BOOKE-dependent-code/20181129-210058 base: https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git next config: powerpc-defconfig (attached as .config) compiler: powerpc64-linux-gnu-gcc (Debian 7.2.0-11) 7.2.0 reproduce: wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross chmod +x ~/bin/make.cross # save the attached .config to linux build tree GCC_VERSION=7.2.0 make.cross ARCH=powerpc All errors (new ones prefixed by >>): In file included from arch/powerpc/include/asm/book3s/64/mmu-hash.h:24:0, from arch/powerpc/include/asm/book3s/64/mmu.h:39, from arch/powerpc/include/asm/mmu.h:328, from arch/powerpc/include/asm/lppaca.h:36, from arch/powerpc/include/asm/paca.h:21, from arch/powerpc/include/asm/hw_irq.h:64, from arch/powerpc/include/asm/irqflags.h:12, from include/linux/irqflags.h:16, from include/linux/spinlock.h:54, from include/linux/mmzone.h:8, from include/linux/gfp.h:6, from include/linux/slab.h:15, from include/linux/crypto.h:24, from include/crypto/algapi.h:15, from include/crypto/internal/hash.h:16, from arch/powerpc/crypto/md5-glue.c:15: >> arch/powerpc/include/asm/book3s/64/pgtable.h:219:21: error: "__pte_frag_nr" >> is not defined, evaluates to 0 [-Werror=undef] #define PTE_FRAG_NR __pte_frag_nr ^ arch/powerpc/include/asm/pgtable.h:123:5: note: in expansion of macro 'PTE_FRAG_NR' #if PTE_FRAG_NR != 1 ^~~ cc1: all warnings being treated as errors vim +/__pte_frag_nr +219 arch/powerpc/include/asm/book3s/64/pgtable.h 5ed7ecd0 Aneesh Kumar K.V 2016-04-29 217 5ed7ecd0 Aneesh Kumar K.V 2016-04-29 218 extern unsigned long __pte_frag_nr; 5ed7ecd0 Aneesh Kumar K.V 2016-04-29 @219 #define PTE_FRAG_NR __pte_frag_nr 5ed7ecd0 Aneesh Kumar K.V 2016-04-29 220 extern unsigned long __pte_frag_size_shift; 5ed7ecd0 Aneesh Kumar K.V 2016-04-29 221 #define PTE_FRAG_SIZE_SHIFT __pte_frag_size_shift 5ed7ecd0 Aneesh Kumar K.V 2016-04-29 222 #define PTE_FRAG_SIZE (1UL << PTE_FRAG_SIZE_SHIFT) dd1842a2 Aneesh Kumar K.V 2016-04-29 223 :: The code at line 219 was first introduced by commit :: 5ed7ecd08a0807d6d616c3d958402f9c723bb048 powerpc/mm: pte_frag abstraction :: TO: Aneesh Kumar K.V :: CC: Michael Ellerman --- 0-DAY kernel test infrastructureOpen Source Technology Center https://lists.01.org/pipermail/kbuild-all Intel Corporation .config.gz Description: application/gzip
[PATCH v1 01/13] powerpc/mm: add exec protection on powerpc 603
The 603 doesn't have a HASH table, TLB misses are handled by software. It is then possible to generate page fault when _PAGE_EXEC is not set like in nohash/32. There is one "reserved" PTE bit available, this patch uses it for _PAGE_EXEC. In order to support it, set_pte_filter() and set_access_flags_filter() are made common, and the handling is made dependent on MMU_FTR_HPTE_TABLE Reviewed-by: Aneesh Kumar K.V Signed-off-by: Christophe Leroy --- arch/powerpc/include/asm/book3s/32/hash.h | 1 + arch/powerpc/include/asm/book3s/32/pgtable.h | 18 +- arch/powerpc/include/asm/cputable.h| 8 arch/powerpc/kernel/head_32.S | 2 +- arch/powerpc/mm/dump_linuxpagetables-generic.c | 2 -- arch/powerpc/mm/pgtable.c | 20 +++- 6 files changed, 26 insertions(+), 25 deletions(-) diff --git a/arch/powerpc/include/asm/book3s/32/hash.h b/arch/powerpc/include/asm/book3s/32/hash.h index f2892c7ab73e..2a0a467d2985 100644 --- a/arch/powerpc/include/asm/book3s/32/hash.h +++ b/arch/powerpc/include/asm/book3s/32/hash.h @@ -26,6 +26,7 @@ #define _PAGE_WRITETHRU0x040 /* W: cache write-through */ #define _PAGE_DIRTY0x080 /* C: page changed */ #define _PAGE_ACCESSED 0x100 /* R: page referenced */ +#define _PAGE_EXEC 0x200 /* software: exec allowed */ #define _PAGE_RW 0x400 /* software: user write access allowed */ #define _PAGE_SPECIAL 0x800 /* software: Special page */ diff --git a/arch/powerpc/include/asm/book3s/32/pgtable.h b/arch/powerpc/include/asm/book3s/32/pgtable.h index c21d33704633..cf844fed4527 100644 --- a/arch/powerpc/include/asm/book3s/32/pgtable.h +++ b/arch/powerpc/include/asm/book3s/32/pgtable.h @@ -10,9 +10,9 @@ /* And here we include common definitions */ #define _PAGE_KERNEL_RO0 -#define _PAGE_KERNEL_ROX 0 +#define _PAGE_KERNEL_ROX (_PAGE_EXEC) #define _PAGE_KERNEL_RW(_PAGE_DIRTY | _PAGE_RW) -#define _PAGE_KERNEL_RWX (_PAGE_DIRTY | _PAGE_RW) +#define _PAGE_KERNEL_RWX (_PAGE_DIRTY | _PAGE_RW | _PAGE_EXEC) #define _PAGE_HPTEFLAGS _PAGE_HASHPTE @@ -66,11 +66,11 @@ static inline bool pte_user(pte_t pte) */ #define PAGE_NONE __pgprot(_PAGE_BASE) #define PAGE_SHARED__pgprot(_PAGE_BASE | _PAGE_USER | _PAGE_RW) -#define PAGE_SHARED_X __pgprot(_PAGE_BASE | _PAGE_USER | _PAGE_RW) +#define PAGE_SHARED_X __pgprot(_PAGE_BASE | _PAGE_USER | _PAGE_RW | _PAGE_EXEC) #define PAGE_COPY __pgprot(_PAGE_BASE | _PAGE_USER) -#define PAGE_COPY_X__pgprot(_PAGE_BASE | _PAGE_USER) +#define PAGE_COPY_X__pgprot(_PAGE_BASE | _PAGE_USER | _PAGE_EXEC) #define PAGE_READONLY __pgprot(_PAGE_BASE | _PAGE_USER) -#define PAGE_READONLY_X__pgprot(_PAGE_BASE | _PAGE_USER) +#define PAGE_READONLY_X__pgprot(_PAGE_BASE | _PAGE_USER | _PAGE_EXEC) /* Permission masks used for kernel mappings */ #define PAGE_KERNEL__pgprot(_PAGE_BASE | _PAGE_KERNEL_RW) @@ -318,7 +318,7 @@ static inline void __ptep_set_access_flags(struct vm_area_struct *vma, int psize) { unsigned long set = pte_val(entry) & - (_PAGE_DIRTY | _PAGE_ACCESSED | _PAGE_RW); + (_PAGE_DIRTY | _PAGE_ACCESSED | _PAGE_RW | _PAGE_EXEC); pte_update(ptep, 0, set); @@ -384,7 +384,7 @@ static inline int pte_dirty(pte_t pte) { return !!(pte_val(pte) & _PAGE_DIRTY); static inline int pte_young(pte_t pte) { return !!(pte_val(pte) & _PAGE_ACCESSED); } static inline int pte_special(pte_t pte) { return !!(pte_val(pte) & _PAGE_SPECIAL); } static inline int pte_none(pte_t pte) { return (pte_val(pte) & ~_PTE_NONE_MASK) == 0; } -static inline bool pte_exec(pte_t pte) { return true; } +static inline bool pte_exec(pte_t pte) { return pte_val(pte) & _PAGE_EXEC; } static inline int pte_present(pte_t pte) { @@ -451,7 +451,7 @@ static inline pte_t pte_wrprotect(pte_t pte) static inline pte_t pte_exprotect(pte_t pte) { - return pte; + return __pte(pte_val(pte) & ~_PAGE_EXEC); } static inline pte_t pte_mkclean(pte_t pte) @@ -466,7 +466,7 @@ static inline pte_t pte_mkold(pte_t pte) static inline pte_t pte_mkexec(pte_t pte) { - return pte; + return __pte(pte_val(pte) | _PAGE_EXEC); } static inline pte_t pte_mkpte(pte_t pte) diff --git a/arch/powerpc/include/asm/cputable.h b/arch/powerpc/include/asm/cputable.h index 29f49a35d6ee..a0395ccbbe9e 100644 --- a/arch/powerpc/include/asm/cputable.h +++ b/arch/powerpc/include/asm/cputable.h @@ -296,7 +296,7 @@ static inline void cpu_feature_keys_init(void) { } #define CPU_FTRS_PPC601(CPU_FTR_COMMON | CPU_FTR_601 | \ CPU_FTR_COHERENT_ICACHE | CPU_FTR_UNIFIED_ID_CACHE | CPU_FTR_USE_RTC) #define CPU_FTRS_603 (CPU_FTR_COMMON | CPU_FTR_MAYBE_CAN_DOZE | \ - CPU_FTR_MAYBE_CAN_NAP | CPU_FTR_PPC_LE) +
[PATCH v1 06/13] powerpc/mm/32s: use _PAGE_EXEC in setbat()
Do not set IBAT when setbat() is called without _PAGE_EXEC Signed-off-by: Christophe Leroy --- arch/powerpc/mm/ppc_mmu_32.c | 10 ++ 1 file changed, 6 insertions(+), 4 deletions(-) diff --git a/arch/powerpc/mm/ppc_mmu_32.c b/arch/powerpc/mm/ppc_mmu_32.c index 61c10ee00ba2..1078095d9407 100644 --- a/arch/powerpc/mm/ppc_mmu_32.c +++ b/arch/powerpc/mm/ppc_mmu_32.c @@ -130,6 +130,7 @@ unsigned long __init mmu_mapin_ram(unsigned long base, unsigned long top) * Set up one of the I/D BAT (block address translation) register pairs. * The parameters are not checked; in particular size must be a power * of 2 between 128k and 256M. + * On 603+, only set IBAT when _PAGE_EXEC is set */ void __init setbat(int index, unsigned long virt, phys_addr_t phys, unsigned int size, pgprot_t prot) @@ -156,11 +157,12 @@ void __init setbat(int index, unsigned long virt, phys_addr_t phys, bat[1].batu |= 1; /* Vp = 1 */ if (flags & _PAGE_GUARDED) { /* G bit must be zero in IBATs */ - bat[0].batu = bat[0].batl = 0; - } else { - /* make IBAT same as DBAT */ - bat[0] = bat[1]; + flags &= ~_PAGE_EXEC; } + if (flags & _PAGE_EXEC) + bat[0] = bat[1]; + else + bat[0].batu = bat[0].batl = 0; } else { /* 601 cpu */ if (bl > BL_8M) -- 2.13.3
Re: [PATCH v8 04/14] integrity: Introduce struct evm_xattr
On Fri, 2018-11-16 at 18:07 -0200, Thiago Jung Bauermann wrote: > Even though struct evm_ima_xattr_data includes a fixed-size array to hold a > SHA1 digest, most of the code ignores the array and uses the struct to mean > "type indicator followed by data of unspecified size" and tracks the real > size of what the struct represents in a separate length variable. > > The only exception to that is the EVM code, which correctly uses the > definition of struct evm_ima_xattr_data. > > So make this explicit in the code by removing the length specification from > the array in struct evm_ima_xattr_data. Also, change the name of the > element from digest to data since in most places the array doesn't hold a > digest. > > A separate struct evm_xattr is introduced, with the original definition of > evm_ima_xattr_data to be used in the places that actually expect that > definition. , specifically the EVM HMAC code. > > Signed-off-by: Thiago Jung Bauermann Other than commenting the evm_xattr usage is limited to HMAC before the structure definition, this looks good. Reviewed-by: Mimi Zohar > --- > security/integrity/evm/evm_main.c | 8 > security/integrity/ima/ima_appraise.c | 7 --- > security/integrity/integrity.h| 5 + > 3 files changed, 13 insertions(+), 7 deletions(-) > > diff --git a/security/integrity/evm/evm_main.c > b/security/integrity/evm/evm_main.c > index 7f3f54d89a6e..a1b42d10efc7 100644 > --- a/security/integrity/evm/evm_main.c > +++ b/security/integrity/evm/evm_main.c > @@ -169,7 +169,7 @@ static enum integrity_status evm_verify_hmac(struct > dentry *dentry, > /* check value type */ > switch (xattr_data->type) { > case EVM_XATTR_HMAC: > - if (xattr_len != sizeof(struct evm_ima_xattr_data)) { > + if (xattr_len != sizeof(struct evm_xattr)) { > evm_status = INTEGRITY_FAIL; > goto out; > } > @@ -179,7 +179,7 @@ static enum integrity_status evm_verify_hmac(struct > dentry *dentry, > xattr_value_len, ); > if (rc) > break; > - rc = crypto_memneq(xattr_data->digest, digest.digest, > + rc = crypto_memneq(xattr_data->data, digest.digest, > SHA1_DIGEST_SIZE); > if (rc) > rc = -EINVAL; > @@ -523,7 +523,7 @@ int evm_inode_init_security(struct inode *inode, >const struct xattr *lsm_xattr, >struct xattr *evm_xattr) > { > - struct evm_ima_xattr_data *xattr_data; > + struct evm_xattr *xattr_data; > int rc; > > if (!evm_key_loaded() || !evm_protected_xattr(lsm_xattr->name)) > @@ -533,7 +533,7 @@ int evm_inode_init_security(struct inode *inode, > if (!xattr_data) > return -ENOMEM; > > - xattr_data->type = EVM_XATTR_HMAC; > + xattr_data->data.type = EVM_XATTR_HMAC; > rc = evm_init_hmac(inode, lsm_xattr, xattr_data->digest); > if (rc < 0) > goto out; > diff --git a/security/integrity/ima/ima_appraise.c > b/security/integrity/ima/ima_appraise.c > index deec1804a00a..8bcef90939f8 100644 > --- a/security/integrity/ima/ima_appraise.c > +++ b/security/integrity/ima/ima_appraise.c > @@ -167,7 +167,8 @@ enum hash_algo ima_get_hash_algo(struct > evm_ima_xattr_data *xattr_value, > return sig->hash_algo; > break; > case IMA_XATTR_DIGEST_NG: > - ret = xattr_value->digest[0]; > + /* first byte contains algorithm id */ > + ret = xattr_value->data[0]; > if (ret < HASH_ALGO__LAST) > return ret; > break; > @@ -175,7 +176,7 @@ enum hash_algo ima_get_hash_algo(struct > evm_ima_xattr_data *xattr_value, > /* this is for backward compatibility */ > if (xattr_len == 21) { > unsigned int zero = 0; > - if (!memcmp(_value->digest[16], , 4)) > + if (!memcmp(_value->data[16], , 4)) > return HASH_ALGO_MD5; > else > return HASH_ALGO_SHA1; > @@ -274,7 +275,7 @@ int ima_appraise_measurement(enum ima_hooks func, > /* xattr length may be longer. md5 hash in previous > version occupied 20 bytes in xattr, instead of 16 >*/ > - rc = memcmp(_value->digest[hash_start], > + rc = memcmp(_value->data[hash_start], > iint->ima_hash->digest, > iint->ima_hash->length); > else > diff --git a/security/integrity/integrity.h b/security/integrity/integrity.h > index e60473b13a8d..20ac02bf1b84 100644 > --- a/security/integrity/integrity.h > +++
[PATCH v1 02/13] powerpc/mm/32: add base address to mmu_mapin_ram()
At the time being, mmu_mapin_ram() always maps RAM from the beginning. But some platforms like the WII have to map a second block of RAM. This patch adds to mmu_mapin_ram() the base address of the block. At the moment, only base address 0 is supported. Signed-off-by: Christophe Leroy --- arch/powerpc/mm/40x_mmu.c | 2 +- arch/powerpc/mm/44x_mmu.c | 2 +- arch/powerpc/mm/8xx_mmu.c | 2 +- arch/powerpc/mm/fsl_booke_mmu.c | 2 +- arch/powerpc/mm/mmu_decl.h | 2 +- arch/powerpc/mm/pgtable_32.c| 6 +++--- arch/powerpc/mm/ppc_mmu_32.c| 2 +- 7 files changed, 9 insertions(+), 9 deletions(-) diff --git a/arch/powerpc/mm/40x_mmu.c b/arch/powerpc/mm/40x_mmu.c index 61ac468c87c6..b9cf6f8764b0 100644 --- a/arch/powerpc/mm/40x_mmu.c +++ b/arch/powerpc/mm/40x_mmu.c @@ -93,7 +93,7 @@ void __init MMU_init_hw(void) #define LARGE_PAGE_SIZE_16M(1<<24) #define LARGE_PAGE_SIZE_4M (1<<22) -unsigned long __init mmu_mapin_ram(unsigned long top) +unsigned long __init mmu_mapin_ram(unsigned long base, unsigned long top) { unsigned long v, s, mapped; phys_addr_t p; diff --git a/arch/powerpc/mm/44x_mmu.c b/arch/powerpc/mm/44x_mmu.c index 12d92518e898..f59df82896a0 100644 --- a/arch/powerpc/mm/44x_mmu.c +++ b/arch/powerpc/mm/44x_mmu.c @@ -178,7 +178,7 @@ void __init MMU_init_hw(void) flush_instruction_cache(); } -unsigned long __init mmu_mapin_ram(unsigned long top) +unsigned long __init mmu_mapin_ram(unsigned long base, unsigned long top) { unsigned long addr; unsigned long memstart = memstart_addr & ~(PPC_PIN_SIZE - 1); diff --git a/arch/powerpc/mm/8xx_mmu.c b/arch/powerpc/mm/8xx_mmu.c index 01b7f5107c3a..50b640c7a7f9 100644 --- a/arch/powerpc/mm/8xx_mmu.c +++ b/arch/powerpc/mm/8xx_mmu.c @@ -107,7 +107,7 @@ static void __init mmu_patch_cmp_limit(s32 *site, unsigned long mapped) patch_instruction_site(site, instr); } -unsigned long __init mmu_mapin_ram(unsigned long top) +unsigned long __init mmu_mapin_ram(unsigned long base, unsigned long top) { unsigned long mapped; diff --git a/arch/powerpc/mm/fsl_booke_mmu.c b/arch/powerpc/mm/fsl_booke_mmu.c index 080d49b26c3a..210cbc1faf63 100644 --- a/arch/powerpc/mm/fsl_booke_mmu.c +++ b/arch/powerpc/mm/fsl_booke_mmu.c @@ -221,7 +221,7 @@ unsigned long map_mem_in_cams(unsigned long ram, int max_cam_idx, bool dryrun) #error "LOWMEM_CAM_NUM must be less than NUM_TLBCAMS" #endif -unsigned long __init mmu_mapin_ram(unsigned long top) +unsigned long __init mmu_mapin_ram(unsigned long base, unsigned long top) { return tlbcam_addrs[tlbcam_index - 1].limit - PAGE_OFFSET + 1; } diff --git a/arch/powerpc/mm/mmu_decl.h b/arch/powerpc/mm/mmu_decl.h index 8574fbbc45e0..c29f061b1678 100644 --- a/arch/powerpc/mm/mmu_decl.h +++ b/arch/powerpc/mm/mmu_decl.h @@ -130,7 +130,7 @@ extern void wii_memory_fixups(void); */ #ifdef CONFIG_PPC32 extern void MMU_init_hw(void); -extern unsigned long mmu_mapin_ram(unsigned long top); +unsigned long mmu_mapin_ram(unsigned long base, unsigned long top); #endif #ifdef CONFIG_PPC_FSL_BOOK3E diff --git a/arch/powerpc/mm/pgtable_32.c b/arch/powerpc/mm/pgtable_32.c index bda3c6f1bd32..c030f24d1d05 100644 --- a/arch/powerpc/mm/pgtable_32.c +++ b/arch/powerpc/mm/pgtable_32.c @@ -275,15 +275,15 @@ void __init mapin_ram(void) #ifndef CONFIG_WII top = total_lowmem; - s = mmu_mapin_ram(top); + s = mmu_mapin_ram(0, top); __mapin_ram_chunk(s, top); #else if (!wii_hole_size) { - s = mmu_mapin_ram(total_lowmem); + s = mmu_mapin_ram(0, total_lowmem); __mapin_ram_chunk(s, total_lowmem); } else { top = wii_hole_start; - s = mmu_mapin_ram(top); + s = mmu_mapin_ram(0, top); __mapin_ram_chunk(s, top); top = memblock_end_of_DRAM(); diff --git a/arch/powerpc/mm/ppc_mmu_32.c b/arch/powerpc/mm/ppc_mmu_32.c index f6f575bae3bc..3a29e88308b0 100644 --- a/arch/powerpc/mm/ppc_mmu_32.c +++ b/arch/powerpc/mm/ppc_mmu_32.c @@ -72,7 +72,7 @@ unsigned long p_block_mapped(phys_addr_t pa) return 0; } -unsigned long __init mmu_mapin_ram(unsigned long top) +unsigned long __init mmu_mapin_ram(unsigned long base, unsigned long top) { unsigned long tot, bl, done; unsigned long max_size = (256<<20); -- 2.13.3
[PATCH v1 07/13] powerpc/mm/32s: add setibat() clearibat() and update_bats()
setibat() and clearibat() allows to manipulate IBATs independently of DBATs. update_bats() allows to update bats after init. This is done with MMU off. Signed-off-by: Christophe Leroy --- arch/powerpc/include/asm/book3s/32/mmu-hash.h | 2 ++ arch/powerpc/kernel/head_32.S | 35 +++ arch/powerpc/mm/ppc_mmu_32.c | 32 3 files changed, 69 insertions(+) diff --git a/arch/powerpc/include/asm/book3s/32/mmu-hash.h b/arch/powerpc/include/asm/book3s/32/mmu-hash.h index e38c91388c40..b4ccb832d4fb 100644 --- a/arch/powerpc/include/asm/book3s/32/mmu-hash.h +++ b/arch/powerpc/include/asm/book3s/32/mmu-hash.h @@ -83,6 +83,8 @@ typedef struct { unsigned long vdso_base; } mm_context_t; +void update_bats(void); + #endif /* !__ASSEMBLY__ */ /* We happily ignore the smaller BATs on 601, we don't actually use diff --git a/arch/powerpc/kernel/head_32.S b/arch/powerpc/kernel/head_32.S index d1c39b5ccfd6..0f4c72ebb151 100644 --- a/arch/powerpc/kernel/head_32.S +++ b/arch/powerpc/kernel/head_32.S @@ -1101,6 +1101,41 @@ BEGIN_MMU_FTR_SECTION END_MMU_FTR_SECTION_IFSET(MMU_FTR_USE_HIGH_BATS) blr +_ENTRY(update_bats) + lis r4, 1f@h + ori r4, r4, 1f@l + tophys(r4, r4) + mfmsr r6 + mflrr7 + li r3, MSR_KERNEL & ~(MSR_IR | MSR_DR) + rlwinm r0, r6, 0, ~MSR_RI + rlwinm r0, r0, 0, ~MSR_EE + mtmsr r0 + mtspr SPRN_SRR0, r4 + mtspr SPRN_SRR1, r3 + SYNC + RFI +1: bl clear_bats + lis r3, BATS@ha + addir3, r3, BATS@l + tophys(r3, r3) + LOAD_BAT(0, r3, r4, r5) + LOAD_BAT(1, r3, r4, r5) + LOAD_BAT(2, r3, r4, r5) + LOAD_BAT(3, r3, r4, r5) +BEGIN_MMU_FTR_SECTION + LOAD_BAT(4, r3, r4, r5) + LOAD_BAT(5, r3, r4, r5) + LOAD_BAT(6, r3, r4, r5) + LOAD_BAT(7, r3, r4, r5) +END_MMU_FTR_SECTION_IFSET(MMU_FTR_USE_HIGH_BATS) + li r3, MSR_KERNEL & ~(MSR_IR | MSR_DR | MSR_RI) + mtmsr r3 + mtspr SPRN_SRR0, r7 + mtspr SPRN_SRR1, r6 + SYNC + RFI + flush_tlbs: lis r10, 0x40 1: addic. r10, r10, -0x1000 diff --git a/arch/powerpc/mm/ppc_mmu_32.c b/arch/powerpc/mm/ppc_mmu_32.c index 1078095d9407..58dd71686707 100644 --- a/arch/powerpc/mm/ppc_mmu_32.c +++ b/arch/powerpc/mm/ppc_mmu_32.c @@ -105,6 +105,38 @@ static unsigned int block_size(unsigned long base, unsigned long top) return min3(max_size, 1U << base_shift, 1U << block_shift); } +/* + * Set up one of the IBAT (block address translation) register pairs. + * The parameters are not checked; in particular size must be a power + * of 2 between 128k and 256M. + * Only for 603+ ... + */ +static void setibat(int index, unsigned long virt, phys_addr_t phys, + unsigned int size, pgprot_t prot) +{ + unsigned int bl = (size >> 17) - 1; + int wimgxpp; + struct ppc_bat *bat = BATS[index]; + unsigned long flags = pgprot_val(prot); + + if (!cpu_has_feature(CPU_FTR_NEED_COHERENT)) + flags &= ~_PAGE_COHERENT; + + wimgxpp = (flags & _PAGE_COHERENT) | (_PAGE_EXEC ? BPP_RX : BPP_XX); + bat[0].batu = virt | (bl << 2) | 2; /* Vs=1, Vp=0 */ + bat[0].batl = BAT_PHYS_ADDR(phys) | wimgxpp; + if (flags & _PAGE_USER) + bat[0].batu |= 1; /* Vp = 1 */ +} + +static void clearibat(int index) +{ + struct ppc_bat *bat = BATS[index]; + + bat[0].batu = 0; + bat[0].batl = 0; +} + unsigned long __init mmu_mapin_ram(unsigned long base, unsigned long top) { int idx; -- 2.13.3
Re: [PATCH v8 17/20] powerpc/8xx: Enable 512k hugepage support with HW assistance
Hi Christophe, Thank you for the patch! Yet something to improve: [auto build test ERROR on powerpc/next] [also build test ERROR on v4.20-rc4] [cannot apply to next-20181129] [if your patch is applied to the wrong git tree, please drop us a note to help improve the system] url: https://github.com/0day-ci/linux/commits/Christophe-Leroy/powerpc-book3s32-Remove-CONFIG_BOOKE-dependent-code/20181129-210058 base: https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git next config: powerpc-defconfig (attached as .config) compiler: powerpc64-linux-gnu-gcc (Debian 7.2.0-11) 7.2.0 reproduce: wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross chmod +x ~/bin/make.cross # save the attached .config to linux build tree GCC_VERSION=7.2.0 make.cross ARCH=powerpc All errors (new ones prefixed by >>): In file included from arch/powerpc/include/asm/book3s/64/mmu-hash.h:24:0, from arch/powerpc/include/asm/book3s/64/mmu.h:39, from arch/powerpc/include/asm/mmu.h:328, from arch/powerpc/include/asm/lppaca.h:36, from arch/powerpc/include/asm/paca.h:21, from arch/powerpc/include/asm/hw_irq.h:64, from arch/powerpc/include/asm/irqflags.h:12, from include/linux/irqflags.h:16, from include/linux/spinlock.h:54, from include/linux/mmzone.h:8, from include/linux/gfp.h:6, from include/linux/mm.h:10, from arch/powerpc//mm/hugetlbpage.c:11: arch/powerpc/include/asm/book3s/64/pgtable.h:219:21: error: "__pte_frag_nr" is not defined, evaluates to 0 [-Werror=undef] #define PTE_FRAG_NR __pte_frag_nr ^ arch/powerpc/include/asm/pgtable.h:123:5: note: in expansion of macro 'PTE_FRAG_NR' #if PTE_FRAG_NR != 1 ^~~ In file included from arch/powerpc/include/asm/book3s/pgalloc.h:10:0, from arch/powerpc/include/asm/pgalloc.h:24, from arch/powerpc//mm/hugetlbpage.c:23: arch/powerpc//mm/hugetlbpage.c: In function '__hugepte_alloc': >> arch/powerpc//mm/hugetlbpage.c:69:22: error: 'PTE_SHIFT' undeclared (first >> use in this function); did you mean 'PUD_SHIFT'? cachep = PGT_CACHE(PTE_SHIFT); ^ arch/powerpc/include/asm/book3s/64/pgalloc.h:40:40: note: in definition of macro 'PGT_CACHE' #define PGT_CACHE(shift) pgtable_cache[shift] ^ arch/powerpc//mm/hugetlbpage.c:69:22: note: each undeclared identifier is reported only once for each function it appears in cachep = PGT_CACHE(PTE_SHIFT); ^ arch/powerpc/include/asm/book3s/64/pgalloc.h:40:40: note: in definition of macro 'PGT_CACHE' #define PGT_CACHE(shift) pgtable_cache[shift] ^ arch/powerpc//mm/hugetlbpage.c: In function 'free_hugepd_range': arch/powerpc//mm/hugetlbpage.c:339:29: error: 'PTE_SHIFT' undeclared (first use in this function); did you mean 'PUD_SHIFT'? get_hugepd_cache_index(PTE_SHIFT)); ^ PUD_SHIFT arch/powerpc//mm/hugetlbpage.c: In function 'hugetlbpage_init': arch/powerpc//mm/hugetlbpage.c:710:22: error: 'PTE_SHIFT' undeclared (first use in this function); did you mean 'PUD_SHIFT'? pgtable_cache_add(PTE_SHIFT); ^ PUD_SHIFT cc1: all warnings being treated as errors vim +69 arch/powerpc//mm/hugetlbpage.c > 23 #include 24 #include 25 #include 26 #include 27 #include 28 29 30 #ifdef CONFIG_HUGETLB_PAGE 31 32 #define PAGE_SHIFT_64K 16 33 #define PAGE_SHIFT_512K 19 34 #define PAGE_SHIFT_8M 23 35 #define PAGE_SHIFT_16M 24 36 #define PAGE_SHIFT_16G 34 37 38 bool hugetlb_disabled = false; 39 40 unsigned int HPAGE_SHIFT; 41 EXPORT_SYMBOL(HPAGE_SHIFT); 42 43 #define hugepd_none(hpd)(hpd_val(hpd) == 0) 44 45 #define PTE_T_ORDER (__builtin_ffs(sizeof(pte_t)) - __builtin_ffs(sizeof(void *))) 46 47 pte_t *huge_pte_offset(struct mm_struct *mm, unsigned long addr, unsigned long sz) 48 { 49 /* 50 * Only called for hugetlbfs pages, hence can ignore THP and the 51 * irq disabled walk. 52 */ 53 return __find_linux_pte(mm->pgd, addr, NULL, NULL); 54 } 55 56 static int __hugepte_alloc(struct mm_struct *mm, hugepd_t *hpdp, 57 unsigned long address, unsigned int pdshift, 58 unsigned int pshift, spinlock_t *ptl
Re: [PATCH v8 07/20] powerpc/mm: add helpers to get/set mm.context->pte_frag
Hi Christophe, Thank you for the patch! Perhaps something to improve: [auto build test WARNING on powerpc/next] [also build test WARNING on v4.20-rc4] [cannot apply to next-20181129] [if your patch is applied to the wrong git tree, please drop us a note to help improve the system] url: https://github.com/0day-ci/linux/commits/Christophe-Leroy/powerpc-book3s32-Remove-CONFIG_BOOKE-dependent-code/20181129-210058 base: https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git next config: powerpc-allmodconfig (attached as .config) compiler: powerpc64-linux-gnu-gcc (Debian 7.2.0-11) 7.2.0 reproduce: wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross chmod +x ~/bin/make.cross # save the attached .config to linux build tree GCC_VERSION=7.2.0 make.cross ARCH=powerpc All warnings (new ones prefixed by >>): In file included from arch/powerpc/include/asm/book3s/64/mmu-hash.h:24:0, from arch/powerpc/include/asm/book3s/64/mmu.h:39, from arch/powerpc/include/asm/mmu.h:328, from arch/powerpc/include/asm/lppaca.h:36, from arch/powerpc/include/asm/paca.h:21, from arch/powerpc/include/asm/hw_irq.h:64, from arch/powerpc/include/asm/irqflags.h:12, from include/linux/irqflags.h:16, from include/linux/spinlock.h:54, from include/linux/mmzone.h:8, from include/linux/gfp.h:6, from include/linux/slab.h:15, from drivers/usb/gadget/function/f_acm.c:14: >> arch/powerpc/include/asm/book3s/64/pgtable.h:219:21: warning: >> "__pte_frag_nr" is not defined, evaluates to 0 [-Wundef] #define PTE_FRAG_NR __pte_frag_nr ^ >> arch/powerpc/include/asm/pgtable.h:123:5: note: in expansion of macro >> 'PTE_FRAG_NR' #if PTE_FRAG_NR != 1 ^~~ -- In file included from arch/powerpc/include/asm/book3s/64/mmu-hash.h:24:0, from arch/powerpc/include/asm/book3s/64/mmu.h:39, from arch/powerpc/include/asm/mmu.h:328, from arch/powerpc/include/asm/lppaca.h:36, from arch/powerpc/include/asm/paca.h:21, from arch/powerpc/include/asm/current.h:16, from include/linux/sched.h:12, from drivers/usb/gadget/function/u_serial.c:18: >> arch/powerpc/include/asm/book3s/64/pgtable.h:219:21: warning: >> "__pte_frag_nr" is not defined, evaluates to 0 [-Wundef] #define PTE_FRAG_NR __pte_frag_nr ^ >> arch/powerpc/include/asm/pgtable.h:123:5: note: in expansion of macro >> 'PTE_FRAG_NR' #if PTE_FRAG_NR != 1 ^~~ In file included from arch/powerpc/include/asm/book3s/64/mmu-hash.h:24:0, from arch/powerpc/include/asm/book3s/64/mmu.h:39, from arch/powerpc/include/asm/mmu.h:328, from arch/powerpc/include/asm/lppaca.h:36, from arch/powerpc/include/asm/paca.h:21, from arch/powerpc/include/asm/current.h:16, from include/linux/sched.h:12, from drivers/usb/gadget/function/u_serial.c:18: >> arch/powerpc/include/asm/book3s/64/pgtable.h:219:21: warning: >> "__pte_frag_nr" is not defined, evaluates to 0 [-Wundef] #define PTE_FRAG_NR __pte_frag_nr ^ >> arch/powerpc/include/asm/pgtable.h:123:5: note: in expansion of macro >> 'PTE_FRAG_NR' #if PTE_FRAG_NR != 1 ^~~ -- In file included from arch/powerpc/include/asm/book3s/64/mmu-hash.h:24:0, from arch/powerpc/include/asm/book3s/64/mmu.h:39, from arch/powerpc/include/asm/mmu.h:328, from arch/powerpc/include/asm/lppaca.h:36, from arch/powerpc/include/asm/paca.h:21, from arch/powerpc/include/asm/current.h:16, from include/linux/sched.h:12, from drivers/staging/rtlwifi/rtl8822be/../wifi.h:20, from drivers/staging/rtlwifi/rtl8822be/trx.c:15: >> arch/powerpc/include/asm/book3s/64/pgtable.h:219:21: warning: >> "__pte_frag_nr" is not defined, evaluates to 0 [-Wundef] #define PTE_FRAG_NR __pte_frag_nr ^ >> arch/powerpc/include/asm/pgtable.h:123:5: note: in expansion of macro >> 'PTE_FRAG_NR' #if PTE_FRAG_NR != 1 ^~~ In file included from drivers/staging/rtlwifi/rtl8822be/trx.c:26:0: include/linux/vermagic.h:29:10: fatal error: generated/randomize_layout_hash.h: No such file or directory #include ^
[PATCH v1 03/13] powerpc/mm/32s: rework mmu_mapin_ram()
This patch reworks mmu_mapin_ram() to be more generic and map as much blocks as possible. It now supports blocks not starting at address 0. It scans DBATs array to find free ones instead of forcing the use of BAT2 and BAT3. Signed-off-by: Christophe Leroy --- arch/powerpc/mm/ppc_mmu_32.c | 61 +--- 1 file changed, 40 insertions(+), 21 deletions(-) diff --git a/arch/powerpc/mm/ppc_mmu_32.c b/arch/powerpc/mm/ppc_mmu_32.c index 3a29e88308b0..61c10ee00ba2 100644 --- a/arch/powerpc/mm/ppc_mmu_32.c +++ b/arch/powerpc/mm/ppc_mmu_32.c @@ -72,39 +72,58 @@ unsigned long p_block_mapped(phys_addr_t pa) return 0; } +static int find_free_bat(void) +{ + int b; + + if (cpu_has_feature(CPU_FTR_601)) { + for (b = 0; b < 4; b++) { + struct ppc_bat *bat = BATS[b]; + + if (!(bat[0].batl & 0x40)) + return b; + } + } else { + int n = mmu_has_feature(MMU_FTR_USE_HIGH_BATS) ? 8 : 4; + + for (b = 0; b < n; b++) { + struct ppc_bat *bat = BATS[b]; + + if (!(bat[1].batu & 3)) + return b; + } + } + return -1; +} + +static unsigned int block_size(unsigned long base, unsigned long top) +{ + unsigned int max_size = (cpu_has_feature(CPU_FTR_601) ? 8 : 256) << 20; + unsigned int base_shift = (fls(base) - 1) & 31; + unsigned int block_shift = (fls(top - base) - 1) & 31; + + return min3(max_size, 1U << base_shift, 1U << block_shift); +} + unsigned long __init mmu_mapin_ram(unsigned long base, unsigned long top) { - unsigned long tot, bl, done; - unsigned long max_size = (256<<20); + int idx; if (__map_without_bats) { printk(KERN_DEBUG "RAM mapped without BATs\n"); return 0; } - /* Set up BAT2 and if necessary BAT3 to cover RAM. */ + while ((idx = find_free_bat()) != -1 && base != top) { + unsigned int size = block_size(base, top); - /* Make sure we don't map a block larger than the - smallest alignment of the physical address. */ - tot = top; - for (bl = 128<<10; bl < max_size; bl <<= 1) { - if (bl * 2 > tot) + if (size < 128 << 10) break; + setbat(idx, PAGE_OFFSET + base, base, size, PAGE_KERNEL_TEXT); + base += size; } - setbat(2, PAGE_OFFSET, 0, bl, PAGE_KERNEL_X); - done = (unsigned long)bat_addrs[2].limit - PAGE_OFFSET + 1; - if ((done < tot) && !bat_addrs[3].limit) { - /* use BAT3 to cover a bit more */ - tot -= done; - for (bl = 128<<10; bl < max_size; bl <<= 1) - if (bl * 2 > tot) - break; - setbat(3, PAGE_OFFSET+done, done, bl, PAGE_KERNEL_X); - done = (unsigned long)bat_addrs[3].limit - PAGE_OFFSET + 1; - } - - return done; + return base; } /* -- 2.13.3
[PATCH v1 11/13] powerpc/kconfig: define CONFIG_DATA_SHIFT and CONFIG_ETEXT_SHIFT
CONFIG_STRICT_KERNEL_RWX requires a special alignment for DATA for some subarches. Today it is just defined as an #ifdef in vmlinux.lds.S In order to get more flexibility, this patch moves the definition of this alignment in Kconfig On some subarches, CONFIG_STRICT_KERNEL_RWX will require a special alignment of _etext. This patch also adds a configuration item for it in Kconfig Signed-off-by: Christophe Leroy --- arch/powerpc/Kconfig | 9 + arch/powerpc/kernel/vmlinux.lds.S | 9 +++-- 2 files changed, 12 insertions(+), 6 deletions(-) diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig index 4a81a80d0635..f3e420f3f1d7 100644 --- a/arch/powerpc/Kconfig +++ b/arch/powerpc/Kconfig @@ -728,6 +728,15 @@ config THREAD_SHIFT Used to define the stack size. The default is almost always what you want. Only change this if you know what you are doing. +config ETEXT_SHIFT + int + default PPC_PAGE_SHIFT + +config DATA_SHIFT + int + default 24 if STRICT_KERNEL_RWX && PPC64 + default PPC_PAGE_SHIFT + config FORCE_MAX_ZONEORDER int "Maximum zone order" range 8 9 if PPC64 && PPC_64K_PAGES diff --git a/arch/powerpc/kernel/vmlinux.lds.S b/arch/powerpc/kernel/vmlinux.lds.S index 1148c3c60c3b..d210dcfe915a 100644 --- a/arch/powerpc/kernel/vmlinux.lds.S +++ b/arch/powerpc/kernel/vmlinux.lds.S @@ -12,11 +12,8 @@ #include #include -#if defined(CONFIG_STRICT_KERNEL_RWX) && !defined(CONFIG_PPC32) -#define STRICT_ALIGN_SIZE (1 << 24) -#else -#define STRICT_ALIGN_SIZE PAGE_SIZE -#endif +#define STRICT_ALIGN_SIZE (1 << CONFIG_DATA_SHIFT) +#define ETEXT_ALIGN_SIZE (1 << CONFIG_ETEXT_SHIFT) ENTRY(_stext) @@ -131,7 +128,7 @@ SECTIONS } :kernel - . = ALIGN(PAGE_SIZE); + . = ALIGN(ETEXT_ALIGN_SIZE); _etext = .; PROVIDE32 (etext = .); -- 2.13.3
[PATCH v1 13/13] powerpc/kconfig: make _etext and data areas alignment configurable on Book3s 32
Depending on the number of available BATs for mapping the different kernel areas, it might be needed to increase the alignment of _etext and/or of data areas. This patchs allows the user to do it via Kconfig. Signed-off-by: Christophe Leroy --- arch/powerpc/Kconfig | 32 ++-- 1 file changed, 30 insertions(+), 2 deletions(-) diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig index ffcf4d7a1186..bab9dab815d9 100644 --- a/arch/powerpc/Kconfig +++ b/arch/powerpc/Kconfig @@ -728,16 +728,44 @@ config THREAD_SHIFT Used to define the stack size. The default is almost always what you want. Only change this if you know what you are doing. +config ETEXT_SHIFT_BOOL + bool "Set custom etext alignment" if STRICT_KERNEL_RWX && PPC_BOOK3S_32 + depends on ADVANCED_OPTIONS + help + This option allows you to set the kernel end of text alignment. When + RAM is mapped by blocks, the alignment needs to fit the size and + number of possible blocks. The default should be OK for most configs. + + Say N here unless you know what you are doing. + config ETEXT_SHIFT - int + int "_etext shift" if ETEXT_SHIFT_BOOL + range 17 28 if STRICT_KERNEL_RWX && PPC_BOOK3S_32 default 17 if STRICT_KERNEL_RWX && PPC_BOOK3S_32 default PPC_PAGE_SHIFT + help + On Book3S 32 (603+), IBATs are used to map kernel text. + Smaller is the alignment, greater is the number of necessary IBATs. + +config DATA_SHIFT_BOOL + bool "Set custom data alignment" if STRICT_KERNEL_RWX && PPC_BOOK3S_32 + depends on ADVANCED_OPTIONS + help + This option allows you to set the kernel data alignment. When + RAM is mapped by blocks, the alignment needs to fit the size and + number of possible blocks. The default should be OK for most configs. + + Say N here unless you know what you are doing. config DATA_SHIFT - int + int "Data shift" if DATA_SHIFT_BOOL default 24 if STRICT_KERNEL_RWX && PPC64 + range 17 28 if STRICT_KERNEL_RWX && PPC_BOOK3S_32 default 22 if STRICT_KERNEL_RWX && PPC_BOOK3S_32 default PPC_PAGE_SHIFT + help + On Book3S 32 (603+), DBATs are used to map kernel text and rodata RO. + Smaller is the alignment, greater is the number of necessary DBATs. config FORCE_MAX_ZONEORDER int "Maximum zone order" -- 2.13.3
[PATCH v1 08/13] powerpc/32: add helper to write into segment registers
This patch add an helper which wraps 'mtsrin' instruction to write into segment registers. Signed-off-by: Christophe Leroy --- arch/powerpc/include/asm/reg.h | 5 + 1 file changed, 5 insertions(+) diff --git a/arch/powerpc/include/asm/reg.h b/arch/powerpc/include/asm/reg.h index de52c3166ba4..c9c382e57017 100644 --- a/arch/powerpc/include/asm/reg.h +++ b/arch/powerpc/include/asm/reg.h @@ -1423,6 +1423,11 @@ static inline void msr_check_and_clear(unsigned long bits) #define mfsrin(v) ({unsigned int rval; \ asm volatile("mfsrin %0,%1" : "=r" (rval) : "r" (v)); \ rval;}) + +static inline void mtsrin(u32 val, u32 idx) +{ + asm volatile("mtsrin %0, %1" : : "r" (val), "r" (idx)); +} #endif #define proc_trap()asm volatile("trap") -- 2.13.3
[PATCH v1 05/13] powerpc/wii: remove wii_mmu_mapin_mem2()
wii_mmu_mapin_mem2() is not used anymore, remove it. Signed-off-by: Christophe Leroy --- arch/powerpc/platforms/embedded6xx/wii.c | 24 1 file changed, 24 deletions(-) diff --git a/arch/powerpc/platforms/embedded6xx/wii.c b/arch/powerpc/platforms/embedded6xx/wii.c index ecf703ee3a76..235fe81aa2b1 100644 --- a/arch/powerpc/platforms/embedded6xx/wii.c +++ b/arch/powerpc/platforms/embedded6xx/wii.c @@ -54,10 +54,6 @@ static void __iomem *hw_ctrl; static void __iomem *hw_gpio; -unsigned long wii_hole_start; -unsigned long wii_hole_size; - - static int __init page_aligned(unsigned long x) { return !(x & (PAGE_SIZE-1)); @@ -69,26 +65,6 @@ void __init wii_memory_fixups(void) BUG_ON(memblock.memory.cnt != 2); BUG_ON(!page_aligned(p[0].base) || !page_aligned(p[1].base)); - - /* determine hole */ - wii_hole_start = ALIGN(p[0].base + p[0].size, PAGE_SIZE); - wii_hole_size = p[1].base - wii_hole_start; -} - -unsigned long __init wii_mmu_mapin_mem2(unsigned long top) -{ - unsigned long delta, size, bl; - unsigned long max_size = (256<<20); - - /* MEM2 64MB@0x1000 */ - delta = wii_hole_start + wii_hole_size; - size = top - delta; - for (bl = 128<<10; bl < max_size; bl <<= 1) { - if (bl * 2 > size) - break; - } - setbat(4, PAGE_OFFSET+delta, delta, bl, PAGE_KERNEL_X); - return delta + bl; } static void __noreturn wii_spin(void) -- 2.13.3
[PATCH v1 12/13] powerpc/mm/32s: Use BATs for STRICT_KERNEL_RWX
Today, STRICT_KERNEL_RWX is based on the use of regular pages to map kernel pages. On Book3s 32, it has three consequences: - Using pages instead of BAT for mapping kernel linear memory severely impacts performance. - Exec protection is not effective because no-execute cannot be set at page level (except on 603 which doesn't have hash tables) - Write protection is not effective because PP bits do not provide RO mode for kernel-only pages (except on 603 which handles it in software via PAGE_DIRTY) On the 603+, we have: - Independent IBAT and DBAT allowing limitation of exec parts. - NX bit can be set in segment registers to forbit execution on memory mapped by pages. - RO mode on DBATs even for kernel-only blocks. On the 601, there is nothing much we can do other than warn the user about it, because: - BATs are common to instructions and data. - BAT do not provide RO mode for kernel-only blocks. - segment registers don't have the NX bit. In order to use IBAT for exec protection, this patch: - Aligns _etext to BAT block sizes (128kb) - Set NX bit in kernel segment register (Except on vmalloc area when CONFIG_MODULES is selected) - Maps kernel text with IBATs. In order to use DBAT for exec protection, this patch: - Aligns RW DATA to BAT block sizes (4M) - Maps kernel RO area with write prohibited DBATs - Maps remaining memory with remaining DBATs Here is what we get with this patch on a 832x when activating STRICT_KERNEL_RWX: Symbols: c000 T _stext c068 R __start_rodata c068 R _etext c080 T __init_begin c080 T _sinittext ~# cat /sys/kernel/debug/block_address_translation ---[ Instruction Block Address Translation ]--- 0: 0xc000-0xc03f 0x Kernel EXEC coherent 1: 0xc040-0xc05f 0x0040 Kernel EXEC coherent 2: 0xc060-0xc067 0x0060 Kernel EXEC coherent 3: - 4: - 5: - 6: - 7: - ---[ Data Block Address Translation ]--- 0: 0xc000-0xc07f 0x Kernel RO coherent 1: 0xc080-0xc0ff 0x0080 Kernel RW coherent 2: 0xc100-0xc1ff 0x0100 Kernel RW coherent 3: 0xc200-0xc3ff 0x0200 Kernel RW coherent 4: 0xc400-0xc7ff 0x0400 Kernel RW coherent 5: 0xc800-0xcfff 0x0800 Kernel RW coherent 6: 0xd000-0xdfff 0x1000 Kernel RW coherent 7: - ~# cat /sys/kernel/debug/segment_registers ---[ User Segments ]--- 0x-0x0fff Kern key 1 User key 1 VSID 0xa085d0 0x1000-0x1fff Kern key 1 User key 1 VSID 0xa086e1 0x2000-0x2fff Kern key 1 User key 1 VSID 0xa087f2 0x3000-0x3fff Kern key 1 User key 1 VSID 0xa08903 0x4000-0x4fff Kern key 1 User key 1 VSID 0xa08a14 0x5000-0x5fff Kern key 1 User key 1 VSID 0xa08b25 0x6000-0x6fff Kern key 1 User key 1 VSID 0xa08c36 0x7000-0x7fff Kern key 1 User key 1 VSID 0xa08d47 0x8000-0x8fff Kern key 1 User key 1 VSID 0xa08e58 0x9000-0x9fff Kern key 1 User key 1 VSID 0xa08f69 0xa000-0xafff Kern key 1 User key 1 VSID 0xa0907a 0xb000-0xbfff Kern key 1 User key 1 VSID 0xa0918b ---[ Kernel Segments ]--- 0xc000-0xcfff Kern key 0 User key 1 No Exec VSID 0x000ccc 0xd000-0xdfff Kern key 0 User key 1 No Exec VSID 0x000ddd 0xe000-0xefff Kern key 0 User key 1 No Exec VSID 0x000eee 0xf000-0x Kern key 0 User key 1 No Exec VSID 0x000fff Aligning _etext to 128kb allows to map up to 32Mb text with 8 IBATs: 16Mb + 8Mb + 4Mb + 2Mb + 1Mb + 512kb + 256kb + 128kb (+ 128kb) = 32Mb (A 9th IBAT is unneeded as 32Mb would need only a single 32Mb block) Aligning data to 4M allows to map up to 512Mb data with 8 DBATs: 16Mb + 8Mb + 4Mb + 4Mb + 32Mb + 64Mb + 128Mb + 256Mb = 512Mb Because some processors only have 4 BATs and because some targets need DBATs for mapping other areas, the following patch will allow to modify _etext and data alignment. Signed-off-by: Christophe Leroy --- arch/powerpc/Kconfig | 2 + arch/powerpc/include/asm/book3s/32/pgtable.h | 11 arch/powerpc/mm/init_32.c| 4 +- arch/powerpc/mm/mmu_decl.h | 8 +++ arch/powerpc/mm/pgtable_32.c | 10 +++- arch/powerpc/mm/ppc_mmu_32.c | 87 ++-- 6 files changed, 112 insertions(+), 10 deletions(-) diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig index f3e420f3f1d7..ffcf4d7a1186 100644 --- a/arch/powerpc/Kconfig +++ b/arch/powerpc/Kconfig @@ -730,11 +730,13 @@ config THREAD_SHIFT config ETEXT_SHIFT int + default 17 if STRICT_KERNEL_RWX && PPC_BOOK3S_32 default PPC_PAGE_SHIFT config DATA_SHIFT int default 24 if STRICT_KERNEL_RWX && PPC64 + default 22 if STRICT_KERNEL_RWX && PPC_BOOK3S_32 default PPC_PAGE_SHIFT config FORCE_MAX_ZONEORDER diff --git a/arch/powerpc/include/asm/book3s/32/pgtable.h b/arch/powerpc/include/asm/book3s/32/pgtable.h index
[PATCH] powerpc: Look for "stdout-path" when setting up legacy consoles
Commit 78e5dfea8 "powerpc: dts: replace 'linux,stdout-path' with 'stdout-path'" broke the default console on a number of embedded PowerPC systems, because it failed to also update the code in arch/powerpc/kernel/legacy_serial.c to look for that property in addition to the old one. This fixes it. Signed-off-by: Benjamin Herrenschmidt Fixes: 78e5dfea8 powerpc: dts: replace 'linux,stdout-path' with 'stdout-path' --- diff --git a/arch/powerpc/kernel/legacy_serial.c b/arch/powerpc/kernel/legacy_serial.c index 33b34a5..5b9dce1 100644 --- a/arch/powerpc/kernel/legacy_serial.c +++ b/arch/powerpc/kernel/legacy_serial.c @@ -372,6 +372,8 @@ void __init find_legacy_serial_ports(void) /* Now find out if one of these is out firmware console */ path = of_get_property(of_chosen, "linux,stdout-path", NULL); + if (path == NULL) + path = of_get_property(of_chosen, "stdout-path", NULL); if (path != NULL) { stdout = of_find_node_by_path(path); if (stdout) @@ -595,8 +597,10 @@ static int __init check_legacy_serial_console(void) /* We are getting a weird phandle from OF ... */ /* ... So use the full path instead */ name = of_get_property(of_chosen, "linux,stdout-path", NULL); + if (name == NULL) + name = of_get_property(of_chosen, "stdout-path", NULL); if (name == NULL) { - DBG(" no linux,stdout-path !\n"); + DBG(" no stdout-path !\n"); return -ENODEV; } prom_stdout = of_find_node_by_path(name);
[PATCH v8 05/20] powerpc/mm: move platform specific mmu-xxx.h in platform directories
The purpose of this patch is to move platform specific mmu-xxx.h files in platform directories like pte-xxx.h files. In the meantime this patch creates common nohash and nohash/32 + nohash/64 mmu.h files for future common parts. Reviewed-by: Aneesh Kumar K.V Signed-off-by: Christophe Leroy --- arch/powerpc/include/asm/mmu.h | 14 ++ arch/powerpc/include/asm/{ => nohash/32}/mmu-40x.h | 0 arch/powerpc/include/asm/{ => nohash/32}/mmu-44x.h | 0 arch/powerpc/include/asm/{ => nohash/32}/mmu-8xx.h | 0 arch/powerpc/include/asm/nohash/32/mmu.h | 19 +++ arch/powerpc/include/asm/nohash/64/mmu.h | 8 arch/powerpc/include/asm/{ => nohash}/mmu-book3e.h | 0 arch/powerpc/include/asm/nohash/mmu.h | 11 +++ arch/powerpc/kernel/cpu_setup_fsl_booke.S | 2 +- arch/powerpc/kvm/e500.h| 2 +- 10 files changed, 42 insertions(+), 14 deletions(-) rename arch/powerpc/include/asm/{ => nohash/32}/mmu-40x.h (100%) rename arch/powerpc/include/asm/{ => nohash/32}/mmu-44x.h (100%) rename arch/powerpc/include/asm/{ => nohash/32}/mmu-8xx.h (100%) create mode 100644 arch/powerpc/include/asm/nohash/32/mmu.h create mode 100644 arch/powerpc/include/asm/nohash/64/mmu.h rename arch/powerpc/include/asm/{ => nohash}/mmu-book3e.h (100%) create mode 100644 arch/powerpc/include/asm/nohash/mmu.h diff --git a/arch/powerpc/include/asm/mmu.h b/arch/powerpc/include/asm/mmu.h index eb20eb3b8fb0..2184021b0e1c 100644 --- a/arch/powerpc/include/asm/mmu.h +++ b/arch/powerpc/include/asm/mmu.h @@ -341,18 +341,8 @@ static inline void mmu_early_init_devtree(void) { } #if defined(CONFIG_PPC_STD_MMU_32) /* 32-bit classic hash table MMU */ #include -#elif defined(CONFIG_40x) -/* 40x-style software loaded TLB */ -# include -#elif defined(CONFIG_44x) -/* 44x-style software loaded TLB */ -# include -#elif defined(CONFIG_PPC_BOOK3E_MMU) -/* Freescale Book-E software loaded TLB or Book-3e (ISA 2.06+) MMU */ -# include -#elif defined (CONFIG_PPC_8xx) -/* Motorola/Freescale 8xx software loaded TLB */ -# include +#elif defined(CONFIG_PPC_MMU_NOHASH) +#include #endif #endif /* __KERNEL__ */ diff --git a/arch/powerpc/include/asm/mmu-40x.h b/arch/powerpc/include/asm/nohash/32/mmu-40x.h similarity index 100% rename from arch/powerpc/include/asm/mmu-40x.h rename to arch/powerpc/include/asm/nohash/32/mmu-40x.h diff --git a/arch/powerpc/include/asm/mmu-44x.h b/arch/powerpc/include/asm/nohash/32/mmu-44x.h similarity index 100% rename from arch/powerpc/include/asm/mmu-44x.h rename to arch/powerpc/include/asm/nohash/32/mmu-44x.h diff --git a/arch/powerpc/include/asm/mmu-8xx.h b/arch/powerpc/include/asm/nohash/32/mmu-8xx.h similarity index 100% rename from arch/powerpc/include/asm/mmu-8xx.h rename to arch/powerpc/include/asm/nohash/32/mmu-8xx.h diff --git a/arch/powerpc/include/asm/nohash/32/mmu.h b/arch/powerpc/include/asm/nohash/32/mmu.h new file mode 100644 index ..af0e8b54876a --- /dev/null +++ b/arch/powerpc/include/asm/nohash/32/mmu.h @@ -0,0 +1,19 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +#ifndef _ASM_POWERPC_NOHASH_32_MMU_H_ +#define _ASM_POWERPC_NOHASH_32_MMU_H_ + +#if defined(CONFIG_40x) +/* 40x-style software loaded TLB */ +#include +#elif defined(CONFIG_44x) +/* 44x-style software loaded TLB */ +#include +#elif defined(CONFIG_PPC_BOOK3E_MMU) +/* Freescale Book-E software loaded TLB or Book-3e (ISA 2.06+) MMU */ +#include +#elif defined (CONFIG_PPC_8xx) +/* Motorola/Freescale 8xx software loaded TLB */ +#include +#endif + +#endif /* _ASM_POWERPC_NOHASH_32_MMU_H_ */ diff --git a/arch/powerpc/include/asm/nohash/64/mmu.h b/arch/powerpc/include/asm/nohash/64/mmu.h new file mode 100644 index ..87871d027b75 --- /dev/null +++ b/arch/powerpc/include/asm/nohash/64/mmu.h @@ -0,0 +1,8 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +#ifndef _ASM_POWERPC_NOHASH_64_MMU_H_ +#define _ASM_POWERPC_NOHASH_64_MMU_H_ + +/* Freescale Book-E software loaded TLB or Book-3e (ISA 2.06+) MMU */ +#include + +#endif /* _ASM_POWERPC_NOHASH_64_MMU_H_ */ diff --git a/arch/powerpc/include/asm/mmu-book3e.h b/arch/powerpc/include/asm/nohash/mmu-book3e.h similarity index 100% rename from arch/powerpc/include/asm/mmu-book3e.h rename to arch/powerpc/include/asm/nohash/mmu-book3e.h diff --git a/arch/powerpc/include/asm/nohash/mmu.h b/arch/powerpc/include/asm/nohash/mmu.h new file mode 100644 index ..a037cb1efb57 --- /dev/null +++ b/arch/powerpc/include/asm/nohash/mmu.h @@ -0,0 +1,11 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +#ifndef _ASM_POWERPC_NOHASH_MMU_H_ +#define _ASM_POWERPC_NOHASH_MMU_H_ + +#ifdef CONFIG_PPC64 +#include +#else +#include +#endif + +#endif /* _ASM_POWERPC_NOHASH_MMU_H_ */ diff --git a/arch/powerpc/kernel/cpu_setup_fsl_booke.S b/arch/powerpc/kernel/cpu_setup_fsl_booke.S index 8d142e5d84cd..5fbc890d1094 100644 --- a/arch/powerpc/kernel/cpu_setup_fsl_booke.S +++
[PATCH v8 06/20] powerpc/mm: Move pgtable_t into platform headers
This patch move pgtable_t into platform headers. It gets rid of the CONFIG_PPC_64K_PAGES case for PPC64 as nohash/64 doesn't support CONFIG_PPC_64K_PAGES. Reviewed-by: Aneesh Kumar K.V Signed-off-by: Christophe Leroy --- arch/powerpc/include/asm/book3s/32/mmu-hash.h | 2 ++ arch/powerpc/include/asm/book3s/64/mmu.h | 9 + arch/powerpc/include/asm/nohash/32/mmu.h | 4 arch/powerpc/include/asm/nohash/64/mmu.h | 4 arch/powerpc/include/asm/page.h | 14 -- 5 files changed, 19 insertions(+), 14 deletions(-) diff --git a/arch/powerpc/include/asm/book3s/32/mmu-hash.h b/arch/powerpc/include/asm/book3s/32/mmu-hash.h index e38c91388c40..5bd26c218b94 100644 --- a/arch/powerpc/include/asm/book3s/32/mmu-hash.h +++ b/arch/powerpc/include/asm/book3s/32/mmu-hash.h @@ -42,6 +42,8 @@ struct ppc_bat { u32 batu; u32 batl; }; + +typedef struct page *pgtable_t; #endif /* !__ASSEMBLY__ */ /* diff --git a/arch/powerpc/include/asm/book3s/64/mmu.h b/arch/powerpc/include/asm/book3s/64/mmu.h index 6328857f259f..1ceee000c18d 100644 --- a/arch/powerpc/include/asm/book3s/64/mmu.h +++ b/arch/powerpc/include/asm/book3s/64/mmu.h @@ -2,6 +2,8 @@ #ifndef _ASM_POWERPC_BOOK3S_64_MMU_H_ #define _ASM_POWERPC_BOOK3S_64_MMU_H_ +#include + #ifndef __ASSEMBLY__ /* * Page size definition @@ -24,6 +26,13 @@ struct mmu_psize_def { }; extern struct mmu_psize_def mmu_psize_defs[MMU_PAGE_COUNT]; +/* + * For BOOK3s 64 with 4k and 64K linux page size + * we want to use pointers, because the page table + * actually store pfn + */ +typedef pte_t *pgtable_t; + #endif /* __ASSEMBLY__ */ /* 64-bit classic hash table MMU */ diff --git a/arch/powerpc/include/asm/nohash/32/mmu.h b/arch/powerpc/include/asm/nohash/32/mmu.h index af0e8b54876a..f61f933a4cd8 100644 --- a/arch/powerpc/include/asm/nohash/32/mmu.h +++ b/arch/powerpc/include/asm/nohash/32/mmu.h @@ -16,4 +16,8 @@ #include #endif +#ifndef __ASSEMBLY__ +typedef struct page *pgtable_t; +#endif + #endif /* _ASM_POWERPC_NOHASH_32_MMU_H_ */ diff --git a/arch/powerpc/include/asm/nohash/64/mmu.h b/arch/powerpc/include/asm/nohash/64/mmu.h index 87871d027b75..e6585480dfc4 100644 --- a/arch/powerpc/include/asm/nohash/64/mmu.h +++ b/arch/powerpc/include/asm/nohash/64/mmu.h @@ -5,4 +5,8 @@ /* Freescale Book-E software loaded TLB or Book-3e (ISA 2.06+) MMU */ #include +#ifndef __ASSEMBLY__ +typedef struct page *pgtable_t; +#endif + #endif /* _ASM_POWERPC_NOHASH_64_MMU_H_ */ diff --git a/arch/powerpc/include/asm/page.h b/arch/powerpc/include/asm/page.h index 9ea903221a9f..a7624a3b1435 100644 --- a/arch/powerpc/include/asm/page.h +++ b/arch/powerpc/include/asm/page.h @@ -335,20 +335,6 @@ void arch_free_page(struct page *page, int order); #endif struct vm_area_struct; -#ifdef CONFIG_PPC_BOOK3S_64 -/* - * For BOOK3s 64 with 4k and 64K linux page size - * we want to use pointers, because the page table - * actually store pfn - */ -typedef pte_t *pgtable_t; -#else -#if defined(CONFIG_PPC_64K_PAGES) && defined(CONFIG_PPC64) -typedef pte_t *pgtable_t; -#else -typedef struct page *pgtable_t; -#endif -#endif #include #endif /* __ASSEMBLY__ */ -- 2.13.3
[PATCH v8 08/20] powerpc/mm: Extend pte_fragment functionality to PPC32
In order to allow the 8xx to handle pte_fragments, this patch extends the use of pte_fragments to PPC32 platforms. Signed-off-by: Christophe Leroy --- arch/powerpc/include/asm/book3s/32/mmu-hash.h | 5 - arch/powerpc/include/asm/book3s/32/pgalloc.h | 17 + arch/powerpc/include/asm/book3s/32/pgtable.h | 5 +++-- arch/powerpc/include/asm/mmu_context.h| 2 +- arch/powerpc/include/asm/nohash/32/mmu.h | 4 +++- arch/powerpc/include/asm/nohash/32/pgalloc.h | 22 +++--- arch/powerpc/include/asm/nohash/32/pgtable.h | 8 +--- arch/powerpc/mm/Makefile | 1 + arch/powerpc/mm/mmu_context.c | 10 ++ arch/powerpc/mm/mmu_context_nohash.c | 2 +- arch/powerpc/mm/pgtable_32.c | 25 - 11 files changed, 52 insertions(+), 49 deletions(-) diff --git a/arch/powerpc/include/asm/book3s/32/mmu-hash.h b/arch/powerpc/include/asm/book3s/32/mmu-hash.h index 5bd26c218b94..2bb500d25de6 100644 --- a/arch/powerpc/include/asm/book3s/32/mmu-hash.h +++ b/arch/powerpc/include/asm/book3s/32/mmu-hash.h @@ -1,6 +1,7 @@ /* SPDX-License-Identifier: GPL-2.0 */ #ifndef _ASM_POWERPC_BOOK3S_32_MMU_HASH_H_ #define _ASM_POWERPC_BOOK3S_32_MMU_HASH_H_ + /* * 32-bit hash table MMU support */ @@ -9,6 +10,8 @@ * BATs */ +#include + /* Block size masks */ #define BL_128K0x000 #define BL_256K 0x001 @@ -43,7 +46,7 @@ struct ppc_bat { u32 batl; }; -typedef struct page *pgtable_t; +typedef pte_t *pgtable_t; #endif /* !__ASSEMBLY__ */ /* diff --git a/arch/powerpc/include/asm/book3s/32/pgalloc.h b/arch/powerpc/include/asm/book3s/32/pgalloc.h index eb8882c6dbb0..0f58e5b9dbe7 100644 --- a/arch/powerpc/include/asm/book3s/32/pgalloc.h +++ b/arch/powerpc/include/asm/book3s/32/pgalloc.h @@ -59,30 +59,31 @@ static inline void pmd_populate_kernel(struct mm_struct *mm, pmd_t *pmdp, static inline void pmd_populate(struct mm_struct *mm, pmd_t *pmdp, pgtable_t pte_page) { - *pmdp = __pmd((page_to_pfn(pte_page) << PAGE_SHIFT) | _PMD_PRESENT); + *pmdp = __pmd(__pa(pte_page) | _PMD_PRESENT); } -#define pmd_pgtable(pmd) pmd_page(pmd) +#define pmd_pgtable(pmd) ((pgtable_t)pmd_page_vaddr(pmd)) extern pte_t *pte_alloc_one_kernel(struct mm_struct *mm, unsigned long addr); extern pgtable_t pte_alloc_one(struct mm_struct *mm, unsigned long addr); +void pte_frag_destroy(void *pte_frag); +pte_t *pte_fragment_alloc(struct mm_struct *mm, unsigned long vmaddr, int kernel); +void pte_fragment_free(unsigned long *table, int kernel); static inline void pte_free_kernel(struct mm_struct *mm, pte_t *pte) { - free_page((unsigned long)pte); + pte_fragment_free((unsigned long *)pte, 1); } static inline void pte_free(struct mm_struct *mm, pgtable_t ptepage) { - pgtable_page_dtor(ptepage); - __free_page(ptepage); + pte_fragment_free((unsigned long *)ptepage, 0); } static inline void pgtable_free(void *table, unsigned index_size) { if (!index_size) { - pgtable_page_dtor(virt_to_page(table)); - free_page((unsigned long)table); + pte_fragment_free((unsigned long *)table, 0); } else { BUG_ON(index_size > MAX_PGTABLE_INDEX_SIZE); kmem_cache_free(PGT_CACHE(index_size), table); @@ -120,6 +121,6 @@ static inline void pgtable_free_tlb(struct mmu_gather *tlb, static inline void __pte_free_tlb(struct mmu_gather *tlb, pgtable_t table, unsigned long address) { - pgtable_free_tlb(tlb, page_address(table), 0); + pgtable_free_tlb(tlb, table, 0); } #endif /* _ASM_POWERPC_BOOK3S_32_PGALLOC_H */ diff --git a/arch/powerpc/include/asm/book3s/32/pgtable.h b/arch/powerpc/include/asm/book3s/32/pgtable.h index 32c33eccc0e2..47156b93f9af 100644 --- a/arch/powerpc/include/asm/book3s/32/pgtable.h +++ b/arch/powerpc/include/asm/book3s/32/pgtable.h @@ -329,7 +329,7 @@ static inline void __ptep_set_access_flags(struct vm_area_struct *vma, #define pte_same(A,B) (((pte_val(A) ^ pte_val(B)) & ~_PAGE_HASHPTE) == 0) #define pmd_page_vaddr(pmd)\ - ((unsigned long) __va(pmd_val(pmd) & PAGE_MASK)) + ((unsigned long)__va(pmd_val(pmd) & ~(PTE_TABLE_SIZE - 1))) #define pmd_page(pmd) \ pfn_to_page(pmd_val(pmd) >> PAGE_SHIFT) @@ -346,7 +346,8 @@ static inline void __ptep_set_access_flags(struct vm_area_struct *vma, #define pte_offset_kernel(dir, addr) \ ((pte_t *) pmd_page_vaddr(*(dir)) + pte_index(addr)) #define pte_offset_map(dir, addr) \ - ((pte_t *) kmap_atomic(pmd_page(*(dir))) + pte_index(addr)) + ((pte_t *)(kmap_atomic(pmd_page(*(dir))) + \ + (pmd_page_vaddr(*(dir)) & ~PAGE_MASK)) + pte_index(addr)) #define pte_unmap(pte) kunmap_atomic(pte) /* diff --git
[PATCH v8 19/20] powerpc/8xx: don't use r12/SPRN_SPRG_SCRATCH2 in TLB Miss handlers
This patch reworks the TLB Miss handler in order to not use r12 register, hence avoiding having to save it into SPRN_SPRG_SCRATCH2. In the DAR Fixup code we can now use SPRN_M_TW, freeing SPRN_SPRG_SCRATCH2. Then SPRN_SPRG_SCRATCH2 may be used for something else in the future. Signed-off-by: Christophe Leroy --- arch/powerpc/kernel/head_8xx.S | 110 ++--- 1 file changed, 49 insertions(+), 61 deletions(-) diff --git a/arch/powerpc/kernel/head_8xx.S b/arch/powerpc/kernel/head_8xx.S index 85fb4b8bf6c7..0a4f8a9c85ff 100644 --- a/arch/powerpc/kernel/head_8xx.S +++ b/arch/powerpc/kernel/head_8xx.S @@ -302,90 +302,87 @@ SystemCall: */ #ifdef CONFIG_8xx_CPU15 -#define INVALIDATE_ADJACENT_PAGES_CPU15(tmp, addr) \ - additmp, addr, PAGE_SIZE; \ - tlbie tmp;\ - additmp, addr, -PAGE_SIZE; \ - tlbie tmp +#define INVALIDATE_ADJACENT_PAGES_CPU15(addr) \ + addiaddr, addr, PAGE_SIZE; \ + tlbie addr; \ + addiaddr, addr, -(PAGE_SIZE << 1); \ + tlbie addr; \ + addiaddr, addr, PAGE_SIZE #else -#define INVALIDATE_ADJACENT_PAGES_CPU15(tmp, addr) +#define INVALIDATE_ADJACENT_PAGES_CPU15(addr) #endif InstructionTLBMiss: mtspr SPRN_SPRG_SCRATCH0, r10 +#if defined(ITLB_MISS_KERNEL) || defined(CONFIG_SWAP) mtspr SPRN_SPRG_SCRATCH1, r11 -#ifdef ITLB_MISS_KERNEL - mtspr SPRN_SPRG_SCRATCH2, r12 #endif /* If we are faulting a kernel address, we have to use the * kernel page tables. */ mfspr r10, SPRN_SRR0 /* Get effective address of fault */ - INVALIDATE_ADJACENT_PAGES_CPU15(r11, r10) + INVALIDATE_ADJACENT_PAGES_CPU15(r10) mtspr SPRN_MD_EPN, r10 /* Only modules will cause ITLB Misses as we always * pin the first 8MB of kernel memory */ #ifdef ITLB_MISS_KERNEL - mfcrr12 + mfcrr11 #if defined(SIMPLE_KERNEL_ADDRESS) && defined(CONFIG_PIN_TLB_TEXT) - andis. r11, r10, 0x8000/* Address >= 0x8000 */ + cmpicr0, r10, 0 /* Address >= 0x8000 */ #else - rlwinm r11, r10, 16, 0xfff8 - cmpli cr0, r11, PAGE_OFFSET@h + rlwinm r10, r10, 16, 0xfff8 + cmpli cr0, r10, PAGE_OFFSET@h #ifndef CONFIG_PIN_TLB_TEXT /* It is assumed that kernel code fits into the first 8M page */ -0: cmpli cr7, r11, (PAGE_OFFSET + 0x080)@h +0: cmpli cr7, r10, (PAGE_OFFSET + 0x080)@h patch_site 0b, patch__itlbmiss_linmem_top #endif #endif #endif - mfspr r11, SPRN_M_TWB /* Get level 1 table */ + mfspr r10, SPRN_M_TWB /* Get level 1 table */ #ifdef ITLB_MISS_KERNEL #if defined(SIMPLE_KERNEL_ADDRESS) && defined(CONFIG_PIN_TLB_TEXT) - beq+3f + bge+3f #else blt+3f #endif #ifndef CONFIG_PIN_TLB_TEXT blt cr7, ITLBMissLinear #endif - rlwinm r11, r11, 0, 20, 31 - orisr11, r11, (swapper_pg_dir - PAGE_OFFSET)@ha + rlwinm r10, r10, 0, 20, 31 + orisr10, r10, (swapper_pg_dir - PAGE_OFFSET)@ha 3: #endif - lwz r11, (swapper_pg_dir-PAGE_OFFSET)@l(r11)/* Get the level 1 entry */ + lwz r10, (swapper_pg_dir-PAGE_OFFSET)@l(r10)/* Get level 1 entry */ + mtspr SPRN_MI_TWC, r10/* Set segment attributes */ - mtspr SPRN_MD_TWC, r11 + mtspr SPRN_MD_TWC, r10 mfspr r10, SPRN_MD_TWC lwz r10, 0(r10) /* Get the pte */ #ifdef ITLB_MISS_KERNEL - mtcrr12 + mtcrr11 #endif - /* Load the MI_TWC with the attributes for this "segment." */ - mtspr SPRN_MI_TWC, r11/* Set segment attributes */ - #ifdef CONFIG_SWAP rlwinm r11, r10, 32-5, _PAGE_PRESENT and r11, r11, r10 rlwimi r10, r11, 0, _PAGE_PRESENT #endif - li r11, RPN_PATTERN | 0x200 /* The Linux PTE won't go exactly into the MMU TLB. * Software indicator bits 20 and 23 must be clear. * Software indicator bits 22, 24, 25, 26, and 27 must be * set. All other Linux PTE bits control the behavior * of the MMU. */ - rlwimi r11, r10, 4, 0x0400 /* Copy _PAGE_EXEC into bit 21 */ - rlwimi r10, r11, 0, 0x0ff0 /* Set 22, 24-27, clear 20,23 */ + rlwimi r10, r10, 0, 0x0f00 /* Clear bits 20-23 */ + rlwimi r10, r10, 4, 0x0400 /* Copy _PAGE_EXEC into bit 21 */ + ori r10, r10, RPN_PATTERN | 0x200 /* Set 22 and 24-27 */ mtspr SPRN_MI_RPN, r10/* Update TLB entry */ /* Restore registers */ 0: mfspr r10, SPRN_SPRG_SCRATCH0 +#if defined(ITLB_MISS_KERNEL) || defined(CONFIG_SWAP) mfspr r11, SPRN_SPRG_SCRATCH1 -#ifdef ITLB_MISS_KERNEL - mfspr r12, SPRN_SPRG_SCRATCH2 #endif rfi patch_site 0b,
[PATCH v8 01/20] powerpc/book3s32: Remove CONFIG_BOOKE dependent code
BOOK3S/32 cannot be BOOKE, so remove useless code Signed-off-by: Christophe Leroy --- arch/powerpc/include/asm/book3s/32/pgalloc.h | 18 -- arch/powerpc/include/asm/book3s/32/pgtable.h | 14 -- 2 files changed, 32 deletions(-) diff --git a/arch/powerpc/include/asm/book3s/32/pgalloc.h b/arch/powerpc/include/asm/book3s/32/pgalloc.h index 82e44b1a00ae..eb8882c6dbb0 100644 --- a/arch/powerpc/include/asm/book3s/32/pgalloc.h +++ b/arch/powerpc/include/asm/book3s/32/pgalloc.h @@ -50,8 +50,6 @@ static inline void pgd_free(struct mm_struct *mm, pgd_t *pgd) #define __pmd_free_tlb(tlb,x,a)do { } while (0) /* #define pgd_populate(mm, pmd, pte) BUG() */ -#ifndef CONFIG_BOOKE - static inline void pmd_populate_kernel(struct mm_struct *mm, pmd_t *pmdp, pte_t *pte) { @@ -65,22 +63,6 @@ static inline void pmd_populate(struct mm_struct *mm, pmd_t *pmdp, } #define pmd_pgtable(pmd) pmd_page(pmd) -#else - -static inline void pmd_populate_kernel(struct mm_struct *mm, pmd_t *pmdp, - pte_t *pte) -{ - *pmdp = __pmd((unsigned long)pte | _PMD_PRESENT); -} - -static inline void pmd_populate(struct mm_struct *mm, pmd_t *pmdp, - pgtable_t pte_page) -{ - *pmdp = __pmd((unsigned long)lowmem_page_address(pte_page) | _PMD_PRESENT); -} - -#define pmd_pgtable(pmd) pmd_page(pmd) -#endif extern pte_t *pte_alloc_one_kernel(struct mm_struct *mm, unsigned long addr); extern pgtable_t pte_alloc_one(struct mm_struct *mm, unsigned long addr); diff --git a/arch/powerpc/include/asm/book3s/32/pgtable.h b/arch/powerpc/include/asm/book3s/32/pgtable.h index c21d33704633..32c33eccc0e2 100644 --- a/arch/powerpc/include/asm/book3s/32/pgtable.h +++ b/arch/powerpc/include/asm/book3s/32/pgtable.h @@ -328,24 +328,10 @@ static inline void __ptep_set_access_flags(struct vm_area_struct *vma, #define __HAVE_ARCH_PTE_SAME #define pte_same(A,B) (((pte_val(A) ^ pte_val(B)) & ~_PAGE_HASHPTE) == 0) -/* - * Note that on Book E processors, the pmd contains the kernel virtual - * (lowmem) address of the pte page. The physical address is less useful - * because everything runs with translation enabled (even the TLB miss - * handler). On everything else the pmd contains the physical address - * of the pte page. -- paulus - */ -#ifndef CONFIG_BOOKE #define pmd_page_vaddr(pmd)\ ((unsigned long) __va(pmd_val(pmd) & PAGE_MASK)) #define pmd_page(pmd) \ pfn_to_page(pmd_val(pmd) >> PAGE_SHIFT) -#else -#define pmd_page_vaddr(pmd)\ - ((unsigned long) (pmd_val(pmd) & PAGE_MASK)) -#define pmd_page(pmd) \ - pfn_to_page((__pa(pmd_val(pmd)) >> PAGE_SHIFT)) -#endif /* to find an entry in a kernel page-table-directory */ #define pgd_offset_k(address) pgd_offset(_mm, address) -- 2.13.3
[PATCH v8 03/20] powerpc/mm: Move pte_fragment_alloc() to a common location
In preparation of next patch which generalises the use of pte_fragment_alloc() for all, this patch moves the related functions in a place that is common to all subarches. The 8xx will need that for supporting 16k pages, as in that mode page tables still have a size of 4k. Since pte_fragment with only once fragment is not different from what is done in the general case, we can easily migrate all subarchs to pte fragments. Reviewed-by: Aneesh Kumar K.V Signed-off-by: Christophe Leroy --- arch/powerpc/include/asm/book3s/64/pgalloc.h | 1 + arch/powerpc/mm/Makefile | 4 +- arch/powerpc/mm/mmu_context_book3s64.c | 15 arch/powerpc/mm/pgtable-book3s64.c | 85 arch/powerpc/mm/pgtable-frag.c | 116 +++ 5 files changed, 120 insertions(+), 101 deletions(-) create mode 100644 arch/powerpc/mm/pgtable-frag.c diff --git a/arch/powerpc/include/asm/book3s/64/pgalloc.h b/arch/powerpc/include/asm/book3s/64/pgalloc.h index 391ed2c3b697..f949dd90af9b 100644 --- a/arch/powerpc/include/asm/book3s/64/pgalloc.h +++ b/arch/powerpc/include/asm/book3s/64/pgalloc.h @@ -50,6 +50,7 @@ extern void pgtable_free_tlb(struct mmu_gather *tlb, void *table, int shift); #ifdef CONFIG_SMP extern void __tlb_remove_table(void *_table); #endif +void pte_frag_destroy(void *pte_frag); static inline pgd_t *radix__pgd_alloc(struct mm_struct *mm) { diff --git a/arch/powerpc/mm/Makefile b/arch/powerpc/mm/Makefile index ca96e7be4d0e..3cbb1acf0745 100644 --- a/arch/powerpc/mm/Makefile +++ b/arch/powerpc/mm/Makefile @@ -15,7 +15,9 @@ obj-$(CONFIG_PPC_MMU_NOHASH) += mmu_context_nohash.o tlb_nohash.o \ obj-$(CONFIG_PPC_BOOK3E) += tlb_low_$(BITS)e.o hash64-$(CONFIG_PPC_NATIVE):= hash_native_64.o obj-$(CONFIG_PPC_BOOK3E_64) += pgtable-book3e.o -obj-$(CONFIG_PPC_BOOK3S_64)+= pgtable-hash64.o hash_utils_64.o slb.o $(hash64-y) mmu_context_book3s64.o pgtable-book3s64.o +obj-$(CONFIG_PPC_BOOK3S_64)+= pgtable-hash64.o hash_utils_64.o slb.o \ + $(hash64-y) mmu_context_book3s64.o \ + pgtable-book3s64.o pgtable-frag.o obj-$(CONFIG_PPC_RADIX_MMU)+= pgtable-radix.o tlb-radix.o obj-$(CONFIG_PPC_STD_MMU_32) += ppc_mmu_32.o hash_low_32.o mmu_context_hash32.o obj-$(CONFIG_PPC_STD_MMU) += tlb_hash$(BITS).o diff --git a/arch/powerpc/mm/mmu_context_book3s64.c b/arch/powerpc/mm/mmu_context_book3s64.c index 510f103d7813..f720c5cc0b5e 100644 --- a/arch/powerpc/mm/mmu_context_book3s64.c +++ b/arch/powerpc/mm/mmu_context_book3s64.c @@ -164,21 +164,6 @@ static void destroy_contexts(mm_context_t *ctx) } } -static void pte_frag_destroy(void *pte_frag) -{ - int count; - struct page *page; - - page = virt_to_page(pte_frag); - /* drop all the pending references */ - count = ((unsigned long)pte_frag & ~PAGE_MASK) >> PTE_FRAG_SIZE_SHIFT; - /* We allow PTE_FRAG_NR fragments from a PTE page */ - if (atomic_sub_and_test(PTE_FRAG_NR - count, >pt_frag_refcount)) { - pgtable_page_dtor(page); - __free_page(page); - } -} - static void pmd_frag_destroy(void *pmd_frag) { int count; diff --git a/arch/powerpc/mm/pgtable-book3s64.c b/arch/powerpc/mm/pgtable-book3s64.c index 9f93c9f985c5..0c0fd173208a 100644 --- a/arch/powerpc/mm/pgtable-book3s64.c +++ b/arch/powerpc/mm/pgtable-book3s64.c @@ -322,91 +322,6 @@ void pmd_fragment_free(unsigned long *pmd) } } -static pte_t *get_pte_from_cache(struct mm_struct *mm) -{ - void *pte_frag, *ret; - - spin_lock(>page_table_lock); - ret = mm->context.pte_frag; - if (ret) { - pte_frag = ret + PTE_FRAG_SIZE; - /* -* If we have taken up all the fragments mark PTE page NULL -*/ - if (((unsigned long)pte_frag & ~PAGE_MASK) == 0) - pte_frag = NULL; - mm->context.pte_frag = pte_frag; - } - spin_unlock(>page_table_lock); - return (pte_t *)ret; -} - -static pte_t *__alloc_for_ptecache(struct mm_struct *mm, int kernel) -{ - void *ret = NULL; - struct page *page; - - if (!kernel) { - page = alloc_page(PGALLOC_GFP | __GFP_ACCOUNT); - if (!page) - return NULL; - if (!pgtable_page_ctor(page)) { - __free_page(page); - return NULL; - } - } else { - page = alloc_page(PGALLOC_GFP); - if (!page) - return NULL; - } - - atomic_set(>pt_frag_refcount, 1); - - ret = page_address(page); - /* -* if we support only one fragment just return the -* allocated page. -*/ - if (PTE_FRAG_NR == 1) - return ret; - spin_lock(>page_table_lock); - /*
[PATCH v8 10/20] powerpc/mm: replace hugetlb_cache by PGT_CACHE(PTE_T_ORDER)
Instead of opencoding cache handling for the special case of hugepage tables having a single pte_t element, this patch makes use of the common pgtable_cache helpers Signed-off-by: Christophe Leroy --- arch/powerpc/include/asm/hugetlb.h | 2 -- arch/powerpc/mm/hugetlbpage.c | 26 +++--- 2 files changed, 7 insertions(+), 21 deletions(-) diff --git a/arch/powerpc/include/asm/hugetlb.h b/arch/powerpc/include/asm/hugetlb.h index 98004262bc87..dfb8bf236586 100644 --- a/arch/powerpc/include/asm/hugetlb.h +++ b/arch/powerpc/include/asm/hugetlb.h @@ -5,8 +5,6 @@ #ifdef CONFIG_HUGETLB_PAGE #include -extern struct kmem_cache *hugepte_cache; - #ifdef CONFIG_PPC_BOOK3S_64 #include diff --git a/arch/powerpc/mm/hugetlbpage.c b/arch/powerpc/mm/hugetlbpage.c index 8cf035e68378..c4f1263228b8 100644 --- a/arch/powerpc/mm/hugetlbpage.c +++ b/arch/powerpc/mm/hugetlbpage.c @@ -42,6 +42,8 @@ EXPORT_SYMBOL(HPAGE_SHIFT); #define hugepd_none(hpd) (hpd_val(hpd) == 0) +#define PTE_T_ORDER(__builtin_ffs(sizeof(pte_t)) - __builtin_ffs(sizeof(void *))) + pte_t *huge_pte_offset(struct mm_struct *mm, unsigned long addr, unsigned long sz) { /* @@ -61,7 +63,7 @@ static int __hugepte_alloc(struct mm_struct *mm, hugepd_t *hpdp, int num_hugepd; if (pshift >= pdshift) { - cachep = hugepte_cache; + cachep = PGT_CACHE(PTE_T_ORDER); num_hugepd = 1 << (pshift - pdshift); } else { cachep = PGT_CACHE(pdshift - pshift); @@ -264,7 +266,7 @@ static void hugepd_free_rcu_callback(struct rcu_head *head) unsigned int i; for (i = 0; i < batch->index; i++) - kmem_cache_free(hugepte_cache, batch->ptes[i]); + kmem_cache_free(PGT_CACHE(PTE_T_ORDER), batch->ptes[i]); free_page((unsigned long)batch); } @@ -277,7 +279,7 @@ static void hugepd_free(struct mmu_gather *tlb, void *hugepte) if (atomic_read(>mm->mm_users) < 2 || mm_is_thread_local(tlb->mm)) { - kmem_cache_free(hugepte_cache, hugepte); + kmem_cache_free(PGT_CACHE(PTE_T_ORDER), hugepte); put_cpu_var(hugepd_freelist_cur); return; } @@ -652,7 +654,6 @@ static int __init hugepage_setup_sz(char *str) } __setup("hugepagesz=", hugepage_setup_sz); -struct kmem_cache *hugepte_cache; static int __init hugetlbpage_init(void) { int psize; @@ -702,21 +703,8 @@ static int __init hugetlbpage_init(void) if (pdshift > shift) pgtable_cache_add(pdshift - shift, NULL); #if defined(CONFIG_PPC_FSL_BOOK3E) || defined(CONFIG_PPC_8xx) - else if (!hugepte_cache) { - /* -* Create a kmem cache for hugeptes. The bottom bits in -* the pte have size information encoded in them, so -* align them to allow this -*/ - hugepte_cache = kmem_cache_create("hugepte-cache", - sizeof(pte_t), - HUGEPD_SHIFT_MASK + 1, - 0, NULL); - if (hugepte_cache == NULL) - panic("%s: Unable to create kmem cache " - "for hugeptes\n", __func__); - - } + else + pgtable_cache_add(PTE_T_ORDER, NULL); #endif } -- 2.13.3
[PATCH v8 14/20] powerpc/8xx: Temporarily disable 16k pages and hugepages
In preparation of making use of hardware assistance in TLB handlers, this patch temporarily disables 16K pages and hugepages. The reason is that when using HW assistance in 4K pages mode, the linux model fit with the HW model for 4K pages and 8M pages. However for 16K pages and 512K mode some additional work is needed to get linux model fit with HW model. For the 8M pages, they will naturaly come back when we switch to HW assistance, without any additional handling. In order to keep the following patch smaller, the removal of the current special handling for 8M pages gets removed here as well. Therefore the 4K pages mode will be implemented first and without support for 512k hugepages. Then the 512k hugepages will be brought back. And the 16K pages will be implemented in the following step. Signed-off-by: Christophe Leroy --- arch/powerpc/Kconfig | 2 +- arch/powerpc/kernel/head_8xx.S | 74 +++--- arch/powerpc/mm/tlb_nohash.c | 6 3 files changed, 6 insertions(+), 76 deletions(-) diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig index 8be31261aec8..ddfccdf004fe 100644 --- a/arch/powerpc/Kconfig +++ b/arch/powerpc/Kconfig @@ -689,7 +689,7 @@ config PPC_4K_PAGES config PPC_16K_PAGES bool "16k page size" - depends on 44x || PPC_8xx + depends on 44x config PPC_64K_PAGES bool "64k page size" diff --git a/arch/powerpc/kernel/head_8xx.S b/arch/powerpc/kernel/head_8xx.S index c203defe49a4..01f58b1d9ae7 100644 --- a/arch/powerpc/kernel/head_8xx.S +++ b/arch/powerpc/kernel/head_8xx.S @@ -314,7 +314,7 @@ SystemCall: InstructionTLBMiss: mtspr SPRN_SPRG_SCRATCH0, r10 mtspr SPRN_SPRG_SCRATCH1, r11 -#if defined(ITLB_MISS_KERNEL) || defined(CONFIG_HUGETLB_PAGE) +#ifdef ITLB_MISS_KERNEL mtspr SPRN_SPRG_SCRATCH2, r12 #endif @@ -325,10 +325,8 @@ InstructionTLBMiss: INVALIDATE_ADJACENT_PAGES_CPU15(r11, r10) /* Only modules will cause ITLB Misses as we always * pin the first 8MB of kernel memory */ -#if defined(ITLB_MISS_KERNEL) || defined(CONFIG_HUGETLB_PAGE) - mfcrr12 -#endif #ifdef ITLB_MISS_KERNEL + mfcrr12 #if defined(SIMPLE_KERNEL_ADDRESS) && defined(CONFIG_PIN_TLB_TEXT) andis. r11, r10, 0x8000/* Address >= 0x8000 */ #else @@ -360,15 +358,9 @@ InstructionTLBMiss: /* Extract level 2 index */ rlwinm r10, r10, 32 - (PAGE_SHIFT - 2), 32 - PAGE_SHIFT, 29 -#ifdef CONFIG_HUGETLB_PAGE - mtcrr11 - bt- 28, 10f /* bit 28 = Large page (8M) */ - bt- 29, 20f /* bit 29 = Large page (8M or 512k) */ -#endif rlwimi r10, r11, 0, 0, 32 - PAGE_SHIFT - 1 /* Add level 2 base */ lwz r10, 0(r10) /* Get the pte */ -4: -#if defined(ITLB_MISS_KERNEL) || defined(CONFIG_HUGETLB_PAGE) +#ifdef ITLB_MISS_KERNEL mtcrr12 #endif /* Load the MI_TWC with the attributes for this "segment." */ @@ -393,7 +385,7 @@ InstructionTLBMiss: /* Restore registers */ 0: mfspr r10, SPRN_SPRG_SCRATCH0 mfspr r11, SPRN_SPRG_SCRATCH1 -#if defined(ITLB_MISS_KERNEL) || defined(CONFIG_HUGETLB_PAGE) +#ifdef ITLB_MISS_KERNEL mfspr r12, SPRN_SPRG_SCRATCH2 #endif rfi @@ -406,35 +398,12 @@ InstructionTLBMiss: stw r10, (itlb_miss_counter - PAGE_OFFSET)@l(0) mfspr r10, SPRN_SPRG_SCRATCH0 mfspr r11, SPRN_SPRG_SCRATCH1 -#if defined(ITLB_MISS_KERNEL) || defined(CONFIG_HUGETLB_PAGE) +#ifdef ITLB_MISS_KERNEL mfspr r12, SPRN_SPRG_SCRATCH2 #endif rfi #endif -#ifdef CONFIG_HUGETLB_PAGE -10:/* 8M pages */ -#ifdef CONFIG_PPC_16K_PAGES - /* Extract level 2 index */ - rlwinm r10, r10, 32 - (PAGE_SHIFT_8M - PAGE_SHIFT), 32 + PAGE_SHIFT_8M - (PAGE_SHIFT << 1), 29 - /* Add level 2 base */ - rlwimi r10, r11, 0, 0, 32 + PAGE_SHIFT_8M - (PAGE_SHIFT << 1) - 1 -#else - /* Level 2 base */ - rlwinm r10, r11, 0, ~HUGEPD_SHIFT_MASK -#endif - lwz r10, 0(r10) /* Get the pte */ - b 4b - -20:/* 512k pages */ - /* Extract level 2 index */ - rlwinm r10, r10, 32 - (PAGE_SHIFT_512K - PAGE_SHIFT), 32 + PAGE_SHIFT_512K - (PAGE_SHIFT << 1), 29 - /* Add level 2 base */ - rlwimi r10, r11, 0, 0, 32 + PAGE_SHIFT_512K - (PAGE_SHIFT << 1) - 1 - lwz r10, 0(r10) /* Get the pte */ - b 4b -#endif - . = 0x1200 DataStoreTLBMiss: mtspr SPRN_SPRG_SCRATCH0, r10 @@ -472,11 +441,6 @@ DataStoreTLBMiss: */ /* Extract level 2 index */ rlwinm r10, r10, 32 - (PAGE_SHIFT - 2), 32 - PAGE_SHIFT, 29 -#ifdef CONFIG_HUGETLB_PAGE - mtcrr11 - bt- 28, 10f /* bit 28 = Large page (8M) */ - bt- 29, 20f /* bit 29 = Large page (8M or 512k) */ -#endif rlwimi r10, r11, 0, 0, 32 - PAGE_SHIFT - 1 /* Add level 2 base */
Re: [PATCH 0/3] System call table generation support
Hi Sathish, Thanks for your email. On Thu, 29 Nov 2018 at 12:05, Satheesh Rajendran wrote: > > On Fri, Sep 14, 2018 at 02:02:57PM +0530, Firoz Khan wrote: > > The purpose of this patch series is: > > 1. We can easily add/modify/delete system call by changing entry > > in syscall.tbl file. No need to manually edit many files. > > > > 2. It is easy to unify the system call implementation across all > > the architectures. > > > > The system call tables are in different format in all architecture > > and it will be difficult to manually add or modify the system calls > > in the respective files manually. To make it easy by keeping a script > > and which'll generate the header file and syscall table file so this > > change will unify them across all architectures. > > > > syscall.tbl contains the list of available system calls along with > > system call number and corresponding entry point. Add a new system > > call in this architecture will be possible by adding new entry in > > the syscall.tbl file. > > > > Adding a new table entry consisting of: > > - System call number. > > - ABI. > > - System call name. > > - Entry point name. > > - Compat entry name, if required. > > > > ARM, s390 and x86 architecuture does exist the similar support. I > > leverage their implementation to come up with a generic solution. > > > > I have done the same support for work for alpha, m68k, microblaze, > > ia64, mips, parisc, sh, sparc, and xtensa. But I started sending > > the patch for one architecuture for review. Below mentioned git > > repository contains more details. > > Git repo:- https://github.com/frzkhn/system_call_table_generator/ > > > > Finally, this is the ground work for solving the Y2038 issue. We > > need to add/change two dozen of system calls to solve Y2038 issue. > > So this patch series will help to easily modify from existing > > system call to Y2038 compatible system calls. > > > > I started working system call table generation on 4.17-rc1. I used > > marcin's script - https://github.com/hrw/syscalls-table to generate > > the syscall.tbl file. And this will be the input to the system call > > table generation script. But there are couple system call got add > > in the latest rc release. If run Marcin's script on latest release, > > It will generate a new syscall.tbl. But I still use the old file - > > syscall.tbl and once all review got over I'll update syscall.tbl > > alone w.r.to the tip of the kernel. The impact of this thing, few > > of the system call won't work. > > > > Firoz Khan (3): > > powerpc: Replace NR_syscalls macro from asm/unistd.h > > powerpc: Add system call table generation support > > powerpc: uapi header and system call table file generation > > > > arch/powerpc/Makefile | 3 + > > arch/powerpc/include/asm/Kbuild | 3 + > > arch/powerpc/include/asm/unistd.h | 3 +- > > arch/powerpc/include/uapi/asm/Kbuild| 2 + > > arch/powerpc/include/uapi/asm/unistd.h | 391 > > +--- > > arch/powerpc/kernel/Makefile| 3 +- > > arch/powerpc/kernel/syscall_table_32.S | 9 + > > arch/powerpc/kernel/syscall_table_64.S | 17 ++ > > arch/powerpc/kernel/syscalls/Makefile | 51 > > arch/powerpc/kernel/syscalls/syscall_32.tbl | 378 > > +++ > > arch/powerpc/kernel/syscalls/syscall_64.tbl | 372 > > ++ > > arch/powerpc/kernel/syscalls/syscallhdr.sh | 37 +++ > > arch/powerpc/kernel/syscalls/syscalltbl.sh | 38 +++ > > arch/powerpc/kernel/systbl.S| 50 > > 14 files changed, 916 insertions(+), 441 deletions(-) > > create mode 100644 arch/powerpc/kernel/syscall_table_32.S > > create mode 100644 arch/powerpc/kernel/syscall_table_64.S > > create mode 100644 arch/powerpc/kernel/syscalls/Makefile > > create mode 100644 arch/powerpc/kernel/syscalls/syscall_32.tbl > > create mode 100644 arch/powerpc/kernel/syscalls/syscall_64.tbl > > create mode 100644 arch/powerpc/kernel/syscalls/syscallhdr.sh > > create mode 100644 arch/powerpc/kernel/syscalls/syscalltbl.sh > > delete mode 100644 arch/powerpc/kernel/systbl.S > > Hi, > > This patch series failed to boot in IBM Power8 box with below base commit and > built with ppc64le_defconfig, > https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git/commit/?h=merge=183cbf93be88d1a4fb572e27b1e08aa0ad85 I think you are applied some old patch series. Could you please perform the boot test on powerpc v3 which I have sent few hour before. Thanks Firoz > > Complete boot log attached. > > > [1.577383] SGI XFS with ACLs, security attributes, no debug enabled > [1.581550] Bad kernel stack pointer 6e69 at c0e2ceec > [1.581558] Oops: Bad kernel stack pointer, sig: 6 [#1] > [1.581562] LE SMP NR_CPUS=2048 NUMA PowerNV > [1.581567] Modules linked in: > [1.581572] CPU: 3 PID: 1937
[PATCH v8 12/20] powerpc/mm: remove unnecessary test in pgtable_cache_init()
pgtable_cache_add() gracefully handles the case when a cache that size already exists by returning early with the following test: if (PGT_CACHE(shift)) return; /* Already have a cache of this size */ It is then not needed to test the existence of the cache before. Signed-off-by: Christophe Leroy --- arch/powerpc/mm/init-common.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/arch/powerpc/mm/init-common.c b/arch/powerpc/mm/init-common.c index b7ca03643d0b..1e6910eb70ed 100644 --- a/arch/powerpc/mm/init-common.c +++ b/arch/powerpc/mm/init-common.c @@ -111,13 +111,13 @@ void pgtable_cache_init(void) { pgtable_cache_add(PGD_INDEX_SIZE); - if (PMD_CACHE_INDEX && !PGT_CACHE(PMD_CACHE_INDEX)) + if (PMD_CACHE_INDEX) pgtable_cache_add(PMD_CACHE_INDEX); /* * In all current configs, when the PUD index exists it's the * same size as either the pgd or pmd index except with THP enabled * on book3s 64 */ - if (PUD_CACHE_INDEX && !PGT_CACHE(PUD_CACHE_INDEX)) + if (PUD_CACHE_INDEX) pgtable_cache_add(PUD_CACHE_INDEX); } -- 2.13.3
[PATCH v8 13/20] powerpc/8xx: Move SW perf counters in first 32kb of memory
In order to simplify time critical exceptions handling 8xx specific SW perf counters, this patch moves the counters into the beginning of memory. This is possible because .text is readable and the counters are never modified outside of the handlers. By doing this, we avoid having to set a second register with the upper part of the address of the counters. Signed-off-by: Christophe Leroy --- arch/powerpc/kernel/head_8xx.S | 58 -- 1 file changed, 28 insertions(+), 30 deletions(-) diff --git a/arch/powerpc/kernel/head_8xx.S b/arch/powerpc/kernel/head_8xx.S index 3b67b9533c82..c203defe49a4 100644 --- a/arch/powerpc/kernel/head_8xx.S +++ b/arch/powerpc/kernel/head_8xx.S @@ -106,6 +106,23 @@ turn_on_mmu: mtspr SPRN_SRR0,r0 rfi /* enables MMU */ + +#ifdef CONFIG_PERF_EVENTS + .align 4 + + .globl itlb_miss_counter +itlb_miss_counter: + .space 4 + + .globl dtlb_miss_counter +dtlb_miss_counter: + .space 4 + + .globl instruction_counter +instruction_counter: + .space 4 +#endif + /* * Exception entry code. This code runs with address translation * turned off, i.e. using physical addresses. @@ -384,17 +401,16 @@ InstructionTLBMiss: #ifdef CONFIG_PERF_EVENTS patch_site 0f, patch__itlbmiss_perf -0: lis r10, (itlb_miss_counter - PAGE_OFFSET)@ha - lwz r11, (itlb_miss_counter - PAGE_OFFSET)@l(r10) - addir11, r11, 1 - stw r11, (itlb_miss_counter - PAGE_OFFSET)@l(r10) -#endif +0: lwz r10, (itlb_miss_counter - PAGE_OFFSET)@l(0) + addir10, r10, 1 + stw r10, (itlb_miss_counter - PAGE_OFFSET)@l(0) mfspr r10, SPRN_SPRG_SCRATCH0 mfspr r11, SPRN_SPRG_SCRATCH1 #if defined(ITLB_MISS_KERNEL) || defined(CONFIG_HUGETLB_PAGE) mfspr r12, SPRN_SPRG_SCRATCH2 #endif rfi +#endif #ifdef CONFIG_HUGETLB_PAGE 10:/* 8M pages */ @@ -509,15 +525,14 @@ DataStoreTLBMiss: #ifdef CONFIG_PERF_EVENTS patch_site 0f, patch__dtlbmiss_perf -0: lis r10, (dtlb_miss_counter - PAGE_OFFSET)@ha - lwz r11, (dtlb_miss_counter - PAGE_OFFSET)@l(r10) - addir11, r11, 1 - stw r11, (dtlb_miss_counter - PAGE_OFFSET)@l(r10) -#endif +0: lwz r10, (dtlb_miss_counter - PAGE_OFFSET)@l(0) + addir10, r10, 1 + stw r10, (dtlb_miss_counter - PAGE_OFFSET)@l(0) mfspr r10, SPRN_SPRG_SCRATCH0 mfspr r11, SPRN_SPRG_SCRATCH1 mfspr r12, SPRN_SPRG_SCRATCH2 rfi +#endif #ifdef CONFIG_HUGETLB_PAGE 10:/* 8M pages */ @@ -625,16 +640,13 @@ DataBreakpoint: . = 0x1d00 InstructionBreakpoint: mtspr SPRN_SPRG_SCRATCH0, r10 - mtspr SPRN_SPRG_SCRATCH1, r11 - lis r10, (instruction_counter - PAGE_OFFSET)@ha - lwz r11, (instruction_counter - PAGE_OFFSET)@l(r10) - addir11, r11, -1 - stw r11, (instruction_counter - PAGE_OFFSET)@l(r10) + lwz r10, (instruction_counter - PAGE_OFFSET)@l(0) + addir10, r10, -1 + stw r10, (instruction_counter - PAGE_OFFSET)@l(0) lis r10, 0x ori r10, r10, 0x01 mtspr SPRN_COUNTA, r10 mfspr r10, SPRN_SPRG_SCRATCH0 - mfspr r11, SPRN_SPRG_SCRATCH1 rfi #else EXCEPTION(0x1d00, Trap_1d, unknown_exception, EXC_XFER_EE) @@ -1065,17 +1077,3 @@ swapper_pg_dir: */ abatron_pteptrs: .space 8 - -#ifdef CONFIG_PERF_EVENTS - .globl itlb_miss_counter -itlb_miss_counter: - .space 4 - - .globl dtlb_miss_counter -dtlb_miss_counter: - .space 4 - - .globl instruction_counter -instruction_counter: - .space 4 -#endif -- 2.13.3
[PATCH v8 15/20] powerpc/8xx: Use hardware assistance in TLB handlers
Today, on the 8xx the TLB handlers do SW tablewalk by doing all the calculation in ASM, in order to match with the Linux page table structure. The 8xx offers hardware assistance which allows significant size reduction of the TLB handlers, hence also reduces the time spent in the handlers. However, using this HW assistance implies some constraints on the page table structure: - Regardless of the main page size used (4k or 16k), the level 1 table (PGD) contains 1024 entries and each PGD entry covers a 4Mbytes area which is managed by a level 2 table (PTE) containing also 1024 entries each describing a 4k page. - 16k pages require 4 identifical entries in the L2 table - 512k pages PTE have to be spread every 128 bytes in the L2 table - 8M pages PTE are at the address pointed by the L1 entry and each 8M page require 2 identical entries in the PGD. This patch modifies the TLB handlers to use HW assistance for 4K PAGES. Before that patch, the mean time spent in TLB miss handlers is: - ITLB miss: 80 ticks - DTLB miss: 62 ticks After that patch, the mean time spent in TLB miss handlers is: - ITLB miss: 72 ticks - DTLB miss: 54 ticks So the improvement is 10% for ITLB and 13% for DTLB misses Signed-off-by: Christophe Leroy --- arch/powerpc/kernel/head_8xx.S | 58 +- arch/powerpc/mm/8xx_mmu.c | 4 +-- 2 files changed, 26 insertions(+), 36 deletions(-) diff --git a/arch/powerpc/kernel/head_8xx.S b/arch/powerpc/kernel/head_8xx.S index 01f58b1d9ae7..85fb4b8bf6c7 100644 --- a/arch/powerpc/kernel/head_8xx.S +++ b/arch/powerpc/kernel/head_8xx.S @@ -292,7 +292,7 @@ SystemCall: . = 0x1100 /* * For the MPC8xx, this is a software tablewalk to load the instruction - * TLB. The task switch loads the M_TW register with the pointer to the first + * TLB. The task switch loads the M_TWB register with the pointer to the first * level table. * If we discover there is no second level table (value is zero) or if there * is an invalid pte, we load that into the TLB, which causes another fault @@ -323,6 +323,7 @@ InstructionTLBMiss: */ mfspr r10, SPRN_SRR0 /* Get effective address of fault */ INVALIDATE_ADJACENT_PAGES_CPU15(r11, r10) + mtspr SPRN_MD_EPN, r10 /* Only modules will cause ITLB Misses as we always * pin the first 8MB of kernel memory */ #ifdef ITLB_MISS_KERNEL @@ -339,7 +340,7 @@ InstructionTLBMiss: #endif #endif #endif - mfspr r11, SPRN_M_TW /* Get level 1 table */ + mfspr r11, SPRN_M_TWB /* Get level 1 table */ #ifdef ITLB_MISS_KERNEL #if defined(SIMPLE_KERNEL_ADDRESS) && defined(CONFIG_PIN_TLB_TEXT) beq+3f @@ -349,16 +350,14 @@ InstructionTLBMiss: #ifndef CONFIG_PIN_TLB_TEXT blt cr7, ITLBMissLinear #endif - lis r11, (swapper_pg_dir-PAGE_OFFSET)@ha + rlwinm r11, r11, 0, 20, 31 + orisr11, r11, (swapper_pg_dir - PAGE_OFFSET)@ha 3: #endif - /* Insert level 1 index */ - rlwimi r11, r10, 32 - ((PAGE_SHIFT - 2) << 1), (PAGE_SHIFT - 2) << 1, 29 lwz r11, (swapper_pg_dir-PAGE_OFFSET)@l(r11)/* Get the level 1 entry */ - /* Extract level 2 index */ - rlwinm r10, r10, 32 - (PAGE_SHIFT - 2), 32 - PAGE_SHIFT, 29 - rlwimi r10, r11, 0, 0, 32 - PAGE_SHIFT - 1 /* Add level 2 base */ + mtspr SPRN_MD_TWC, r11 + mfspr r10, SPRN_MD_TWC lwz r10, 0(r10) /* Get the pte */ #ifdef ITLB_MISS_KERNEL mtcrr12 @@ -417,7 +416,7 @@ DataStoreTLBMiss: mfspr r10, SPRN_MD_EPN rlwinm r11, r10, 16, 0xfff8 cmpli cr0, r11, PAGE_OFFSET@h - mfspr r11, SPRN_M_TW /* Get level 1 table */ + mfspr r11, SPRN_M_TWB /* Get level 1 table */ blt+3f rlwinm r11, r10, 16, 0xfff8 #ifndef CONFIG_PIN_TLB_IMMR @@ -430,20 +429,16 @@ DataStoreTLBMiss: patch_site 0b, patch__dtlbmiss_immr_jmp #endif blt cr7, DTLBMissLinear - lis r11, (swapper_pg_dir-PAGE_OFFSET)@ha + mfspr r11, SPRN_M_TWB /* Get level 1 table */ + rlwinm r11, r11, 0, 20, 31 + orisr11, r11, (swapper_pg_dir - PAGE_OFFSET)@ha 3: - - /* Insert level 1 index */ - rlwimi r11, r10, 32 - ((PAGE_SHIFT - 2) << 1), (PAGE_SHIFT - 2) << 1, 29 lwz r11, (swapper_pg_dir-PAGE_OFFSET)@l(r11)/* Get the level 1 entry */ - /* We have a pte table, so load fetch the pte from the table. -*/ - /* Extract level 2 index */ - rlwinm r10, r10, 32 - (PAGE_SHIFT - 2), 32 - PAGE_SHIFT, 29 - rlwimi r10, r11, 0, 0, 32 - PAGE_SHIFT - 1 /* Add level 2 base */ + mtspr SPRN_MD_TWC, r11 + mfspr r10, SPRN_MD_TWC lwz r10, 0(r10) /* Get the pte */ -4: + mtcrr12 /* Insert the Guarded flag into the TWC from the Linux PTE. @@ -668,9 +663,10 @@ FixupDAR:/* Entry point for dcbx workaround. */ mtspr
[PATCH v8 18/20] powerpc/8xx: reintroduce 16K pages with HW assistance
Using this HW assistance implies some constraints on the page table structure: - Regardless of the main page size used (4k or 16k), the level 1 table (PGD) contains 1024 entries and each PGD entry covers a 4Mbytes area which is managed by a level 2 table (PTE) containing also 1024 entries each describing a 4k page. - 16k pages require 4 identifical entries in the L2 table - 512k pages PTE have to be spread every 128 bytes in the L2 table - 8M pages PTE are at the address pointed by the L1 entry and each 8M page require 2 identical entries in the PGD. In order to use hardware assistance with 16K pages, this patch does the following modifications: - Make PGD size independent of the main page size - In 16k pages mode, redefine pte_t as a struct with 4 elements, and populate those 4 elements in __set_pte_at() and pte_update() - Adapt the size of the hugepage tables. - Define a PTE_FRAGMENT_NB so that a 16k page contains 4 page tables. Signed-off-by: Christophe Leroy --- arch/powerpc/Kconfig | 2 +- arch/powerpc/include/asm/nohash/32/mmu-8xx.h | 1 + arch/powerpc/include/asm/nohash/32/pgtable.h | 19 ++- arch/powerpc/include/asm/nohash/pgtable.h| 4 arch/powerpc/include/asm/pgtable-types.h | 4 5 files changed, 28 insertions(+), 2 deletions(-) diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig index ddfccdf004fe..8be31261aec8 100644 --- a/arch/powerpc/Kconfig +++ b/arch/powerpc/Kconfig @@ -689,7 +689,7 @@ config PPC_4K_PAGES config PPC_16K_PAGES bool "16k page size" - depends on 44x + depends on 44x || PPC_8xx config PPC_64K_PAGES bool "64k page size" diff --git a/arch/powerpc/include/asm/nohash/32/mmu-8xx.h b/arch/powerpc/include/asm/nohash/32/mmu-8xx.h index fa05aa566ece..25f05131afd5 100644 --- a/arch/powerpc/include/asm/nohash/32/mmu-8xx.h +++ b/arch/powerpc/include/asm/nohash/32/mmu-8xx.h @@ -190,6 +190,7 @@ typedef struct { struct slice_mask mask_8m; # endif #endif + void *pte_frag; } mm_context_t; #define PHYS_IMMR_BASE (mfspr(SPRN_IMMR) & 0xfff8) diff --git a/arch/powerpc/include/asm/nohash/32/pgtable.h b/arch/powerpc/include/asm/nohash/32/pgtable.h index 31a03e9a42c4..e3e81b078432 100644 --- a/arch/powerpc/include/asm/nohash/32/pgtable.h +++ b/arch/powerpc/include/asm/nohash/32/pgtable.h @@ -19,7 +19,14 @@ extern int icache_44x_need_flush; #endif /* __ASSEMBLY__ */ +#if defined(CONFIG_PPC_8xx) && defined(CONFIG_PPC_16K_PAGES) +#define PTE_INDEX_SIZE (PTE_SHIFT - 2) +#define PTE_FRAG_NR4 +#define PTE_FRAG_SIZE_SHIFT12 +#define PTE_FRAG_SIZE (1UL << PTE_FRAG_SIZE_SHIFT) +#else #define PTE_INDEX_SIZE PTE_SHIFT +#endif #define PMD_INDEX_SIZE 0 #define PUD_INDEX_SIZE 0 @@ -49,7 +56,11 @@ extern int icache_44x_need_flush; * -Matt */ /* PGDIR_SHIFT determines what a top-level page table entry can map */ +#ifdef CONFIG_PPC_8xx +#define PGDIR_SHIFT22 +#else #define PGDIR_SHIFT(PAGE_SHIFT + PTE_INDEX_SIZE) +#endif #define PGDIR_SIZE (1UL << PGDIR_SHIFT) #define PGDIR_MASK (~(PGDIR_SIZE-1)) @@ -233,7 +244,13 @@ static inline unsigned long pte_update(pte_t *p, : "cc" ); #else /* PTE_ATOMIC_UPDATES */ unsigned long old = pte_val(*p); - *p = __pte((old & ~clr) | set); + unsigned long new = (old & ~clr) | set; + +#if defined(CONFIG_PPC_8xx) && defined(CONFIG_PPC_16K_PAGES) + p->pte = p->pte1 = p->pte2 = p->pte3 = new; +#else + *p = __pte(new); +#endif #endif /* !PTE_ATOMIC_UPDATES */ #ifdef CONFIG_44x diff --git a/arch/powerpc/include/asm/nohash/pgtable.h b/arch/powerpc/include/asm/nohash/pgtable.h index 70ff23974b59..1ca1c1864b32 100644 --- a/arch/powerpc/include/asm/nohash/pgtable.h +++ b/arch/powerpc/include/asm/nohash/pgtable.h @@ -209,7 +209,11 @@ static inline void __set_pte_at(struct mm_struct *mm, unsigned long addr, /* Anything else just stores the PTE normally. That covers all 64-bit * cases, and 32-bit non-hash with 32-bit PTEs. */ +#if defined(CONFIG_PPC_8xx) && defined(CONFIG_PPC_16K_PAGES) + ptep->pte = ptep->pte1 = ptep->pte2 = ptep->pte3 = pte_val(pte); +#else *ptep = pte; +#endif /* * With hardware tablewalk, a sync is needed to ensure that diff --git a/arch/powerpc/include/asm/pgtable-types.h b/arch/powerpc/include/asm/pgtable-types.h index eccb30b38b47..3b0edf041b2e 100644 --- a/arch/powerpc/include/asm/pgtable-types.h +++ b/arch/powerpc/include/asm/pgtable-types.h @@ -3,7 +3,11 @@ #define _ASM_POWERPC_PGTABLE_TYPES_H /* PTE level */ +#if defined(CONFIG_PPC_8xx) && defined(CONFIG_PPC_16K_PAGES) +typedef struct { pte_basic_t pte, pte1, pte2, pte3; } pte_t; +#else typedef struct { pte_basic_t pte; } pte_t; +#endif #define __pte(x) ((pte_t) { (x) }) static inline pte_basic_t pte_val(pte_t x) { -- 2.13.3
[PATCH v8 00/20] Implement use of HW assistance on TLB table walk on 8xx
The purpose of this serie is to implement hardware assistance for TLB table walk on the 8xx. First part prepares for using HW assistance in TLB routines: - Trivial fixes: - Remove CONFIG_BOOKE stuff from book3S headers. - Removal of unneeded atomic PTE update requirement for 8xx. - move book3s64 page fragment code in a common part for reusing it by the 8xx as 16k page size mode still uses 4k page tables. - Fixing a bug in memcache handling when standard pages and hugepages share caches of the same size (see original discussion in https://patchwork.ozlabs.org/patch/957565/) - Optimise access to 8xx perf counters (hence reducing number of registers used) Second part implements HW assistance in TLB routines in the following steps: - Disable 16k page size mode and 512k hugepages - Switch 4k to HW assistance - Bring back 512k hugepages - Bring back 16k page size mode. Last part cleans up: - Take benefit of Miss handler size reduction to regroup related parts - Reduce number of registers used in miss handlers, freeing them for future use. Tested successfully on 8xx and 83xx (book3s/32) Changes in v8: - Moved definitions in pgalloc.h to avoid conflicting with memcache patches. - Included the memcache bugfix serie in this serie to avoid conflicts between the two series when coming to the 512k pages patch. - In the 512k HW assistance patch, reduced the #ifdef mess by using IS_ENABLED(CONFIG_PPC_8xx) instead. Changes in v7: - Reordered to get trivial and already reviewed patches in front. - Reordered to regroup all HW assistance related patches together. - Rebased on today merge branch (28 Nov) - Added a helper for access to mm_context_t.frag - Reduced the amount of changes in PPC32 to support pte_fragment - Applied pte_fragment to both nohash/32 and book3s/32 Changes in v6: - Droped the part related to handling GUARD attribute at PGD/PMD level. - Moved the commonalisation of page_fragment in the begining (this part has been reviewed by Aneesh) - Rebased on today merge branch (19 Oct) Changes in v5: - Also avoid useless lock in get_pmd_from_cache() - A new patch to relocate mmu headers in platform specific directories - A new patch to distribute pgtable_t typedefs in platform specific mmu headers instead of the uggly #ifdef - Moved early_pte_alloc_kernel() in platform specific pgalloc - Restricted definition of PTE_FRAG_SIZE and PTE_FRAG_NR to platforms using the pte fragmentation. - arch_exit_mmap() and destroy_pagetable_cache() are now platform specific. Changes in v4: - Reordered the serie to put at the end the modifications which makes L1 and L2 entries independant. - No modifications to ppc64 ioremap (we still have an opportunity to merge them, for a future patch serie) - 8xx code modified to use patch_site instead of patch_instruction to get a clearer code and avoid object pollution with global symbols - Moved perf counters in first 32kb of memory to optimise access - Split the big bang to HW assistance in several steps: 1. Temporarily removes support of 16k pages and 512k hugepages 2. Change TLB routines to use HW assistance for 4k pages and 8M hugepages 3. Add back support for 512k hugepages 4. Add back support for 16k pages (using pte_fragment as page tables are still 4k) Changes in v3: - Fixed an issue in the 09/14 when CONFIG_PIN_TLB_TEXT was not enabled - Added performance measurement in the 09/14 commit log - Rebased on latest 'powerpc/merge' tree, which conflicted with 13/14 Changes in v2: - Removed the 3 first patchs which have been applied already - Fixed compilation errors reported by Michael - Squashed the commonalisation of ioremap functions into a single patch - Fixed the use of pte_fragment - Added a patch optimising perf counting of TLB misses and instructions Christophe Leroy (20): powerpc/book3s32: Remove CONFIG_BOOKE dependent code powerpc/8xx: Remove PTE_ATOMIC_UPDATES powerpc/mm: Move pte_fragment_alloc() to a common location powerpc/mm: Avoid useless lock with single page fragments powerpc/mm: move platform specific mmu-xxx.h in platform directories powerpc/mm: Move pgtable_t into platform headers powerpc/mm: add helpers to get/set mm.context->pte_frag powerpc/mm: Extend pte_fragment functionality to PPC32 powerpc/mm: enable the use of page table cache of order 0 powerpc/mm: replace hugetlb_cache by PGT_CACHE(PTE_T_ORDER) powerpc/mm: fix a warning when a cache is common to PGD and hugepages powerpc/mm: remove unnecessary test in pgtable_cache_init() powerpc/8xx: Move SW perf counters in first 32kb of memory powerpc/8xx: Temporarily disable 16k pages and hugepages powerpc/8xx: Use hardware assistance in TLB handlers powerpc/8xx: Enable 8M hugepage support with HW assistance powerpc/8xx: Enable 512k hugepage support with HW assistance powerpc/8xx: reintroduce 16K pages with HW assistance powerpc/8xx: don't use r12/SPRN_SPRG_SCRATCH2 in TLB Miss handlers
[PATCH v8 16/20] powerpc/8xx: Enable 8M hugepage support with HW assistance
HW assistance naturally supports 8M huge pages without further modifications. Signed-off-by: Christophe Leroy --- arch/powerpc/mm/tlb_nohash.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/arch/powerpc/mm/tlb_nohash.c b/arch/powerpc/mm/tlb_nohash.c index 4f79639e432f..8ad7aab150b7 100644 --- a/arch/powerpc/mm/tlb_nohash.c +++ b/arch/powerpc/mm/tlb_nohash.c @@ -97,6 +97,9 @@ struct mmu_psize_def mmu_psize_defs[MMU_PAGE_COUNT] = { .shift = 14, }, #endif + [MMU_PAGE_8M] = { + .shift = 23, + }, }; #else struct mmu_psize_def mmu_psize_defs[MMU_PAGE_COUNT] = { -- 2.13.3
Re: [PATCH v3 0/4] powerpc: system call table generation support
++ sathn...@linux.vnet.ibm.com On Thu, 29 Nov 2018 at 09:57, Firoz Khan wrote: > > The purpose of this patch series is, we can easily > add/modify/delete system call table support by cha- > nging entry in syscall.tbl file instead of manually > changing many files. The other goal is to unify the > system call table generation support implementation > across all the architectures. > > The system call tables are in different format in > all architecture. It will be difficult to manually > add, modify or delete the system calls in the resp- > ective files manually. To make it easy by keeping a > script and which'll generate uapi header file and > syscall table file. > > syscall.tbl contains the list of available system > calls along with system call number and correspond- > ing entry point. Add a new system call in this arch- > itecture will be possible by adding new entry in > the syscall.tbl file. > > Adding a new table entry consisting of: > - System call number. > - ABI. > - System call name. > - Entry point name. > - Compat entry name, if required. > - spu entry name, if required. > > ARM, s390 and x86 architecuture does exist the sim- > ilar support. I leverage their implementation to > come up with a generic solution. > > I have done the same support for work for alpha, > ia64, m68k, microblaze, mips, parisc, sh, sparc, > and xtensa. Below mentioned git repository contains > more details about the workflow. > > https://github.com/frzkhn/system_call_table_generator/ > > Finally, this is the ground work to solve the Y2038 > issue. We need to add two dozen of system calls to > solve Y2038 issue. So this patch series will help to > add new system calls easily by adding new entry in the > syscall.tbl. > > changes since v2: > - modified/optimized the syscall.tbl to avoid duplicate >for the spu entries. > - updated the syscalltbl.sh to meet the above point. > > changes since v1: > - optimized/updated the syscall table generation >scripts. > - fixed all mixed indentation issues in syscall.tbl. > - added "comments" in syscall_*.tbl. > - changed from generic-y to generated-y in Kbuild. > > Firoz Khan (4): > powerpc: add __NR_syscalls along with NR_syscalls > powerpc: move macro definition from asm/systbl.h > powerpc: add system call table generation support > powerpc: generate uapi header and system call table files > > arch/powerpc/Makefile | 3 + > arch/powerpc/include/asm/Kbuild | 4 + > arch/powerpc/include/asm/systbl.h | 396 -- > arch/powerpc/include/asm/unistd.h | 3 +- > arch/powerpc/include/uapi/asm/Kbuild| 2 + > arch/powerpc/include/uapi/asm/unistd.h | 389 + > arch/powerpc/kernel/Makefile| 10 - > arch/powerpc/kernel/syscalls/Makefile | 63 > arch/powerpc/kernel/syscalls/syscall.tbl| 427 > > arch/powerpc/kernel/syscalls/syscallhdr.sh | 36 +++ > arch/powerpc/kernel/syscalls/syscalltbl.sh | 36 +++ > arch/powerpc/kernel/systbl.S| 37 +-- > arch/powerpc/kernel/systbl_chk.c| 60 > arch/powerpc/platforms/cell/spu_callbacks.c | 17 +- > 14 files changed, 591 insertions(+), 892 deletions(-) > delete mode 100644 arch/powerpc/include/asm/systbl.h > create mode 100644 arch/powerpc/kernel/syscalls/Makefile > create mode 100644 arch/powerpc/kernel/syscalls/syscall.tbl > create mode 100644 arch/powerpc/kernel/syscalls/syscallhdr.sh > create mode 100644 arch/powerpc/kernel/syscalls/syscalltbl.sh > delete mode 100644 arch/powerpc/kernel/systbl_chk.c > > -- > 1.9.1 >
[PATCH v8 02/20] powerpc/8xx: Remove PTE_ATOMIC_UPDATES
commit 1bc54c03117b9 ("powerpc: rework 4xx PTE access and TLB miss") introduced non atomic PTE updates and started the work of removing PTE updates in TLB miss handlers, but kept PTE_ATOMIC_UPDATES for the 8xx with the following comment: /* Until my rework is finished, 8xx still needs atomic PTE updates */ commit fe11dc3f9628e ("powerpc/8xx: Update TLB asm so it behaves as linux mm expects") removed all PTE updates done in TLB miss handlers Therefore, atomic PTE updates are not needed anymore for the 8xx Signed-off-by: Christophe Leroy --- arch/powerpc/include/asm/nohash/32/pte-8xx.h | 3 --- 1 file changed, 3 deletions(-) diff --git a/arch/powerpc/include/asm/nohash/32/pte-8xx.h b/arch/powerpc/include/asm/nohash/32/pte-8xx.h index 6bfe041ef59d..c9e4b2d90f65 100644 --- a/arch/powerpc/include/asm/nohash/32/pte-8xx.h +++ b/arch/powerpc/include/asm/nohash/32/pte-8xx.h @@ -65,9 +65,6 @@ #define _PTE_NONE_MASK 0 -/* Until my rework is finished, 8xx still needs atomic PTE updates */ -#define PTE_ATOMIC_UPDATES 1 - #ifdef CONFIG_PPC_16K_PAGES #define _PAGE_PSIZE_PAGE_SPS #else -- 2.13.3
[PATCH v8 04/20] powerpc/mm: Avoid useless lock with single page fragments
There is no point in taking the page table lock as pte_frag or pmd_frag are always NULL when we have only one fragment. Reviewed-by: Aneesh Kumar K.V Signed-off-by: Christophe Leroy --- arch/powerpc/mm/pgtable-book3s64.c | 3 +++ arch/powerpc/mm/pgtable-frag.c | 3 +++ 2 files changed, 6 insertions(+) diff --git a/arch/powerpc/mm/pgtable-book3s64.c b/arch/powerpc/mm/pgtable-book3s64.c index 0c0fd173208a..f3c31f5e1026 100644 --- a/arch/powerpc/mm/pgtable-book3s64.c +++ b/arch/powerpc/mm/pgtable-book3s64.c @@ -244,6 +244,9 @@ static pmd_t *get_pmd_from_cache(struct mm_struct *mm) { void *pmd_frag, *ret; + if (PMD_FRAG_NR == 1) + return NULL; + spin_lock(>page_table_lock); ret = mm->context.pmd_frag; if (ret) { diff --git a/arch/powerpc/mm/pgtable-frag.c b/arch/powerpc/mm/pgtable-frag.c index d61e7c2a9a79..7544d0d7177d 100644 --- a/arch/powerpc/mm/pgtable-frag.c +++ b/arch/powerpc/mm/pgtable-frag.c @@ -34,6 +34,9 @@ static pte_t *get_pte_from_cache(struct mm_struct *mm) { void *pte_frag, *ret; + if (PTE_FRAG_NR == 1) + return NULL; + spin_lock(>page_table_lock); ret = mm->context.pte_frag; if (ret) { -- 2.13.3
[PATCH v8 07/20] powerpc/mm: add helpers to get/set mm.context->pte_frag
In order to handle pte_fragment functions with single fragment without adding pte_frag in all mm_context_t, this patch creates two helpers which do nothing on platforms using a single fragment. Signed-off-by: Christophe Leroy --- arch/powerpc/include/asm/pgtable.h | 31 +++ arch/powerpc/mm/pgtable-frag.c | 8 2 files changed, 35 insertions(+), 4 deletions(-) diff --git a/arch/powerpc/include/asm/pgtable.h b/arch/powerpc/include/asm/pgtable.h index 9679b7519a35..734df2210749 100644 --- a/arch/powerpc/include/asm/pgtable.h +++ b/arch/powerpc/include/asm/pgtable.h @@ -110,6 +110,37 @@ void mark_initmem_nx(void); static inline void mark_initmem_nx(void) { } #endif +/* + * When used, PTE_FRAG_NR is defined in subarch pgtable.h + * so we are sure it is included when arriving here. + */ +#ifndef PTE_FRAG_NR +#define PTE_FRAG_NR1 +#define PTE_FRAG_SIZE_SHIFTPAGE_SHIFT +#define PTE_FRAG_SIZE (1UL << PTE_FRAG_SIZE_SHIFT) +#endif + +#if PTE_FRAG_NR != 1 +static inline void *pte_frag_get(mm_context_t *ctx) +{ + return ctx->pte_frag; +} + +static inline void pte_frag_set(mm_context_t *ctx, void *p) +{ + ctx->pte_frag = p; +} +#else +static inline void *pte_frag_get(mm_context_t *ctx) +{ + return NULL; +} + +static inline void pte_frag_set(mm_context_t *ctx, void *p) +{ +} +#endif + #endif /* __ASSEMBLY__ */ #endif /* _ASM_POWERPC_PGTABLE_H */ diff --git a/arch/powerpc/mm/pgtable-frag.c b/arch/powerpc/mm/pgtable-frag.c index 7544d0d7177d..af23a587f019 100644 --- a/arch/powerpc/mm/pgtable-frag.c +++ b/arch/powerpc/mm/pgtable-frag.c @@ -38,7 +38,7 @@ static pte_t *get_pte_from_cache(struct mm_struct *mm) return NULL; spin_lock(>page_table_lock); - ret = mm->context.pte_frag; + ret = pte_frag_get(>context); if (ret) { pte_frag = ret + PTE_FRAG_SIZE; /* @@ -46,7 +46,7 @@ static pte_t *get_pte_from_cache(struct mm_struct *mm) */ if (((unsigned long)pte_frag & ~PAGE_MASK) == 0) pte_frag = NULL; - mm->context.pte_frag = pte_frag; + pte_frag_set(>context, pte_frag); } spin_unlock(>page_table_lock); return (pte_t *)ret; @@ -86,9 +86,9 @@ static pte_t *__alloc_for_ptecache(struct mm_struct *mm, int kernel) * the allocated page with single fragement * count. */ - if (likely(!mm->context.pte_frag)) { + if (likely(!pte_frag_get(>context))) { atomic_set(>pt_frag_refcount, PTE_FRAG_NR); - mm->context.pte_frag = ret + PTE_FRAG_SIZE; + pte_frag_set(>context, ret + PTE_FRAG_SIZE); } spin_unlock(>page_table_lock); -- 2.13.3
[PATCH v8 09/20] powerpc/mm: enable the use of page table cache of order 0
hugepages uses a cache of order 0. Lets allow page tables of order 0 in the common part in order to avoid open coding in hugetlb Signed-off-by: Christophe Leroy --- arch/powerpc/include/asm/book3s/32/pgalloc.h | 5 + arch/powerpc/include/asm/book3s/64/pgalloc.h | 5 + arch/powerpc/include/asm/nohash/32/pgalloc.h | 5 + arch/powerpc/include/asm/nohash/64/pgalloc.h | 5 + arch/powerpc/mm/init-common.c| 6 +++--- 5 files changed, 7 insertions(+), 19 deletions(-) diff --git a/arch/powerpc/include/asm/book3s/32/pgalloc.h b/arch/powerpc/include/asm/book3s/32/pgalloc.h index 0f58e5b9dbe7..b5b955eb2fb7 100644 --- a/arch/powerpc/include/asm/book3s/32/pgalloc.h +++ b/arch/powerpc/include/asm/book3s/32/pgalloc.h @@ -25,10 +25,7 @@ extern void __bad_pte(pmd_t *pmd); extern struct kmem_cache *pgtable_cache[]; -#define PGT_CACHE(shift) ({\ - BUG_ON(!(shift)); \ - pgtable_cache[(shift) - 1]; \ - }) +#define PGT_CACHE(shift) pgtable_cache[shift] static inline pgd_t *pgd_alloc(struct mm_struct *mm) { diff --git a/arch/powerpc/include/asm/book3s/64/pgalloc.h b/arch/powerpc/include/asm/book3s/64/pgalloc.h index f949dd90af9b..4aba625389c4 100644 --- a/arch/powerpc/include/asm/book3s/64/pgalloc.h +++ b/arch/powerpc/include/asm/book3s/64/pgalloc.h @@ -37,10 +37,7 @@ extern struct vmemmap_backing *vmemmap_list; #define MAX_PGTABLE_INDEX_SIZE 0xf extern struct kmem_cache *pgtable_cache[]; -#define PGT_CACHE(shift) ({\ - BUG_ON(!(shift)); \ - pgtable_cache[(shift) - 1]; \ - }) +#define PGT_CACHE(shift) pgtable_cache[shift] extern pte_t *pte_fragment_alloc(struct mm_struct *, unsigned long, int); extern pmd_t *pmd_fragment_alloc(struct mm_struct *, unsigned long); diff --git a/arch/powerpc/include/asm/nohash/32/pgalloc.h b/arch/powerpc/include/asm/nohash/32/pgalloc.h index 7e234582dce5..17963951bdb0 100644 --- a/arch/powerpc/include/asm/nohash/32/pgalloc.h +++ b/arch/powerpc/include/asm/nohash/32/pgalloc.h @@ -25,10 +25,7 @@ extern void __bad_pte(pmd_t *pmd); extern struct kmem_cache *pgtable_cache[]; -#define PGT_CACHE(shift) ({\ - BUG_ON(!(shift)); \ - pgtable_cache[(shift) - 1]; \ - }) +#define PGT_CACHE(shift) pgtable_cache[shift] static inline pgd_t *pgd_alloc(struct mm_struct *mm) { diff --git a/arch/powerpc/include/asm/nohash/64/pgalloc.h b/arch/powerpc/include/asm/nohash/64/pgalloc.h index e2d62d033708..e95eb499a174 100644 --- a/arch/powerpc/include/asm/nohash/64/pgalloc.h +++ b/arch/powerpc/include/asm/nohash/64/pgalloc.h @@ -36,10 +36,7 @@ extern struct vmemmap_backing *vmemmap_list; #define MAX_PGTABLE_INDEX_SIZE 0xf extern struct kmem_cache *pgtable_cache[]; -#define PGT_CACHE(shift) ({\ - BUG_ON(!(shift)); \ - pgtable_cache[(shift) - 1]; \ - }) +#define PGT_CACHE(shift) pgtable_cache[shift] static inline pgd_t *pgd_alloc(struct mm_struct *mm) { diff --git a/arch/powerpc/mm/init-common.c b/arch/powerpc/mm/init-common.c index 2b656e67f2ea..41190f2b60c2 100644 --- a/arch/powerpc/mm/init-common.c +++ b/arch/powerpc/mm/init-common.c @@ -40,7 +40,7 @@ static void pmd_ctor(void *addr) memset(addr, 0, PMD_TABLE_SIZE); } -struct kmem_cache *pgtable_cache[MAX_PGTABLE_INDEX_SIZE]; +struct kmem_cache *pgtable_cache[MAX_PGTABLE_INDEX_SIZE + 1]; EXPORT_SYMBOL_GPL(pgtable_cache); /* used by kvm_hv module */ /* @@ -71,7 +71,7 @@ void pgtable_cache_add(unsigned shift, void (*ctor)(void *)) * moment, gcc doesn't seem to recognize is_power_of_2 as a * constant expression, so so much for that. */ BUG_ON(!is_power_of_2(minalign)); - BUG_ON((shift < 1) || (shift > MAX_PGTABLE_INDEX_SIZE)); + BUG_ON(shift > MAX_PGTABLE_INDEX_SIZE); if (PGT_CACHE(shift)) return; /* Already have a cache of this size */ @@ -83,7 +83,7 @@ void pgtable_cache_add(unsigned shift, void (*ctor)(void *)) panic("Could not allocate pgtable cache for order %d", shift); kfree(name); - pgtable_cache[shift - 1] = new; + pgtable_cache[shift] = new; pr_debug("Allocated pgtable cache for order %d\n", shift); } -- 2.13.3
[PATCH v8 11/20] powerpc/mm: fix a warning when a cache is common to PGD and hugepages
While implementing TLB miss HW assistance on the 8xx, the following warning was encountered: [ 423.732965] WARNING: CPU: 0 PID: 345 at mm/slub.c:2412 ___slab_alloc.constprop.30+0x26c/0x46c [ 423.733033] CPU: 0 PID: 345 Comm: mmap Not tainted 4.18.0-rc8-00664-g2dfff9121c55 #671 [ 423.733075] NIP: c0108f90 LR: c0109ad0 CTR: 0004 [ 423.733121] REGS: c455bba0 TRAP: 0700 Not tainted (4.18.0-rc8-00664-g2dfff9121c55) [ 423.733147] MSR: 00021032 CR: 24224848 XER: 2000 [ 423.733319] [ 423.733319] GPR00: c0109ad0 c455bc50 c4521910 c60053c0 007080c0 c0011b34 c7fa41e0 c455be30 [ 423.733319] GPR08: 0001 c00103a0 c7fa41e0 c49afcc4 24282842 10018840 c079b37c 0040 [ 423.733319] GPR16: 73f0 00210d00 0001 c455a000 0100 0200 c455a000 [ 423.733319] GPR24: c60053c0 c0011b34 007080c0 c455a000 c455a000 c7fa41e0 9032 [ 423.734190] NIP [c0108f90] ___slab_alloc.constprop.30+0x26c/0x46c [ 423.734257] LR [c0109ad0] kmem_cache_alloc+0x210/0x23c [ 423.734283] Call Trace: [ 423.734326] [c455bc50] [0100] 0x100 (unreliable) [ 423.734430] [c455bcc0] [c0109ad0] kmem_cache_alloc+0x210/0x23c [ 423.734543] [c455bcf0] [c0011b34] huge_pte_alloc+0xc0/0x1dc [ 423.734633] [c455bd20] [c01044dc] hugetlb_fault+0x408/0x48c [ 423.734720] [c455bdb0] [c0104b20] follow_hugetlb_page+0x14c/0x44c [ 423.734826] [c455be10] [c00e8e54] __get_user_pages+0x1c4/0x3dc [ 423.734919] [c455be80] [c00e9924] __mm_populate+0xac/0x140 [ 423.735020] [c455bec0] [c00db14c] vm_mmap_pgoff+0xb4/0xb8 [ 423.735127] [c455bf00] [c00f27c0] ksys_mmap_pgoff+0xcc/0x1fc [ 423.735222] [c455bf40] [c000e0f8] ret_from_syscall+0x0/0x38 [ 423.735271] Instruction dump: [ 423.735321] 7cbf482e 38fd0008 7fa6eb78 7fc4f378 4bfff5dd 7fe3fb78 4bfffe24 81370010 [ 423.735536] 71280004 41a2ff88 4840c571 4b80 <0fe0> 4bfffeb8 81340010 712a0004 [ 423.735757] ---[ end trace e9b222919a470790 ]--- This warning occurs when calling kmem_cache_zalloc() on a cache having a constructor. In this case it happens because PGD cache and 512k hugepte cache are the same size (4k). While a cache with constructor is created for the PGD, hugepages create cache without constructor and uses kmem_cache_zalloc(). As both expect a cache with the same size, the hugepages reuse the cache created for PGD, hence the conflict. In order to avoid this conflict, this patch: - modifies pgtable_cache_add() so that a zeroising constructor is added for any cache size. - replaces calls to kmem_cache_zalloc() by kmem_cache_alloc() Signed-off-by: Christophe Leroy --- see original discussion in https://patchwork.ozlabs.org/patch/957565/ arch/powerpc/include/asm/pgtable.h | 2 +- arch/powerpc/mm/hugetlbpage.c | 6 ++--- arch/powerpc/mm/init-common.c | 46 ++ 3 files changed, 36 insertions(+), 18 deletions(-) diff --git a/arch/powerpc/include/asm/pgtable.h b/arch/powerpc/include/asm/pgtable.h index 734df2210749..74810bba45d2 100644 --- a/arch/powerpc/include/asm/pgtable.h +++ b/arch/powerpc/include/asm/pgtable.h @@ -101,7 +101,7 @@ extern int gup_hugepte(pte_t *ptep, unsigned long sz, unsigned long addr, /* can we use this in kvm */ unsigned long vmalloc_to_phys(void *vmalloc_addr); -void pgtable_cache_add(unsigned shift, void (*ctor)(void *)); +void pgtable_cache_add(unsigned int shift); void pgtable_cache_init(void); #if defined(CONFIG_STRICT_KERNEL_RWX) || defined(CONFIG_PPC32) diff --git a/arch/powerpc/mm/hugetlbpage.c b/arch/powerpc/mm/hugetlbpage.c index c4f1263228b8..bc97874d7c74 100644 --- a/arch/powerpc/mm/hugetlbpage.c +++ b/arch/powerpc/mm/hugetlbpage.c @@ -70,7 +70,7 @@ static int __hugepte_alloc(struct mm_struct *mm, hugepd_t *hpdp, num_hugepd = 1; } - new = kmem_cache_zalloc(cachep, pgtable_gfp_flags(mm, GFP_KERNEL)); + new = kmem_cache_alloc(cachep, pgtable_gfp_flags(mm, GFP_KERNEL)); BUG_ON(pshift > HUGEPD_SHIFT_MASK); BUG_ON((unsigned long)new & HUGEPD_SHIFT_MASK); @@ -701,10 +701,10 @@ static int __init hugetlbpage_init(void) * use pgt cache for hugepd. */ if (pdshift > shift) - pgtable_cache_add(pdshift - shift, NULL); + pgtable_cache_add(pdshift - shift); #if defined(CONFIG_PPC_FSL_BOOK3E) || defined(CONFIG_PPC_8xx) else - pgtable_cache_add(PTE_T_ORDER, NULL); + pgtable_cache_add(PTE_T_ORDER); #endif } diff --git a/arch/powerpc/mm/init-common.c b/arch/powerpc/mm/init-common.c index 41190f2b60c2..b7ca03643d0b 100644 --- a/arch/powerpc/mm/init-common.c +++ b/arch/powerpc/mm/init-common.c @@ -25,19 +25,37 @@ #include #include -static void pgd_ctor(void *addr) -{ - memset(addr, 0, PGD_TABLE_SIZE); +#define CTOR(shift) static void ctor_##shift(void *addr) \ +{ \ +
[PATCH v8 17/20] powerpc/8xx: Enable 512k hugepage support with HW assistance
For using 512k pages with hardware assistance, the PTEs have to be spread every 128 bytes in the L2 table. Signed-off-by: Christophe Leroy --- arch/powerpc/include/asm/hugetlb.h | 4 +++- arch/powerpc/mm/hugetlbpage.c | 10 +- arch/powerpc/mm/tlb_nohash.c | 3 +++ 3 files changed, 15 insertions(+), 2 deletions(-) diff --git a/arch/powerpc/include/asm/hugetlb.h b/arch/powerpc/include/asm/hugetlb.h index dfb8bf236586..62a0ca02ca7d 100644 --- a/arch/powerpc/include/asm/hugetlb.h +++ b/arch/powerpc/include/asm/hugetlb.h @@ -74,7 +74,9 @@ static inline pte_t *hugepte_offset(hugepd_t hpd, unsigned long addr, unsigned long idx = 0; pte_t *dir = hugepd_page(hpd); -#ifndef CONFIG_PPC_FSL_BOOK3E +#ifdef CONFIG_PPC_8xx + idx = (addr & ((1UL << pdshift) - 1)) >> PAGE_SHIFT; +#elif !defined(CONFIG_PPC_FSL_BOOK3E) idx = (addr & ((1UL << pdshift) - 1)) >> hugepd_shift(hpd); #endif diff --git a/arch/powerpc/mm/hugetlbpage.c b/arch/powerpc/mm/hugetlbpage.c index bc97874d7c74..5f22f1d399b3 100644 --- a/arch/powerpc/mm/hugetlbpage.c +++ b/arch/powerpc/mm/hugetlbpage.c @@ -65,6 +65,9 @@ static int __hugepte_alloc(struct mm_struct *mm, hugepd_t *hpdp, if (pshift >= pdshift) { cachep = PGT_CACHE(PTE_T_ORDER); num_hugepd = 1 << (pshift - pdshift); + } else if (IS_ENABLED(CONFIG_PPC_8xx)) { + cachep = PGT_CACHE(PTE_SHIFT); + num_hugepd = 1; } else { cachep = PGT_CACHE(pdshift - pshift); num_hugepd = 1; @@ -331,6 +334,9 @@ static void free_hugepd_range(struct mmu_gather *tlb, hugepd_t *hpdp, int pdshif if (shift >= pdshift) hugepd_free(tlb, hugepte); + else if (IS_ENABLED(CONFIG_PPC_8xx)) + pgtable_free_tlb(tlb, hugepte, +get_hugepd_cache_index(PTE_SHIFT)); else pgtable_free_tlb(tlb, hugepte, get_hugepd_cache_index(pdshift - shift)); @@ -700,7 +706,9 @@ static int __init hugetlbpage_init(void) * if we have pdshift and shift value same, we don't * use pgt cache for hugepd. */ - if (pdshift > shift) + if (pdshift > shift && IS_ENABLED(CONFIG_PPC_8xx)) + pgtable_cache_add(PTE_SHIFT); + else if (pdshift > shift) pgtable_cache_add(pdshift - shift); #if defined(CONFIG_PPC_FSL_BOOK3E) || defined(CONFIG_PPC_8xx) else diff --git a/arch/powerpc/mm/tlb_nohash.c b/arch/powerpc/mm/tlb_nohash.c index 8ad7aab150b7..ae5d568e267f 100644 --- a/arch/powerpc/mm/tlb_nohash.c +++ b/arch/powerpc/mm/tlb_nohash.c @@ -97,6 +97,9 @@ struct mmu_psize_def mmu_psize_defs[MMU_PAGE_COUNT] = { .shift = 14, }, #endif + [MMU_PAGE_512K] = { + .shift = 19, + }, [MMU_PAGE_8M] = { .shift = 23, }, -- 2.13.3
[PATCH v8 20/20] powerpc/8xx: regroup TLB handler routines
As this is running with MMU off, the CPU only does speculative fetch for code in the same page. Following the significant size reduction of TLB handler routines, the side handlers can be brought back close to the main part, ie in the same page. Signed-off-by: Christophe Leroy --- arch/powerpc/kernel/head_8xx.S | 112 - 1 file changed, 54 insertions(+), 58 deletions(-) diff --git a/arch/powerpc/kernel/head_8xx.S b/arch/powerpc/kernel/head_8xx.S index 0a4f8a9c85ff..b171b7c0a0e7 100644 --- a/arch/powerpc/kernel/head_8xx.S +++ b/arch/powerpc/kernel/head_8xx.S @@ -399,6 +399,23 @@ InstructionTLBMiss: rfi #endif +#ifndef CONFIG_PIN_TLB_TEXT +ITLBMissLinear: + mtcrr11 + /* Set 8M byte page and mark it valid */ + li r11, MI_PS8MEG | MI_SVALID + mtspr SPRN_MI_TWC, r11 + rlwinm r10, r10, 20, 0x0f80/* 8xx supports max 256Mb RAM */ + ori r10, r10, 0xf0 | MI_SPS16K | _PAGE_SH | _PAGE_DIRTY | \ + _PAGE_PRESENT + mtspr SPRN_MI_RPN, r10/* Update TLB entry */ + +0: mfspr r10, SPRN_SPRG_SCRATCH0 + mfspr r11, SPRN_SPRG_SCRATCH1 + rfi + patch_site 0b, patch__itlbmiss_exit_2 +#endif + . = 0x1200 DataStoreTLBMiss: mtspr SPRN_SPRG_SCRATCH0, r10 @@ -484,6 +501,43 @@ DataStoreTLBMiss: rfi #endif +DTLBMissIMMR: + mtcrr11 + /* Set 512k byte guarded page and mark it valid */ + li r10, MD_PS512K | MD_GUARDED | MD_SVALID + mtspr SPRN_MD_TWC, r10 + mfspr r10, SPRN_IMMR /* Get current IMMR */ + rlwinm r10, r10, 0, 0xfff8 /* Get 512 kbytes boundary */ + ori r10, r10, 0xf0 | MD_SPS16K | _PAGE_SH | _PAGE_DIRTY | \ + _PAGE_PRESENT | _PAGE_NO_CACHE + mtspr SPRN_MD_RPN, r10/* Update TLB entry */ + + li r11, RPN_PATTERN + mtspr SPRN_DAR, r11 /* Tag DAR */ + +0: mfspr r10, SPRN_SPRG_SCRATCH0 + mfspr r11, SPRN_SPRG_SCRATCH1 + rfi + patch_site 0b, patch__dtlbmiss_exit_2 + +DTLBMissLinear: + mtcrr11 + /* Set 8M byte page and mark it valid */ + li r11, MD_PS8MEG | MD_SVALID + mtspr SPRN_MD_TWC, r11 + rlwinm r10, r10, 20, 0x0f80/* 8xx supports max 256Mb RAM */ + ori r10, r10, 0xf0 | MD_SPS16K | _PAGE_SH | _PAGE_DIRTY | \ + _PAGE_PRESENT + mtspr SPRN_MD_RPN, r10/* Update TLB entry */ + + li r11, RPN_PATTERN + mtspr SPRN_DAR, r11 /* Tag DAR */ + +0: mfspr r10, SPRN_SPRG_SCRATCH0 + mfspr r11, SPRN_SPRG_SCRATCH1 + rfi + patch_site 0b, patch__dtlbmiss_exit_3 + /* This is an instruction TLB error on the MPC8xx. This could be due * to many reasons, such as executing guarded memory or illegal instruction * addresses. There is nothing to do but handle a big time error fault. @@ -583,64 +637,6 @@ InstructionBreakpoint: . = 0x2000 -/* - * Bottom part of DataStoreTLBMiss handlers for IMMR area and linear RAM. - * not enough space in the DataStoreTLBMiss area. - */ -DTLBMissIMMR: - mtcrr11 - /* Set 512k byte guarded page and mark it valid */ - li r10, MD_PS512K | MD_GUARDED | MD_SVALID - mtspr SPRN_MD_TWC, r10 - mfspr r10, SPRN_IMMR /* Get current IMMR */ - rlwinm r10, r10, 0, 0xfff8 /* Get 512 kbytes boundary */ - ori r10, r10, 0xf0 | MD_SPS16K | _PAGE_SH | _PAGE_DIRTY | \ - _PAGE_PRESENT | _PAGE_NO_CACHE - mtspr SPRN_MD_RPN, r10/* Update TLB entry */ - - li r11, RPN_PATTERN - mtspr SPRN_DAR, r11 /* Tag DAR */ - -0: mfspr r10, SPRN_SPRG_SCRATCH0 - mfspr r11, SPRN_SPRG_SCRATCH1 - rfi - patch_site 0b, patch__dtlbmiss_exit_2 - -DTLBMissLinear: - mtcrr11 - /* Set 8M byte page and mark it valid */ - li r11, MD_PS8MEG | MD_SVALID - mtspr SPRN_MD_TWC, r11 - rlwinm r10, r10, 20, 0x0f80/* 8xx supports max 256Mb RAM */ - ori r10, r10, 0xf0 | MD_SPS16K | _PAGE_SH | _PAGE_DIRTY | \ - _PAGE_PRESENT - mtspr SPRN_MD_RPN, r10/* Update TLB entry */ - - li r11, RPN_PATTERN - mtspr SPRN_DAR, r11 /* Tag DAR */ - -0: mfspr r10, SPRN_SPRG_SCRATCH0 - mfspr r11, SPRN_SPRG_SCRATCH1 - rfi - patch_site 0b, patch__dtlbmiss_exit_3 - -#ifndef CONFIG_PIN_TLB_TEXT -ITLBMissLinear: - mtcrr11 - /* Set 8M byte page and mark it valid */ - li r11, MI_PS8MEG | MI_SVALID - mtspr SPRN_MI_TWC, r11 - rlwinm r10, r10, 20, 0x0f80/* 8xx supports max 256Mb RAM */ - ori r10, r10, 0xf0 | MI_SPS16K | _PAGE_SH | _PAGE_DIRTY | \ -
Re: [PATCH 0/3] System call table generation support
On Thu, Nov 29, 2018 at 01:48:16PM +0530, Firoz Khan wrote: > Hi Sathish, > > Thanks for your email. > > On Thu, 29 Nov 2018 at 12:05, Satheesh Rajendran > wrote: > > > > On Fri, Sep 14, 2018 at 02:02:57PM +0530, Firoz Khan wrote: > > > The purpose of this patch series is: > > > 1. We can easily add/modify/delete system call by changing entry > > > in syscall.tbl file. No need to manually edit many files. > > > > > > 2. It is easy to unify the system call implementation across all > > > the architectures. > > > > > > The system call tables are in different format in all architecture > > > and it will be difficult to manually add or modify the system calls > > > in the respective files manually. To make it easy by keeping a script > > > and which'll generate the header file and syscall table file so this > > > change will unify them across all architectures. > > > > > > syscall.tbl contains the list of available system calls along with > > > system call number and corresponding entry point. Add a new system > > > call in this architecture will be possible by adding new entry in > > > the syscall.tbl file. > > > > > > Adding a new table entry consisting of: > > > - System call number. > > > - ABI. > > > - System call name. > > > - Entry point name. > > > - Compat entry name, if required. > > > > > > ARM, s390 and x86 architecuture does exist the similar support. I > > > leverage their implementation to come up with a generic solution. > > > > > > I have done the same support for work for alpha, m68k, microblaze, > > > ia64, mips, parisc, sh, sparc, and xtensa. But I started sending > > > the patch for one architecuture for review. Below mentioned git > > > repository contains more details. > > > Git repo:- https://github.com/frzkhn/system_call_table_generator/ > > > > > > Finally, this is the ground work for solving the Y2038 issue. We > > > need to add/change two dozen of system calls to solve Y2038 issue. > > > So this patch series will help to easily modify from existing > > > system call to Y2038 compatible system calls. > > > > > > I started working system call table generation on 4.17-rc1. I used > > > marcin's script - https://github.com/hrw/syscalls-table to generate > > > the syscall.tbl file. And this will be the input to the system call > > > table generation script. But there are couple system call got add > > > in the latest rc release. If run Marcin's script on latest release, > > > It will generate a new syscall.tbl. But I still use the old file - > > > syscall.tbl and once all review got over I'll update syscall.tbl > > > alone w.r.to the tip of the kernel. The impact of this thing, few > > > of the system call won't work. > > > > > > Firoz Khan (3): > > > powerpc: Replace NR_syscalls macro from asm/unistd.h > > > powerpc: Add system call table generation support > > > powerpc: uapi header and system call table file generation > > > > > > arch/powerpc/Makefile | 3 + > > > arch/powerpc/include/asm/Kbuild | 3 + > > > arch/powerpc/include/asm/unistd.h | 3 +- > > > arch/powerpc/include/uapi/asm/Kbuild| 2 + > > > arch/powerpc/include/uapi/asm/unistd.h | 391 > > > +--- > > > arch/powerpc/kernel/Makefile| 3 +- > > > arch/powerpc/kernel/syscall_table_32.S | 9 + > > > arch/powerpc/kernel/syscall_table_64.S | 17 ++ > > > arch/powerpc/kernel/syscalls/Makefile | 51 > > > arch/powerpc/kernel/syscalls/syscall_32.tbl | 378 > > > +++ > > > arch/powerpc/kernel/syscalls/syscall_64.tbl | 372 > > > ++ > > > arch/powerpc/kernel/syscalls/syscallhdr.sh | 37 +++ > > > arch/powerpc/kernel/syscalls/syscalltbl.sh | 38 +++ > > > arch/powerpc/kernel/systbl.S| 50 > > > 14 files changed, 916 insertions(+), 441 deletions(-) > > > create mode 100644 arch/powerpc/kernel/syscall_table_32.S > > > create mode 100644 arch/powerpc/kernel/syscall_table_64.S > > > create mode 100644 arch/powerpc/kernel/syscalls/Makefile > > > create mode 100644 arch/powerpc/kernel/syscalls/syscall_32.tbl > > > create mode 100644 arch/powerpc/kernel/syscalls/syscall_64.tbl > > > create mode 100644 arch/powerpc/kernel/syscalls/syscallhdr.sh > > > create mode 100644 arch/powerpc/kernel/syscalls/syscalltbl.sh > > > delete mode 100644 arch/powerpc/kernel/systbl.S > > > > Hi, > > > > This patch series failed to boot in IBM Power8 box with below base commit > > and built with ppc64le_defconfig, > > https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git/commit/?h=merge=183cbf93be88d1a4fb572e27b1e08aa0ad85 > > I think you are applied some old patch series. Could you please > perform the boot test on powerpc v3 which I have sent few hour before. Hi Firoz, Looks like I chose a wrong mail to reply, but did test with v3 series itself.
Re: [PATCH 12/24] powerpc/mm: Fix reporting of kernel execute faults
Le 30/11/2018 à 06:50, Aneesh Kumar K.V a écrit : Christophe LEROY writes: Hi Ben, I have an issue on the 8xx with this change Le 19/07/2017 à 06:49, Benjamin Herrenschmidt a écrit : We currently test for is_exec and DSISR_PROTFAULT but that doesn't make sense as this is the wrong error bit to test for an execute permission failure. On the 8xx, on an exec permission failure, this is the correct BIT, see below extract from reference manual: Note that only one of bits 1, 3, and 4 will be set. 1 1 if the translation of an attempted access is not in the translation tables. Otherwise 0 3 1 if the fetch access was to guarded memory when MSR[IR] = 1. Otherwise 0 4 1 if the access is not permitted by the protection mechanism; otherwise 0. So on the 8xx, bit 3 is not DSISR_NOEXEC_OR_G but only DSISR_G. When the PPP bits are set to No-Execute, we really get bit 4 that is DSISR_PROTFAULT. Do you have an url for the document? I am wondering whether we can get Documentation/powerpc/cpu_families.txt updated with these urls? https://www.nxp.com/docs/en/reference-manual/MPC885RM.pdf Christophe In fact, we had code that would return early if we had an exec fault in kernel mode so I think that was just dead code anyway. Finally the location of that test is awkward and prevents further simplifications. So instead move that test into a helper along with the existing early test for kernel exec faults and out of range accesses, and put it all in a "bad_kernel_fault()" helper. While at it test the correct error bits. Signed-off-by: Benjamin Herrenschmidt --- arch/powerpc/mm/fault.c | 21 +++-- 1 file changed, 15 insertions(+), 6 deletions(-) diff --git a/arch/powerpc/mm/fault.c b/arch/powerpc/mm/fault.c index e8d6acc888c5..aead07cf8a5b 100644 --- a/arch/powerpc/mm/fault.c +++ b/arch/powerpc/mm/fault.c @@ -180,6 +180,20 @@ static int mm_fault_error(struct pt_regs *regs, unsigned long addr, int fault) return MM_FAULT_CONTINUE; } +/* Is this a bad kernel fault ? */ +static bool bad_kernel_fault(bool is_exec, unsigned long error_code, +unsigned long address) +{ + if (is_exec && (error_code & (DSISR_NOEXEC_OR_G | DSISR_KEYFAULT))) { Do you mind if we had DSISR_PROTFAULT here as well ? Christophe + printk_ratelimited(KERN_CRIT "kernel tried to execute" + " exec-protected page (%lx) -" + "exploit attempt? (uid: %d)\n", + address, from_kuid(_user_ns, + current_uid())); + } + return is_exec || (address >= TASK_SIZE); +} + /* * Define the correct "is_write" bit in error_code based * on the processor family @@ -252,7 +266,7 @@ static int __do_page_fault(struct pt_regs *regs, unsigned long address, * The kernel should never take an execute fault nor should it * take a page fault to a kernel address. */ - if (!is_user && (is_exec || (address >= TASK_SIZE))) + if (unlikely(!is_user && bad_kernel_fault(is_exec, error_code, address))) return SIGSEGV; /* We restore the interrupt state now */ @@ -491,11 +505,6 @@ static int __do_page_fault(struct pt_regs *regs, unsigned long address, return 0; } - if (is_exec && (error_code & DSISR_PROTFAULT)) - printk_ratelimited(KERN_CRIT "kernel tried to execute NX-protected" - " page (%lx) - exploit attempt? (uid: %d)\n", - address, from_kuid(_user_ns, current_uid())); - return SIGSEGV; } NOKPROBE_SYMBOL(__do_page_fault);
Re: [PATCH] powerpc/mm: add exec protection on powerpc 603
On 11/16/18 10:50 PM, Christophe Leroy wrote: The 603 doesn't have a HASH table, TLB misses are handled by software. It is then possible to generate page fault when _PAGE_EXEC is not set like in nohash/32. In order to support it, set_pte_filter() and set_access_flags_filter() are made common, and the handling is made dependent on MMU_FTR_HPTE_TABLE Signed-off-by: Christophe Leroy --- arch/powerpc/include/asm/book3s/32/hash.h| 1 + arch/powerpc/include/asm/book3s/32/pgtable.h | 18 +- arch/powerpc/include/asm/cputable.h | 8 arch/powerpc/kernel/head_32.S| 2 ++ arch/powerpc/mm/pgtable.c| 20 +++- 5 files changed, 27 insertions(+), 22 deletions(-) diff --git a/arch/powerpc/include/asm/book3s/32/hash.h b/arch/powerpc/include/asm/book3s/32/hash.h index f2892c7ab73e..2a0a467d2985 100644 --- a/arch/powerpc/include/asm/book3s/32/hash.h +++ b/arch/powerpc/include/asm/book3s/32/hash.h @@ -26,6 +26,7 @@ #define _PAGE_WRITETHRU 0x040 /* W: cache write-through */ #define _PAGE_DIRTY 0x080 /* C: page changed */ #define _PAGE_ACCESSED0x100 /* R: page referenced */ +#define _PAGE_EXEC 0x200 /* software: exec allowed */ #define _PAGE_RW 0x400 /* software: user write access allowed */ #define _PAGE_SPECIAL 0x800 /* software: Special page */ diff --git a/arch/powerpc/include/asm/book3s/32/pgtable.h b/arch/powerpc/include/asm/book3s/32/pgtable.h index c21d33704633..cf844fed4527 100644 --- a/arch/powerpc/include/asm/book3s/32/pgtable.h +++ b/arch/powerpc/include/asm/book3s/32/pgtable.h @@ -10,9 +10,9 @@ /* And here we include common definitions */ #define _PAGE_KERNEL_RO 0 -#define _PAGE_KERNEL_ROX 0 +#define _PAGE_KERNEL_ROX (_PAGE_EXEC) #define _PAGE_KERNEL_RW (_PAGE_DIRTY | _PAGE_RW) -#define _PAGE_KERNEL_RWX (_PAGE_DIRTY | _PAGE_RW) +#define _PAGE_KERNEL_RWX (_PAGE_DIRTY | _PAGE_RW | _PAGE_EXEC) #define _PAGE_HPTEFLAGS _PAGE_HASHPTE @@ -66,11 +66,11 @@ static inline bool pte_user(pte_t pte) */ #define PAGE_NONE __pgprot(_PAGE_BASE) #define PAGE_SHARED __pgprot(_PAGE_BASE | _PAGE_USER | _PAGE_RW) -#define PAGE_SHARED_X __pgprot(_PAGE_BASE | _PAGE_USER | _PAGE_RW) +#define PAGE_SHARED_X __pgprot(_PAGE_BASE | _PAGE_USER | _PAGE_RW | _PAGE_EXEC) #define PAGE_COPY __pgprot(_PAGE_BASE | _PAGE_USER) -#define PAGE_COPY_X__pgprot(_PAGE_BASE | _PAGE_USER) +#define PAGE_COPY_X__pgprot(_PAGE_BASE | _PAGE_USER | _PAGE_EXEC) #define PAGE_READONLY __pgprot(_PAGE_BASE | _PAGE_USER) -#define PAGE_READONLY_X__pgprot(_PAGE_BASE | _PAGE_USER) +#define PAGE_READONLY_X__pgprot(_PAGE_BASE | _PAGE_USER | _PAGE_EXEC) /* Permission masks used for kernel mappings */ #define PAGE_KERNEL __pgprot(_PAGE_BASE | _PAGE_KERNEL_RW) @@ -318,7 +318,7 @@ static inline void __ptep_set_access_flags(struct vm_area_struct *vma, int psize) { unsigned long set = pte_val(entry) & - (_PAGE_DIRTY | _PAGE_ACCESSED | _PAGE_RW); + (_PAGE_DIRTY | _PAGE_ACCESSED | _PAGE_RW | _PAGE_EXEC); pte_update(ptep, 0, set); @@ -384,7 +384,7 @@ static inline int pte_dirty(pte_t pte) { return !!(pte_val(pte) & _PAGE_DIRTY); static inline int pte_young(pte_t pte){ return !!(pte_val(pte) & _PAGE_ACCESSED); } static inline int pte_special(pte_t pte) { return !!(pte_val(pte) & _PAGE_SPECIAL); } static inline int pte_none(pte_t pte) { return (pte_val(pte) & ~_PTE_NONE_MASK) == 0; } -static inline bool pte_exec(pte_t pte) { return true; } +static inline bool pte_exec(pte_t pte) { return pte_val(pte) & _PAGE_EXEC; } static inline int pte_present(pte_t pte) { @@ -451,7 +451,7 @@ static inline pte_t pte_wrprotect(pte_t pte) static inline pte_t pte_exprotect(pte_t pte) { - return pte; + return __pte(pte_val(pte) & ~_PAGE_EXEC); } static inline pte_t pte_mkclean(pte_t pte) @@ -466,7 +466,7 @@ static inline pte_t pte_mkold(pte_t pte) static inline pte_t pte_mkexec(pte_t pte) { - return pte; + return __pte(pte_val(pte) | _PAGE_EXEC); } static inline pte_t pte_mkpte(pte_t pte) diff --git a/arch/powerpc/include/asm/cputable.h b/arch/powerpc/include/asm/cputable.h index 29f49a35d6ee..a0395ccbbe9e 100644 --- a/arch/powerpc/include/asm/cputable.h +++ b/arch/powerpc/include/asm/cputable.h @@ -296,7 +296,7 @@ static inline void cpu_feature_keys_init(void) { } #define CPU_FTRS_PPC601 (CPU_FTR_COMMON | CPU_FTR_601 | \ CPU_FTR_COHERENT_ICACHE | CPU_FTR_UNIFIED_ID_CACHE | CPU_FTR_USE_RTC) #define CPU_FTRS_603 (CPU_FTR_COMMON | CPU_FTR_MAYBE_CAN_DOZE | \ - CPU_FTR_MAYBE_CAN_NAP | CPU_FTR_PPC_LE) + CPU_FTR_MAYBE_CAN_NAP | CPU_FTR_PPC_LE | CPU_FTR_NOEXECUTE) #define CPU_FTRS_604
Re: [PATCH] powerpc/mm: add exec protection on powerpc 603
Le 29/11/2018 à 12:25, Aneesh Kumar K.V a écrit : On 11/16/18 10:50 PM, Christophe Leroy wrote: The 603 doesn't have a HASH table, TLB misses are handled by software. It is then possible to generate page fault when _PAGE_EXEC is not set like in nohash/32. In order to support it, set_pte_filter() and set_access_flags_filter() are made common, and the handling is made dependent on MMU_FTR_HPTE_TABLE Signed-off-by: Christophe Leroy --- arch/powerpc/include/asm/book3s/32/hash.h | 1 + arch/powerpc/include/asm/book3s/32/pgtable.h | 18 +- arch/powerpc/include/asm/cputable.h | 8 arch/powerpc/kernel/head_32.S | 2 ++ arch/powerpc/mm/pgtable.c | 20 +++- 5 files changed, 27 insertions(+), 22 deletions(-) diff --git a/arch/powerpc/include/asm/book3s/32/hash.h b/arch/powerpc/include/asm/book3s/32/hash.h index f2892c7ab73e..2a0a467d2985 100644 --- a/arch/powerpc/include/asm/book3s/32/hash.h +++ b/arch/powerpc/include/asm/book3s/32/hash.h @@ -26,6 +26,7 @@ #define _PAGE_WRITETHRU 0x040 /* W: cache write-through */ #define _PAGE_DIRTY 0x080 /* C: page changed */ #define _PAGE_ACCESSED 0x100 /* R: page referenced */ +#define _PAGE_EXEC 0x200 /* software: exec allowed */ #define _PAGE_RW 0x400 /* software: user write access allowed */ #define _PAGE_SPECIAL 0x800 /* software: Special page */ diff --git a/arch/powerpc/include/asm/book3s/32/pgtable.h b/arch/powerpc/include/asm/book3s/32/pgtable.h index c21d33704633..cf844fed4527 100644 --- a/arch/powerpc/include/asm/book3s/32/pgtable.h +++ b/arch/powerpc/include/asm/book3s/32/pgtable.h @@ -10,9 +10,9 @@ /* And here we include common definitions */ #define _PAGE_KERNEL_RO 0 -#define _PAGE_KERNEL_ROX 0 +#define _PAGE_KERNEL_ROX (_PAGE_EXEC) #define _PAGE_KERNEL_RW (_PAGE_DIRTY | _PAGE_RW) -#define _PAGE_KERNEL_RWX (_PAGE_DIRTY | _PAGE_RW) +#define _PAGE_KERNEL_RWX (_PAGE_DIRTY | _PAGE_RW | _PAGE_EXEC) #define _PAGE_HPTEFLAGS _PAGE_HASHPTE @@ -66,11 +66,11 @@ static inline bool pte_user(pte_t pte) */ #define PAGE_NONE __pgprot(_PAGE_BASE) #define PAGE_SHARED __pgprot(_PAGE_BASE | _PAGE_USER | _PAGE_RW) -#define PAGE_SHARED_X __pgprot(_PAGE_BASE | _PAGE_USER | _PAGE_RW) +#define PAGE_SHARED_X __pgprot(_PAGE_BASE | _PAGE_USER | _PAGE_RW | _PAGE_EXEC) #define PAGE_COPY __pgprot(_PAGE_BASE | _PAGE_USER) -#define PAGE_COPY_X __pgprot(_PAGE_BASE | _PAGE_USER) +#define PAGE_COPY_X __pgprot(_PAGE_BASE | _PAGE_USER | _PAGE_EXEC) #define PAGE_READONLY __pgprot(_PAGE_BASE | _PAGE_USER) -#define PAGE_READONLY_X __pgprot(_PAGE_BASE | _PAGE_USER) +#define PAGE_READONLY_X __pgprot(_PAGE_BASE | _PAGE_USER | _PAGE_EXEC) /* Permission masks used for kernel mappings */ #define PAGE_KERNEL __pgprot(_PAGE_BASE | _PAGE_KERNEL_RW) @@ -318,7 +318,7 @@ static inline void __ptep_set_access_flags(struct vm_area_struct *vma, int psize) { unsigned long set = pte_val(entry) & - (_PAGE_DIRTY | _PAGE_ACCESSED | _PAGE_RW); + (_PAGE_DIRTY | _PAGE_ACCESSED | _PAGE_RW | _PAGE_EXEC); pte_update(ptep, 0, set); @@ -384,7 +384,7 @@ static inline int pte_dirty(pte_t pte) { return !!(pte_val(pte) & _PAGE_DIRTY); static inline int pte_young(pte_t pte) { return !!(pte_val(pte) & _PAGE_ACCESSED); } static inline int pte_special(pte_t pte) { return !!(pte_val(pte) & _PAGE_SPECIAL); } static inline int pte_none(pte_t pte) { return (pte_val(pte) & ~_PTE_NONE_MASK) == 0; } -static inline bool pte_exec(pte_t pte) { return true; } +static inline bool pte_exec(pte_t pte) { return pte_val(pte) & _PAGE_EXEC; } static inline int pte_present(pte_t pte) { @@ -451,7 +451,7 @@ static inline pte_t pte_wrprotect(pte_t pte) static inline pte_t pte_exprotect(pte_t pte) { - return pte; + return __pte(pte_val(pte) & ~_PAGE_EXEC); } static inline pte_t pte_mkclean(pte_t pte) @@ -466,7 +466,7 @@ static inline pte_t pte_mkold(pte_t pte) static inline pte_t pte_mkexec(pte_t pte) { - return pte; + return __pte(pte_val(pte) | _PAGE_EXEC); } static inline pte_t pte_mkpte(pte_t pte) diff --git a/arch/powerpc/include/asm/cputable.h b/arch/powerpc/include/asm/cputable.h index 29f49a35d6ee..a0395ccbbe9e 100644 --- a/arch/powerpc/include/asm/cputable.h +++ b/arch/powerpc/include/asm/cputable.h @@ -296,7 +296,7 @@ static inline void cpu_feature_keys_init(void) { } #define CPU_FTRS_PPC601 (CPU_FTR_COMMON | CPU_FTR_601 | \ CPU_FTR_COHERENT_ICACHE | CPU_FTR_UNIFIED_ID_CACHE | CPU_FTR_USE_RTC) #define CPU_FTRS_603 (CPU_FTR_COMMON | CPU_FTR_MAYBE_CAN_DOZE | \ - CPU_FTR_MAYBE_CAN_NAP | CPU_FTR_PPC_LE) + CPU_FTR_MAYBE_CAN_NAP | CPU_FTR_PPC_LE | CPU_FTR_NOEXECUTE) #define CPU_FTRS_604 (CPU_FTR_COMMON | CPU_FTR_PPC_LE) #define CPU_FTRS_740_NOTAU
Re: [PATCH] powerpc/boot: Copy serial.c in Makefile
Hi dja, Daniel Axtens writes: > Right, so as both 0-day and snowpatch tell me, this patch is wrong. > > It turns out that this: >> $(obj)/serial.c: $(obj)/autoconf.h >> +$(Q)cp $< $@ > is identical to: > cp arch/powerpc/boot/autoconf.h arch/powerpc/boot/serial.c > > (Clearly my make mastery is inadequate.) > > Amusingly this which works for my 64e uImage but obviously not for > anything that actually needs code from serial.c. > > Further analysis suggests that making with -j1 triggers the issue, but > everything works with -j2 and above. That would make sense with the > timeline of when I discovered the issue because I changed my build > script to not build in parallel. I don't get why -j makes a difference, but that does explain why we haven't seen it, none of my tests use -j 1 :) I don't think we actually want to copy serial.c, we just want to specify a dependency, does this work for you? cheers diff --git a/arch/powerpc/boot/Makefile b/arch/powerpc/boot/Makefile index 39354365f54a..ed9883169190 100644 --- a/arch/powerpc/boot/Makefile +++ b/arch/powerpc/boot/Makefile @@ -197,7 +197,7 @@ $(addprefix $(obj)/,$(libfdt) $(libfdtheader)): $(obj)/%: $(srctree)/scripts/dtc $(obj)/zImage.coff.lds $(obj)/zImage.ps3.lds : $(obj)/%: $(srctree)/$(src)/%.S $(Q)cp $< $@ -$(obj)/serial.c: $(obj)/autoconf.h +$(srctree)/$(src)/serial.c: $(obj)/autoconf.h $(obj)/autoconf.h: $(obj)/%: $(objtree)/include/generated/% $(Q)cp $< $@
Re: [PATCH] powerpc/boot: Copy serial.c in Makefile
Hi mpe, >> Further analysis suggests that making with -j1 triggers the issue, but >> everything works with -j2 and above. That would make sense with the >> timeline of when I discovered the issue because I changed my build >> script to not build in parallel. > > I don't get why -j makes a difference, but that does explain why we > haven't seen it, none of my tests use -j 1 :) I don't understand either - only that with -j1 V=1 I get powerpc64le-linux-gnu-gcc ... -c -o arch/powerpc/boot/serial.o arch/powerpc/boot/serial.c and with -j2 I get: powerpc64le-linux-gnu-gcc ... -c -o arch/powerpc/boot/serial.o /home/dja/dev/linux/linux/arch/powerpc/boot/serial.c So for some reason j2 is getting the absolute path and j1 is getting a relative path. I have absolutely no idea why this would be. > I don't think we actually want to copy serial.c, we just want to specify > a dependency, does this work for you? Yeah I think you're right, I misunderstood the boot wrapper. That patch works for me - it causes the full path to be used. Thanks heaps! Regards, Daniel > cheers > > diff --git a/arch/powerpc/boot/Makefile b/arch/powerpc/boot/Makefile > index 39354365f54a..ed9883169190 100644 > --- a/arch/powerpc/boot/Makefile > +++ b/arch/powerpc/boot/Makefile > @@ -197,7 +197,7 @@ $(addprefix $(obj)/,$(libfdt) $(libfdtheader)): $(obj)/%: > $(srctree)/scripts/dtc > $(obj)/zImage.coff.lds $(obj)/zImage.ps3.lds : $(obj)/%: > $(srctree)/$(src)/%.S > $(Q)cp $< $@ > > -$(obj)/serial.c: $(obj)/autoconf.h > +$(srctree)/$(src)/serial.c: $(obj)/autoconf.h > > $(obj)/autoconf.h: $(obj)/%: $(objtree)/include/generated/% > $(Q)cp $< $@
Re: pkeys: Reserve PKEY_DISABLE_READ
* Dave Hansen: > On 11/27/18 3:57 AM, Florian Weimer wrote: >> I would have expected something that translates PKEY_DISABLE_WRITE | >> PKEY_DISABLE_READ into PKEY_DISABLE_ACCESS, and also accepts >> PKEY_DISABLE_ACCESS | PKEY_DISABLE_READ, for consistency with POWER. >> >> (My understanding is that PKEY_DISABLE_ACCESS does not disable all >> access, but produces execute-only memory.) > > Correct, it disables all data access, but not execution. So I would expect something like this (completely untested, I did not even compile this): diff --git a/arch/powerpc/include/asm/pkeys.h b/arch/powerpc/include/asm/pkeys.h index 20ebf153c871..bed23f9e8336 100644 --- a/arch/powerpc/include/asm/pkeys.h +++ b/arch/powerpc/include/asm/pkeys.h @@ -199,6 +199,11 @@ static inline bool arch_pkeys_enabled(void) return !static_branch_likely(_disabled); } +static inline bool arch_pkey_access_rights_valid(unsigned long rights) +{ + return (rights & ~(unsigned long)PKEY_ACCESS_MASK) == 0; +} + extern void pkey_mm_init(struct mm_struct *mm); extern bool arch_supports_pkeys(int cap); extern unsigned int arch_usable_pkeys(void); diff --git a/arch/x86/include/asm/pkeys.h b/arch/x86/include/asm/pkeys.h index 19b137f1b3be..e3e1d5a316e8 100644 --- a/arch/x86/include/asm/pkeys.h +++ b/arch/x86/include/asm/pkeys.h @@ -14,6 +14,17 @@ static inline bool arch_pkeys_enabled(void) return boot_cpu_has(X86_FEATURE_OSPKE); } +static inline bool arch_pkey_access_rights_valid(unsigned long rights) +{ + if (rights & ~(unsigned long)PKEY_ACCESS_MASK) + return false; + if (rights & PKEY_DISABLE_READ) { + /* x86 can only disable read access along with write access. */ + return rights & (PKEY_DISABLE_WRITE | PKEY_DISABLE_ACCESS); + } + return true; +} + /* * Try to dedicate one of the protection keys to be used as an * execute-only protection key. diff --git a/arch/x86/kernel/fpu/xstate.c b/arch/x86/kernel/fpu/xstate.c index 87a57b7642d3..b9b78145017f 100644 --- a/arch/x86/kernel/fpu/xstate.c +++ b/arch/x86/kernel/fpu/xstate.c @@ -928,7 +928,13 @@ int arch_set_user_pkey_access(struct task_struct *tsk, int pkey, return -EINVAL; /* Set the bits we need in PKRU: */ - if (init_val & PKEY_DISABLE_ACCESS) + if (init_val & (PKEY_DISABLE_ACCESS | PKEY_DISABLE_READ)) + /* +* arch_pkey_access_rights_valid checked that +* PKEY_DISABLE_READ is actually representable on x86 +* (that is, it comes with PKEY_DISABLE_ACCESS or +* PKEY_DISABLE_WRITE). +*/ new_pkru_bits |= PKRU_AD_BIT; if (init_val & PKEY_DISABLE_WRITE) new_pkru_bits |= PKRU_WD_BIT; diff --git a/include/linux/pkeys.h b/include/linux/pkeys.h index 2955ba976048..2c330fabbe55 100644 --- a/include/linux/pkeys.h +++ b/include/linux/pkeys.h @@ -48,6 +48,11 @@ static inline void copy_init_pkru_to_fpregs(void) { } +static inline bool arch_pkey_access_rights_valid(unsigned long rights) +{ + return false; +} + #endif /* ! CONFIG_ARCH_HAS_PKEYS */ #endif /* _LINUX_PKEYS_H */ diff --git a/mm/mprotect.c b/mm/mprotect.c index 6d331620b9e5..f4cefc3540df 100644 --- a/mm/mprotect.c +++ b/mm/mprotect.c @@ -597,7 +597,7 @@ SYSCALL_DEFINE2(pkey_alloc, unsigned long, flags, unsigned long, init_val) if (flags) return -EINVAL; /* check for unsupported init values */ - if (init_val & ~PKEY_ACCESS_MASK) + if (!arch_pkey_access_rights_valid(init_val)) return -EINVAL; down_write(>mm->mmap_sem); Thanks, Florian
Re: use generic DMA mapping code in powerpc V4
On 28 November 2018 at 12:05PM, Michael Ellerman wrote: Christoph Hellwig writes: Any comments? I'd like to at least get the ball moving on the easy bits. Nothing specific yet. I'm a bit worried it might break one of the many old obscure platforms we have that aren't well tested. There's not much we can do about that, but I'll just try and test it on everything I can find. Is the plan that you take these via the dma-mapping tree or that they go via powerpc? cheers Hi All, I compiled a test kernel from the following Git today. http://git.infradead.org/users/hch/misc.git/shortlog/refs/heads/powerpc-dma.4 Command: git clone git://git.infradead.org/users/hch/misc.git -b powerpc-dma.4 a Unfortunately I get some DMA error messages and the PASEMI ethernet doesn't work anymore. [ 367.627623] pci :00:1a.0: dma_direct_map_page: overflow 0x00026bcb5002+110 of device mask bus mask 0 [ 367.627631] pci :00:1a.0: dma_direct_map_page: overflow 0x00026bcb5002+110 of device mask bus mask 0 [ 367.627639] pci :00:1a.0: dma_direct_map_page: overflow 0x00026bcb5002+110 of device mask bus mask 0 [ 367.627647] pci :00:1a.0: dma_direct_map_page: overflow 0x00026bcb5002+110 of device mask bus mask 0 [ 367.627655] pci :00:1a.0: dma_direct_map_page: overflow 0x00026bcb5002+110 of device mask bus mask 0 [ 367.627686] pci :00:1a.0: dma_direct_map_page: overflow 0x00026bcb5002+110 of device mask bus mask 0 [ 367.628418] pci :00:1a.0: dma_direct_map_page: overflow 0x00026bcb5002+110 of device mask bus mask 0 [ 367.628505] pci :00:1a.0: dma_direct_map_page: overflow 0x00026bcb5002+110 of device mask bus mask 0 [ 367.628592] pci :00:1a.0: dma_direct_map_page: overflow 0x00026bcb5002+110 of device mask bus mask 0 [ 367.629324] pci :00:1a.0: dma_direct_map_page: overflow 0x00026bcb5002+110 of device mask bus mask 0 [ 367.629417] pci :00:1a.0: dma_direct_map_page: overflow 0x00026bcb5002+110 of device mask bus mask 0 [ 367.629495] pci :00:1a.0: dma_direct_map_page: overflow 0x00026bcb5002+110 of device mask bus mask 0 [ 367.629589] pci :00:1a.0: dma_direct_map_page: overflow 0x00026bcb5002+110 of device mask bus mask 0 [ 430.424732]pasemi_mac: rcmdsta error: 0x04ef3001 I tested this kernel with the Nemo board (CPU: PWRficient PA6T-1682M). The PASEMI ethernet works with the RC4 of kernel 4.20. Cheers, Christian
Re: [PATCH v3] powerpc/mm: add exec protection on powerpc 603
Le 28/11/2018 à 18:21, Christophe Leroy a écrit : The 603 doesn't have a HASH table, TLB misses are handled by software. It is then possible to generate page fault when _PAGE_EXEC is not set like in nohash/32. There is one "reserved" PTE bit available, this patch uses it for _PAGE_EXEC. In order to support it, set_pte_filter() and set_access_flags_filter() are made common, and the handling is made dependent on MMU_FTR_HPTE_TABLE Signed-off-by: Christophe Leroy Reviewed-by: Aneesh Kumar K.V --- v3: Included the _PAGE_EXEC flag in the existing test in TLB handler, no additional insns. v2: Amended commit log and removed #ifdef in pagetable dump arch/powerpc/include/asm/book3s/32/hash.h | 1 + arch/powerpc/include/asm/book3s/32/pgtable.h | 18 +- arch/powerpc/include/asm/cputable.h| 8 arch/powerpc/kernel/head_32.S | 2 +- arch/powerpc/mm/dump_linuxpagetables-generic.c | 2 -- arch/powerpc/mm/pgtable.c | 20 +++- 6 files changed, 26 insertions(+), 25 deletions(-) diff --git a/arch/powerpc/include/asm/book3s/32/hash.h b/arch/powerpc/include/asm/book3s/32/hash.h index f2892c7ab73e..2a0a467d2985 100644 --- a/arch/powerpc/include/asm/book3s/32/hash.h +++ b/arch/powerpc/include/asm/book3s/32/hash.h @@ -26,6 +26,7 @@ #define _PAGE_WRITETHRU 0x040 /* W: cache write-through */ #define _PAGE_DIRTY 0x080 /* C: page changed */ #define _PAGE_ACCESSED0x100 /* R: page referenced */ +#define _PAGE_EXEC 0x200 /* software: exec allowed */ #define _PAGE_RW 0x400 /* software: user write access allowed */ #define _PAGE_SPECIAL 0x800 /* software: Special page */ diff --git a/arch/powerpc/include/asm/book3s/32/pgtable.h b/arch/powerpc/include/asm/book3s/32/pgtable.h index b849b45429d5..6accf3e686af 100644 --- a/arch/powerpc/include/asm/book3s/32/pgtable.h +++ b/arch/powerpc/include/asm/book3s/32/pgtable.h @@ -10,9 +10,9 @@ /* And here we include common definitions */ #define _PAGE_KERNEL_RO 0 -#define _PAGE_KERNEL_ROX 0 +#define _PAGE_KERNEL_ROX (_PAGE_EXEC) #define _PAGE_KERNEL_RW (_PAGE_DIRTY | _PAGE_RW) -#define _PAGE_KERNEL_RWX (_PAGE_DIRTY | _PAGE_RW) +#define _PAGE_KERNEL_RWX (_PAGE_DIRTY | _PAGE_RW | _PAGE_EXEC) #define _PAGE_HPTEFLAGS _PAGE_HASHPTE @@ -66,11 +66,11 @@ static inline bool pte_user(pte_t pte) */ #define PAGE_NONE __pgprot(_PAGE_BASE) #define PAGE_SHARED __pgprot(_PAGE_BASE | _PAGE_USER | _PAGE_RW) -#define PAGE_SHARED_X __pgprot(_PAGE_BASE | _PAGE_USER | _PAGE_RW) +#define PAGE_SHARED_X __pgprot(_PAGE_BASE | _PAGE_USER | _PAGE_RW | _PAGE_EXEC) #define PAGE_COPY __pgprot(_PAGE_BASE | _PAGE_USER) -#define PAGE_COPY_X__pgprot(_PAGE_BASE | _PAGE_USER) +#define PAGE_COPY_X__pgprot(_PAGE_BASE | _PAGE_USER | _PAGE_EXEC) #define PAGE_READONLY __pgprot(_PAGE_BASE | _PAGE_USER) -#define PAGE_READONLY_X__pgprot(_PAGE_BASE | _PAGE_USER) +#define PAGE_READONLY_X__pgprot(_PAGE_BASE | _PAGE_USER | _PAGE_EXEC) /* Permission masks used for kernel mappings */ #define PAGE_KERNEL __pgprot(_PAGE_BASE | _PAGE_KERNEL_RW) @@ -318,7 +318,7 @@ static inline void __ptep_set_access_flags(struct vm_area_struct *vma, int psize) { unsigned long set = pte_val(entry) & - (_PAGE_DIRTY | _PAGE_ACCESSED | _PAGE_RW); + (_PAGE_DIRTY | _PAGE_ACCESSED | _PAGE_RW | _PAGE_EXEC); pte_update(ptep, 0, set); @@ -384,7 +384,7 @@ static inline int pte_dirty(pte_t pte) { return !!(pte_val(pte) & _PAGE_DIRTY); static inline int pte_young(pte_t pte){ return !!(pte_val(pte) & _PAGE_ACCESSED); } static inline int pte_special(pte_t pte) { return !!(pte_val(pte) & _PAGE_SPECIAL); } static inline int pte_none(pte_t pte) { return (pte_val(pte) & ~_PTE_NONE_MASK) == 0; } -static inline bool pte_exec(pte_t pte) { return true; } +static inline bool pte_exec(pte_t pte) { return pte_val(pte) & _PAGE_EXEC; } static inline int pte_present(pte_t pte) { @@ -451,7 +451,7 @@ static inline pte_t pte_wrprotect(pte_t pte) static inline pte_t pte_exprotect(pte_t pte) { - return pte; + return __pte(pte_val(pte) & ~_PAGE_EXEC); } static inline pte_t pte_mkclean(pte_t pte) @@ -466,7 +466,7 @@ static inline pte_t pte_mkold(pte_t pte) static inline pte_t pte_mkexec(pte_t pte) { - return pte; + return __pte(pte_val(pte) | _PAGE_EXEC); } static inline pte_t pte_mkpte(pte_t pte) diff --git a/arch/powerpc/include/asm/cputable.h b/arch/powerpc/include/asm/cputable.h index 29f49a35d6ee..a0395ccbbe9e 100644 --- a/arch/powerpc/include/asm/cputable.h +++ b/arch/powerpc/include/asm/cputable.h @@ -296,7 +296,7 @@ static inline void cpu_feature_keys_init(void) { } #define CPU_FTRS_PPC601
[PATCH v9 01/20] powerpc/book3s32: Remove CONFIG_BOOKE dependent code
BOOK3S/32 cannot be BOOKE, so remove useless code Signed-off-by: Christophe Leroy --- arch/powerpc/include/asm/book3s/32/pgalloc.h | 18 -- arch/powerpc/include/asm/book3s/32/pgtable.h | 14 -- 2 files changed, 32 deletions(-) diff --git a/arch/powerpc/include/asm/book3s/32/pgalloc.h b/arch/powerpc/include/asm/book3s/32/pgalloc.h index 82e44b1a00ae..eb8882c6dbb0 100644 --- a/arch/powerpc/include/asm/book3s/32/pgalloc.h +++ b/arch/powerpc/include/asm/book3s/32/pgalloc.h @@ -50,8 +50,6 @@ static inline void pgd_free(struct mm_struct *mm, pgd_t *pgd) #define __pmd_free_tlb(tlb,x,a)do { } while (0) /* #define pgd_populate(mm, pmd, pte) BUG() */ -#ifndef CONFIG_BOOKE - static inline void pmd_populate_kernel(struct mm_struct *mm, pmd_t *pmdp, pte_t *pte) { @@ -65,22 +63,6 @@ static inline void pmd_populate(struct mm_struct *mm, pmd_t *pmdp, } #define pmd_pgtable(pmd) pmd_page(pmd) -#else - -static inline void pmd_populate_kernel(struct mm_struct *mm, pmd_t *pmdp, - pte_t *pte) -{ - *pmdp = __pmd((unsigned long)pte | _PMD_PRESENT); -} - -static inline void pmd_populate(struct mm_struct *mm, pmd_t *pmdp, - pgtable_t pte_page) -{ - *pmdp = __pmd((unsigned long)lowmem_page_address(pte_page) | _PMD_PRESENT); -} - -#define pmd_pgtable(pmd) pmd_page(pmd) -#endif extern pte_t *pte_alloc_one_kernel(struct mm_struct *mm, unsigned long addr); extern pgtable_t pte_alloc_one(struct mm_struct *mm, unsigned long addr); diff --git a/arch/powerpc/include/asm/book3s/32/pgtable.h b/arch/powerpc/include/asm/book3s/32/pgtable.h index c21d33704633..32c33eccc0e2 100644 --- a/arch/powerpc/include/asm/book3s/32/pgtable.h +++ b/arch/powerpc/include/asm/book3s/32/pgtable.h @@ -328,24 +328,10 @@ static inline void __ptep_set_access_flags(struct vm_area_struct *vma, #define __HAVE_ARCH_PTE_SAME #define pte_same(A,B) (((pte_val(A) ^ pte_val(B)) & ~_PAGE_HASHPTE) == 0) -/* - * Note that on Book E processors, the pmd contains the kernel virtual - * (lowmem) address of the pte page. The physical address is less useful - * because everything runs with translation enabled (even the TLB miss - * handler). On everything else the pmd contains the physical address - * of the pte page. -- paulus - */ -#ifndef CONFIG_BOOKE #define pmd_page_vaddr(pmd)\ ((unsigned long) __va(pmd_val(pmd) & PAGE_MASK)) #define pmd_page(pmd) \ pfn_to_page(pmd_val(pmd) >> PAGE_SHIFT) -#else -#define pmd_page_vaddr(pmd)\ - ((unsigned long) (pmd_val(pmd) & PAGE_MASK)) -#define pmd_page(pmd) \ - pfn_to_page((__pa(pmd_val(pmd)) >> PAGE_SHIFT)) -#endif /* to find an entry in a kernel page-table-directory */ #define pgd_offset_k(address) pgd_offset(_mm, address) -- 2.13.3
[PATCH v9 02/20] powerpc/8xx: Remove PTE_ATOMIC_UPDATES
commit 1bc54c03117b9 ("powerpc: rework 4xx PTE access and TLB miss") introduced non atomic PTE updates and started the work of removing PTE updates in TLB miss handlers, but kept PTE_ATOMIC_UPDATES for the 8xx with the following comment: /* Until my rework is finished, 8xx still needs atomic PTE updates */ commit fe11dc3f9628e ("powerpc/8xx: Update TLB asm so it behaves as linux mm expects") removed all PTE updates done in TLB miss handlers Therefore, atomic PTE updates are not needed anymore for the 8xx Signed-off-by: Christophe Leroy --- arch/powerpc/include/asm/nohash/32/pte-8xx.h | 3 --- 1 file changed, 3 deletions(-) diff --git a/arch/powerpc/include/asm/nohash/32/pte-8xx.h b/arch/powerpc/include/asm/nohash/32/pte-8xx.h index 6bfe041ef59d..c9e4b2d90f65 100644 --- a/arch/powerpc/include/asm/nohash/32/pte-8xx.h +++ b/arch/powerpc/include/asm/nohash/32/pte-8xx.h @@ -65,9 +65,6 @@ #define _PTE_NONE_MASK 0 -/* Until my rework is finished, 8xx still needs atomic PTE updates */ -#define PTE_ATOMIC_UPDATES 1 - #ifdef CONFIG_PPC_16K_PAGES #define _PAGE_PSIZE_PAGE_SPS #else -- 2.13.3
[PATCH v9 04/20] powerpc/mm: Avoid useless lock with single page fragments
There is no point in taking the page table lock as pte_frag or pmd_frag are always NULL when we have only one fragment. Reviewed-by: Aneesh Kumar K.V Signed-off-by: Christophe Leroy --- arch/powerpc/mm/pgtable-book3s64.c | 3 +++ arch/powerpc/mm/pgtable-frag.c | 3 +++ 2 files changed, 6 insertions(+) diff --git a/arch/powerpc/mm/pgtable-book3s64.c b/arch/powerpc/mm/pgtable-book3s64.c index 0c0fd173208a..f3c31f5e1026 100644 --- a/arch/powerpc/mm/pgtable-book3s64.c +++ b/arch/powerpc/mm/pgtable-book3s64.c @@ -244,6 +244,9 @@ static pmd_t *get_pmd_from_cache(struct mm_struct *mm) { void *pmd_frag, *ret; + if (PMD_FRAG_NR == 1) + return NULL; + spin_lock(>page_table_lock); ret = mm->context.pmd_frag; if (ret) { diff --git a/arch/powerpc/mm/pgtable-frag.c b/arch/powerpc/mm/pgtable-frag.c index d61e7c2a9a79..7544d0d7177d 100644 --- a/arch/powerpc/mm/pgtable-frag.c +++ b/arch/powerpc/mm/pgtable-frag.c @@ -34,6 +34,9 @@ static pte_t *get_pte_from_cache(struct mm_struct *mm) { void *pte_frag, *ret; + if (PTE_FRAG_NR == 1) + return NULL; + spin_lock(>page_table_lock); ret = mm->context.pte_frag; if (ret) { -- 2.13.3
[PATCH v9 07/20] powerpc/mm: add helpers to get/set mm.context->pte_frag
In order to handle pte_fragment functions with single fragment without adding pte_frag in all mm_context_t, this patch creates two helpers which do nothing on platforms using a single fragment. Signed-off-by: Christophe Leroy --- arch/powerpc/include/asm/pgtable.h | 25 + arch/powerpc/mm/pgtable-frag.c | 8 2 files changed, 29 insertions(+), 4 deletions(-) diff --git a/arch/powerpc/include/asm/pgtable.h b/arch/powerpc/include/asm/pgtable.h index 9679b7519a35..314a2890a972 100644 --- a/arch/powerpc/include/asm/pgtable.h +++ b/arch/powerpc/include/asm/pgtable.h @@ -110,6 +110,31 @@ void mark_initmem_nx(void); static inline void mark_initmem_nx(void) { } #endif +/* + * When used, PTE_FRAG_NR is defined in subarch pgtable.h + * so we are sure it is included when arriving here. + */ +#ifdef PTE_FRAG_NR +static inline void *pte_frag_get(mm_context_t *ctx) +{ + return ctx->pte_frag; +} + +static inline void pte_frag_set(mm_context_t *ctx, void *p) +{ + ctx->pte_frag = p; +} +#else +static inline void *pte_frag_get(mm_context_t *ctx) +{ + return NULL; +} + +static inline void pte_frag_set(mm_context_t *ctx, void *p) +{ +} +#endif + #endif /* __ASSEMBLY__ */ #endif /* _ASM_POWERPC_PGTABLE_H */ diff --git a/arch/powerpc/mm/pgtable-frag.c b/arch/powerpc/mm/pgtable-frag.c index 7544d0d7177d..af23a587f019 100644 --- a/arch/powerpc/mm/pgtable-frag.c +++ b/arch/powerpc/mm/pgtable-frag.c @@ -38,7 +38,7 @@ static pte_t *get_pte_from_cache(struct mm_struct *mm) return NULL; spin_lock(>page_table_lock); - ret = mm->context.pte_frag; + ret = pte_frag_get(>context); if (ret) { pte_frag = ret + PTE_FRAG_SIZE; /* @@ -46,7 +46,7 @@ static pte_t *get_pte_from_cache(struct mm_struct *mm) */ if (((unsigned long)pte_frag & ~PAGE_MASK) == 0) pte_frag = NULL; - mm->context.pte_frag = pte_frag; + pte_frag_set(>context, pte_frag); } spin_unlock(>page_table_lock); return (pte_t *)ret; @@ -86,9 +86,9 @@ static pte_t *__alloc_for_ptecache(struct mm_struct *mm, int kernel) * the allocated page with single fragement * count. */ - if (likely(!mm->context.pte_frag)) { + if (likely(!pte_frag_get(>context))) { atomic_set(>pt_frag_refcount, PTE_FRAG_NR); - mm->context.pte_frag = ret + PTE_FRAG_SIZE; + pte_frag_set(>context, ret + PTE_FRAG_SIZE); } spin_unlock(>page_table_lock); -- 2.13.3
[PATCH v9 08/20] powerpc/mm: Extend pte_fragment functionality to PPC32
In order to allow the 8xx to handle pte_fragments, this patch extends the use of pte_fragments to PPC32 platforms. Signed-off-by: Christophe Leroy --- arch/powerpc/include/asm/book3s/32/mmu-hash.h | 5 - arch/powerpc/include/asm/book3s/32/pgalloc.h | 17 + arch/powerpc/include/asm/book3s/32/pgtable.h | 5 +++-- arch/powerpc/include/asm/mmu_context.h| 2 +- arch/powerpc/include/asm/nohash/32/mmu.h | 4 +++- arch/powerpc/include/asm/nohash/32/pgalloc.h | 22 +++--- arch/powerpc/include/asm/nohash/32/pgtable.h | 7 --- arch/powerpc/include/asm/pgtable.h| 4 arch/powerpc/mm/Makefile | 1 + arch/powerpc/mm/mmu_context.c | 10 ++ arch/powerpc/mm/mmu_context_nohash.c | 2 +- arch/powerpc/mm/pgtable_32.c | 25 - 12 files changed, 55 insertions(+), 49 deletions(-) diff --git a/arch/powerpc/include/asm/book3s/32/mmu-hash.h b/arch/powerpc/include/asm/book3s/32/mmu-hash.h index 5bd26c218b94..2bb500d25de6 100644 --- a/arch/powerpc/include/asm/book3s/32/mmu-hash.h +++ b/arch/powerpc/include/asm/book3s/32/mmu-hash.h @@ -1,6 +1,7 @@ /* SPDX-License-Identifier: GPL-2.0 */ #ifndef _ASM_POWERPC_BOOK3S_32_MMU_HASH_H_ #define _ASM_POWERPC_BOOK3S_32_MMU_HASH_H_ + /* * 32-bit hash table MMU support */ @@ -9,6 +10,8 @@ * BATs */ +#include + /* Block size masks */ #define BL_128K0x000 #define BL_256K 0x001 @@ -43,7 +46,7 @@ struct ppc_bat { u32 batl; }; -typedef struct page *pgtable_t; +typedef pte_t *pgtable_t; #endif /* !__ASSEMBLY__ */ /* diff --git a/arch/powerpc/include/asm/book3s/32/pgalloc.h b/arch/powerpc/include/asm/book3s/32/pgalloc.h index eb8882c6dbb0..0f58e5b9dbe7 100644 --- a/arch/powerpc/include/asm/book3s/32/pgalloc.h +++ b/arch/powerpc/include/asm/book3s/32/pgalloc.h @@ -59,30 +59,31 @@ static inline void pmd_populate_kernel(struct mm_struct *mm, pmd_t *pmdp, static inline void pmd_populate(struct mm_struct *mm, pmd_t *pmdp, pgtable_t pte_page) { - *pmdp = __pmd((page_to_pfn(pte_page) << PAGE_SHIFT) | _PMD_PRESENT); + *pmdp = __pmd(__pa(pte_page) | _PMD_PRESENT); } -#define pmd_pgtable(pmd) pmd_page(pmd) +#define pmd_pgtable(pmd) ((pgtable_t)pmd_page_vaddr(pmd)) extern pte_t *pte_alloc_one_kernel(struct mm_struct *mm, unsigned long addr); extern pgtable_t pte_alloc_one(struct mm_struct *mm, unsigned long addr); +void pte_frag_destroy(void *pte_frag); +pte_t *pte_fragment_alloc(struct mm_struct *mm, unsigned long vmaddr, int kernel); +void pte_fragment_free(unsigned long *table, int kernel); static inline void pte_free_kernel(struct mm_struct *mm, pte_t *pte) { - free_page((unsigned long)pte); + pte_fragment_free((unsigned long *)pte, 1); } static inline void pte_free(struct mm_struct *mm, pgtable_t ptepage) { - pgtable_page_dtor(ptepage); - __free_page(ptepage); + pte_fragment_free((unsigned long *)ptepage, 0); } static inline void pgtable_free(void *table, unsigned index_size) { if (!index_size) { - pgtable_page_dtor(virt_to_page(table)); - free_page((unsigned long)table); + pte_fragment_free((unsigned long *)table, 0); } else { BUG_ON(index_size > MAX_PGTABLE_INDEX_SIZE); kmem_cache_free(PGT_CACHE(index_size), table); @@ -120,6 +121,6 @@ static inline void pgtable_free_tlb(struct mmu_gather *tlb, static inline void __pte_free_tlb(struct mmu_gather *tlb, pgtable_t table, unsigned long address) { - pgtable_free_tlb(tlb, page_address(table), 0); + pgtable_free_tlb(tlb, table, 0); } #endif /* _ASM_POWERPC_BOOK3S_32_PGALLOC_H */ diff --git a/arch/powerpc/include/asm/book3s/32/pgtable.h b/arch/powerpc/include/asm/book3s/32/pgtable.h index 32c33eccc0e2..47156b93f9af 100644 --- a/arch/powerpc/include/asm/book3s/32/pgtable.h +++ b/arch/powerpc/include/asm/book3s/32/pgtable.h @@ -329,7 +329,7 @@ static inline void __ptep_set_access_flags(struct vm_area_struct *vma, #define pte_same(A,B) (((pte_val(A) ^ pte_val(B)) & ~_PAGE_HASHPTE) == 0) #define pmd_page_vaddr(pmd)\ - ((unsigned long) __va(pmd_val(pmd) & PAGE_MASK)) + ((unsigned long)__va(pmd_val(pmd) & ~(PTE_TABLE_SIZE - 1))) #define pmd_page(pmd) \ pfn_to_page(pmd_val(pmd) >> PAGE_SHIFT) @@ -346,7 +346,8 @@ static inline void __ptep_set_access_flags(struct vm_area_struct *vma, #define pte_offset_kernel(dir, addr) \ ((pte_t *) pmd_page_vaddr(*(dir)) + pte_index(addr)) #define pte_offset_map(dir, addr) \ - ((pte_t *) kmap_atomic(pmd_page(*(dir))) + pte_index(addr)) + ((pte_t *)(kmap_atomic(pmd_page(*(dir))) + \ + (pmd_page_vaddr(*(dir)) & ~PAGE_MASK)) + pte_index(addr)) #define pte_unmap(pte) kunmap_atomic(pte)
[PATCH v9 11/20] powerpc/mm: fix a warning when a cache is common to PGD and hugepages
While implementing TLB miss HW assistance on the 8xx, the following warning was encountered: [ 423.732965] WARNING: CPU: 0 PID: 345 at mm/slub.c:2412 ___slab_alloc.constprop.30+0x26c/0x46c [ 423.733033] CPU: 0 PID: 345 Comm: mmap Not tainted 4.18.0-rc8-00664-g2dfff9121c55 #671 [ 423.733075] NIP: c0108f90 LR: c0109ad0 CTR: 0004 [ 423.733121] REGS: c455bba0 TRAP: 0700 Not tainted (4.18.0-rc8-00664-g2dfff9121c55) [ 423.733147] MSR: 00021032 CR: 24224848 XER: 2000 [ 423.733319] [ 423.733319] GPR00: c0109ad0 c455bc50 c4521910 c60053c0 007080c0 c0011b34 c7fa41e0 c455be30 [ 423.733319] GPR08: 0001 c00103a0 c7fa41e0 c49afcc4 24282842 10018840 c079b37c 0040 [ 423.733319] GPR16: 73f0 00210d00 0001 c455a000 0100 0200 c455a000 [ 423.733319] GPR24: c60053c0 c0011b34 007080c0 c455a000 c455a000 c7fa41e0 9032 [ 423.734190] NIP [c0108f90] ___slab_alloc.constprop.30+0x26c/0x46c [ 423.734257] LR [c0109ad0] kmem_cache_alloc+0x210/0x23c [ 423.734283] Call Trace: [ 423.734326] [c455bc50] [0100] 0x100 (unreliable) [ 423.734430] [c455bcc0] [c0109ad0] kmem_cache_alloc+0x210/0x23c [ 423.734543] [c455bcf0] [c0011b34] huge_pte_alloc+0xc0/0x1dc [ 423.734633] [c455bd20] [c01044dc] hugetlb_fault+0x408/0x48c [ 423.734720] [c455bdb0] [c0104b20] follow_hugetlb_page+0x14c/0x44c [ 423.734826] [c455be10] [c00e8e54] __get_user_pages+0x1c4/0x3dc [ 423.734919] [c455be80] [c00e9924] __mm_populate+0xac/0x140 [ 423.735020] [c455bec0] [c00db14c] vm_mmap_pgoff+0xb4/0xb8 [ 423.735127] [c455bf00] [c00f27c0] ksys_mmap_pgoff+0xcc/0x1fc [ 423.735222] [c455bf40] [c000e0f8] ret_from_syscall+0x0/0x38 [ 423.735271] Instruction dump: [ 423.735321] 7cbf482e 38fd0008 7fa6eb78 7fc4f378 4bfff5dd 7fe3fb78 4bfffe24 81370010 [ 423.735536] 71280004 41a2ff88 4840c571 4b80 <0fe0> 4bfffeb8 81340010 712a0004 [ 423.735757] ---[ end trace e9b222919a470790 ]--- This warning occurs when calling kmem_cache_zalloc() on a cache having a constructor. In this case it happens because PGD cache and 512k hugepte cache are the same size (4k). While a cache with constructor is created for the PGD, hugepages create cache without constructor and uses kmem_cache_zalloc(). As both expect a cache with the same size, the hugepages reuse the cache created for PGD, hence the conflict. In order to avoid this conflict, this patch: - modifies pgtable_cache_add() so that a zeroising constructor is added for any cache size. - replaces calls to kmem_cache_zalloc() by kmem_cache_alloc() Signed-off-by: Christophe Leroy --- see original discussion in https://patchwork.ozlabs.org/patch/957565/ arch/powerpc/include/asm/pgtable.h | 2 +- arch/powerpc/mm/hugetlbpage.c | 6 ++--- arch/powerpc/mm/init-common.c | 46 ++ 3 files changed, 36 insertions(+), 18 deletions(-) diff --git a/arch/powerpc/include/asm/pgtable.h b/arch/powerpc/include/asm/pgtable.h index 56ef5437eb2f..f2bfaf674674 100644 --- a/arch/powerpc/include/asm/pgtable.h +++ b/arch/powerpc/include/asm/pgtable.h @@ -101,7 +101,7 @@ extern int gup_hugepte(pte_t *ptep, unsigned long sz, unsigned long addr, /* can we use this in kvm */ unsigned long vmalloc_to_phys(void *vmalloc_addr); -void pgtable_cache_add(unsigned shift, void (*ctor)(void *)); +void pgtable_cache_add(unsigned int shift); void pgtable_cache_init(void); #if defined(CONFIG_STRICT_KERNEL_RWX) || defined(CONFIG_PPC32) diff --git a/arch/powerpc/mm/hugetlbpage.c b/arch/powerpc/mm/hugetlbpage.c index c4f1263228b8..bc97874d7c74 100644 --- a/arch/powerpc/mm/hugetlbpage.c +++ b/arch/powerpc/mm/hugetlbpage.c @@ -70,7 +70,7 @@ static int __hugepte_alloc(struct mm_struct *mm, hugepd_t *hpdp, num_hugepd = 1; } - new = kmem_cache_zalloc(cachep, pgtable_gfp_flags(mm, GFP_KERNEL)); + new = kmem_cache_alloc(cachep, pgtable_gfp_flags(mm, GFP_KERNEL)); BUG_ON(pshift > HUGEPD_SHIFT_MASK); BUG_ON((unsigned long)new & HUGEPD_SHIFT_MASK); @@ -701,10 +701,10 @@ static int __init hugetlbpage_init(void) * use pgt cache for hugepd. */ if (pdshift > shift) - pgtable_cache_add(pdshift - shift, NULL); + pgtable_cache_add(pdshift - shift); #if defined(CONFIG_PPC_FSL_BOOK3E) || defined(CONFIG_PPC_8xx) else - pgtable_cache_add(PTE_T_ORDER, NULL); + pgtable_cache_add(PTE_T_ORDER); #endif } diff --git a/arch/powerpc/mm/init-common.c b/arch/powerpc/mm/init-common.c index 41190f2b60c2..b7ca03643d0b 100644 --- a/arch/powerpc/mm/init-common.c +++ b/arch/powerpc/mm/init-common.c @@ -25,19 +25,37 @@ #include #include -static void pgd_ctor(void *addr) -{ - memset(addr, 0, PGD_TABLE_SIZE); +#define CTOR(shift) static void ctor_##shift(void *addr) \ +{ \ +
[PATCH v9 03/20] powerpc/mm: Move pte_fragment_alloc() to a common location
In preparation of next patch which generalises the use of pte_fragment_alloc() for all, this patch moves the related functions in a place that is common to all subarches. The 8xx will need that for supporting 16k pages, as in that mode page tables still have a size of 4k. Since pte_fragment with only once fragment is not different from what is done in the general case, we can easily migrate all subarchs to pte fragments. Reviewed-by: Aneesh Kumar K.V Signed-off-by: Christophe Leroy --- arch/powerpc/include/asm/book3s/64/pgalloc.h | 1 + arch/powerpc/mm/Makefile | 4 +- arch/powerpc/mm/mmu_context_book3s64.c | 15 arch/powerpc/mm/pgtable-book3s64.c | 85 arch/powerpc/mm/pgtable-frag.c | 116 +++ 5 files changed, 120 insertions(+), 101 deletions(-) create mode 100644 arch/powerpc/mm/pgtable-frag.c diff --git a/arch/powerpc/include/asm/book3s/64/pgalloc.h b/arch/powerpc/include/asm/book3s/64/pgalloc.h index 391ed2c3b697..f949dd90af9b 100644 --- a/arch/powerpc/include/asm/book3s/64/pgalloc.h +++ b/arch/powerpc/include/asm/book3s/64/pgalloc.h @@ -50,6 +50,7 @@ extern void pgtable_free_tlb(struct mmu_gather *tlb, void *table, int shift); #ifdef CONFIG_SMP extern void __tlb_remove_table(void *_table); #endif +void pte_frag_destroy(void *pte_frag); static inline pgd_t *radix__pgd_alloc(struct mm_struct *mm) { diff --git a/arch/powerpc/mm/Makefile b/arch/powerpc/mm/Makefile index ca96e7be4d0e..3cbb1acf0745 100644 --- a/arch/powerpc/mm/Makefile +++ b/arch/powerpc/mm/Makefile @@ -15,7 +15,9 @@ obj-$(CONFIG_PPC_MMU_NOHASH) += mmu_context_nohash.o tlb_nohash.o \ obj-$(CONFIG_PPC_BOOK3E) += tlb_low_$(BITS)e.o hash64-$(CONFIG_PPC_NATIVE):= hash_native_64.o obj-$(CONFIG_PPC_BOOK3E_64) += pgtable-book3e.o -obj-$(CONFIG_PPC_BOOK3S_64)+= pgtable-hash64.o hash_utils_64.o slb.o $(hash64-y) mmu_context_book3s64.o pgtable-book3s64.o +obj-$(CONFIG_PPC_BOOK3S_64)+= pgtable-hash64.o hash_utils_64.o slb.o \ + $(hash64-y) mmu_context_book3s64.o \ + pgtable-book3s64.o pgtable-frag.o obj-$(CONFIG_PPC_RADIX_MMU)+= pgtable-radix.o tlb-radix.o obj-$(CONFIG_PPC_STD_MMU_32) += ppc_mmu_32.o hash_low_32.o mmu_context_hash32.o obj-$(CONFIG_PPC_STD_MMU) += tlb_hash$(BITS).o diff --git a/arch/powerpc/mm/mmu_context_book3s64.c b/arch/powerpc/mm/mmu_context_book3s64.c index 510f103d7813..f720c5cc0b5e 100644 --- a/arch/powerpc/mm/mmu_context_book3s64.c +++ b/arch/powerpc/mm/mmu_context_book3s64.c @@ -164,21 +164,6 @@ static void destroy_contexts(mm_context_t *ctx) } } -static void pte_frag_destroy(void *pte_frag) -{ - int count; - struct page *page; - - page = virt_to_page(pte_frag); - /* drop all the pending references */ - count = ((unsigned long)pte_frag & ~PAGE_MASK) >> PTE_FRAG_SIZE_SHIFT; - /* We allow PTE_FRAG_NR fragments from a PTE page */ - if (atomic_sub_and_test(PTE_FRAG_NR - count, >pt_frag_refcount)) { - pgtable_page_dtor(page); - __free_page(page); - } -} - static void pmd_frag_destroy(void *pmd_frag) { int count; diff --git a/arch/powerpc/mm/pgtable-book3s64.c b/arch/powerpc/mm/pgtable-book3s64.c index 9f93c9f985c5..0c0fd173208a 100644 --- a/arch/powerpc/mm/pgtable-book3s64.c +++ b/arch/powerpc/mm/pgtable-book3s64.c @@ -322,91 +322,6 @@ void pmd_fragment_free(unsigned long *pmd) } } -static pte_t *get_pte_from_cache(struct mm_struct *mm) -{ - void *pte_frag, *ret; - - spin_lock(>page_table_lock); - ret = mm->context.pte_frag; - if (ret) { - pte_frag = ret + PTE_FRAG_SIZE; - /* -* If we have taken up all the fragments mark PTE page NULL -*/ - if (((unsigned long)pte_frag & ~PAGE_MASK) == 0) - pte_frag = NULL; - mm->context.pte_frag = pte_frag; - } - spin_unlock(>page_table_lock); - return (pte_t *)ret; -} - -static pte_t *__alloc_for_ptecache(struct mm_struct *mm, int kernel) -{ - void *ret = NULL; - struct page *page; - - if (!kernel) { - page = alloc_page(PGALLOC_GFP | __GFP_ACCOUNT); - if (!page) - return NULL; - if (!pgtable_page_ctor(page)) { - __free_page(page); - return NULL; - } - } else { - page = alloc_page(PGALLOC_GFP); - if (!page) - return NULL; - } - - atomic_set(>pt_frag_refcount, 1); - - ret = page_address(page); - /* -* if we support only one fragment just return the -* allocated page. -*/ - if (PTE_FRAG_NR == 1) - return ret; - spin_lock(>page_table_lock); - /*
[PATCH v9 05/20] powerpc/mm: move platform specific mmu-xxx.h in platform directories
The purpose of this patch is to move platform specific mmu-xxx.h files in platform directories like pte-xxx.h files. In the meantime this patch creates common nohash and nohash/32 + nohash/64 mmu.h files for future common parts. Reviewed-by: Aneesh Kumar K.V Signed-off-by: Christophe Leroy --- arch/powerpc/include/asm/mmu.h | 14 ++ arch/powerpc/include/asm/{ => nohash/32}/mmu-40x.h | 0 arch/powerpc/include/asm/{ => nohash/32}/mmu-44x.h | 0 arch/powerpc/include/asm/{ => nohash/32}/mmu-8xx.h | 0 arch/powerpc/include/asm/nohash/32/mmu.h | 19 +++ arch/powerpc/include/asm/nohash/64/mmu.h | 8 arch/powerpc/include/asm/{ => nohash}/mmu-book3e.h | 0 arch/powerpc/include/asm/nohash/mmu.h | 11 +++ arch/powerpc/kernel/cpu_setup_fsl_booke.S | 2 +- arch/powerpc/kvm/e500.h| 2 +- 10 files changed, 42 insertions(+), 14 deletions(-) rename arch/powerpc/include/asm/{ => nohash/32}/mmu-40x.h (100%) rename arch/powerpc/include/asm/{ => nohash/32}/mmu-44x.h (100%) rename arch/powerpc/include/asm/{ => nohash/32}/mmu-8xx.h (100%) create mode 100644 arch/powerpc/include/asm/nohash/32/mmu.h create mode 100644 arch/powerpc/include/asm/nohash/64/mmu.h rename arch/powerpc/include/asm/{ => nohash}/mmu-book3e.h (100%) create mode 100644 arch/powerpc/include/asm/nohash/mmu.h diff --git a/arch/powerpc/include/asm/mmu.h b/arch/powerpc/include/asm/mmu.h index eb20eb3b8fb0..2184021b0e1c 100644 --- a/arch/powerpc/include/asm/mmu.h +++ b/arch/powerpc/include/asm/mmu.h @@ -341,18 +341,8 @@ static inline void mmu_early_init_devtree(void) { } #if defined(CONFIG_PPC_STD_MMU_32) /* 32-bit classic hash table MMU */ #include -#elif defined(CONFIG_40x) -/* 40x-style software loaded TLB */ -# include -#elif defined(CONFIG_44x) -/* 44x-style software loaded TLB */ -# include -#elif defined(CONFIG_PPC_BOOK3E_MMU) -/* Freescale Book-E software loaded TLB or Book-3e (ISA 2.06+) MMU */ -# include -#elif defined (CONFIG_PPC_8xx) -/* Motorola/Freescale 8xx software loaded TLB */ -# include +#elif defined(CONFIG_PPC_MMU_NOHASH) +#include #endif #endif /* __KERNEL__ */ diff --git a/arch/powerpc/include/asm/mmu-40x.h b/arch/powerpc/include/asm/nohash/32/mmu-40x.h similarity index 100% rename from arch/powerpc/include/asm/mmu-40x.h rename to arch/powerpc/include/asm/nohash/32/mmu-40x.h diff --git a/arch/powerpc/include/asm/mmu-44x.h b/arch/powerpc/include/asm/nohash/32/mmu-44x.h similarity index 100% rename from arch/powerpc/include/asm/mmu-44x.h rename to arch/powerpc/include/asm/nohash/32/mmu-44x.h diff --git a/arch/powerpc/include/asm/mmu-8xx.h b/arch/powerpc/include/asm/nohash/32/mmu-8xx.h similarity index 100% rename from arch/powerpc/include/asm/mmu-8xx.h rename to arch/powerpc/include/asm/nohash/32/mmu-8xx.h diff --git a/arch/powerpc/include/asm/nohash/32/mmu.h b/arch/powerpc/include/asm/nohash/32/mmu.h new file mode 100644 index ..af0e8b54876a --- /dev/null +++ b/arch/powerpc/include/asm/nohash/32/mmu.h @@ -0,0 +1,19 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +#ifndef _ASM_POWERPC_NOHASH_32_MMU_H_ +#define _ASM_POWERPC_NOHASH_32_MMU_H_ + +#if defined(CONFIG_40x) +/* 40x-style software loaded TLB */ +#include +#elif defined(CONFIG_44x) +/* 44x-style software loaded TLB */ +#include +#elif defined(CONFIG_PPC_BOOK3E_MMU) +/* Freescale Book-E software loaded TLB or Book-3e (ISA 2.06+) MMU */ +#include +#elif defined (CONFIG_PPC_8xx) +/* Motorola/Freescale 8xx software loaded TLB */ +#include +#endif + +#endif /* _ASM_POWERPC_NOHASH_32_MMU_H_ */ diff --git a/arch/powerpc/include/asm/nohash/64/mmu.h b/arch/powerpc/include/asm/nohash/64/mmu.h new file mode 100644 index ..87871d027b75 --- /dev/null +++ b/arch/powerpc/include/asm/nohash/64/mmu.h @@ -0,0 +1,8 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +#ifndef _ASM_POWERPC_NOHASH_64_MMU_H_ +#define _ASM_POWERPC_NOHASH_64_MMU_H_ + +/* Freescale Book-E software loaded TLB or Book-3e (ISA 2.06+) MMU */ +#include + +#endif /* _ASM_POWERPC_NOHASH_64_MMU_H_ */ diff --git a/arch/powerpc/include/asm/mmu-book3e.h b/arch/powerpc/include/asm/nohash/mmu-book3e.h similarity index 100% rename from arch/powerpc/include/asm/mmu-book3e.h rename to arch/powerpc/include/asm/nohash/mmu-book3e.h diff --git a/arch/powerpc/include/asm/nohash/mmu.h b/arch/powerpc/include/asm/nohash/mmu.h new file mode 100644 index ..a037cb1efb57 --- /dev/null +++ b/arch/powerpc/include/asm/nohash/mmu.h @@ -0,0 +1,11 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +#ifndef _ASM_POWERPC_NOHASH_MMU_H_ +#define _ASM_POWERPC_NOHASH_MMU_H_ + +#ifdef CONFIG_PPC64 +#include +#else +#include +#endif + +#endif /* _ASM_POWERPC_NOHASH_MMU_H_ */ diff --git a/arch/powerpc/kernel/cpu_setup_fsl_booke.S b/arch/powerpc/kernel/cpu_setup_fsl_booke.S index 8d142e5d84cd..5fbc890d1094 100644 --- a/arch/powerpc/kernel/cpu_setup_fsl_booke.S +++
[PATCH v9 06/20] powerpc/mm: Move pgtable_t into platform headers
This patch move pgtable_t into platform headers. It gets rid of the CONFIG_PPC_64K_PAGES case for PPC64 as nohash/64 doesn't support CONFIG_PPC_64K_PAGES. Reviewed-by: Aneesh Kumar K.V Signed-off-by: Christophe Leroy --- arch/powerpc/include/asm/book3s/32/mmu-hash.h | 2 ++ arch/powerpc/include/asm/book3s/64/mmu.h | 9 + arch/powerpc/include/asm/nohash/32/mmu.h | 4 arch/powerpc/include/asm/nohash/64/mmu.h | 4 arch/powerpc/include/asm/page.h | 14 -- 5 files changed, 19 insertions(+), 14 deletions(-) diff --git a/arch/powerpc/include/asm/book3s/32/mmu-hash.h b/arch/powerpc/include/asm/book3s/32/mmu-hash.h index e38c91388c40..5bd26c218b94 100644 --- a/arch/powerpc/include/asm/book3s/32/mmu-hash.h +++ b/arch/powerpc/include/asm/book3s/32/mmu-hash.h @@ -42,6 +42,8 @@ struct ppc_bat { u32 batu; u32 batl; }; + +typedef struct page *pgtable_t; #endif /* !__ASSEMBLY__ */ /* diff --git a/arch/powerpc/include/asm/book3s/64/mmu.h b/arch/powerpc/include/asm/book3s/64/mmu.h index 6328857f259f..1ceee000c18d 100644 --- a/arch/powerpc/include/asm/book3s/64/mmu.h +++ b/arch/powerpc/include/asm/book3s/64/mmu.h @@ -2,6 +2,8 @@ #ifndef _ASM_POWERPC_BOOK3S_64_MMU_H_ #define _ASM_POWERPC_BOOK3S_64_MMU_H_ +#include + #ifndef __ASSEMBLY__ /* * Page size definition @@ -24,6 +26,13 @@ struct mmu_psize_def { }; extern struct mmu_psize_def mmu_psize_defs[MMU_PAGE_COUNT]; +/* + * For BOOK3s 64 with 4k and 64K linux page size + * we want to use pointers, because the page table + * actually store pfn + */ +typedef pte_t *pgtable_t; + #endif /* __ASSEMBLY__ */ /* 64-bit classic hash table MMU */ diff --git a/arch/powerpc/include/asm/nohash/32/mmu.h b/arch/powerpc/include/asm/nohash/32/mmu.h index af0e8b54876a..f61f933a4cd8 100644 --- a/arch/powerpc/include/asm/nohash/32/mmu.h +++ b/arch/powerpc/include/asm/nohash/32/mmu.h @@ -16,4 +16,8 @@ #include #endif +#ifndef __ASSEMBLY__ +typedef struct page *pgtable_t; +#endif + #endif /* _ASM_POWERPC_NOHASH_32_MMU_H_ */ diff --git a/arch/powerpc/include/asm/nohash/64/mmu.h b/arch/powerpc/include/asm/nohash/64/mmu.h index 87871d027b75..e6585480dfc4 100644 --- a/arch/powerpc/include/asm/nohash/64/mmu.h +++ b/arch/powerpc/include/asm/nohash/64/mmu.h @@ -5,4 +5,8 @@ /* Freescale Book-E software loaded TLB or Book-3e (ISA 2.06+) MMU */ #include +#ifndef __ASSEMBLY__ +typedef struct page *pgtable_t; +#endif + #endif /* _ASM_POWERPC_NOHASH_64_MMU_H_ */ diff --git a/arch/powerpc/include/asm/page.h b/arch/powerpc/include/asm/page.h index 9ea903221a9f..a7624a3b1435 100644 --- a/arch/powerpc/include/asm/page.h +++ b/arch/powerpc/include/asm/page.h @@ -335,20 +335,6 @@ void arch_free_page(struct page *page, int order); #endif struct vm_area_struct; -#ifdef CONFIG_PPC_BOOK3S_64 -/* - * For BOOK3s 64 with 4k and 64K linux page size - * we want to use pointers, because the page table - * actually store pfn - */ -typedef pte_t *pgtable_t; -#else -#if defined(CONFIG_PPC_64K_PAGES) && defined(CONFIG_PPC64) -typedef pte_t *pgtable_t; -#else -typedef struct page *pgtable_t; -#endif -#endif #include #endif /* __ASSEMBLY__ */ -- 2.13.3
[PATCH v9 09/20] powerpc/mm: enable the use of page table cache of order 0
hugepages uses a cache of order 0. Lets allow page tables of order 0 in the common part in order to avoid open coding in hugetlb Signed-off-by: Christophe Leroy --- arch/powerpc/include/asm/book3s/32/pgalloc.h | 5 + arch/powerpc/include/asm/book3s/64/pgalloc.h | 5 + arch/powerpc/include/asm/nohash/32/pgalloc.h | 5 + arch/powerpc/include/asm/nohash/64/pgalloc.h | 5 + arch/powerpc/mm/init-common.c| 6 +++--- 5 files changed, 7 insertions(+), 19 deletions(-) diff --git a/arch/powerpc/include/asm/book3s/32/pgalloc.h b/arch/powerpc/include/asm/book3s/32/pgalloc.h index 0f58e5b9dbe7..b5b955eb2fb7 100644 --- a/arch/powerpc/include/asm/book3s/32/pgalloc.h +++ b/arch/powerpc/include/asm/book3s/32/pgalloc.h @@ -25,10 +25,7 @@ extern void __bad_pte(pmd_t *pmd); extern struct kmem_cache *pgtable_cache[]; -#define PGT_CACHE(shift) ({\ - BUG_ON(!(shift)); \ - pgtable_cache[(shift) - 1]; \ - }) +#define PGT_CACHE(shift) pgtable_cache[shift] static inline pgd_t *pgd_alloc(struct mm_struct *mm) { diff --git a/arch/powerpc/include/asm/book3s/64/pgalloc.h b/arch/powerpc/include/asm/book3s/64/pgalloc.h index f949dd90af9b..4aba625389c4 100644 --- a/arch/powerpc/include/asm/book3s/64/pgalloc.h +++ b/arch/powerpc/include/asm/book3s/64/pgalloc.h @@ -37,10 +37,7 @@ extern struct vmemmap_backing *vmemmap_list; #define MAX_PGTABLE_INDEX_SIZE 0xf extern struct kmem_cache *pgtable_cache[]; -#define PGT_CACHE(shift) ({\ - BUG_ON(!(shift)); \ - pgtable_cache[(shift) - 1]; \ - }) +#define PGT_CACHE(shift) pgtable_cache[shift] extern pte_t *pte_fragment_alloc(struct mm_struct *, unsigned long, int); extern pmd_t *pmd_fragment_alloc(struct mm_struct *, unsigned long); diff --git a/arch/powerpc/include/asm/nohash/32/pgalloc.h b/arch/powerpc/include/asm/nohash/32/pgalloc.h index 7e234582dce5..17963951bdb0 100644 --- a/arch/powerpc/include/asm/nohash/32/pgalloc.h +++ b/arch/powerpc/include/asm/nohash/32/pgalloc.h @@ -25,10 +25,7 @@ extern void __bad_pte(pmd_t *pmd); extern struct kmem_cache *pgtable_cache[]; -#define PGT_CACHE(shift) ({\ - BUG_ON(!(shift)); \ - pgtable_cache[(shift) - 1]; \ - }) +#define PGT_CACHE(shift) pgtable_cache[shift] static inline pgd_t *pgd_alloc(struct mm_struct *mm) { diff --git a/arch/powerpc/include/asm/nohash/64/pgalloc.h b/arch/powerpc/include/asm/nohash/64/pgalloc.h index e2d62d033708..e95eb499a174 100644 --- a/arch/powerpc/include/asm/nohash/64/pgalloc.h +++ b/arch/powerpc/include/asm/nohash/64/pgalloc.h @@ -36,10 +36,7 @@ extern struct vmemmap_backing *vmemmap_list; #define MAX_PGTABLE_INDEX_SIZE 0xf extern struct kmem_cache *pgtable_cache[]; -#define PGT_CACHE(shift) ({\ - BUG_ON(!(shift)); \ - pgtable_cache[(shift) - 1]; \ - }) +#define PGT_CACHE(shift) pgtable_cache[shift] static inline pgd_t *pgd_alloc(struct mm_struct *mm) { diff --git a/arch/powerpc/mm/init-common.c b/arch/powerpc/mm/init-common.c index 2b656e67f2ea..41190f2b60c2 100644 --- a/arch/powerpc/mm/init-common.c +++ b/arch/powerpc/mm/init-common.c @@ -40,7 +40,7 @@ static void pmd_ctor(void *addr) memset(addr, 0, PMD_TABLE_SIZE); } -struct kmem_cache *pgtable_cache[MAX_PGTABLE_INDEX_SIZE]; +struct kmem_cache *pgtable_cache[MAX_PGTABLE_INDEX_SIZE + 1]; EXPORT_SYMBOL_GPL(pgtable_cache); /* used by kvm_hv module */ /* @@ -71,7 +71,7 @@ void pgtable_cache_add(unsigned shift, void (*ctor)(void *)) * moment, gcc doesn't seem to recognize is_power_of_2 as a * constant expression, so so much for that. */ BUG_ON(!is_power_of_2(minalign)); - BUG_ON((shift < 1) || (shift > MAX_PGTABLE_INDEX_SIZE)); + BUG_ON(shift > MAX_PGTABLE_INDEX_SIZE); if (PGT_CACHE(shift)) return; /* Already have a cache of this size */ @@ -83,7 +83,7 @@ void pgtable_cache_add(unsigned shift, void (*ctor)(void *)) panic("Could not allocate pgtable cache for order %d", shift); kfree(name); - pgtable_cache[shift - 1] = new; + pgtable_cache[shift] = new; pr_debug("Allocated pgtable cache for order %d\n", shift); } -- 2.13.3
[PATCH v9 10/20] powerpc/mm: replace hugetlb_cache by PGT_CACHE(PTE_T_ORDER)
Instead of opencoding cache handling for the special case of hugepage tables having a single pte_t element, this patch makes use of the common pgtable_cache helpers Signed-off-by: Christophe Leroy --- arch/powerpc/include/asm/hugetlb.h | 2 -- arch/powerpc/mm/hugetlbpage.c | 26 +++--- 2 files changed, 7 insertions(+), 21 deletions(-) diff --git a/arch/powerpc/include/asm/hugetlb.h b/arch/powerpc/include/asm/hugetlb.h index 98004262bc87..dfb8bf236586 100644 --- a/arch/powerpc/include/asm/hugetlb.h +++ b/arch/powerpc/include/asm/hugetlb.h @@ -5,8 +5,6 @@ #ifdef CONFIG_HUGETLB_PAGE #include -extern struct kmem_cache *hugepte_cache; - #ifdef CONFIG_PPC_BOOK3S_64 #include diff --git a/arch/powerpc/mm/hugetlbpage.c b/arch/powerpc/mm/hugetlbpage.c index 8cf035e68378..c4f1263228b8 100644 --- a/arch/powerpc/mm/hugetlbpage.c +++ b/arch/powerpc/mm/hugetlbpage.c @@ -42,6 +42,8 @@ EXPORT_SYMBOL(HPAGE_SHIFT); #define hugepd_none(hpd) (hpd_val(hpd) == 0) +#define PTE_T_ORDER(__builtin_ffs(sizeof(pte_t)) - __builtin_ffs(sizeof(void *))) + pte_t *huge_pte_offset(struct mm_struct *mm, unsigned long addr, unsigned long sz) { /* @@ -61,7 +63,7 @@ static int __hugepte_alloc(struct mm_struct *mm, hugepd_t *hpdp, int num_hugepd; if (pshift >= pdshift) { - cachep = hugepte_cache; + cachep = PGT_CACHE(PTE_T_ORDER); num_hugepd = 1 << (pshift - pdshift); } else { cachep = PGT_CACHE(pdshift - pshift); @@ -264,7 +266,7 @@ static void hugepd_free_rcu_callback(struct rcu_head *head) unsigned int i; for (i = 0; i < batch->index; i++) - kmem_cache_free(hugepte_cache, batch->ptes[i]); + kmem_cache_free(PGT_CACHE(PTE_T_ORDER), batch->ptes[i]); free_page((unsigned long)batch); } @@ -277,7 +279,7 @@ static void hugepd_free(struct mmu_gather *tlb, void *hugepte) if (atomic_read(>mm->mm_users) < 2 || mm_is_thread_local(tlb->mm)) { - kmem_cache_free(hugepte_cache, hugepte); + kmem_cache_free(PGT_CACHE(PTE_T_ORDER), hugepte); put_cpu_var(hugepd_freelist_cur); return; } @@ -652,7 +654,6 @@ static int __init hugepage_setup_sz(char *str) } __setup("hugepagesz=", hugepage_setup_sz); -struct kmem_cache *hugepte_cache; static int __init hugetlbpage_init(void) { int psize; @@ -702,21 +703,8 @@ static int __init hugetlbpage_init(void) if (pdshift > shift) pgtable_cache_add(pdshift - shift, NULL); #if defined(CONFIG_PPC_FSL_BOOK3E) || defined(CONFIG_PPC_8xx) - else if (!hugepte_cache) { - /* -* Create a kmem cache for hugeptes. The bottom bits in -* the pte have size information encoded in them, so -* align them to allow this -*/ - hugepte_cache = kmem_cache_create("hugepte-cache", - sizeof(pte_t), - HUGEPD_SHIFT_MASK + 1, - 0, NULL); - if (hugepte_cache == NULL) - panic("%s: Unable to create kmem cache " - "for hugeptes\n", __func__); - - } + else + pgtable_cache_add(PTE_T_ORDER, NULL); #endif } -- 2.13.3
[PATCH v9 12/20] powerpc/mm: remove unnecessary test in pgtable_cache_init()
pgtable_cache_add() gracefully handles the case when a cache that size already exists by returning early with the following test: if (PGT_CACHE(shift)) return; /* Already have a cache of this size */ It is then not needed to test the existence of the cache before. Signed-off-by: Christophe Leroy --- arch/powerpc/mm/init-common.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/arch/powerpc/mm/init-common.c b/arch/powerpc/mm/init-common.c index b7ca03643d0b..1e6910eb70ed 100644 --- a/arch/powerpc/mm/init-common.c +++ b/arch/powerpc/mm/init-common.c @@ -111,13 +111,13 @@ void pgtable_cache_init(void) { pgtable_cache_add(PGD_INDEX_SIZE); - if (PMD_CACHE_INDEX && !PGT_CACHE(PMD_CACHE_INDEX)) + if (PMD_CACHE_INDEX) pgtable_cache_add(PMD_CACHE_INDEX); /* * In all current configs, when the PUD index exists it's the * same size as either the pgd or pmd index except with THP enabled * on book3s 64 */ - if (PUD_CACHE_INDEX && !PGT_CACHE(PUD_CACHE_INDEX)) + if (PUD_CACHE_INDEX) pgtable_cache_add(PUD_CACHE_INDEX); } -- 2.13.3
[PATCH v9 13/20] powerpc/8xx: Move SW perf counters in first 32kb of memory
In order to simplify time critical exceptions handling 8xx specific SW perf counters, this patch moves the counters into the beginning of memory. This is possible because .text is readable and the counters are never modified outside of the handlers. By doing this, we avoid having to set a second register with the upper part of the address of the counters. Signed-off-by: Christophe Leroy --- arch/powerpc/kernel/head_8xx.S | 58 -- 1 file changed, 28 insertions(+), 30 deletions(-) diff --git a/arch/powerpc/kernel/head_8xx.S b/arch/powerpc/kernel/head_8xx.S index 3b67b9533c82..c203defe49a4 100644 --- a/arch/powerpc/kernel/head_8xx.S +++ b/arch/powerpc/kernel/head_8xx.S @@ -106,6 +106,23 @@ turn_on_mmu: mtspr SPRN_SRR0,r0 rfi /* enables MMU */ + +#ifdef CONFIG_PERF_EVENTS + .align 4 + + .globl itlb_miss_counter +itlb_miss_counter: + .space 4 + + .globl dtlb_miss_counter +dtlb_miss_counter: + .space 4 + + .globl instruction_counter +instruction_counter: + .space 4 +#endif + /* * Exception entry code. This code runs with address translation * turned off, i.e. using physical addresses. @@ -384,17 +401,16 @@ InstructionTLBMiss: #ifdef CONFIG_PERF_EVENTS patch_site 0f, patch__itlbmiss_perf -0: lis r10, (itlb_miss_counter - PAGE_OFFSET)@ha - lwz r11, (itlb_miss_counter - PAGE_OFFSET)@l(r10) - addir11, r11, 1 - stw r11, (itlb_miss_counter - PAGE_OFFSET)@l(r10) -#endif +0: lwz r10, (itlb_miss_counter - PAGE_OFFSET)@l(0) + addir10, r10, 1 + stw r10, (itlb_miss_counter - PAGE_OFFSET)@l(0) mfspr r10, SPRN_SPRG_SCRATCH0 mfspr r11, SPRN_SPRG_SCRATCH1 #if defined(ITLB_MISS_KERNEL) || defined(CONFIG_HUGETLB_PAGE) mfspr r12, SPRN_SPRG_SCRATCH2 #endif rfi +#endif #ifdef CONFIG_HUGETLB_PAGE 10:/* 8M pages */ @@ -509,15 +525,14 @@ DataStoreTLBMiss: #ifdef CONFIG_PERF_EVENTS patch_site 0f, patch__dtlbmiss_perf -0: lis r10, (dtlb_miss_counter - PAGE_OFFSET)@ha - lwz r11, (dtlb_miss_counter - PAGE_OFFSET)@l(r10) - addir11, r11, 1 - stw r11, (dtlb_miss_counter - PAGE_OFFSET)@l(r10) -#endif +0: lwz r10, (dtlb_miss_counter - PAGE_OFFSET)@l(0) + addir10, r10, 1 + stw r10, (dtlb_miss_counter - PAGE_OFFSET)@l(0) mfspr r10, SPRN_SPRG_SCRATCH0 mfspr r11, SPRN_SPRG_SCRATCH1 mfspr r12, SPRN_SPRG_SCRATCH2 rfi +#endif #ifdef CONFIG_HUGETLB_PAGE 10:/* 8M pages */ @@ -625,16 +640,13 @@ DataBreakpoint: . = 0x1d00 InstructionBreakpoint: mtspr SPRN_SPRG_SCRATCH0, r10 - mtspr SPRN_SPRG_SCRATCH1, r11 - lis r10, (instruction_counter - PAGE_OFFSET)@ha - lwz r11, (instruction_counter - PAGE_OFFSET)@l(r10) - addir11, r11, -1 - stw r11, (instruction_counter - PAGE_OFFSET)@l(r10) + lwz r10, (instruction_counter - PAGE_OFFSET)@l(0) + addir10, r10, -1 + stw r10, (instruction_counter - PAGE_OFFSET)@l(0) lis r10, 0x ori r10, r10, 0x01 mtspr SPRN_COUNTA, r10 mfspr r10, SPRN_SPRG_SCRATCH0 - mfspr r11, SPRN_SPRG_SCRATCH1 rfi #else EXCEPTION(0x1d00, Trap_1d, unknown_exception, EXC_XFER_EE) @@ -1065,17 +1077,3 @@ swapper_pg_dir: */ abatron_pteptrs: .space 8 - -#ifdef CONFIG_PERF_EVENTS - .globl itlb_miss_counter -itlb_miss_counter: - .space 4 - - .globl dtlb_miss_counter -dtlb_miss_counter: - .space 4 - - .globl instruction_counter -instruction_counter: - .space 4 -#endif -- 2.13.3
[PATCH v9 00/20] Implement use of HW assistance on TLB table walk on 8xx
The purpose of this serie is to implement hardware assistance for TLB table walk on the 8xx. First part prepares for using HW assistance in TLB routines: - Trivial fixes: - Remove CONFIG_BOOKE stuff from book3S headers. - Removal of unneeded atomic PTE update requirement for 8xx. - move book3s64 page fragment code in a common part for reusing it by the 8xx as 16k page size mode still uses 4k page tables. - Fixing a bug in memcache handling when standard pages and hugepages share caches of the same size (see original discussion in https://patchwork.ozlabs.org/patch/957565/) - Optimise access to 8xx perf counters (hence reducing number of registers used) Second part implements HW assistance in TLB routines in the following steps: - Disable 16k page size mode and 512k hugepages - Switch 4k to HW assistance - Bring back 512k hugepages - Bring back 16k page size mode. Last part cleans up: - Take benefit of Miss handler size reduction to regroup related parts - Reduce number of registers used in miss handlers, freeing them for future use. Tested successfully on 8xx and 83xx (book3s/32) Changes in v9: - Fixed PTE_INDEX_SIZE and PTE_SHIFT so that we always have PTE_SHIFT == PTE_INDEX_SIZE - Using PTE_INDEX_SIZE instead of PTE_SHIFT to avoid build failure on PPC64 - Moved PTE_FRAG_NR from pgtable.h to mmu-8xx.h to reduce ifdefs Changes in v8: - Moved definitions in pgalloc.h to avoid conflicting with memcache patches. - Included the memcache bugfix serie in this serie to avoid conflicts between the two series when coming to the 512k pages patch. - In the 512k HW assistance patch, reduced the #ifdef mess by using IS_ENABLED(CONFIG_PPC_8xx) instead. Changes in v7: - Reordered to get trivial and already reviewed patches in front. - Reordered to regroup all HW assistance related patches together. - Rebased on today merge branch (28 Nov) - Added a helper for access to mm_context_t.frag - Reduced the amount of changes in PPC32 to support pte_fragment - Applied pte_fragment to both nohash/32 and book3s/32 Changes in v6: - Droped the part related to handling GUARD attribute at PGD/PMD level. - Moved the commonalisation of page_fragment in the begining (this part has been reviewed by Aneesh) - Rebased on today merge branch (19 Oct) Changes in v5: - Also avoid useless lock in get_pmd_from_cache() - A new patch to relocate mmu headers in platform specific directories - A new patch to distribute pgtable_t typedefs in platform specific mmu headers instead of the uggly #ifdef - Moved early_pte_alloc_kernel() in platform specific pgalloc - Restricted definition of PTE_FRAG_SIZE and PTE_FRAG_NR to platforms using the pte fragmentation. - arch_exit_mmap() and destroy_pagetable_cache() are now platform specific. Changes in v4: - Reordered the serie to put at the end the modifications which makes L1 and L2 entries independant. - No modifications to ppc64 ioremap (we still have an opportunity to merge them, for a future patch serie) - 8xx code modified to use patch_site instead of patch_instruction to get a clearer code and avoid object pollution with global symbols - Moved perf counters in first 32kb of memory to optimise access - Split the big bang to HW assistance in several steps: 1. Temporarily removes support of 16k pages and 512k hugepages 2. Change TLB routines to use HW assistance for 4k pages and 8M hugepages 3. Add back support for 512k hugepages 4. Add back support for 16k pages (using pte_fragment as page tables are still 4k) Changes in v3: - Fixed an issue in the 09/14 when CONFIG_PIN_TLB_TEXT was not enabled - Added performance measurement in the 09/14 commit log - Rebased on latest 'powerpc/merge' tree, which conflicted with 13/14 Changes in v2: - Removed the 3 first patchs which have been applied already - Fixed compilation errors reported by Michael - Squashed the commonalisation of ioremap functions into a single patch - Fixed the use of pte_fragment - Added a patch optimising perf counting of TLB misses and instructions Christophe Leroy (20): powerpc/book3s32: Remove CONFIG_BOOKE dependent code powerpc/8xx: Remove PTE_ATOMIC_UPDATES powerpc/mm: Move pte_fragment_alloc() to a common location powerpc/mm: Avoid useless lock with single page fragments powerpc/mm: move platform specific mmu-xxx.h in platform directories powerpc/mm: Move pgtable_t into platform headers powerpc/mm: add helpers to get/set mm.context->pte_frag powerpc/mm: Extend pte_fragment functionality to PPC32 powerpc/mm: enable the use of page table cache of order 0 powerpc/mm: replace hugetlb_cache by PGT_CACHE(PTE_T_ORDER) powerpc/mm: fix a warning when a cache is common to PGD and hugepages powerpc/mm: remove unnecessary test in pgtable_cache_init() powerpc/8xx: Move SW perf counters in first 32kb of memory powerpc/8xx: Temporarily disable 16k pages and hugepages powerpc/8xx: Use hardware assistance in TLB handlers
[PATCH -next] powerpc/64s: Fix debugfs_simple_attr.cocci warnings
Use DEFINE_DEBUGFS_ATTRIBUTE rather than DEFINE_SIMPLE_ATTRIBUTE for debugfs files. Semantic patch information: Rationale: DEFINE_SIMPLE_ATTRIBUTE + debugfs_create_file() imposes some significant overhead as compared to DEFINE_DEBUGFS_ATTRIBUTE + debugfs_create_file_unsafe(). Generated by: scripts/coccinelle/api/debugfs/debugfs_simple_attr.cocci Signed-off-by: YueHaibing --- arch/powerpc/kernel/security.c | 24 ++-- 1 file changed, 14 insertions(+), 10 deletions(-) diff --git a/arch/powerpc/kernel/security.c b/arch/powerpc/kernel/security.c index 9703dce..14bb806 100644 --- a/arch/powerpc/kernel/security.c +++ b/arch/powerpc/kernel/security.c @@ -90,13 +90,14 @@ static int barrier_nospec_get(void *data, u64 *val) return 0; } -DEFINE_SIMPLE_ATTRIBUTE(fops_barrier_nospec, - barrier_nospec_get, barrier_nospec_set, "%llu\n"); +DEFINE_DEBUGFS_ATTRIBUTE(fops_barrier_nospec, barrier_nospec_get, +barrier_nospec_set, "%llu\n"); static __init int barrier_nospec_debugfs_init(void) { - debugfs_create_file("barrier_nospec", 0600, powerpc_debugfs_root, NULL, - _barrier_nospec); + debugfs_create_file_unsafe("barrier_nospec", 0600, + powerpc_debugfs_root, NULL, + _barrier_nospec); return 0; } device_initcall(barrier_nospec_debugfs_init); @@ -339,11 +340,13 @@ static int stf_barrier_get(void *data, u64 *val) return 0; } -DEFINE_SIMPLE_ATTRIBUTE(fops_stf_barrier, stf_barrier_get, stf_barrier_set, "%llu\n"); +DEFINE_DEBUGFS_ATTRIBUTE(fops_stf_barrier, stf_barrier_get, stf_barrier_set, +"%llu\n"); static __init int stf_barrier_debugfs_init(void) { - debugfs_create_file("stf_barrier", 0600, powerpc_debugfs_root, NULL, _stf_barrier); + debugfs_create_file_unsafe("stf_barrier", 0600, powerpc_debugfs_root, + NULL, _stf_barrier); return 0; } device_initcall(stf_barrier_debugfs_init); @@ -404,13 +407,14 @@ static int count_cache_flush_get(void *data, u64 *val) return 0; } -DEFINE_SIMPLE_ATTRIBUTE(fops_count_cache_flush, count_cache_flush_get, - count_cache_flush_set, "%llu\n"); +DEFINE_DEBUGFS_ATTRIBUTE(fops_count_cache_flush, count_cache_flush_get, +count_cache_flush_set, "%llu\n"); static __init int count_cache_flush_debugfs_init(void) { - debugfs_create_file("count_cache_flush", 0600, powerpc_debugfs_root, - NULL, _count_cache_flush); + debugfs_create_file_unsafe("count_cache_flush", 0600, + powerpc_debugfs_root, NULL, + _count_cache_flush); return 0; } device_initcall(count_cache_flush_debugfs_init);
[PATCH v9 14/20] powerpc/8xx: Temporarily disable 16k pages and hugepages
In preparation of making use of hardware assistance in TLB handlers, this patch temporarily disables 16K pages and hugepages. The reason is that when using HW assistance in 4K pages mode, the linux model fit with the HW model for 4K pages and 8M pages. However for 16K pages and 512K mode some additional work is needed to get linux model fit with HW model. For the 8M pages, they will naturaly come back when we switch to HW assistance, without any additional handling. In order to keep the following patch smaller, the removal of the current special handling for 8M pages gets removed here as well. Therefore the 4K pages mode will be implemented first and without support for 512k hugepages. Then the 512k hugepages will be brought back. And the 16K pages will be implemented in the following step. Signed-off-by: Christophe Leroy --- arch/powerpc/Kconfig | 2 +- arch/powerpc/kernel/head_8xx.S | 74 +++--- arch/powerpc/mm/tlb_nohash.c | 6 3 files changed, 6 insertions(+), 76 deletions(-) diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig index 8be31261aec8..ddfccdf004fe 100644 --- a/arch/powerpc/Kconfig +++ b/arch/powerpc/Kconfig @@ -689,7 +689,7 @@ config PPC_4K_PAGES config PPC_16K_PAGES bool "16k page size" - depends on 44x || PPC_8xx + depends on 44x config PPC_64K_PAGES bool "64k page size" diff --git a/arch/powerpc/kernel/head_8xx.S b/arch/powerpc/kernel/head_8xx.S index c203defe49a4..01f58b1d9ae7 100644 --- a/arch/powerpc/kernel/head_8xx.S +++ b/arch/powerpc/kernel/head_8xx.S @@ -314,7 +314,7 @@ SystemCall: InstructionTLBMiss: mtspr SPRN_SPRG_SCRATCH0, r10 mtspr SPRN_SPRG_SCRATCH1, r11 -#if defined(ITLB_MISS_KERNEL) || defined(CONFIG_HUGETLB_PAGE) +#ifdef ITLB_MISS_KERNEL mtspr SPRN_SPRG_SCRATCH2, r12 #endif @@ -325,10 +325,8 @@ InstructionTLBMiss: INVALIDATE_ADJACENT_PAGES_CPU15(r11, r10) /* Only modules will cause ITLB Misses as we always * pin the first 8MB of kernel memory */ -#if defined(ITLB_MISS_KERNEL) || defined(CONFIG_HUGETLB_PAGE) - mfcrr12 -#endif #ifdef ITLB_MISS_KERNEL + mfcrr12 #if defined(SIMPLE_KERNEL_ADDRESS) && defined(CONFIG_PIN_TLB_TEXT) andis. r11, r10, 0x8000/* Address >= 0x8000 */ #else @@ -360,15 +358,9 @@ InstructionTLBMiss: /* Extract level 2 index */ rlwinm r10, r10, 32 - (PAGE_SHIFT - 2), 32 - PAGE_SHIFT, 29 -#ifdef CONFIG_HUGETLB_PAGE - mtcrr11 - bt- 28, 10f /* bit 28 = Large page (8M) */ - bt- 29, 20f /* bit 29 = Large page (8M or 512k) */ -#endif rlwimi r10, r11, 0, 0, 32 - PAGE_SHIFT - 1 /* Add level 2 base */ lwz r10, 0(r10) /* Get the pte */ -4: -#if defined(ITLB_MISS_KERNEL) || defined(CONFIG_HUGETLB_PAGE) +#ifdef ITLB_MISS_KERNEL mtcrr12 #endif /* Load the MI_TWC with the attributes for this "segment." */ @@ -393,7 +385,7 @@ InstructionTLBMiss: /* Restore registers */ 0: mfspr r10, SPRN_SPRG_SCRATCH0 mfspr r11, SPRN_SPRG_SCRATCH1 -#if defined(ITLB_MISS_KERNEL) || defined(CONFIG_HUGETLB_PAGE) +#ifdef ITLB_MISS_KERNEL mfspr r12, SPRN_SPRG_SCRATCH2 #endif rfi @@ -406,35 +398,12 @@ InstructionTLBMiss: stw r10, (itlb_miss_counter - PAGE_OFFSET)@l(0) mfspr r10, SPRN_SPRG_SCRATCH0 mfspr r11, SPRN_SPRG_SCRATCH1 -#if defined(ITLB_MISS_KERNEL) || defined(CONFIG_HUGETLB_PAGE) +#ifdef ITLB_MISS_KERNEL mfspr r12, SPRN_SPRG_SCRATCH2 #endif rfi #endif -#ifdef CONFIG_HUGETLB_PAGE -10:/* 8M pages */ -#ifdef CONFIG_PPC_16K_PAGES - /* Extract level 2 index */ - rlwinm r10, r10, 32 - (PAGE_SHIFT_8M - PAGE_SHIFT), 32 + PAGE_SHIFT_8M - (PAGE_SHIFT << 1), 29 - /* Add level 2 base */ - rlwimi r10, r11, 0, 0, 32 + PAGE_SHIFT_8M - (PAGE_SHIFT << 1) - 1 -#else - /* Level 2 base */ - rlwinm r10, r11, 0, ~HUGEPD_SHIFT_MASK -#endif - lwz r10, 0(r10) /* Get the pte */ - b 4b - -20:/* 512k pages */ - /* Extract level 2 index */ - rlwinm r10, r10, 32 - (PAGE_SHIFT_512K - PAGE_SHIFT), 32 + PAGE_SHIFT_512K - (PAGE_SHIFT << 1), 29 - /* Add level 2 base */ - rlwimi r10, r11, 0, 0, 32 + PAGE_SHIFT_512K - (PAGE_SHIFT << 1) - 1 - lwz r10, 0(r10) /* Get the pte */ - b 4b -#endif - . = 0x1200 DataStoreTLBMiss: mtspr SPRN_SPRG_SCRATCH0, r10 @@ -472,11 +441,6 @@ DataStoreTLBMiss: */ /* Extract level 2 index */ rlwinm r10, r10, 32 - (PAGE_SHIFT - 2), 32 - PAGE_SHIFT, 29 -#ifdef CONFIG_HUGETLB_PAGE - mtcrr11 - bt- 28, 10f /* bit 28 = Large page (8M) */ - bt- 29, 20f /* bit 29 = Large page (8M or 512k) */ -#endif rlwimi r10, r11, 0, 0, 32 - PAGE_SHIFT - 1 /* Add level 2 base */
[PATCH v9 15/20] powerpc/8xx: Use hardware assistance in TLB handlers
Today, on the 8xx the TLB handlers do SW tablewalk by doing all the calculation in ASM, in order to match with the Linux page table structure. The 8xx offers hardware assistance which allows significant size reduction of the TLB handlers, hence also reduces the time spent in the handlers. However, using this HW assistance implies some constraints on the page table structure: - Regardless of the main page size used (4k or 16k), the level 1 table (PGD) contains 1024 entries and each PGD entry covers a 4Mbytes area which is managed by a level 2 table (PTE) containing also 1024 entries each describing a 4k page. - 16k pages require 4 identifical entries in the L2 table - 512k pages PTE have to be spread every 128 bytes in the L2 table - 8M pages PTE are at the address pointed by the L1 entry and each 8M page require 2 identical entries in the PGD. This patch modifies the TLB handlers to use HW assistance for 4K PAGES. Before that patch, the mean time spent in TLB miss handlers is: - ITLB miss: 80 ticks - DTLB miss: 62 ticks After that patch, the mean time spent in TLB miss handlers is: - ITLB miss: 72 ticks - DTLB miss: 54 ticks So the improvement is 10% for ITLB and 13% for DTLB misses Signed-off-by: Christophe Leroy --- arch/powerpc/kernel/head_8xx.S | 58 +- arch/powerpc/mm/8xx_mmu.c | 4 +-- 2 files changed, 26 insertions(+), 36 deletions(-) diff --git a/arch/powerpc/kernel/head_8xx.S b/arch/powerpc/kernel/head_8xx.S index 01f58b1d9ae7..85fb4b8bf6c7 100644 --- a/arch/powerpc/kernel/head_8xx.S +++ b/arch/powerpc/kernel/head_8xx.S @@ -292,7 +292,7 @@ SystemCall: . = 0x1100 /* * For the MPC8xx, this is a software tablewalk to load the instruction - * TLB. The task switch loads the M_TW register with the pointer to the first + * TLB. The task switch loads the M_TWB register with the pointer to the first * level table. * If we discover there is no second level table (value is zero) or if there * is an invalid pte, we load that into the TLB, which causes another fault @@ -323,6 +323,7 @@ InstructionTLBMiss: */ mfspr r10, SPRN_SRR0 /* Get effective address of fault */ INVALIDATE_ADJACENT_PAGES_CPU15(r11, r10) + mtspr SPRN_MD_EPN, r10 /* Only modules will cause ITLB Misses as we always * pin the first 8MB of kernel memory */ #ifdef ITLB_MISS_KERNEL @@ -339,7 +340,7 @@ InstructionTLBMiss: #endif #endif #endif - mfspr r11, SPRN_M_TW /* Get level 1 table */ + mfspr r11, SPRN_M_TWB /* Get level 1 table */ #ifdef ITLB_MISS_KERNEL #if defined(SIMPLE_KERNEL_ADDRESS) && defined(CONFIG_PIN_TLB_TEXT) beq+3f @@ -349,16 +350,14 @@ InstructionTLBMiss: #ifndef CONFIG_PIN_TLB_TEXT blt cr7, ITLBMissLinear #endif - lis r11, (swapper_pg_dir-PAGE_OFFSET)@ha + rlwinm r11, r11, 0, 20, 31 + orisr11, r11, (swapper_pg_dir - PAGE_OFFSET)@ha 3: #endif - /* Insert level 1 index */ - rlwimi r11, r10, 32 - ((PAGE_SHIFT - 2) << 1), (PAGE_SHIFT - 2) << 1, 29 lwz r11, (swapper_pg_dir-PAGE_OFFSET)@l(r11)/* Get the level 1 entry */ - /* Extract level 2 index */ - rlwinm r10, r10, 32 - (PAGE_SHIFT - 2), 32 - PAGE_SHIFT, 29 - rlwimi r10, r11, 0, 0, 32 - PAGE_SHIFT - 1 /* Add level 2 base */ + mtspr SPRN_MD_TWC, r11 + mfspr r10, SPRN_MD_TWC lwz r10, 0(r10) /* Get the pte */ #ifdef ITLB_MISS_KERNEL mtcrr12 @@ -417,7 +416,7 @@ DataStoreTLBMiss: mfspr r10, SPRN_MD_EPN rlwinm r11, r10, 16, 0xfff8 cmpli cr0, r11, PAGE_OFFSET@h - mfspr r11, SPRN_M_TW /* Get level 1 table */ + mfspr r11, SPRN_M_TWB /* Get level 1 table */ blt+3f rlwinm r11, r10, 16, 0xfff8 #ifndef CONFIG_PIN_TLB_IMMR @@ -430,20 +429,16 @@ DataStoreTLBMiss: patch_site 0b, patch__dtlbmiss_immr_jmp #endif blt cr7, DTLBMissLinear - lis r11, (swapper_pg_dir-PAGE_OFFSET)@ha + mfspr r11, SPRN_M_TWB /* Get level 1 table */ + rlwinm r11, r11, 0, 20, 31 + orisr11, r11, (swapper_pg_dir - PAGE_OFFSET)@ha 3: - - /* Insert level 1 index */ - rlwimi r11, r10, 32 - ((PAGE_SHIFT - 2) << 1), (PAGE_SHIFT - 2) << 1, 29 lwz r11, (swapper_pg_dir-PAGE_OFFSET)@l(r11)/* Get the level 1 entry */ - /* We have a pte table, so load fetch the pte from the table. -*/ - /* Extract level 2 index */ - rlwinm r10, r10, 32 - (PAGE_SHIFT - 2), 32 - PAGE_SHIFT, 29 - rlwimi r10, r11, 0, 0, 32 - PAGE_SHIFT - 1 /* Add level 2 base */ + mtspr SPRN_MD_TWC, r11 + mfspr r10, SPRN_MD_TWC lwz r10, 0(r10) /* Get the pte */ -4: + mtcrr12 /* Insert the Guarded flag into the TWC from the Linux PTE. @@ -668,9 +663,10 @@ FixupDAR:/* Entry point for dcbx workaround. */ mtspr
[PATCH v9 17/20] powerpc/8xx: Enable 512k hugepage support with HW assistance
For using 512k pages with hardware assistance, the PTEs have to be spread every 128 bytes in the L2 table. Signed-off-by: Christophe Leroy --- arch/powerpc/include/asm/hugetlb.h | 4 +++- arch/powerpc/mm/hugetlbpage.c | 10 +- arch/powerpc/mm/tlb_nohash.c | 3 +++ 3 files changed, 15 insertions(+), 2 deletions(-) diff --git a/arch/powerpc/include/asm/hugetlb.h b/arch/powerpc/include/asm/hugetlb.h index dfb8bf236586..62a0ca02ca7d 100644 --- a/arch/powerpc/include/asm/hugetlb.h +++ b/arch/powerpc/include/asm/hugetlb.h @@ -74,7 +74,9 @@ static inline pte_t *hugepte_offset(hugepd_t hpd, unsigned long addr, unsigned long idx = 0; pte_t *dir = hugepd_page(hpd); -#ifndef CONFIG_PPC_FSL_BOOK3E +#ifdef CONFIG_PPC_8xx + idx = (addr & ((1UL << pdshift) - 1)) >> PAGE_SHIFT; +#elif !defined(CONFIG_PPC_FSL_BOOK3E) idx = (addr & ((1UL << pdshift) - 1)) >> hugepd_shift(hpd); #endif diff --git a/arch/powerpc/mm/hugetlbpage.c b/arch/powerpc/mm/hugetlbpage.c index bc97874d7c74..5b236621d302 100644 --- a/arch/powerpc/mm/hugetlbpage.c +++ b/arch/powerpc/mm/hugetlbpage.c @@ -65,6 +65,9 @@ static int __hugepte_alloc(struct mm_struct *mm, hugepd_t *hpdp, if (pshift >= pdshift) { cachep = PGT_CACHE(PTE_T_ORDER); num_hugepd = 1 << (pshift - pdshift); + } else if (IS_ENABLED(CONFIG_PPC_8xx)) { + cachep = PGT_CACHE(PTE_INDEX_SIZE); + num_hugepd = 1; } else { cachep = PGT_CACHE(pdshift - pshift); num_hugepd = 1; @@ -331,6 +334,9 @@ static void free_hugepd_range(struct mmu_gather *tlb, hugepd_t *hpdp, int pdshif if (shift >= pdshift) hugepd_free(tlb, hugepte); + else if (IS_ENABLED(CONFIG_PPC_8xx)) + pgtable_free_tlb(tlb, hugepte, +get_hugepd_cache_index(PTE_INDEX_SIZE)); else pgtable_free_tlb(tlb, hugepte, get_hugepd_cache_index(pdshift - shift)); @@ -700,7 +706,9 @@ static int __init hugetlbpage_init(void) * if we have pdshift and shift value same, we don't * use pgt cache for hugepd. */ - if (pdshift > shift) + if (pdshift > shift && IS_ENABLED(CONFIG_PPC_8xx)) + pgtable_cache_add(PTE_INDEX_SIZE); + else if (pdshift > shift) pgtable_cache_add(pdshift - shift); #if defined(CONFIG_PPC_FSL_BOOK3E) || defined(CONFIG_PPC_8xx) else diff --git a/arch/powerpc/mm/tlb_nohash.c b/arch/powerpc/mm/tlb_nohash.c index 8ad7aab150b7..ae5d568e267f 100644 --- a/arch/powerpc/mm/tlb_nohash.c +++ b/arch/powerpc/mm/tlb_nohash.c @@ -97,6 +97,9 @@ struct mmu_psize_def mmu_psize_defs[MMU_PAGE_COUNT] = { .shift = 14, }, #endif + [MMU_PAGE_512K] = { + .shift = 19, + }, [MMU_PAGE_8M] = { .shift = 23, }, -- 2.13.3
[PATCH v9 18/20] powerpc/8xx: reintroduce 16K pages with HW assistance
Using this HW assistance implies some constraints on the page table structure: - Regardless of the main page size used (4k or 16k), the level 1 table (PGD) contains 1024 entries and each PGD entry covers a 4Mbytes area which is managed by a level 2 table (PTE) containing also 1024 entries each describing a 4k page. - 16k pages require 4 identifical entries in the L2 table - 512k pages PTE have to be spread every 128 bytes in the L2 table - 8M pages PTE are at the address pointed by the L1 entry and each 8M page require 2 identical entries in the PGD. In order to use hardware assistance with 16K pages, this patch does the following modifications: - Make PGD size independent of the main page size - In 16k pages mode, redefine pte_t as a struct with 4 elements, and populate those 4 elements in __set_pte_at() and pte_update() - Adapt the size of the hugepage tables. - Define a PTE_FRAGMENT_NB so that a 16k page contains 4 page tables. Signed-off-by: Christophe Leroy --- arch/powerpc/Kconfig | 2 +- arch/powerpc/include/asm/nohash/32/mmu-8xx.h | 4 arch/powerpc/include/asm/nohash/32/pgtable.h | 8 +++- arch/powerpc/include/asm/nohash/pgtable.h| 4 arch/powerpc/include/asm/page_32.h | 3 ++- arch/powerpc/include/asm/pgtable-types.h | 4 6 files changed, 22 insertions(+), 3 deletions(-) diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig index ddfccdf004fe..8be31261aec8 100644 --- a/arch/powerpc/Kconfig +++ b/arch/powerpc/Kconfig @@ -689,7 +689,7 @@ config PPC_4K_PAGES config PPC_16K_PAGES bool "16k page size" - depends on 44x + depends on 44x || PPC_8xx config PPC_64K_PAGES bool "64k page size" diff --git a/arch/powerpc/include/asm/nohash/32/mmu-8xx.h b/arch/powerpc/include/asm/nohash/32/mmu-8xx.h index fa05aa566ece..b0f764c827c0 100644 --- a/arch/powerpc/include/asm/nohash/32/mmu-8xx.h +++ b/arch/powerpc/include/asm/nohash/32/mmu-8xx.h @@ -190,6 +190,7 @@ typedef struct { struct slice_mask mask_8m; # endif #endif + void *pte_frag; } mm_context_t; #define PHYS_IMMR_BASE (mfspr(SPRN_IMMR) & 0xfff8) @@ -244,6 +245,9 @@ extern s32 patch__itlbmiss_perf, patch__dtlbmiss_perf; #define mmu_virtual_psize MMU_PAGE_4K #elif defined(CONFIG_PPC_16K_PAGES) #define mmu_virtual_psize MMU_PAGE_16K +#define PTE_FRAG_NR4 +#define PTE_FRAG_SIZE_SHIFT12 +#define PTE_FRAG_SIZE (1UL << 12) #else #error "Unsupported PAGE_SIZE" #endif diff --git a/arch/powerpc/include/asm/nohash/32/pgtable.h b/arch/powerpc/include/asm/nohash/32/pgtable.h index e9c8604cc8c5..bed433358260 100644 --- a/arch/powerpc/include/asm/nohash/32/pgtable.h +++ b/arch/powerpc/include/asm/nohash/32/pgtable.h @@ -232,7 +232,13 @@ static inline unsigned long pte_update(pte_t *p, : "cc" ); #else /* PTE_ATOMIC_UPDATES */ unsigned long old = pte_val(*p); - *p = __pte((old & ~clr) | set); + unsigned long new = (old & ~clr) | set; + +#if defined(CONFIG_PPC_8xx) && defined(CONFIG_PPC_16K_PAGES) + p->pte = p->pte1 = p->pte2 = p->pte3 = new; +#else + *p = __pte(new); +#endif #endif /* !PTE_ATOMIC_UPDATES */ #ifdef CONFIG_44x diff --git a/arch/powerpc/include/asm/nohash/pgtable.h b/arch/powerpc/include/asm/nohash/pgtable.h index 70ff23974b59..1ca1c1864b32 100644 --- a/arch/powerpc/include/asm/nohash/pgtable.h +++ b/arch/powerpc/include/asm/nohash/pgtable.h @@ -209,7 +209,11 @@ static inline void __set_pte_at(struct mm_struct *mm, unsigned long addr, /* Anything else just stores the PTE normally. That covers all 64-bit * cases, and 32-bit non-hash with 32-bit PTEs. */ +#if defined(CONFIG_PPC_8xx) && defined(CONFIG_PPC_16K_PAGES) + ptep->pte = ptep->pte1 = ptep->pte2 = ptep->pte3 = pte_val(pte); +#else *ptep = pte; +#endif /* * With hardware tablewalk, a sync is needed to ensure that diff --git a/arch/powerpc/include/asm/page_32.h b/arch/powerpc/include/asm/page_32.h index 5c378e9b78c8..683dfbc67ca8 100644 --- a/arch/powerpc/include/asm/page_32.h +++ b/arch/powerpc/include/asm/page_32.h @@ -22,7 +22,8 @@ #define PTE_FLAGS_OFFSET 0 #endif -#ifdef CONFIG_PPC_256K_PAGES +#if defined(CONFIG_PPC_256K_PAGES) || \ +(defined(CONFIG_PPC_8xx) && defined(CONFIG_PPC_16K_PAGES)) #define PTE_SHIFT (PAGE_SHIFT - PTE_T_LOG2 - 2) /* 1/4 of a page */ #else #define PTE_SHIFT (PAGE_SHIFT - PTE_T_LOG2) /* full page */ diff --git a/arch/powerpc/include/asm/pgtable-types.h b/arch/powerpc/include/asm/pgtable-types.h index eccb30b38b47..3b0edf041b2e 100644 --- a/arch/powerpc/include/asm/pgtable-types.h +++ b/arch/powerpc/include/asm/pgtable-types.h @@ -3,7 +3,11 @@ #define _ASM_POWERPC_PGTABLE_TYPES_H /* PTE level */ +#if defined(CONFIG_PPC_8xx) && defined(CONFIG_PPC_16K_PAGES) +typedef struct { pte_basic_t pte, pte1, pte2, pte3; } pte_t; +#else typedef struct { pte_basic_t pte; } pte_t;
[PATCH v9 20/20] powerpc/8xx: regroup TLB handler routines
As this is running with MMU off, the CPU only does speculative fetch for code in the same page. Following the significant size reduction of TLB handler routines, the side handlers can be brought back close to the main part, ie in the same page. Signed-off-by: Christophe Leroy --- arch/powerpc/kernel/head_8xx.S | 112 - 1 file changed, 54 insertions(+), 58 deletions(-) diff --git a/arch/powerpc/kernel/head_8xx.S b/arch/powerpc/kernel/head_8xx.S index 0a4f8a9c85ff..b171b7c0a0e7 100644 --- a/arch/powerpc/kernel/head_8xx.S +++ b/arch/powerpc/kernel/head_8xx.S @@ -399,6 +399,23 @@ InstructionTLBMiss: rfi #endif +#ifndef CONFIG_PIN_TLB_TEXT +ITLBMissLinear: + mtcrr11 + /* Set 8M byte page and mark it valid */ + li r11, MI_PS8MEG | MI_SVALID + mtspr SPRN_MI_TWC, r11 + rlwinm r10, r10, 20, 0x0f80/* 8xx supports max 256Mb RAM */ + ori r10, r10, 0xf0 | MI_SPS16K | _PAGE_SH | _PAGE_DIRTY | \ + _PAGE_PRESENT + mtspr SPRN_MI_RPN, r10/* Update TLB entry */ + +0: mfspr r10, SPRN_SPRG_SCRATCH0 + mfspr r11, SPRN_SPRG_SCRATCH1 + rfi + patch_site 0b, patch__itlbmiss_exit_2 +#endif + . = 0x1200 DataStoreTLBMiss: mtspr SPRN_SPRG_SCRATCH0, r10 @@ -484,6 +501,43 @@ DataStoreTLBMiss: rfi #endif +DTLBMissIMMR: + mtcrr11 + /* Set 512k byte guarded page and mark it valid */ + li r10, MD_PS512K | MD_GUARDED | MD_SVALID + mtspr SPRN_MD_TWC, r10 + mfspr r10, SPRN_IMMR /* Get current IMMR */ + rlwinm r10, r10, 0, 0xfff8 /* Get 512 kbytes boundary */ + ori r10, r10, 0xf0 | MD_SPS16K | _PAGE_SH | _PAGE_DIRTY | \ + _PAGE_PRESENT | _PAGE_NO_CACHE + mtspr SPRN_MD_RPN, r10/* Update TLB entry */ + + li r11, RPN_PATTERN + mtspr SPRN_DAR, r11 /* Tag DAR */ + +0: mfspr r10, SPRN_SPRG_SCRATCH0 + mfspr r11, SPRN_SPRG_SCRATCH1 + rfi + patch_site 0b, patch__dtlbmiss_exit_2 + +DTLBMissLinear: + mtcrr11 + /* Set 8M byte page and mark it valid */ + li r11, MD_PS8MEG | MD_SVALID + mtspr SPRN_MD_TWC, r11 + rlwinm r10, r10, 20, 0x0f80/* 8xx supports max 256Mb RAM */ + ori r10, r10, 0xf0 | MD_SPS16K | _PAGE_SH | _PAGE_DIRTY | \ + _PAGE_PRESENT + mtspr SPRN_MD_RPN, r10/* Update TLB entry */ + + li r11, RPN_PATTERN + mtspr SPRN_DAR, r11 /* Tag DAR */ + +0: mfspr r10, SPRN_SPRG_SCRATCH0 + mfspr r11, SPRN_SPRG_SCRATCH1 + rfi + patch_site 0b, patch__dtlbmiss_exit_3 + /* This is an instruction TLB error on the MPC8xx. This could be due * to many reasons, such as executing guarded memory or illegal instruction * addresses. There is nothing to do but handle a big time error fault. @@ -583,64 +637,6 @@ InstructionBreakpoint: . = 0x2000 -/* - * Bottom part of DataStoreTLBMiss handlers for IMMR area and linear RAM. - * not enough space in the DataStoreTLBMiss area. - */ -DTLBMissIMMR: - mtcrr11 - /* Set 512k byte guarded page and mark it valid */ - li r10, MD_PS512K | MD_GUARDED | MD_SVALID - mtspr SPRN_MD_TWC, r10 - mfspr r10, SPRN_IMMR /* Get current IMMR */ - rlwinm r10, r10, 0, 0xfff8 /* Get 512 kbytes boundary */ - ori r10, r10, 0xf0 | MD_SPS16K | _PAGE_SH | _PAGE_DIRTY | \ - _PAGE_PRESENT | _PAGE_NO_CACHE - mtspr SPRN_MD_RPN, r10/* Update TLB entry */ - - li r11, RPN_PATTERN - mtspr SPRN_DAR, r11 /* Tag DAR */ - -0: mfspr r10, SPRN_SPRG_SCRATCH0 - mfspr r11, SPRN_SPRG_SCRATCH1 - rfi - patch_site 0b, patch__dtlbmiss_exit_2 - -DTLBMissLinear: - mtcrr11 - /* Set 8M byte page and mark it valid */ - li r11, MD_PS8MEG | MD_SVALID - mtspr SPRN_MD_TWC, r11 - rlwinm r10, r10, 20, 0x0f80/* 8xx supports max 256Mb RAM */ - ori r10, r10, 0xf0 | MD_SPS16K | _PAGE_SH | _PAGE_DIRTY | \ - _PAGE_PRESENT - mtspr SPRN_MD_RPN, r10/* Update TLB entry */ - - li r11, RPN_PATTERN - mtspr SPRN_DAR, r11 /* Tag DAR */ - -0: mfspr r10, SPRN_SPRG_SCRATCH0 - mfspr r11, SPRN_SPRG_SCRATCH1 - rfi - patch_site 0b, patch__dtlbmiss_exit_3 - -#ifndef CONFIG_PIN_TLB_TEXT -ITLBMissLinear: - mtcrr11 - /* Set 8M byte page and mark it valid */ - li r11, MI_PS8MEG | MI_SVALID - mtspr SPRN_MI_TWC, r11 - rlwinm r10, r10, 20, 0x0f80/* 8xx supports max 256Mb RAM */ - ori r10, r10, 0xf0 | MI_SPS16K | _PAGE_SH | _PAGE_DIRTY | \ -
[PATCH v9 16/20] powerpc/8xx: Enable 8M hugepage support with HW assistance
HW assistance naturally supports 8M huge pages without further modifications. Signed-off-by: Christophe Leroy --- arch/powerpc/mm/tlb_nohash.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/arch/powerpc/mm/tlb_nohash.c b/arch/powerpc/mm/tlb_nohash.c index 4f79639e432f..8ad7aab150b7 100644 --- a/arch/powerpc/mm/tlb_nohash.c +++ b/arch/powerpc/mm/tlb_nohash.c @@ -97,6 +97,9 @@ struct mmu_psize_def mmu_psize_defs[MMU_PAGE_COUNT] = { .shift = 14, }, #endif + [MMU_PAGE_8M] = { + .shift = 23, + }, }; #else struct mmu_psize_def mmu_psize_defs[MMU_PAGE_COUNT] = { -- 2.13.3
[PATCH v9 19/20] powerpc/8xx: don't use r12/SPRN_SPRG_SCRATCH2 in TLB Miss handlers
This patch reworks the TLB Miss handler in order to not use r12 register, hence avoiding having to save it into SPRN_SPRG_SCRATCH2. In the DAR Fixup code we can now use SPRN_M_TW, freeing SPRN_SPRG_SCRATCH2. Then SPRN_SPRG_SCRATCH2 may be used for something else in the future. Signed-off-by: Christophe Leroy --- arch/powerpc/kernel/head_8xx.S | 110 ++--- 1 file changed, 49 insertions(+), 61 deletions(-) diff --git a/arch/powerpc/kernel/head_8xx.S b/arch/powerpc/kernel/head_8xx.S index 85fb4b8bf6c7..0a4f8a9c85ff 100644 --- a/arch/powerpc/kernel/head_8xx.S +++ b/arch/powerpc/kernel/head_8xx.S @@ -302,90 +302,87 @@ SystemCall: */ #ifdef CONFIG_8xx_CPU15 -#define INVALIDATE_ADJACENT_PAGES_CPU15(tmp, addr) \ - additmp, addr, PAGE_SIZE; \ - tlbie tmp;\ - additmp, addr, -PAGE_SIZE; \ - tlbie tmp +#define INVALIDATE_ADJACENT_PAGES_CPU15(addr) \ + addiaddr, addr, PAGE_SIZE; \ + tlbie addr; \ + addiaddr, addr, -(PAGE_SIZE << 1); \ + tlbie addr; \ + addiaddr, addr, PAGE_SIZE #else -#define INVALIDATE_ADJACENT_PAGES_CPU15(tmp, addr) +#define INVALIDATE_ADJACENT_PAGES_CPU15(addr) #endif InstructionTLBMiss: mtspr SPRN_SPRG_SCRATCH0, r10 +#if defined(ITLB_MISS_KERNEL) || defined(CONFIG_SWAP) mtspr SPRN_SPRG_SCRATCH1, r11 -#ifdef ITLB_MISS_KERNEL - mtspr SPRN_SPRG_SCRATCH2, r12 #endif /* If we are faulting a kernel address, we have to use the * kernel page tables. */ mfspr r10, SPRN_SRR0 /* Get effective address of fault */ - INVALIDATE_ADJACENT_PAGES_CPU15(r11, r10) + INVALIDATE_ADJACENT_PAGES_CPU15(r10) mtspr SPRN_MD_EPN, r10 /* Only modules will cause ITLB Misses as we always * pin the first 8MB of kernel memory */ #ifdef ITLB_MISS_KERNEL - mfcrr12 + mfcrr11 #if defined(SIMPLE_KERNEL_ADDRESS) && defined(CONFIG_PIN_TLB_TEXT) - andis. r11, r10, 0x8000/* Address >= 0x8000 */ + cmpicr0, r10, 0 /* Address >= 0x8000 */ #else - rlwinm r11, r10, 16, 0xfff8 - cmpli cr0, r11, PAGE_OFFSET@h + rlwinm r10, r10, 16, 0xfff8 + cmpli cr0, r10, PAGE_OFFSET@h #ifndef CONFIG_PIN_TLB_TEXT /* It is assumed that kernel code fits into the first 8M page */ -0: cmpli cr7, r11, (PAGE_OFFSET + 0x080)@h +0: cmpli cr7, r10, (PAGE_OFFSET + 0x080)@h patch_site 0b, patch__itlbmiss_linmem_top #endif #endif #endif - mfspr r11, SPRN_M_TWB /* Get level 1 table */ + mfspr r10, SPRN_M_TWB /* Get level 1 table */ #ifdef ITLB_MISS_KERNEL #if defined(SIMPLE_KERNEL_ADDRESS) && defined(CONFIG_PIN_TLB_TEXT) - beq+3f + bge+3f #else blt+3f #endif #ifndef CONFIG_PIN_TLB_TEXT blt cr7, ITLBMissLinear #endif - rlwinm r11, r11, 0, 20, 31 - orisr11, r11, (swapper_pg_dir - PAGE_OFFSET)@ha + rlwinm r10, r10, 0, 20, 31 + orisr10, r10, (swapper_pg_dir - PAGE_OFFSET)@ha 3: #endif - lwz r11, (swapper_pg_dir-PAGE_OFFSET)@l(r11)/* Get the level 1 entry */ + lwz r10, (swapper_pg_dir-PAGE_OFFSET)@l(r10)/* Get level 1 entry */ + mtspr SPRN_MI_TWC, r10/* Set segment attributes */ - mtspr SPRN_MD_TWC, r11 + mtspr SPRN_MD_TWC, r10 mfspr r10, SPRN_MD_TWC lwz r10, 0(r10) /* Get the pte */ #ifdef ITLB_MISS_KERNEL - mtcrr12 + mtcrr11 #endif - /* Load the MI_TWC with the attributes for this "segment." */ - mtspr SPRN_MI_TWC, r11/* Set segment attributes */ - #ifdef CONFIG_SWAP rlwinm r11, r10, 32-5, _PAGE_PRESENT and r11, r11, r10 rlwimi r10, r11, 0, _PAGE_PRESENT #endif - li r11, RPN_PATTERN | 0x200 /* The Linux PTE won't go exactly into the MMU TLB. * Software indicator bits 20 and 23 must be clear. * Software indicator bits 22, 24, 25, 26, and 27 must be * set. All other Linux PTE bits control the behavior * of the MMU. */ - rlwimi r11, r10, 4, 0x0400 /* Copy _PAGE_EXEC into bit 21 */ - rlwimi r10, r11, 0, 0x0ff0 /* Set 22, 24-27, clear 20,23 */ + rlwimi r10, r10, 0, 0x0f00 /* Clear bits 20-23 */ + rlwimi r10, r10, 4, 0x0400 /* Copy _PAGE_EXEC into bit 21 */ + ori r10, r10, RPN_PATTERN | 0x200 /* Set 22 and 24-27 */ mtspr SPRN_MI_RPN, r10/* Update TLB entry */ /* Restore registers */ 0: mfspr r10, SPRN_SPRG_SCRATCH0 +#if defined(ITLB_MISS_KERNEL) || defined(CONFIG_SWAP) mfspr r11, SPRN_SPRG_SCRATCH1 -#ifdef ITLB_MISS_KERNEL - mfspr r12, SPRN_SPRG_SCRATCH2 #endif rfi patch_site 0b,
Re: use generic DMA mapping code in powerpc V4
On 29 November 2018 at 1:05PM, Christian Zigotzky wrote: On 28 November 2018 at 12:05PM, Michael Ellerman wrote: Christoph Hellwig writes: Any comments? I'd like to at least get the ball moving on the easy bits. Nothing specific yet. I'm a bit worried it might break one of the many old obscure platforms we have that aren't well tested. There's not much we can do about that, but I'll just try and test it on everything I can find. Is the plan that you take these via the dma-mapping tree or that they go via powerpc? cheers Hi All, I compiled a test kernel from the following Git today. http://git.infradead.org/users/hch/misc.git/shortlog/refs/heads/powerpc-dma.4 Command: git clone git://git.infradead.org/users/hch/misc.git -b powerpc-dma.4 a Unfortunately I get some DMA error messages and the PASEMI ethernet doesn't work anymore. [ 367.627623] pci :00:1a.0: dma_direct_map_page: overflow 0x00026bcb5002+110 of device mask bus mask 0 [ 367.627631] pci :00:1a.0: dma_direct_map_page: overflow 0x00026bcb5002+110 of device mask bus mask 0 [ 367.627639] pci :00:1a.0: dma_direct_map_page: overflow 0x00026bcb5002+110 of device mask bus mask 0 [ 367.627647] pci :00:1a.0: dma_direct_map_page: overflow 0x00026bcb5002+110 of device mask bus mask 0 [ 367.627655] pci :00:1a.0: dma_direct_map_page: overflow 0x00026bcb5002+110 of device mask bus mask 0 [ 367.627686] pci :00:1a.0: dma_direct_map_page: overflow 0x00026bcb5002+110 of device mask bus mask 0 [ 367.628418] pci :00:1a.0: dma_direct_map_page: overflow 0x00026bcb5002+110 of device mask bus mask 0 [ 367.628505] pci :00:1a.0: dma_direct_map_page: overflow 0x00026bcb5002+110 of device mask bus mask 0 [ 367.628592] pci :00:1a.0: dma_direct_map_page: overflow 0x00026bcb5002+110 of device mask bus mask 0 [ 367.629324] pci :00:1a.0: dma_direct_map_page: overflow 0x00026bcb5002+110 of device mask bus mask 0 [ 367.629417] pci :00:1a.0: dma_direct_map_page: overflow 0x00026bcb5002+110 of device mask bus mask 0 [ 367.629495] pci :00:1a.0: dma_direct_map_page: overflow 0x00026bcb5002+110 of device mask bus mask 0 [ 367.629589] pci :00:1a.0: dma_direct_map_page: overflow 0x00026bcb5002+110 of device mask bus mask 0 [ 430.424732]pasemi_mac: rcmdsta error: 0x04ef3001 I tested this kernel with the Nemo board (CPU: PWRficient PA6T-1682M). The PASEMI ethernet works with the RC4 of kernel 4.20. Cheers, Christian Hi All, I tested this kernel on my NXP QorIQ P5020 board. U-Boot loads the dtb file and the kernel and after that the booting stops. This board works with the RC4 of kernel 4.20. Please test this kernel on your NXP and PASEMI boards. Thanks, Christian
[PATCH v1 09/13] powerpc/mmu: add is_strict_kernel_rwx() helper
Add a helper to know whether STRICT_KERNEL_RWX is enabled. This is based on rodata_enabled flag which is defined only when CONFIG_STRICT_KERNEL_RWX is selected. Signed-off-by: Christophe Leroy --- arch/powerpc/include/asm/mmu.h | 11 +++ arch/powerpc/mm/init_32.c | 4 +--- 2 files changed, 12 insertions(+), 3 deletions(-) diff --git a/arch/powerpc/include/asm/mmu.h b/arch/powerpc/include/asm/mmu.h index eb20eb3b8fb0..6343cbf5b651 100644 --- a/arch/powerpc/include/asm/mmu.h +++ b/arch/powerpc/include/asm/mmu.h @@ -268,6 +268,17 @@ static inline u16 get_mm_addr_key(struct mm_struct *mm, unsigned long address) } #endif /* CONFIG_PPC_MEM_KEYS */ +#ifdef CONFIG_STRICT_KERNEL_RWX +static inline bool strict_kernel_rwx_enabled(void) +{ + return rodata_enabled; +} +#else +static inline bool strict_kernel_rwx_enabled(void) +{ + return false; +} +#endif #endif /* !__ASSEMBLY__ */ /* The kernel use the constants below to index in the page sizes array. diff --git a/arch/powerpc/mm/init_32.c b/arch/powerpc/mm/init_32.c index 3e59e5d64b01..ee5a430b9a18 100644 --- a/arch/powerpc/mm/init_32.c +++ b/arch/powerpc/mm/init_32.c @@ -108,12 +108,10 @@ static void __init MMU_setup(void) __map_without_bats = 1; __map_without_ltlbs = 1; } -#ifdef CONFIG_STRICT_KERNEL_RWX - if (rodata_enabled) { + if (strict_kernel_rwx_enabled()) { __map_without_bats = 1; __map_without_ltlbs = 1; } -#endif } /* -- 2.13.3
Re: use generic DMA mapping code in powerpc V4
> > Please don't apply the new DMA mapping code if you don't be sure if it > > works on all supported PowerPC machines. Is the new DMA mapping code > > really necessary? It's not really nice, to rewrote code if the old code > > works perfect. We must not forget, that we work for the end users. Does > > the end user have advantages with this new code? Is it faster? The old > > code works without any problems. > > There is another service provided to the users as well: new code that is > cleaner and simpler which allows easier bug fixes and new features. > Without being familiar with the DMA mapping code I cannot really say if > that's the case here. Yes, the main point is to move all architecturs to common code for the dma direct mapping code. This means we have one code bases that sees bugs fixed and features introduced the same way for everyone.
Re: use generic DMA mapping code in powerpc V4
On Wed, Nov 28, 2018 at 10:05:19PM +1100, Michael Ellerman wrote: > Is the plan that you take these via the dma-mapping tree or that they go > via powerpc? In principle either way is fine with me. If it goes through the powerpc tree we might run into a few minor conflicts with the dma-mapping tree depending on how some of the current discussions go.
Re: use generic DMA mapping code in powerpc V4
On Thu, Nov 29, 2018 at 01:05:23PM +0100, Christian Zigotzky wrote: > I compiled a test kernel from the following Git today. > > http://git.infradead.org/users/hch/misc.git/shortlog/refs/heads/powerpc-dma.4 > > Command: git clone git://git.infradead.org/users/hch/misc.git -b > powerpc-dma.4 a > > Unfortunately I get some DMA error messages and the PASEMI ethernet doesn't > work anymore. What kind of machine is this (and your other one)? Can you send me (or point me to) the .config files?