[PATCH] ARC: add hugetlb definitions
From: Pavel Kozlov Add hugetlb definitions if THP enabled. ARC doesn't support HugeTLB FS but it supports THP. Some kernel code such as pagemap uses hugetlb definitions with THP. This patch fixes ARC build issue (HPAGE_SIZE undeclared error) with TRANSPARENT_HUGEPAGE enabled. Signed-off-by: Pavel Kozlov --- arch/arc/include/asm/hugepage.h | 7 +++ 1 file changed, 7 insertions(+) diff --git a/arch/arc/include/asm/hugepage.h b/arch/arc/include/asm/hugepage.h index ef8d4166370c..8a2441670a8f 100644 --- a/arch/arc/include/asm/hugepage.h +++ b/arch/arc/include/asm/hugepage.h @@ -10,6 +10,13 @@ #include #include +/* + * Hugetlb definitions. + */ +#define HPAGE_SHIFTPMD_SHIFT +#define HPAGE_SIZE (_AC(1, UL) << HPAGE_SHIFT) +#define HPAGE_MASK (~(HPAGE_SIZE - 1)) + static inline pte_t pmd_pte(pmd_t pmd) { return __pte(pmd_val(pmd)); -- 2.25.1 ___ linux-snps-arc mailing list linux-snps-arc@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-snps-arc
Re: [PATCH 0/5] ARC updates
Hi Vineet, > A pile of accrued changes, compile tested only. > Please test. > Thx, > -Vineet I'm testing you patches. At first glance everything is fine, no obvious regressions. But I've found an ARC build issue with the latest upstream source (tag: v6.7-rc5). It is not linked with your updates. The build issue occurs when the CONFIG_TRANSPARENT_HUGEPAGE option is enabled: | ../fs/proc/task_mmu.c: In function ‘pagemap_scan_thp_entry’: | ../fs/proc/task_mmu.c:2115:28: error: ‘HPAGE_SIZE’ undeclared (first use in this function); did you mean ‘PAGE_SIZE’? | 2115 | if (end != start + HPAGE_SIZE) { | |^~ | |PAGE_SIZE | ../fs/proc/task_mmu.c:2115:28: note: each undeclared identifier is reported only once for each function it appears in | make[4]: *** [../scripts/Makefile.build:243: fs/proc/task_mmu.o] Error 1 | make[3]: *** [../scripts/Makefile.build:480: fs/proc] Error 2 May be you could address this issue in your updates too? Thanks Pavel ___ linux-snps-arc mailing list linux-snps-arc@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-snps-arc
Re: [PATCH 00/20] ARC updates
> Hi, > > This is a pile of miscll improvements/updates sitting in one of my local > trees. > Given the recent warning fix, we coudl also push them out. > @Alexey, @Shahab: care to give these a spin on hsdk (and test ARC700 > build/boot on nSIM if possible). > > Thx, > -Vineet The entire "ARC updates" patch series (with all subsequent patches with updates) has been tested on the HSDK borard for ARCv2 using the glibc and ksefltest test framworks and on nSIM for ARCompact using the uclibc tests. No regressions found. Tested-by: Pavel Kozlov Regards, Pavel ___ linux-snps-arc mailing list linux-snps-arc@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-snps-arc
Re: [PATCH 16/20] ARC: entry: Add more common chores to EXCEPTION_PROLOGUE
Hi Vineet, > Subject: [PATCH 16/20] ARC: entry: Add more common chores to > EXCEPTION_PROLOGUE > > THe high level structure of most ARC exception handlers is > 1. save regfile with EXCEPTION_PROLOGUE > 2. setup r0: EFA (not part of pt_regs) > 3. setup r1: pointer to pt_regs (SP) > 4. drop down to pure kernel mode (from exception) > 5. call the Linux "C" handler > > Remove the boiler plate code by moving #2, #3, #4 into #1. > > The exceptions to most exceptions are syscall Trap and Machine check > which don't do some of above for various reasons, so call a newly > introduced variant EXCEPTION_PROLOGUE_KEEP_AE (same as original > EXCEPTION_PROLOGUE) I'm observing the ARC700 (nSIM) system freeze after this patch. ... f000.serial: ttyS0 at MMIO 0xf000 (irq = 24, base_baud = 3125000) is a 16550A printk: console [ttyS0] enabled printk: console [ttyS0] enabled printk: bootconsole [uart8250] disabled printk: bootconsole [uart8250] disabled NET: Registered PF_PACKET protocol family NET: Registered PF_KEY protocol family clk: Disabling unused clocks Freeing unused kernel image (initmem) memory: 2856K This architecture does not have kernel memory protection. Run /init as init process > @@ -128,11 +123,6 @@ ENTRY(EV_PrivilegeV) > > EXCEPTION_PROLOGUE > > - lr r0, [efa] > - mov r1, sp > - > - FAKE_RET_FROM_EXCPN > - > bl do_privilege_fault > b ret_from_exception The same update is also required for the call_do_page_fault wrapper for ARcompact. Regards, Pavel ___ linux-snps-arc mailing list linux-snps-arc@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-snps-arc
Re: [PATCH 20/20] ARC: pt_regs: create seperate type for ecr
Hi Vineet, I'm testing your updates and ran into the same build issue reported by the build robot. http://lists.infradead.org/pipermail/linux-snps-arc/2023-August/007522.html > #ifdef CONFIG_ISA_ARCOMPACT > @@ -40,18 +51,7 @@ struct pt_regs { > *Last word used by Linux for extra state mgmt > (syscall-restart) > * For interrupts, use artificial ECR values to note current > prio-level > */ > - union { > - struct { > -#ifdef CONFIG_CPU_BIG_ENDIAN > - unsigned long state:8, ecr_vec:8, > - ecr_cause:8, ecr_param:8; > -#else > - unsigned long ecr_param:8, ecr_cause:8, > - ecr_vec:8, state:8; > -#endif > - }; > - unsigned long event; > - }; > + ecr_reg ecr; > } > > #define MAX_REG_OFFSET offsetof(struct pt_regs, event) This change causes a build issue for ARC700, as the event field has been removed and the MAX_REG_OFFSET macro hasn't been updated. Regards, Pavel ___ linux-snps-arc mailing list linux-snps-arc@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-snps-arc
[PATCH] ARC: avoid unwanted gcc optimizations in atomic operations
From: Pavel Kozlov Notify a compiler about write operations and prevent unwanted optimizations. Add the "memory" clobber to the clobber list. An obvious problem with unwanted compiler optimizations appeared after the cpumask optimization commit 596ff4a09b89 ("cpumask: re-introduce constant-sized cpumask optimizations"). After this commit the SMP kernels for ARC no longer loads because of failed assert in the percpu allocator initialization routine: percpu: BUG: failure at mm/percpu.c:2981/pcpu_build_alloc_info()! The write operation performed by the scond instruction in the atomic inline asm code is not properly passed to the compiler. The compiler cannot correctly optimize a nested loop that runs through the cpumask in the pcpu_build_alloc_info() function. Add the "memory" clobber to fix this. Link: https://github.com/foss-for-synopsys-dwc-arc-processors/linux/issues/135 Cc: # v6.3+ Signed-off-by: Pavel Kozlov --- arch/arc/include/asm/atomic-llsc.h| 6 +++--- arch/arc/include/asm/atomic64-arcv2.h | 6 +++--- 2 files changed, 6 insertions(+), 6 deletions(-) diff --git a/arch/arc/include/asm/atomic-llsc.h b/arch/arc/include/asm/atomic-llsc.h index 1b0ffaeee16d..5258cb81a16b 100644 --- a/arch/arc/include/asm/atomic-llsc.h +++ b/arch/arc/include/asm/atomic-llsc.h @@ -18,7 +18,7 @@ static inline void arch_atomic_##op(int i, atomic_t *v) \ : [val] "=" (val) /* Early clobber to prevent reg reuse */ \ : [ctr] "r" (>counter), /* Not "m": llock only supports reg direct addr mode */ \ [i] "ir"(i) \ - : "cc");\ + : "cc", "memory"); \ } \ #define ATOMIC_OP_RETURN(op, asm_op) \ @@ -34,7 +34,7 @@ static inline int arch_atomic_##op##_return_relaxed(int i, atomic_t *v) \ : [val] "=" (val) \ : [ctr] "r" (>counter), \ [i] "ir"(i) \ - : "cc");\ + : "cc", "memory"); \ \ return val; \ } @@ -56,7 +56,7 @@ static inline int arch_atomic_fetch_##op##_relaxed(int i, atomic_t *v)\ [orig] "=" (orig) \ : [ctr] "r" (>counter), \ [i] "ir"(i) \ - : "cc");\ + : "cc", "memory"); \ \ return orig;\ } diff --git a/arch/arc/include/asm/atomic64-arcv2.h b/arch/arc/include/asm/atomic64-arcv2.h index 6b6db981967a..9b5791b85471 100644 --- a/arch/arc/include/asm/atomic64-arcv2.h +++ b/arch/arc/include/asm/atomic64-arcv2.h @@ -60,7 +60,7 @@ static inline void arch_atomic64_##op(s64 a, atomic64_t *v) \ " bnz 1b \n" \ : "="(val)\ : "r"(>counter), "ir"(a) \ - : "cc");\ + : "cc", "memory"); \ } \ #define ATOMIC64_OP_RETURN(op, op1, op2) \ @@ -77,7 +77,7 @@ static inline s64 arch_atomic64_##op##_return_relaxed(s64 a, atomic64_t *v) \ " bnz 1b \n" \ : [val] "="(val) \ : "r"(>counter), "ir"(a) \ - : "cc");/* memory clobber comes from smp_mb() */\ + : "cc", "memory"); \ \ return val; \
[PATCH 2/2] ARC: run child from the separate start block in __clone
From: Pavel Kozlov For better debug experience use separate code block with extra cfi_* directives to run child (same as in __clone3). --- sysdeps/unix/sysv/linux/arc/clone.S | 40 ++--- 1 file changed, 25 insertions(+), 15 deletions(-) diff --git a/sysdeps/unix/sysv/linux/arc/clone.S b/sysdeps/unix/sysv/linux/arc/clone.S index 766649625658..0029aaeb8170 100644 --- a/sysdeps/unix/sysv/linux/arc/clone.S +++ b/sysdeps/unix/sysv/linux/arc/clone.S @@ -20,9 +20,6 @@ #include #define _ERRNO_H 1 #include -#include - -#define CLONE_SETTLS 0x0008 /* int clone(int (*fn)(void *), void *child_stack, int flags, void *arg, ... @@ -63,19 +60,9 @@ ENTRY (__clone) ARC_TRAP_INSN cmp r0, 0 /* return code : 0 new process, !0 parent. */ + beq thread_start_clone blt L (__sys_err2) /* < 0 (signed) error. */ - jnz [blink] /* Parent returns. */ - - /* child jumps off to @fn with @arg as argument - TP register already set by kernel. */ - jl.d[r10] - mov r0, r11 - - /* exit() with result from @fn (already in r0). */ - mov r8, __NR_exit - ARC_TRAP_INSN - /* In case it ever came back. */ - flag1 + j [blink] /* Parent returns. */ L (__sys_err): mov r0, -EINVAL @@ -89,5 +76,28 @@ L (__sys_err2): position independent. */ b __syscall_error PSEUDO_END (__clone) + + + .align 4 + .type thread_start_clone, %function +thread_start_clone: + cfi_startproc + /* Terminate call stack by noting ra is undefined. */ + cfi_undefined (blink) + + /* Child jumps off to @fn with @arg as argument. */ + jl.d[r10] + mov r0, r11 + + /* exit() with result from @fn (already in r0). */ + mov r8, __NR_exit + ARC_TRAP_INSN + + /* In case it ever came back. */ + flag1 + + cfi_endproc + .size thread_start_clone, .-thread_start_clone + libc_hidden_def (__clone) weak_alias (__clone, clone) -- 2.25.1 ___ linux-snps-arc mailing list linux-snps-arc@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-snps-arc
[PATCH 1/2] ARC: Add the clone3 wrapper
From: Pavel Kozlov Use the clone3 wrapper on ARC. It doesn't care about stack alignment. All callers should provide an aligned stack. It follows the internal signature: extern int clone3 (struct clone_args *__cl_args, size_t __size, int (*__func) (void *__arg), void *__arg); --- Checked on arc-linux-gnu. Previously observed tst-misaling-clone-internal test fail was because I used outdated master branch. Full testsuite runs without regressions. But I also see fail of the new tst-spawn7, as already repoted at [1]. [1] https://sourceware.org/pipermail/libc-alpha/2023-February/145937.html sysdeps/unix/sysv/linux/arc/clone3.S | 90 sysdeps/unix/sysv/linux/arc/sysdep.h | 2 + 2 files changed, 92 insertions(+) create mode 100644 sysdeps/unix/sysv/linux/arc/clone3.S diff --git a/sysdeps/unix/sysv/linux/arc/clone3.S b/sysdeps/unix/sysv/linux/arc/clone3.S new file mode 100644 index ..87a8272a3977 --- /dev/null +++ b/sysdeps/unix/sysv/linux/arc/clone3.S @@ -0,0 +1,90 @@ +/* The clone3 syscall wrapper. Linux/arc version. + Copyright (C) 2023 Free Software Foundation, Inc. + + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + <https://www.gnu.org/licenses/>. */ + +#include +#define _ERRNO_H 1 +#include + +/* The userland implementation is: + int clone3 (struct clone_args *cl_args, size_t size, + int (*func)(void *arg), void *arg); + + the kernel entry is: + int clone3 (struct clone_args *cl_args, size_t size); + + The parameters are passed in registers from userland: + r0: cl_args + r1: size + r2: func + r3: arg */ + +ENTRY(__clone3) + + /* Save args for the child. */ + mov r10, r0 /* cl_args */ + mov r11, r2 /* func */ + mov r12, r3 /* args */ + + /* Sanity check args. */ + breqr10, 0, L (__sys_err) /* No NULL cl_args pointer. */ + breqr11, 0, L (__sys_err) /* No NULL function pointer. */ + + /* Do the system call, the kernel expects: + r8: system call number + r0: cl_args + r1: size */ + mov r0, r10 + mov r8, __NR_clone3 + ARC_TRAP_INSN + + cmp r0, 0 + beq thread_start_clone3 /* Child returns. */ + blt L (__sys_err2) + j [blink] /* Parent returns. */ + +L (__sys_err): + mov r0, -EINVAL +L (__sys_err2): + b __syscall_error +PSEUDO_END (__clone3) + + + .align 4 + .type thread_start_clone3, %function +thread_start_clone3: + cfi_startproc + /* Terminate call stack by noting ra is undefined. */ + cfi_undefined (blink) + + /* Child jumps off to @fn with @arg as argument. */ + jl.d[r11] + mov r0, r12 + + /* exit() with result from @fn (already in r0). */ + mov r8, __NR_exit + ARC_TRAP_INSN + + /* In case it ever came back. */ + flag1 + + cfi_endproc + .size thread_start_clone3, .-thread_start_clone3 + +libc_hidden_def (__clone3) +weak_alias (__clone3, clone3) diff --git a/sysdeps/unix/sysv/linux/arc/sysdep.h b/sysdeps/unix/sysv/linux/arc/sysdep.h index dd6fe73445f9..88dc1dff017f 100644 --- a/sysdeps/unix/sysv/linux/arc/sysdep.h +++ b/sysdeps/unix/sysv/linux/arc/sysdep.h @@ -141,6 +141,8 @@ hidden_proto (__syscall_error) # define ARC_TRAP_INSN "trap_s 0 \n\t" +# define HAVE_CLONE3_WRAPPER 1 + # undef INTERNAL_SYSCALL_NCS # define INTERNAL_SYSCALL_NCS(number, nr_args, args...)\ ({ \ -- 2.25.1 ___ linux-snps-arc mailing list linux-snps-arc@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-snps-arc
Pavel new maintainer of ARC port
Hi all, I'm excited to introduce myself and become a part of the community. I'm a software engineer in Synopsys and member of a team working to enhance support for ARC CPUs in the GNU Linux ecosystem. I would appreciate any guidance or advice on how I can get involved and contribute to the community. I'm always looking for opportunities to connect with other developers and learn from their experiences. As a final step of setup, I've updated wiki and added myself as ARC maintainer. Also, thanks Carlos, for help with my setup. Best regards, Pavel Kozlov ___ linux-snps-arc mailing list linux-snps-arc@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-snps-arc
[PATCH] ARC: fix compile time assert fail for the config with PAE40 and 4K pages
From: Pavel Kozlov Add support for the configuration with 4K Page Size and enabled ARC_PAE40. Set two-level Page Table to 10:9:12, as with PAE40 a 4k page can settle only 512 entries (with PAE40 size of PTE entry increases from 4 to 8 bytes). In this configuration the Page Table can describe only 31-bit (2Gb) virtual space, but it is not a problem as the ARC MMUv4 supports only 2Gb of virtual address. This patch doesn't affect other configurations. Patch fixes compile time assert fail: include/linux/compiler_types.h:328:45: error: call to '__compiletime_assert_288' declared with attribute error: BUILD_BUG_ON failed: (PTRS_PER_PTE * sizeof(pte_t)) > PAGE_SIZE Reported-by: kernel test robot Signed-off-by: Pavel Kozlov --- Added the same ARC_VADDR_BITS macro name as used in ARCv3 port. The 4K Page Size and enabled PAE configuration was tested on nSIM: load, allocation/free of all memory (low and high), user space tests from glibc test suite (malloc, string). arch/arc/include/asm/pgtable-levels.h | 20 +--- 1 file changed, 17 insertions(+), 3 deletions(-) diff --git a/arch/arc/include/asm/pgtable-levels.h b/arch/arc/include/asm/pgtable-levels.h index ef68758b69f7..3b6afee9c272 100644 --- a/arch/arc/include/asm/pgtable-levels.h +++ b/arch/arc/include/asm/pgtable-levels.h @@ -42,12 +42,22 @@ * (so 4K page can only have 1K entries: or 10 bits) */ #ifdef CONFIG_ARC_PAGE_SIZE_4K +#ifdef CONFIG_ARC_HAS_PAE40 +/* + * For PAE40 and 4K page size set 10:9:12 Page Table + * (as with PAE40 4k page can only have 512 entries) + * Page Table can describe only 31-bit (2Gb) virtual space + */ +#define PGDIR_SHIFT21 +#define ARC_VADDR_BITS 31 +#else #define PGDIR_SHIFT22 +#endif /* CONFIG_ARC_HAS_PAE40 */ #else #define PGDIR_SHIFT21 -#endif +#endif /* CONFIG_ARC_PAGE_SIZE_4K */ -#endif +#endif /* CONFIG_ARC_HUGEPAGE_16M */ #else /* CONFIG_PGTABLE_LEVELS != 2 */ @@ -67,9 +77,13 @@ #endif /* CONFIG_PGTABLE_LEVELS */ +#ifndef ARC_VADDR_BITS +#define ARC_VADDR_BITS 32 +#endif + #define PGDIR_SIZE BIT(PGDIR_SHIFT) #define PGDIR_MASK (~(PGDIR_SIZE - 1)) -#define PTRS_PER_PGD BIT(32 - PGDIR_SHIFT) +#define PTRS_PER_PGD BIT(ARC_VADDR_BITS - PGDIR_SHIFT) #if CONFIG_PGTABLE_LEVELS > 3 #define PUD_SIZE BIT(PUD_SHIFT) -- 2.25.1 ___ linux-snps-arc mailing list linux-snps-arc@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-snps-arc
Re: Pavel steping up for ARC glibc maintenance
Hi Carlos, Hi all, Thanks for your greeting. I'm glad to be here. Can you please add me to the wiki EditorGroup? My wiki account: PavelKozlov I have some test results that I want to post for the coming release. I intent to help to maintain ARC glibc port, so feel free to ask if some code, HW verification or review will be required, all concerning ARC. I've requested the sourceware.org account. Looking forward to the next steps. Thank you, Pavel From: Carlos O'Donell Sent: Wednesday, February 1, 2023 12:33 AM To: Vineet Gupta Cc: arcml ; Pavel Kozlov ; GNU C Library Subject: Re: Pavel steping up for ARC glibc maintenance On 1/30/23 22:07, Vineet Gupta wrote: > Hi Carlos, > > I'd like to introduce Pavel who intends to step up to maintain ARC glibc port > and do periodic wiki updates and such. > I've pointed him to links [1] and [2]. Sounds great! Welcome Pavel! > To begin with can he be granted wiki edit access and subsequently also > approve his write access to glibc sourceware repo. We can absolutely help with the next steps. Please follow [2] and if you get stuck anywhere just reach out on IRC or by email. > Thx, > -Vineet > > > [1] > https://urldefense.com/v3/__https://sourceware.org/glibc/wiki/MAINTAINERS*Becoming_a_maintainer_.28developer.29__;Iw!!A4F2R9G_pg!aaKwbgpslcHvrrdypX2BrPwxZt7I0x_g_mYJUHIfZhZk-D9UuJWQsmzd6SqrQWU9KTROPZ8zqEyfPU98J1EM$ > > [2] > https://urldefense.com/v3/__https://sourceware.org/glibc/wiki/MAINTAINERS*AccountsOnSourceware__;Iw!!A4F2R9G_pg!aaKwbgpslcHvrrdypX2BrPwxZt7I0x_g_mYJUHIfZhZk-D9UuJWQsmzd6SqrQWU9KTROPZ8zqEyfPTXW0Rz_$ > > -- Cheers, Carlos. ___ linux-snps-arc mailing list linux-snps-arc@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-snps-arc
Re: [PATCH] ARC: align child stack in clone
> LGTM, although I can't really test it since the Synopsys qemu tree does not > have qemu-user support [1]. Thank you for the review. I've checked this patch on QEMU with Linux system and on the ARC HSDK board. Also, I can say that it is possible to run ARC binaries on QEMU [1] in user mode. Currently not all instructions are supported in ARC QEMU and it is recommended to build binaries with extra -mcpu=archs compiler option, to reduce instruction set. Maybe this [2] will be also useful. It will be great to have this patch in coming 2.37. [1] https://github.com/foss-for-synopsys-dwc-arc-processors/qemu [2] https://github.com/foss-for-synopsys-dwc-arc-processors/glibc/wiki/Glibc-test-suite-for-a-target-without-native-GNU-toolchain#glibc-test-suite-execution-with-qemu-user-mode-emulation -- Pavel ___ linux-snps-arc mailing list linux-snps-arc@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-snps-arc
Re: [PATCH] ARC:fpu: add extra capability check before use of sqrt and fma builtins
> This is wrong, sqrt use macro do not belong for the fma switch file. Thank you for the review and your notice. I was inattentive when moving changes from the branch I had. This has been fixed in v2 of the patch [1]. I've manually checked (by objdump -d output review) that now expected code for libm is generated in different cases (when compiler provides support for extra instructions and sets macroses __ARC_FPU_DP_DIV__, __ARC_FPU_SP_DIV__, __ARC_FPU_DP_FMA__, __ARC_FPU_SP_FMA__ and when not). [1] http://lists.infradead.org/pipermail/linux-snps-arc/2023-January/006771.html -- Pavel ___ linux-snps-arc mailing list linux-snps-arc@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-snps-arc
[PATCH v2] ARC:fpu: add extra capability check before use of sqrt and fma builtins
From: Pavel Kozlov Add extra check for compiler definitions to ensure that compiler provides sqrt and fma hw fpu instructions else use software implementation. As divide/sqrt and FMA hw support from CPU side is optional, the compiler can be configured by options to generate hw FPU instructions, but without use of FDDIV, FDSQRT, FSDIV, FSSQRT, FDMADD and FSMADD instructions. In this case __builtin_sqrt and __builtin_sqrtf provided by compiler can't be used inside the glibc code, as these builtins are used in implementations of sqrt() and sqrtf() functions but at the same time these builtins unfold to sqrt() and sqrtf(). So it is possible to receive code like that: 0001c4b4 <__ieee754_sqrtf>: 1c4b4:0001 b 0 ;1c4b4 <__ieee754_sqrtf> The same is also true for __builtin_fma and __builtin_fmaf. --- Changes in v2: - Fixed macros definitions for FMA sysdeps/arc/fpu/math-use-builtins-fma.h | 14 -- sysdeps/arc/fpu/math-use-builtins-sqrt.h | 14 -- 2 files changed, 24 insertions(+), 4 deletions(-) diff --git a/sysdeps/arc/fpu/math-use-builtins-fma.h b/sysdeps/arc/fpu/math-use-builtins-fma.h index eede75aa41be..2acd8113ce2c 100644 --- a/sysdeps/arc/fpu/math-use-builtins-fma.h +++ b/sysdeps/arc/fpu/math-use-builtins-fma.h @@ -1,4 +1,14 @@ -#define USE_FMA_BUILTIN 1 -#define USE_FMAF_BUILTIN 1 +#if defined __ARC_FPU_DP_FMA__ +# define USE_FMA_BUILTIN 1 +#else +# define USE_FMA_BUILTIN 0 +#endif + +#if defined __ARC_FPU_SP_FMA__ +# define USE_FMAF_BUILTIN 1 +#else +# define USE_FMAF_BUILTIN 0 +#endif + #define USE_FMAL_BUILTIN 0 #define USE_FMAF128_BUILTIN 0 diff --git a/sysdeps/arc/fpu/math-use-builtins-sqrt.h b/sysdeps/arc/fpu/math-use-builtins-sqrt.h index e94c915ba66a..a449bc609295 100644 --- a/sysdeps/arc/fpu/math-use-builtins-sqrt.h +++ b/sysdeps/arc/fpu/math-use-builtins-sqrt.h @@ -1,4 +1,14 @@ -#define USE_SQRT_BUILTIN 1 -#define USE_SQRTF_BUILTIN 1 +#if defined __ARC_FPU_DP_DIV__ +# define USE_SQRT_BUILTIN 1 +#else +# define USE_SQRT_BUILTIN 0 +#endif + +#if defined __ARC_FPU_SP_DIV__ +# define USE_SQRTF_BUILTIN 1 +#else +# define USE_SQRTF_BUILTIN 0 +#endif + #define USE_SQRTL_BUILTIN 0 #define USE_SQRTF128_BUILTIN 0 -- 2.25.1 ___ linux-snps-arc mailing list linux-snps-arc@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-snps-arc
[PATCH] ARC:fpu: add extra capability check before use of sqrt and fma builtins
From: Pavel Kozlov Add extra check for compiler definitions to ensure that compiler provides sqrt and fma hw fpu instructions else use software implementation. As divide/sqrt and FMA hw support from CPU side is optional, the compiler can be configured by options to generate hw FPU instructions, but without use of FDDIV, FDSQRT, FSDIV, FSSQRT, FDMADD and FSMADD instructions. In this case __builtin_sqrt and __builtin_sqrtf provided by compiler can't be used inside the glibc code, as these builtins are used in implementations of sqrt() and sqrtf() functions but at the same time these builtins unfold to sqrt() and sqrtf(). So it is possible to receive code like that: 0001c4b4 <__ieee754_sqrtf>: 1c4b4:0001 b 0 ;1c4b4 <__ieee754_sqrtf> The same is also true for __builtin_fma and __builtin_fmaf. --- sysdeps/arc/fpu/math-use-builtins-fma.h | 14 -- sysdeps/arc/fpu/math-use-builtins-sqrt.h | 14 -- 2 files changed, 24 insertions(+), 4 deletions(-) diff --git a/sysdeps/arc/fpu/math-use-builtins-fma.h b/sysdeps/arc/fpu/math-use-builtins-fma.h index eede75aa41be..082badf48201 100644 --- a/sysdeps/arc/fpu/math-use-builtins-fma.h +++ b/sysdeps/arc/fpu/math-use-builtins-fma.h @@ -1,4 +1,14 @@ -#define USE_FMA_BUILTIN 1 -#define USE_FMAF_BUILTIN 1 +#if defined __ARC_FPU_DP_DIV__ +# define USE_SQRT_BUILTIN 1 +#else +# define USE_SQRT_BUILTIN 0 +#endif + +#if defined __ARC_FPU_SP_DIV__ +# define USE_SQRTF_BUILTIN 1 +#else +# define USE_SQRTF_BUILTIN 0 +#endif + #define USE_FMAL_BUILTIN 0 #define USE_FMAF128_BUILTIN 0 diff --git a/sysdeps/arc/fpu/math-use-builtins-sqrt.h b/sysdeps/arc/fpu/math-use-builtins-sqrt.h index e94c915ba66a..a449bc609295 100644 --- a/sysdeps/arc/fpu/math-use-builtins-sqrt.h +++ b/sysdeps/arc/fpu/math-use-builtins-sqrt.h @@ -1,4 +1,14 @@ -#define USE_SQRT_BUILTIN 1 -#define USE_SQRTF_BUILTIN 1 +#if defined __ARC_FPU_DP_DIV__ +# define USE_SQRT_BUILTIN 1 +#else +# define USE_SQRT_BUILTIN 0 +#endif + +#if defined __ARC_FPU_SP_DIV__ +# define USE_SQRTF_BUILTIN 1 +#else +# define USE_SQRTF_BUILTIN 0 +#endif + #define USE_SQRTL_BUILTIN 0 #define USE_SQRTF128_BUILTIN 0 -- 2.25.1 ___ linux-snps-arc mailing list linux-snps-arc@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-snps-arc
[PATCH] ARC: align child stack in clone
From: Pavel Kozlov The ARCv2 ABI requires 4 byte stack pointer alignment. Don't allow to use unaligned child stack in clone. As the stack grows down, align it down. This was pointed by misc/tst-misalign-clone-internal and misc/tst-misalign-clone tests. Stack alignmet fixes these tests fails. --- sysdeps/unix/sysv/linux/arc/clone.S | 1 + 1 file changed, 1 insertion(+) diff --git a/sysdeps/unix/sysv/linux/arc/clone.S b/sysdeps/unix/sysv/linux/arc/clone.S index bd924890844a..f32c83f17a65 100644 --- a/sysdeps/unix/sysv/linux/arc/clone.S +++ b/sysdeps/unix/sysv/linux/arc/clone.S @@ -41,6 +41,7 @@ ENTRY (__clone) cmp r0, 0 /* @fn can't be NULL. */ + and r1,r1,-4/* @child_stack be 4 bytes aligned per ABI. */ cmp.ne r1, 0 /* @child_stack can't be NULL. */ bz L (__sys_err) -- 2.25.1 ___ linux-snps-arc mailing list linux-snps-arc@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-snps-arc
Re: [PATCH] ARC: mm: fix leakage of memory allocated for PTE
> Good catch. Curious how did you find it. KMEMCHECK or some such or just oom. I've run glibc tests with 5.16 kernel and got many oom-killer messages and even kernel panic because lack of memory. These symptoms pointed to an issue. kmemleak didn't show anything. Didn't try kmemcheck. I prepared a small test and used git bisect to find to the "blame" commit. >> Cc: # 4.15.x > You meant 5.15.x Yes, you right, that's my typo. > Added to for-curr. Thanks! Regards, Pavel ___ linux-snps-arc mailing list linux-snps-arc@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-snps-arc
[PATCH] ARC: mm: fix leakage of memory allocated for PTE
From: Pavel Kozlov Since commit d9820ff ("ARC: mm: switch pgtable_t back to struct page *") a memory leakage problem occurs. Memory allocated for page table entries not released during process termination. This issue can be reproduced by a small program that allocates a large amount of memory. After several runs, you'll see that the amount of free memory has reduced and will continue to reduce after each run. All ARC CPUs are effected by this issue. The issue was introduced since the kernel stable release v5.15-rc1. As described in commit d9820ff after switch pgtable_t back to struct page *, a pointer to "struct page" and appropriate functions are used to allocate and free a memory page for PTEs, but the pmd_pgtable macro hasn't changed and returns the direct virtual address from the PMD (PGD) entry. Than this address used as a parameter in the __pte_free() and as a result this function couldn't release memory page allocated for PTEs. Fix this issue by changing the pmd_pgtable macro and returning pointer to struct page. Fixes: d9820ff76f95 ("ARC: mm: switch pgtable_t back to struct page *") Signed-off-by: Pavel Kozlov Cc: Vineet Gupta Cc: Mike Rapoport Cc: # 4.15.x --- arch/arc/include/asm/pgtable-levels.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/arc/include/asm/pgtable-levels.h b/arch/arc/include/asm/pgtable-levels.h index 64ca25d199be..ef68758b69f7 100644 --- a/arch/arc/include/asm/pgtable-levels.h +++ b/arch/arc/include/asm/pgtable-levels.h @@ -161,7 +161,7 @@ #define pmd_pfn(pmd) ((pmd_val(pmd) & PAGE_MASK) >> PAGE_SHIFT) #define pmd_page(pmd) virt_to_page(pmd_page_vaddr(pmd)) #define set_pmd(pmdp, pmd) (*(pmdp) = pmd) -#define pmd_pgtable(pmd) ((pgtable_t) pmd_page_vaddr(pmd)) +#define pmd_pgtable(pmd) ((pgtable_t) pmd_page(pmd)) /* * 4th level paging: pte -- 2.25.1 ___ linux-snps-arc mailing list linux-snps-arc@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-snps-arc