[PATCH] ARC: add hugetlb definitions

2023-12-13 Thread Pavel Kozlov
From: Pavel Kozlov 

Add hugetlb definitions if THP enabled. ARC doesn't support
HugeTLB FS but it supports THP. Some kernel code such as pagemap
uses hugetlb definitions with THP.

This patch fixes ARC build issue (HPAGE_SIZE undeclared error) with
TRANSPARENT_HUGEPAGE enabled.

Signed-off-by: Pavel Kozlov 
---
 arch/arc/include/asm/hugepage.h | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/arch/arc/include/asm/hugepage.h b/arch/arc/include/asm/hugepage.h
index ef8d4166370c..8a2441670a8f 100644
--- a/arch/arc/include/asm/hugepage.h
+++ b/arch/arc/include/asm/hugepage.h
@@ -10,6 +10,13 @@
 #include 
 #include 
 
+/*
+ * Hugetlb definitions.
+ */
+#define HPAGE_SHIFTPMD_SHIFT
+#define HPAGE_SIZE (_AC(1, UL) << HPAGE_SHIFT)
+#define HPAGE_MASK (~(HPAGE_SIZE - 1))
+
 static inline pte_t pmd_pte(pmd_t pmd)
 {
return __pte(pmd_val(pmd));
-- 
2.25.1


___
linux-snps-arc mailing list
linux-snps-arc@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-snps-arc


Re: [PATCH 0/5] ARC updates

2023-12-11 Thread Pavel . Kozlov
Hi Vineet,

> A pile of accrued changes, compile tested only.
> Please test.

> Thx,
> -Vineet

I'm testing you patches. At first glance everything is fine, no obvious
regressions.

But I've found an ARC build issue with the latest upstream source
(tag: v6.7-rc5). It is not linked with your updates.
The build issue occurs when the CONFIG_TRANSPARENT_HUGEPAGE option
is enabled:

| ../fs/proc/task_mmu.c: In function ‘pagemap_scan_thp_entry’:
| ../fs/proc/task_mmu.c:2115:28: error: ‘HPAGE_SIZE’ undeclared (first use in 
this function); did you mean ‘PAGE_SIZE’?
|  2115 | if (end != start + HPAGE_SIZE) {
|   |^~
|   |PAGE_SIZE
| ../fs/proc/task_mmu.c:2115:28: note: each undeclared identifier is reported 
only once for each function it appears in
| make[4]: *** [../scripts/Makefile.build:243: fs/proc/task_mmu.o] Error 1
| make[3]: *** [../scripts/Makefile.build:480: fs/proc] Error 2

May be you could address this issue in your updates too?

Thanks
Pavel

___
linux-snps-arc mailing list
linux-snps-arc@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-snps-arc


Re: [PATCH 00/20] ARC updates

2023-08-22 Thread Pavel Kozlov
> Hi,
> 
> This is a pile of miscll improvements/updates sitting in one of my local 
> trees.
> Given the recent warning fix, we coudl also push them out.
> @Alexey, @Shahab: care to give these a spin on hsdk (and test ARC700 
> build/boot on nSIM if possible).
> 
> Thx,
> -Vineet

The entire "ARC updates" patch series (with all subsequent patches with updates)
has been tested on the HSDK borard for ARCv2 using the glibc and ksefltest test 
framworks and on nSIM for ARCompact using the uclibc tests. No regressions 
found.

Tested-by: Pavel Kozlov 

Regards,
Pavel

___
linux-snps-arc mailing list
linux-snps-arc@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-snps-arc


Re: [PATCH 16/20] ARC: entry: Add more common chores to EXCEPTION_PROLOGUE

2023-08-18 Thread Pavel . Kozlov
Hi Vineet,

> Subject: [PATCH 16/20] ARC: entry: Add more common chores to
> EXCEPTION_PROLOGUE
> 
> THe high level structure of most ARC exception handlers is
>  1. save regfile with EXCEPTION_PROLOGUE
>  2. setup r0: EFA (not part of pt_regs)
>  3. setup r1: pointer to pt_regs (SP)
>  4. drop down to pure kernel mode (from exception)
>  5. call the Linux "C" handler
> 
> Remove the boiler plate code by moving #2, #3, #4 into #1.
> 
> The exceptions to most exceptions are syscall Trap and Machine check
> which don't do some of above for various reasons, so call a newly
> introduced variant EXCEPTION_PROLOGUE_KEEP_AE (same as original
> EXCEPTION_PROLOGUE)

I'm observing the ARC700 (nSIM) system freeze after this patch.

...
f000.serial: ttyS0 at MMIO 0xf000 (irq = 24, base_baud = 3125000) is a 
16550A
printk: console [ttyS0] enabled
printk: console [ttyS0] enabled
printk: bootconsole [uart8250] disabled
printk: bootconsole [uart8250] disabled
NET: Registered PF_PACKET protocol family
NET: Registered PF_KEY protocol family
clk: Disabling unused clocks
Freeing unused kernel image (initmem) memory: 2856K
This architecture does not have kernel memory protection.
Run /init as init process

> @@ -128,11 +123,6 @@ ENTRY(EV_PrivilegeV)
> 
>  EXCEPTION_PROLOGUE
> 
> -   lr  r0, [efa]
> -   mov r1, sp
> -
> -   FAKE_RET_FROM_EXCPN
> -
>  bl  do_privilege_fault
>  b   ret_from_exception

The same update is also required for the call_do_page_fault wrapper for 
ARcompact. 

Regards,
Pavel



___
linux-snps-arc mailing list
linux-snps-arc@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-snps-arc


Re: [PATCH 20/20] ARC: pt_regs: create seperate type for ecr

2023-08-17 Thread Pavel . Kozlov
Hi Vineet,

I'm testing your updates and ran into the same build issue reported by the 
build 
robot.
http://lists.infradead.org/pipermail/linux-snps-arc/2023-August/007522.html

> #ifdef CONFIG_ISA_ARCOMPACT
> @@ -40,18 +51,7 @@ struct pt_regs {
>   *Last word used by Linux for extra state mgmt 
> (syscall-restart)
>   * For interrupts, use artificial ECR values to note current 
> prio-level
>   */
> -   union {
> -   struct {
> -#ifdef CONFIG_CPU_BIG_ENDIAN
> -   unsigned long state:8, ecr_vec:8,
> - ecr_cause:8, ecr_param:8;
> -#else
> -   unsigned long ecr_param:8, ecr_cause:8,
> - ecr_vec:8, state:8;
> -#endif
> -   };
> -   unsigned long event;
> -   };
> +   ecr_reg ecr;
> }
>
> #define MAX_REG_OFFSET offsetof(struct pt_regs, event)

This change causes a build issue for ARC700, as the event field has been
removed and the MAX_REG_OFFSET macro hasn't been updated.

Regards,
Pavel

___
linux-snps-arc mailing list
linux-snps-arc@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-snps-arc


[PATCH] ARC: avoid unwanted gcc optimizations in atomic operations

2023-08-15 Thread Pavel . Kozlov
From: Pavel Kozlov 

Notify a compiler about write operations and prevent unwanted
optimizations. Add the "memory" clobber to the clobber list.

An obvious problem with unwanted compiler optimizations appeared after
the cpumask optimization commit 596ff4a09b89 ("cpumask: re-introduce
constant-sized cpumask optimizations").

After this commit the SMP kernels for ARC no longer loads because of
failed assert in the percpu allocator initialization routine:

percpu: BUG: failure at mm/percpu.c:2981/pcpu_build_alloc_info()!

The write operation performed by the scond instruction in the atomic
inline asm code is not properly passed to the compiler. The compiler
cannot correctly optimize a nested loop that runs through the cpumask
in the pcpu_build_alloc_info() function.

Add the "memory" clobber to fix this.

Link: https://github.com/foss-for-synopsys-dwc-arc-processors/linux/issues/135
Cc:  # v6.3+
Signed-off-by: Pavel Kozlov 
---
 arch/arc/include/asm/atomic-llsc.h| 6 +++---
 arch/arc/include/asm/atomic64-arcv2.h | 6 +++---
 2 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/arch/arc/include/asm/atomic-llsc.h 
b/arch/arc/include/asm/atomic-llsc.h
index 1b0ffaeee16d..5258cb81a16b 100644
--- a/arch/arc/include/asm/atomic-llsc.h
+++ b/arch/arc/include/asm/atomic-llsc.h
@@ -18,7 +18,7 @@ static inline void arch_atomic_##op(int i, atomic_t *v)   
\
: [val] "="   (val) /* Early clobber to prevent reg reuse */  \
: [ctr] "r" (>counter), /* Not "m": llock only supports reg 
direct addr mode */  \
  [i]   "ir"(i) \
-   : "cc");\
+   : "cc", "memory");  \
 }  \
 
 #define ATOMIC_OP_RETURN(op, asm_op)   \
@@ -34,7 +34,7 @@ static inline int arch_atomic_##op##_return_relaxed(int i, 
atomic_t *v)   \
: [val] "="   (val)   \
: [ctr] "r" (>counter),  \
  [i]   "ir"(i) \
-   : "cc");\
+   : "cc", "memory");  \
\
return val; \
 }
@@ -56,7 +56,7 @@ static inline int arch_atomic_fetch_##op##_relaxed(int i, 
atomic_t *v)\
  [orig] "=" (orig)   \
: [ctr] "r" (>counter),  \
  [i]   "ir"(i) \
-   : "cc");\
+   : "cc", "memory");  \
\
return orig;\
 }
diff --git a/arch/arc/include/asm/atomic64-arcv2.h 
b/arch/arc/include/asm/atomic64-arcv2.h
index 6b6db981967a..9b5791b85471 100644
--- a/arch/arc/include/asm/atomic64-arcv2.h
+++ b/arch/arc/include/asm/atomic64-arcv2.h
@@ -60,7 +60,7 @@ static inline void arch_atomic64_##op(s64 a, atomic64_t *v)   
\
"   bnz 1b  \n" \
: "="(val)\
: "r"(>counter), "ir"(a) \
-   : "cc");\
+   : "cc", "memory");  \
 }  \
 
 #define ATOMIC64_OP_RETURN(op, op1, op2)   \
@@ -77,7 +77,7 @@ static inline s64 arch_atomic64_##op##_return_relaxed(s64 a, 
atomic64_t *v)   \
"   bnz 1b  \n" \
: [val] "="(val)  \
: "r"(>counter), "ir"(a) \
-   : "cc");/* memory clobber comes from smp_mb() */\
+   : "cc", "memory");  \
\
return val; \
 

[PATCH 2/2] ARC: run child from the separate start block in __clone

2023-03-02 Thread Pavel . Kozlov
From: Pavel Kozlov 

For better debug experience use separate code block with extra
cfi_* directives to run child (same as in __clone3).
---
 sysdeps/unix/sysv/linux/arc/clone.S | 40 ++---
 1 file changed, 25 insertions(+), 15 deletions(-)

diff --git a/sysdeps/unix/sysv/linux/arc/clone.S 
b/sysdeps/unix/sysv/linux/arc/clone.S
index 766649625658..0029aaeb8170 100644
--- a/sysdeps/unix/sysv/linux/arc/clone.S
+++ b/sysdeps/unix/sysv/linux/arc/clone.S
@@ -20,9 +20,6 @@
 #include 
 #define _ERRNO_H   1
 #include 
-#include 
-
-#define CLONE_SETTLS   0x0008
 
 /* int clone(int (*fn)(void *), void *child_stack,
int flags, void *arg, ...
@@ -63,19 +60,9 @@ ENTRY (__clone)
ARC_TRAP_INSN
 
cmp r0, 0   /* return code : 0 new process, !0 parent.  */
+   beq thread_start_clone
blt L (__sys_err2)  /* < 0 (signed) error.  */
-   jnz [blink] /* Parent returns.  */
-
-   /* child jumps off to @fn with @arg as argument
-   TP register already set by kernel.  */
-   jl.d[r10]
-   mov r0, r11
-
-   /* exit() with result from @fn (already in r0).  */
-   mov r8, __NR_exit
-   ARC_TRAP_INSN
-   /* In case it ever came back.  */
-   flag1
+   j   [blink] /* Parent returns.  */
 
 L (__sys_err):
mov r0, -EINVAL
@@ -89,5 +76,28 @@ L (__sys_err2):
   position independent.  */
b   __syscall_error
 PSEUDO_END (__clone)
+
+
+   .align 4
+   .type thread_start_clone, %function
+thread_start_clone:
+   cfi_startproc
+   /* Terminate call stack by noting ra is undefined.  */
+   cfi_undefined (blink)
+
+   /* Child jumps off to @fn with @arg as argument.  */
+   jl.d[r10]
+   mov r0, r11
+
+   /* exit() with result from @fn (already in r0).  */
+   mov r8, __NR_exit
+   ARC_TRAP_INSN
+
+   /* In case it ever came back.  */
+   flag1
+
+   cfi_endproc
+   .size thread_start_clone, .-thread_start_clone
+
 libc_hidden_def (__clone)
 weak_alias (__clone, clone)
-- 
2.25.1


___
linux-snps-arc mailing list
linux-snps-arc@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-snps-arc


[PATCH 1/2] ARC: Add the clone3 wrapper

2023-03-02 Thread Pavel . Kozlov
From: Pavel Kozlov 

Use the clone3 wrapper on ARC. It doesn't care about stack alignment.
All callers should provide an aligned stack.
It follows the internal signature:

extern int clone3 (struct clone_args *__cl_args, size_t __size,
 int (*__func) (void *__arg), void *__arg);
---
Checked on arc-linux-gnu. Previously observed tst-misaling-clone-internal
test fail was because I used outdated master branch.
Full testsuite runs without regressions.
But I also see fail of the new tst-spawn7, as already repoted at [1].

[1]
https://sourceware.org/pipermail/libc-alpha/2023-February/145937.html

 sysdeps/unix/sysv/linux/arc/clone3.S | 90 
 sysdeps/unix/sysv/linux/arc/sysdep.h |  2 +
 2 files changed, 92 insertions(+)
 create mode 100644 sysdeps/unix/sysv/linux/arc/clone3.S

diff --git a/sysdeps/unix/sysv/linux/arc/clone3.S 
b/sysdeps/unix/sysv/linux/arc/clone3.S
new file mode 100644
index ..87a8272a3977
--- /dev/null
+++ b/sysdeps/unix/sysv/linux/arc/clone3.S
@@ -0,0 +1,90 @@
+/* The clone3 syscall wrapper.  Linux/arc version.
+   Copyright (C) 2023 Free Software Foundation, Inc.
+
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <https://www.gnu.org/licenses/>.  */
+
+#include 
+#define _ERRNO_H   1
+#include 
+
+/* The userland implementation is:
+   int clone3 (struct clone_args *cl_args, size_t size,
+   int (*func)(void *arg), void *arg);
+
+   the kernel entry is:
+   int clone3 (struct clone_args *cl_args, size_t size);
+
+   The parameters are passed in registers from userland:
+   r0: cl_args
+   r1: size
+   r2: func
+   r3: arg  */
+
+ENTRY(__clone3)
+
+   /* Save args for the child.  */
+   mov r10, r0 /* cl_args  */
+   mov r11, r2 /* func  */
+   mov r12, r3 /* args  */
+
+   /* Sanity check args.  */
+   breqr10, 0, L (__sys_err)   /* No NULL cl_args pointer.  */
+   breqr11, 0, L (__sys_err)   /* No NULL function pointer.  */
+
+   /* Do the system call, the kernel expects:
+  r8: system call number
+  r0: cl_args
+  r1: size  */
+   mov r0, r10
+   mov r8, __NR_clone3
+   ARC_TRAP_INSN
+
+   cmp r0, 0
+   beq thread_start_clone3 /* Child returns.  */
+   blt L (__sys_err2)
+   j   [blink] /* Parent returns.  */
+
+L (__sys_err):
+   mov r0, -EINVAL
+L (__sys_err2):
+   b   __syscall_error
+PSEUDO_END (__clone3)
+
+
+   .align 4
+   .type thread_start_clone3, %function
+thread_start_clone3:
+   cfi_startproc
+   /* Terminate call stack by noting ra is undefined.  */
+   cfi_undefined (blink)
+
+   /* Child jumps off to @fn with @arg as argument.  */
+   jl.d[r11]
+   mov r0, r12
+
+   /* exit() with result from @fn (already in r0).  */
+   mov r8, __NR_exit
+   ARC_TRAP_INSN
+
+   /* In case it ever came back.  */
+   flag1
+
+   cfi_endproc
+   .size thread_start_clone3, .-thread_start_clone3
+
+libc_hidden_def (__clone3)
+weak_alias (__clone3, clone3)
diff --git a/sysdeps/unix/sysv/linux/arc/sysdep.h 
b/sysdeps/unix/sysv/linux/arc/sysdep.h
index dd6fe73445f9..88dc1dff017f 100644
--- a/sysdeps/unix/sysv/linux/arc/sysdep.h
+++ b/sysdeps/unix/sysv/linux/arc/sysdep.h
@@ -141,6 +141,8 @@ hidden_proto (__syscall_error)
 
 # define ARC_TRAP_INSN "trap_s 0   \n\t"
 
+# define HAVE_CLONE3_WRAPPER   1
+
 # undef INTERNAL_SYSCALL_NCS
 # define INTERNAL_SYSCALL_NCS(number, nr_args, args...)\
   ({   \
-- 
2.25.1


___
linux-snps-arc mailing list
linux-snps-arc@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-snps-arc


Pavel new maintainer of ARC port

2023-02-21 Thread Pavel Kozlov
Hi all,

I'm excited to introduce myself and become a part of the community. 

I'm a software engineer in Synopsys and member of a team working to enhance 
support for ARC CPUs in the GNU Linux ecosystem.

I would appreciate any guidance or advice on how I can get involved and 
contribute to the community. I'm always looking for opportunities to connect 
with other developers and learn from their experiences.

As a final step of setup, I've updated wiki and added myself as ARC 
maintainer. Also, thanks Carlos, for help with my setup.

Best regards,
Pavel Kozlov
___
linux-snps-arc mailing list
linux-snps-arc@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-snps-arc


[PATCH] ARC: fix compile time assert fail for the config with PAE40 and 4K pages

2023-02-10 Thread Pavel . Kozlov
From: Pavel Kozlov 

Add support for the configuration with 4K Page Size and enabled ARC_PAE40.
Set two-level Page Table to 10:9:12, as with PAE40 a 4k page can settle
only 512 entries (with PAE40 size of PTE entry increases from 4 to 8
bytes).

In this configuration the Page Table can describe only 31-bit (2Gb) virtual
space, but it is not a problem as the ARC MMUv4 supports only 2Gb of
virtual address.
This patch doesn't affect other configurations.

Patch fixes compile time assert fail:

  include/linux/compiler_types.h:328:45: error:
  call to '__compiletime_assert_288' declared with attribute error:
  BUILD_BUG_ON failed: (PTRS_PER_PTE * sizeof(pte_t)) > PAGE_SIZE

Reported-by: kernel test robot 
Signed-off-by: Pavel Kozlov 
---
Added the same ARC_VADDR_BITS macro name as used in ARCv3 port.
The 4K Page Size and enabled PAE configuration was tested on nSIM:
load, allocation/free of all memory (low and high), user space
tests from glibc test suite (malloc, string).

 arch/arc/include/asm/pgtable-levels.h | 20 +---
 1 file changed, 17 insertions(+), 3 deletions(-)

diff --git a/arch/arc/include/asm/pgtable-levels.h 
b/arch/arc/include/asm/pgtable-levels.h
index ef68758b69f7..3b6afee9c272 100644
--- a/arch/arc/include/asm/pgtable-levels.h
+++ b/arch/arc/include/asm/pgtable-levels.h
@@ -42,12 +42,22 @@
  * (so 4K page can only have 1K entries: or 10 bits)
  */
 #ifdef CONFIG_ARC_PAGE_SIZE_4K
+#ifdef CONFIG_ARC_HAS_PAE40
+/*
+ * For PAE40 and 4K page size set 10:9:12 Page Table
+ * (as with PAE40 4k page can only have 512 entries)
+ * Page Table can describe only 31-bit (2Gb) virtual space
+ */
+#define PGDIR_SHIFT21
+#define ARC_VADDR_BITS 31
+#else
 #define PGDIR_SHIFT22
+#endif /* CONFIG_ARC_HAS_PAE40 */
 #else
 #define PGDIR_SHIFT21
-#endif
+#endif /* CONFIG_ARC_PAGE_SIZE_4K */
 
-#endif
+#endif /* CONFIG_ARC_HUGEPAGE_16M */
 
 #else /* CONFIG_PGTABLE_LEVELS != 2 */
 
@@ -67,9 +77,13 @@
 
 #endif /* CONFIG_PGTABLE_LEVELS */
 
+#ifndef ARC_VADDR_BITS
+#define ARC_VADDR_BITS 32
+#endif
+
 #define PGDIR_SIZE BIT(PGDIR_SHIFT)
 #define PGDIR_MASK (~(PGDIR_SIZE - 1))
-#define PTRS_PER_PGD   BIT(32 - PGDIR_SHIFT)
+#define PTRS_PER_PGD   BIT(ARC_VADDR_BITS - PGDIR_SHIFT)
 
 #if CONFIG_PGTABLE_LEVELS > 3
 #define PUD_SIZE   BIT(PUD_SHIFT)
-- 
2.25.1


___
linux-snps-arc mailing list
linux-snps-arc@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-snps-arc


Re: Pavel steping up for ARC glibc maintenance

2023-02-01 Thread Pavel Kozlov
Hi Carlos,
Hi all,

Thanks for your greeting.
I'm glad to be here.

Can you please add me to the wiki EditorGroup?
My wiki account: PavelKozlov
I have some test results that I want to post for the coming release.

I intent to help to maintain ARC glibc port, so feel free to ask if
some code, HW verification or review will be required, all concerning ARC.

I've requested the sourceware.org account.
Looking forward to the next steps.

Thank you,
Pavel


From: Carlos O'Donell 
Sent: Wednesday, February 1, 2023 12:33 AM
To: Vineet Gupta 
Cc: arcml ; Pavel Kozlov 
; GNU C Library 
Subject: Re: Pavel steping up for ARC glibc maintenance 
 
On 1/30/23 22:07, Vineet Gupta wrote:
> Hi Carlos,
> 
> I'd like to introduce Pavel who intends to step up to maintain ARC glibc port 
> and do periodic wiki updates and such.
> I've pointed him to links [1] and [2].

Sounds great! Welcome Pavel!
 
> To begin with can he be granted wiki edit access and subsequently also 
> approve his write access to glibc sourceware repo.

We can absolutely help with the next steps.

Please follow [2] and if you get stuck anywhere just reach out on IRC or by 
email.

> Thx,
> -Vineet
> 
> 
> [1] 
> https://urldefense.com/v3/__https://sourceware.org/glibc/wiki/MAINTAINERS*Becoming_a_maintainer_.28developer.29__;Iw!!A4F2R9G_pg!aaKwbgpslcHvrrdypX2BrPwxZt7I0x_g_mYJUHIfZhZk-D9UuJWQsmzd6SqrQWU9KTROPZ8zqEyfPU98J1EM$
>  
> [2] 
> https://urldefense.com/v3/__https://sourceware.org/glibc/wiki/MAINTAINERS*AccountsOnSourceware__;Iw!!A4F2R9G_pg!aaKwbgpslcHvrrdypX2BrPwxZt7I0x_g_mYJUHIfZhZk-D9UuJWQsmzd6SqrQWU9KTROPZ8zqEyfPTXW0Rz_$
>  
> 

-- 
Cheers,
Carlos.

___
linux-snps-arc mailing list
linux-snps-arc@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-snps-arc


Re: [PATCH] ARC: align child stack in clone

2023-01-25 Thread Pavel Kozlov
> LGTM, although I can't really test it since the Synopsys qemu tree does not
> have qemu-user support [1].

Thank you for the review. I've checked this patch on QEMU with Linux system
and on the ARC HSDK board.
Also, I can say that it is possible to run ARC binaries on QEMU [1] in
user mode. Currently not all instructions are supported in ARC QEMU and it
is recommended to build binaries with extra -mcpu=archs compiler option,
to reduce instruction set. Maybe this [2] will be also useful.
It will be great to have this patch in coming 2.37.

[1]
https://github.com/foss-for-synopsys-dwc-arc-processors/qemu
[2]
https://github.com/foss-for-synopsys-dwc-arc-processors/glibc/wiki/Glibc-test-suite-for-a-target-without-native-GNU-toolchain#glibc-test-suite-execution-with-qemu-user-mode-emulation

--
Pavel


___
linux-snps-arc mailing list
linux-snps-arc@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-snps-arc


Re: [PATCH] ARC:fpu: add extra capability check before use of sqrt and fma builtins

2023-01-17 Thread Pavel Kozlov
> This is wrong, sqrt use macro do not belong for the fma switch file.

Thank you for the review and your notice. I was inattentive when 
moving changes from the branch I had. This has been fixed in
v2 of the patch [1].
I've manually checked (by objdump -d output review) that now 
expected code for libm is generated in different cases (when compiler 
provides support for extra instructions and sets macroses 
__ARC_FPU_DP_DIV__, __ARC_FPU_SP_DIV__, __ARC_FPU_DP_FMA__, 
__ARC_FPU_SP_FMA__ and when not).

[1]
http://lists.infradead.org/pipermail/linux-snps-arc/2023-January/006771.html

--
Pavel


___
linux-snps-arc mailing list
linux-snps-arc@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-snps-arc


[PATCH v2] ARC:fpu: add extra capability check before use of sqrt and fma builtins

2023-01-17 Thread Pavel . Kozlov
From: Pavel Kozlov 

Add extra check for compiler definitions to ensure that compiler provides
sqrt and fma hw fpu instructions else use software implementation.

As divide/sqrt and FMA hw support from CPU side is optional,
the compiler can be configured by options to generate hw FPU instructions,
but without use of FDDIV, FDSQRT, FSDIV, FSSQRT, FDMADD and FSMADD
instructions. In this case __builtin_sqrt and __builtin_sqrtf provided by
compiler can't be used inside the glibc code, as these builtins are used
in implementations of sqrt() and sqrtf() functions but at the same time
these builtins unfold to sqrt() and sqrtf(). So it is possible to receive
code like that:

0001c4b4 <__ieee754_sqrtf>:
   1c4b4:0001   b 0 ;1c4b4 <__ieee754_sqrtf>

The same is also true for __builtin_fma and __builtin_fmaf.

---
Changes in v2:
 - Fixed macros definitions for FMA

 sysdeps/arc/fpu/math-use-builtins-fma.h  | 14 --
 sysdeps/arc/fpu/math-use-builtins-sqrt.h | 14 --
 2 files changed, 24 insertions(+), 4 deletions(-)

diff --git a/sysdeps/arc/fpu/math-use-builtins-fma.h 
b/sysdeps/arc/fpu/math-use-builtins-fma.h
index eede75aa41be..2acd8113ce2c 100644
--- a/sysdeps/arc/fpu/math-use-builtins-fma.h
+++ b/sysdeps/arc/fpu/math-use-builtins-fma.h
@@ -1,4 +1,14 @@
-#define USE_FMA_BUILTIN 1
-#define USE_FMAF_BUILTIN 1
+#if defined __ARC_FPU_DP_FMA__
+# define USE_FMA_BUILTIN 1
+#else
+# define USE_FMA_BUILTIN 0
+#endif
+
+#if defined __ARC_FPU_SP_FMA__
+# define USE_FMAF_BUILTIN 1
+#else
+# define USE_FMAF_BUILTIN 0
+#endif
+
 #define USE_FMAL_BUILTIN 0
 #define USE_FMAF128_BUILTIN 0
diff --git a/sysdeps/arc/fpu/math-use-builtins-sqrt.h 
b/sysdeps/arc/fpu/math-use-builtins-sqrt.h
index e94c915ba66a..a449bc609295 100644
--- a/sysdeps/arc/fpu/math-use-builtins-sqrt.h
+++ b/sysdeps/arc/fpu/math-use-builtins-sqrt.h
@@ -1,4 +1,14 @@
-#define USE_SQRT_BUILTIN 1
-#define USE_SQRTF_BUILTIN 1
+#if defined __ARC_FPU_DP_DIV__
+# define USE_SQRT_BUILTIN 1
+#else
+# define USE_SQRT_BUILTIN 0
+#endif
+
+#if defined __ARC_FPU_SP_DIV__
+# define USE_SQRTF_BUILTIN 1
+#else
+# define USE_SQRTF_BUILTIN 0
+#endif
+
 #define USE_SQRTL_BUILTIN 0
 #define USE_SQRTF128_BUILTIN 0
-- 
2.25.1


___
linux-snps-arc mailing list
linux-snps-arc@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-snps-arc


[PATCH] ARC:fpu: add extra capability check before use of sqrt and fma builtins

2022-12-21 Thread Pavel . Kozlov
From: Pavel Kozlov 

Add extra check for compiler definitions to ensure that compiler provides
sqrt and fma hw fpu instructions else use software implementation.

As divide/sqrt and FMA hw support from CPU side is optional,
the compiler can be configured by options to generate hw FPU instructions,
but without use of FDDIV, FDSQRT, FSDIV, FSSQRT, FDMADD and FSMADD
instructions. In this case __builtin_sqrt and __builtin_sqrtf provided by
compiler can't be used inside the glibc code, as these builtins are used
in implementations of sqrt() and sqrtf() functions but at the same time
these builtins unfold to sqrt() and sqrtf(). So it is possible to receive
code like that:

0001c4b4 <__ieee754_sqrtf>:
   1c4b4:0001   b 0 ;1c4b4 <__ieee754_sqrtf>

The same is also true for __builtin_fma and __builtin_fmaf.
---
 sysdeps/arc/fpu/math-use-builtins-fma.h  | 14 --
 sysdeps/arc/fpu/math-use-builtins-sqrt.h | 14 --
 2 files changed, 24 insertions(+), 4 deletions(-)

diff --git a/sysdeps/arc/fpu/math-use-builtins-fma.h 
b/sysdeps/arc/fpu/math-use-builtins-fma.h
index eede75aa41be..082badf48201 100644
--- a/sysdeps/arc/fpu/math-use-builtins-fma.h
+++ b/sysdeps/arc/fpu/math-use-builtins-fma.h
@@ -1,4 +1,14 @@
-#define USE_FMA_BUILTIN 1
-#define USE_FMAF_BUILTIN 1
+#if defined __ARC_FPU_DP_DIV__
+# define USE_SQRT_BUILTIN 1
+#else
+# define USE_SQRT_BUILTIN 0
+#endif
+
+#if defined __ARC_FPU_SP_DIV__
+# define USE_SQRTF_BUILTIN 1
+#else
+# define USE_SQRTF_BUILTIN 0
+#endif
+
 #define USE_FMAL_BUILTIN 0
 #define USE_FMAF128_BUILTIN 0
diff --git a/sysdeps/arc/fpu/math-use-builtins-sqrt.h 
b/sysdeps/arc/fpu/math-use-builtins-sqrt.h
index e94c915ba66a..a449bc609295 100644
--- a/sysdeps/arc/fpu/math-use-builtins-sqrt.h
+++ b/sysdeps/arc/fpu/math-use-builtins-sqrt.h
@@ -1,4 +1,14 @@
-#define USE_SQRT_BUILTIN 1
-#define USE_SQRTF_BUILTIN 1
+#if defined __ARC_FPU_DP_DIV__
+# define USE_SQRT_BUILTIN 1
+#else
+# define USE_SQRT_BUILTIN 0
+#endif
+
+#if defined __ARC_FPU_SP_DIV__
+# define USE_SQRTF_BUILTIN 1
+#else
+# define USE_SQRTF_BUILTIN 0
+#endif
+
 #define USE_SQRTL_BUILTIN 0
 #define USE_SQRTF128_BUILTIN 0
-- 
2.25.1


___
linux-snps-arc mailing list
linux-snps-arc@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-snps-arc


[PATCH] ARC: align child stack in clone

2022-12-21 Thread Pavel . Kozlov
From: Pavel Kozlov 

The ARCv2 ABI requires 4 byte stack pointer alignment. Don't allow to
use unaligned child stack in clone. As the stack grows down,
align it down.

This was pointed by misc/tst-misalign-clone-internal and
misc/tst-misalign-clone tests. Stack alignmet fixes these tests
fails.
---
 sysdeps/unix/sysv/linux/arc/clone.S | 1 +
 1 file changed, 1 insertion(+)

diff --git a/sysdeps/unix/sysv/linux/arc/clone.S 
b/sysdeps/unix/sysv/linux/arc/clone.S
index bd924890844a..f32c83f17a65 100644
--- a/sysdeps/unix/sysv/linux/arc/clone.S
+++ b/sysdeps/unix/sysv/linux/arc/clone.S
@@ -41,6 +41,7 @@
 
 ENTRY (__clone)
cmp r0, 0   /* @fn can't be NULL.  */
+   and r1,r1,-4/* @child_stack be 4 bytes aligned per ABI.  */
cmp.ne  r1, 0   /* @child_stack can't be NULL.  */
bz  L (__sys_err)
 
-- 
2.25.1


___
linux-snps-arc mailing list
linux-snps-arc@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-snps-arc


Re: [PATCH] ARC: mm: fix leakage of memory allocated for PTE

2022-10-19 Thread Pavel Kozlov
> Good catch. Curious how did you find it. KMEMCHECK or some such or just oom.

I've run glibc tests with 5.16 kernel and got many oom-killer messages and even 
kernel 
panic because lack of memory. These symptoms pointed to an issue. kmemleak 
didn't 
show anything. Didn't try kmemcheck. I prepared a small test and used git 
bisect to find
to the "blame" commit.

>> Cc:  # 4.15.x

> You meant 5.15.x

Yes, you right, that's my typo.

> Added to for-curr.

Thanks!

Regards,
Pavel



___
linux-snps-arc mailing list
linux-snps-arc@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-snps-arc


[PATCH] ARC: mm: fix leakage of memory allocated for PTE

2022-10-17 Thread Pavel . Kozlov
From: Pavel Kozlov 

Since commit d9820ff ("ARC: mm: switch pgtable_t back to struct page *")
a memory leakage problem occurs. Memory allocated for page table entries
not released during process termination. This issue can be reproduced by
a small program that allocates a large amount of memory. After several
runs, you'll see that the amount of free memory has reduced and will
continue to reduce after each run. All ARC CPUs are effected by this
issue. The issue was introduced since the kernel stable release v5.15-rc1.

As described in commit d9820ff after switch pgtable_t back to struct
page *, a pointer to "struct page" and appropriate functions are used to
allocate and free a memory page for PTEs, but the pmd_pgtable macro hasn't
changed and returns the direct virtual address from the PMD (PGD) entry.
Than this address used as a parameter in the __pte_free() and as a result
this function couldn't release memory page allocated for PTEs.

Fix this issue by changing the pmd_pgtable macro and returning pointer to
struct page.

Fixes: d9820ff76f95 ("ARC: mm: switch pgtable_t back to struct page *")
Signed-off-by: Pavel Kozlov 
Cc: Vineet Gupta 
Cc: Mike Rapoport 
Cc:  # 4.15.x
---
 arch/arc/include/asm/pgtable-levels.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/arc/include/asm/pgtable-levels.h 
b/arch/arc/include/asm/pgtable-levels.h
index 64ca25d199be..ef68758b69f7 100644
--- a/arch/arc/include/asm/pgtable-levels.h
+++ b/arch/arc/include/asm/pgtable-levels.h
@@ -161,7 +161,7 @@
 #define pmd_pfn(pmd)   ((pmd_val(pmd) & PAGE_MASK) >> PAGE_SHIFT)
 #define pmd_page(pmd)  virt_to_page(pmd_page_vaddr(pmd))
 #define set_pmd(pmdp, pmd) (*(pmdp) = pmd)
-#define pmd_pgtable(pmd)   ((pgtable_t) pmd_page_vaddr(pmd))
+#define pmd_pgtable(pmd)   ((pgtable_t) pmd_page(pmd))
 
 /*
  * 4th level paging: pte
-- 
2.25.1


___
linux-snps-arc mailing list
linux-snps-arc@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-snps-arc