[PATCH V12 14/14] RISC-V: paravirt: pvqspinlock: Add trace point for pv_kick/wait

2023-12-25 Thread guoren
From: Guo Ren Add trace point for pv_kick, here is the output: ls /sys/kernel/debug/tracing/events/paravirt/ enable filter pv_kick pv_wait cat /sys/kernel/debug/tracing/trace entries-in-buffer/entries-written: 33927/33927 #P:12 _-=>

[PATCH V12 13/14] RISC-V: paravirt: pvqspinlock: Add kconfig entry

2023-12-25 Thread guoren
From: Guo Ren Add kconfig entry for paravirt_spinlock, an unfair qspinlock virtualization-friendly backend, by halting the virtual CPU rather than spinning. Reviewed-by: Leonardo Bras Signed-off-by: Guo Ren Signed-off-by: Guo Ren --- arch/riscv/Kconfig | 12

[PATCH V12 12/14] RISC-V: paravirt: pvqspinlock: Add nopvspin kernel parameter

2023-12-25 Thread guoren
From: Guo Ren Disables the qspinlock slow path using PV optimizations which allow the hypervisor to 'idle' the guest on lock contention. Reviewed-by: Leonardo Bras Signed-off-by: Guo Ren Signed-off-by: Guo Ren --- Documentation/admin-guide/kernel-parameters.txt | 2 +-

[PATCH V12 11/14] RISC-V: paravirt: pvqspinlock: Add SBI implementation

2023-12-25 Thread guoren
From: Guo Ren Implement pv_kick with SBI guest implementation, and add SBI_EXT_PVLOCK extension detection. The backend part is in the KVM pvqspinlock patch. Reviewed-by: Leonardo Bras Signed-off-by: Guo Ren Signed-off-by: Guo Ren --- arch/riscv/include/asm/sbi.h | 6 ++

[PATCH V12 10/14] RISC-V: paravirt: Add pvqspinlock frontend skeleton

2023-12-25 Thread guoren
From: Guo Ren Using static_call to switch between: native_queued_spin_lock_slowpath()__pv_queued_spin_lock_slowpath() native_queued_spin_unlock() __pv_queued_spin_unlock() Finish the pv_wait implementation, but pv_kick needs the SBI definition of the next patches.

[PATCH V12 09/14] RISC-V: paravirt: Add pvqspinlock KVM backend

2023-12-25 Thread guoren
From: Guo Ren Add the files functions needed to support the SBI PVLOCK (paravirt qspinlock kick_cpu) extension. Implement kvm_sbi_ext_pvlock_kick_- cpu(), and we only need to call the kvm_vcpu_kick() and bring target_vcpu from the halt state. No irq raised, no other request, just a pure

[PATCH V12 08/14] riscv: qspinlock: Force virt_spin_lock for KVM guests

2023-12-25 Thread guoren
From: Guo Ren Force to enable virt_spin_lock when KVM guest, because fair locks have horrible lock 'holder' preemption issues. Suggested-by: Leonardo Bras Link: https://lkml.kernel.org/kvm/zqk9-tn2mepxl...@redhat.com/ Signed-off-by: Guo Ren Signed-off-by: Guo Ren ---

[PATCH V12 07/14] riscv: qspinlock: Add virt_spin_lock() support for VM guest

2023-12-25 Thread guoren
From: Guo Ren Add a static key controlling whether virt_spin_lock() should be called or not. When running on bare metal set the new key to false. The VM guests should fall back to a Test-and-Set spinlock, because fair locks have horrible lock 'holder' preemption issues. The virt_spin_lock_key

[PATCH V12 06/14] riscv: qspinlock: Introduce combo spinlock

2023-12-25 Thread guoren
From: Guo Ren Combo spinlock could support queued and ticket in one Linux Image and select them during boot time via command line. Here is the func size (Bytes) comparison table below: TYPE: COMBO | TICKET | QUEUED arch_spin_lock : 106 | 60 | 50

[PATCH V12 05/14] riscv: qspinlock: Add basic queued_spinlock support

2023-12-25 Thread guoren
From: Guo Ren The requirements of qspinlock have been documented by commit: a8ad07e5240c ("asm-generic: qspinlock: Indicate the use of mixed-size atomics"). Although RISC-V ISA gives out a weaker forward guarantee LR/SC, which doesn't satisfy the requirements of qspinlock above, it won't

[PATCH V12 04/14] riscv: qspinlock: errata: Add ERRATA_THEAD_WRITE_ONCE fixup

2023-12-25 Thread guoren
From: Guo Ren The early version of T-Head C9xx cores has a store merge buffer delay problem. The store merge buffer could improve the store queue performance by merging multi-store requests, but when there are not continued store requests, the prior single store request would be waiting in the

[PATCH V12 03/14] riscv: errata: Move errata vendor func-id into vendorid_list.h

2023-12-25 Thread guoren
From: Guo Ren Move errata vendor func-id definitions from errata_list into vendorid_list.h. Unifying these definitions is also for following rwonce errata implementation. Suggested-by: Leonardo Bras Link: https://lore.kernel.org/linux-riscv/zqlfj1cmq8pao...@redhat.com/ Signed-off-by: Guo Ren

[PATCH V12 02/14] asm-generic: ticket-lock: Add separate ticket-lock.h

2023-12-25 Thread guoren
From: Guo Ren Add a separate ticket-lock.h to include multiple spinlock versions and select one at compile time or runtime. Reviewed-by: Leonardo Bras Suggested-by: Arnd Bergmann Link: https://lore.kernel.org/linux-riscv/cak8p3a2rnz9mqqhn6-e0cguuv9rntrelfdxt_weid7fxh7f...@mail.gmail.com/

[PATCH V12 01/14] asm-generic: ticket-lock: Reuse arch_spinlock_t of qspinlock

2023-12-25 Thread guoren
From: Guo Ren The arch_spinlock_t of qspinlock has contained the atomic_t val, which satisfies the ticket-lock requirement. Thus, unify the arch_spinlock_t into qspinlock_types.h. This is the preparation for the next combo spinlock. Reviewed-by: Leonardo Bras Suggested-by: Arnd Bergmann Link:

[PATCH V12 00/14] riscv: Add Native/Paravirt qspinlock support

2023-12-25 Thread guoren
From: Guo Ren patch[1 - 8]: Native qspinlock patch[9 -14]: Paravirt qspinlock This series based on: - v6.7-rc7 - Rework & improve riscv cmpxchg.h and atomic.h https://lore.kernel.org/linux-riscv/20230810040349.92279-2-leob...@redhat.com/ You can directly try it:

[PATCH] set_thread_area.2: Add C-SKY document

2023-10-15 Thread guoren
From: Guo Ren C-SKY only needs set_thread_area, no need for get_thread_area, the same as MIPS. Signed-off-by: Guo Ren Signed-off-by: Guo Ren --- man2/set_thread_area.2 | 10 +- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/man2/set_thread_area.2

[PATCH 3/3] riscv: Cleanup deprecated function strlen_user

2021-04-20 Thread guoren
From: Guo Ren $ grep strlen_user * -r arch/csky/include/asm/uaccess.h:#define strlen_user(str) strnlen_user(str, 32767) arch/csky/lib/usercopy.c: * strlen_user: - Get the size of a string in user space. arch/ia64/lib/strlen.S: // Please note that in the case of strlen() as opposed to

[PATCH 2/3] nios2: Cleanup deprecated function strlen_user

2021-04-20 Thread guoren
From: Guo Ren $ grep strlen_user * -r arch/csky/include/asm/uaccess.h:#define strlen_user(str) strnlen_user(str, 32767) arch/csky/lib/usercopy.c: * strlen_user: - Get the size of a string in user space. arch/ia64/lib/strlen.S: // Please note that in the case of strlen() as opposed to

[PATCH 1/3] nds32: Cleanup deprecated function strlen_user

2021-04-20 Thread guoren
From: Guo Ren $ grep strlen_user * -r arch/csky/include/asm/uaccess.h:#define strlen_user(str) strnlen_user(str, 32767) arch/csky/lib/usercopy.c: * strlen_user: - Get the size of a string in user space. arch/ia64/lib/strlen.S: // Please note that in the case of strlen() as opposed to

[PATCH v2 (RESEND) 2/2] riscv: atomic: Using ARCH_ATOMIC in asm/atomic.h

2021-04-16 Thread guoren
From: Guo Ren The linux/atomic-arch-fallback.h has been there for a while, but only x86 & arm64 support it. Let's make riscv follow the linux/arch/* development trendy and make the codes more readable and maintainable. This patch also cleanup some codes: - Add atomic_andnot_* operation -

[PATCH v2 (RESEND) 1/2] locking/atomics: Fixup GENERIC_ATOMIC64 conflict with atomic-arch-fallback.h

2021-04-16 Thread guoren
From: Guo Ren Current GENERIC_ATOMIC64 in atomic-arch-fallback.h is broken. When a 32-bit arch use atomic-arch-fallback.h will cause compile error. In file included from include/linux/atomic.h:81, from include/linux/rcupdate.h:25, from

[PATCH v2 2/2] riscv: atomic: Using ARCH_ATOMIC in asm/atomic.h

2021-04-16 Thread guoren
From: Guo Ren The linux/atomic-arch-fallback.h has been there for a while, but only x86 & arm64 support it. Let's make riscv follow the linux/arch/* development trendy and make the codes more readable and maintainable. This patch also cleanup some codes: - Add atomic_andnot_* operation -

[PATCH v2 1/2] locking/atomics: Fixup GENERIC_ATOMIC64 conflict with atomic-arch-fallback.h

2021-04-16 Thread guoren
From: Guo Ren Current GENERIC_ATOMIC64 in atomic-arch-fallback.h is broken. When a 32-bit arch use atomic-arch-fallback.h will cause compile error. In file included from include/linux/atomic.h:81, from include/linux/rcupdate.h:25, from

[PATCH] riscv: atomic: Using ARCH_ATOMIC in asm/atomic.h

2021-04-15 Thread guoren
From: Guo Ren The linux/atomic-arch-fallback.h has been there for a while, but only x86 & arm64 support it. Let's make riscv follow the linux/arch/* development trendy and make the codes more readable and maintainable. This patch also cleanup some codes: - Add atomic_andnot_* operation -

[PATCH v6 9/9] powerpc/qspinlock: Add ARCH_USE_QUEUED_SPINLOCKS_XCHG32

2021-03-31 Thread guoren
From: Guo Ren We don't have native hw xchg16 instruction, so let qspinlock generic code to deal with it. Using the full-word atomic xchg instructions implement xchg16 has the semantic risk for atomic operations. This patch cancels the dependency of on qspinlock generic code on architecture's

[PATCH v6 8/9] xtensa: qspinlock: Add ARCH_USE_QUEUED_SPINLOCKS_XCHG32

2021-03-31 Thread guoren
From: Guo Ren We don't have native hw xchg16 instruction, so let qspinlock generic code to deal with it. Using the full-word atomic xchg instructions implement xchg16 has the semantic risk for atomic operations. This patch cancels the dependency of on qspinlock generic code on architecture's

[PATCH v6 7/9] sparc: qspinlock: Add ARCH_USE_QUEUED_SPINLOCKS_XCHG32

2021-03-31 Thread guoren
From: Guo Ren We don't have native hw xchg16 instruction, so let qspinlock generic code to deal with it. Using the full-word atomic xchg instructions implement xchg16 has the semantic risk for atomic operations. This patch cancels the dependency of on qspinlock generic code on architecture's

[PATCH v6 6/9] openrisc: qspinlock: Add ARCH_USE_QUEUED_SPINLOCKS_XCHG32

2021-03-31 Thread guoren
From: Guo Ren We don't have native hw xchg16 instruction, so let qspinlock generic code to deal with it. Using the full-word atomic xchg instructions implement xchg16 has the semantic risk for atomic operations. This patch cancels the dependency of on qspinlock generic code on architecture's

[PATCH v6 5/9] csky: Convert custom spinlock/rwlock to generic qspinlock/qrwlock

2021-03-31 Thread guoren
From: Guo Ren Update the C-SKY port to use the generic qspinlock and qrwlock. C-SKY only support ldex.w/stex.w with word(double word) size & align access. So it must select XCHG32 to let qspinlock only use word atomic xchg_tail. Default is still ticket lock. Signed-off-by: Guo Ren Cc: Waiman

[PATCH v6 3/9] riscv: locks: Introduce ticket-based spinlock implementation

2021-03-31 Thread guoren
From: Guo Ren This patch introduces a ticket lock implementation for riscv, along the same lines as the implementation for arch/arm & arch/csky. We still use qspinlock as default. Signed-off-by: Guo Ren Cc: Peter Zijlstra Cc: Anup Patel Cc: Arnd Bergmann --- arch/riscv/Kconfig

[PATCH v6 4/9] csky: locks: Optimize coding convention

2021-03-31 Thread guoren
From: Guo Ren - Using smp_cond_load_acquire in arch_spin_lock by Peter's advice. - Using __smp_acquire_fence in arch_spin_trylock - Using smp_store_release in arch_spin_unlock All above are just coding conventions and won't affect the function. TODO in smp_cond_load_acquire for

[PATCH v6 2/9] riscv: Convert custom spinlock/rwlock to generic qspinlock/qrwlock

2021-03-31 Thread guoren
From: Michael Clark Update the RISC-V port to use the generic qspinlock and qrwlock. This patch requires support for xchg_xtail for full-word which are added by a previous patch: Guo added select ARCH_USE_QUEUED_SPINLOCKS_XCHG32 in Kconfig Guo fixed up compile error which made by below

[PATCH v6 1/9] locking/qspinlock: Add ARCH_USE_QUEUED_SPINLOCKS_XCHG32

2021-03-31 Thread guoren
From: Guo Ren Some architectures don't have sub-word swap atomic instruction, they only have the full word's one. The sub-word swap only improve the performance when: NR_CPUS < 16K * 0- 7: locked byte * 8: pending * 9-15: not used * 16-17: tail index * 18-31: tail cpu (+1) The 9-15

[PATCH v6 0/9] riscv: Add qspinlock/qrwlock

2021-03-31 Thread guoren
From: Guo Ren Current riscv is still using baby spinlock implementation. It'll cause fairness and cache line bouncing problems. Many people are involved and pay the efforts to improve it: - The first version of patch was made in 2019.1:

[PATCH v5 7/7] xtensa: qspinlock: Add ARCH_USE_QUEUED_SPINLOCKS_XCHG32

2021-03-28 Thread guoren
From: Guo Ren We don't have native hw xchg16 instruction, so let qspinlock generic code to deal with it. Using the full-word atomic xchg instructions implement xchg16 has the semantic risk for atomic operations. This patch cancels the dependency of on qspinlock generic code on architecture's

[PATCH v5 6/7] sparc: qspinlock: Add ARCH_USE_QUEUED_SPINLOCKS_XCHG32

2021-03-28 Thread guoren
From: Guo Ren We don't have native hw xchg16 instruction, so let qspinlock generic code to deal with it. Using the full-word atomic xchg instructions implement xchg16 has the semantic risk for atomic operations. This patch cancels the dependency of on qspinlock generic code on architecture's

[PATCH v5 5/7] openrisc: qspinlock: Add ARCH_USE_QUEUED_SPINLOCKS_XCHG32

2021-03-28 Thread guoren
From: Guo Ren We don't have native hw xchg16 instruction, so let qspinlock generic code to deal with it. Using the full-word atomic xchg instructions implement xchg16 has the semantic risk for atomic operations. This patch cancels the dependency of on qspinlock generic code on architecture's

[PATCH v5 3/7] csky: Convert custom spinlock/rwlock to generic qspinlock/qrwlock

2021-03-28 Thread guoren
From: Guo Ren Update the C-SKY port to use the generic qspinlock and qrwlock. C-SKY only support ldex.w/stex.w with word(double word) size & align access. So it must select XCHG32 to let qspinlock only use word atomic xchg_tail. Signed-off-by: Guo Ren Cc: Waiman Long Cc: Peter Zijlstra Cc:

[PATCH v5 4/7] powerpc/qspinlock: Add ARCH_USE_QUEUED_SPINLOCKS_XCHG32

2021-03-28 Thread guoren
From: Guo Ren We don't have native hw xchg16 instruction, so let qspinlock generic code to deal with it. Using the full-word atomic xchg instructions implement xchg16 has the semantic risk for atomic operations. This patch cancels the dependency of on qspinlock generic code on architecture's

[PATCH v5 2/7] riscv: Convert custom spinlock/rwlock to generic qspinlock/qrwlock

2021-03-28 Thread guoren
From: Michael Clark Update the RISC-V port to use the generic qspinlock and qrwlock. This patch requires support for xchg_xtail for full-word which are added by a previous patch: Guo added select ARCH_USE_QUEUED_SPINLOCKS_XCHG32 in Kconfig Guo fixed up compile error which made by below

[PATCH v5 1/7] locking/qspinlock: Add ARCH_USE_QUEUED_SPINLOCKS_XCHG32

2021-03-28 Thread guoren
From: Guo Ren Some architectures don't have sub-word swap atomic instruction, they only have the full word's one. The sub-word swap only improve the performance when: NR_CPUS < 16K * 0- 7: locked byte * 8: pending * 9-15: not used * 16-17: tail index * 18-31: tail cpu (+1) The 9-15

[PATCH v5 0/7] riscv: Add qspinlock/qrwlock

2021-03-28 Thread guoren
From: Guo Ren Current riscv is still using baby spinlock implementation. It'll cause fairness and cache line bouncing problems. Many people are involved and pay the efforts to improve it: - The first version of patch was made in 2019.1:

[PATCH v4 3/4] locking/qspinlock: Add ARCH_USE_QUEUED_SPINLOCKS_XCHG32

2021-03-27 Thread guoren
From: Guo Ren Some architectures don't have sub-word swap atomic instruction, they only have the full word's one. The sub-word swap only improve the performance when: NR_CPUS < 16K * 0- 7: locked byte * 8: pending * 9-15: not used * 16-17: tail index * 18-31: tail cpu (+1) The 9-15

[PATCH v4 4/4] riscv: Convert custom spinlock/rwlock to generic qspinlock/qrwlock

2021-03-27 Thread guoren
From: Michael Clark Update the RISC-V port to use the generic qspinlock and qrwlock. This patch requires support for xchg for short which are added by a previous patch. Guo fixed up compile error which made by below include sequence: +#include +#include Signed-off-by: Michael Clark

[PATCH v4 2/4] riscv: cmpxchg.h: Merge macros

2021-03-27 Thread guoren
From: Guo Ren To reduce assembly codes, let's merge duplicate codes into one (xchg_acquire, xchg_release, cmpxchg_release). Signed-off-by: Guo Ren Link: https://lore.kernel.org/linux-riscv/CAJF2gTT1_mP-wiK2HsCpTeU61NqZVKZX1A5ye=twqvgn4tp...@mail.gmail.com/ Cc: Peter Zijlstra Cc: Michael

[PATCH v4 1/4] riscv: cmpxchg.h: Cleanup unused code

2021-03-27 Thread guoren
From: Guo Ren Remove unnecessary marco, they are no use or handled by generic files (atomic-fallback.h, asm-generic/cmpxchg*). Signed-off-by: Guo Ren Link: https://lore.kernel.org/linux-riscv/CAJF2gTT1_mP-wiK2HsCpTeU61NqZVKZX1A5ye=twqvgn4tp...@mail.gmail.com/ Cc: Peter Zijlstra Cc: Michael

[PATCH v4 0/4] riscv: Add qspinlock/qrwlock

2021-03-27 Thread guoren
From: Guo Ren Current riscv is still using baby spinlock implementation. It'll cause fairness and cache line bouncing problems. Many people are involved and pay the efforts to improve it: - The first version of patch was made in 2019.1:

[PATCH v3 1/4] riscv: cmpxchg.h: Cleanup unused code

2021-03-25 Thread guoren
From: Guo Ren Remove unnecessary marco, they are no use or handled by generic files (atomic-fallback.h, asm-generic/cmpxchg*). Signed-off-by: Guo Ren Link: https://lore.kernel.org/linux-riscv/CAJF2gTT1_mP-wiK2HsCpTeU61NqZVKZX1A5ye=twqvgn4tp...@mail.gmail.com/ Cc: Peter Zijlstra Cc: Michael

[PATCH v3 2/4] riscv: cmpxchg.h: Merge macros

2021-03-25 Thread guoren
From: Guo Ren To reduce assembly codes, let's merge duplicate codes into one (xchg_acquire, xchg_release, cmpxchg_release). Signed-off-by: Guo Ren Link: https://lore.kernel.org/linux-riscv/CAJF2gTT1_mP-wiK2HsCpTeU61NqZVKZX1A5ye=twqvgn4tp...@mail.gmail.com/ Cc: Peter Zijlstra Cc: Michael

[PATCH v3 3/4] riscv: cmpxchg.h: Implement xchg for short

2021-03-25 Thread guoren
From: Guo Ren riscv only support lr.wd/s(c).w(d) with word(double word) size & align access. There are not lr.h/sc.h instructions. But qspinlock.c need xchg with short type variable: xchg_tail -> xchg_releaxed(>tail, ... typedef struct qspinlock { union { atomic_t val;

[PATCH v3 4/4] riscv: Convert custom spinlock/rwlock to generic qspinlock/qrwlock

2021-03-25 Thread guoren
From: Michael Clark Update the RISC-V port to use the generic qspinlock and qrwlock. This patch requires support for xchg for short which are added by a previous patch. Guo fixed up compile error which made by below include sequence: +#include +#include Signed-off-by: Michael Clark

[PATCH v3 0/4] riscv: Add qspinlock/qrwlock

2021-03-25 Thread guoren
From: Guo Ren Current riscv is still using baby spinlock implementation. It'll cause fairness and cache line bouncing problems. Many people are involved and pay the efforts to improve it: - The first version of patch was made in 2019.1:

[PATCH] riscv: locks: introduce ticket-based spinlock implementation

2021-03-24 Thread guoren
From: Guo Ren This patch introduces a ticket lock implementation for riscv, along the same lines as the implementation for arch/arm & arch/csky. Signed-off-by: Guo Ren Cc: Catalin Marinas Cc: Will Deacon Cc: Peter Zijlstra Cc: Palmer Dabbelt Cc: Anup Patel Cc: Arnd Bergmann ---

[PATCH 2/2] riscv: Enable generic clockevent broadcast

2021-03-06 Thread guoren
From: Guo Ren When percpu-timers are stopped by deep power saving mode, we need system timer help to broadcast IPI_TIMER. This is first introduced by broken x86 hardware, where the local apic timer stops in C3 state. But many other architectures(powerpc, mips, arm, hexagon, openrisc, sh) have

[PATCH 1/2] csky: Enable generic clockevent broadcast

2021-03-06 Thread guoren
From: Guo Ren When percpu-timers are stopped by deep power saving mode, we need system timer help to broadcast IPI_TIMER. This is first introduced by broken x86 hardware, where the local apic timer stops in C3 state. But many other architectures(powerpc, mips, arm, hexagon, openrisc, sh) have

[PATCH 4/4] perf: csky: Using CPUHP_AP_ONLINE_DYN

2021-03-01 Thread guoren
From: Guo Ren Remove C-SKY perf custom definitions in hotplug.h: - CPUHP_AP_PERF_CSKY_ONLINE For coding convention. Cc: Peter Zijlstra Cc: Thomas Gleixner Cc: Arnd Bergmann Cc: Linus Torvalds Tested-by: Guo Ren Signed-off-by: Guo Ren Link:

[PATCH 3/4] clocksource: csky: Using CPUHP_AP_ONLINE_DYN

2021-03-01 Thread guoren
From: Guo Ren Remove C-SKY clocksource custom definitions in hotplug.h: - CPUHP_AP_CSKY_TIMER_STARTING For coding convention. Cc: Peter Zijlstra Cc: Thomas Gleixner Cc: Arnd Bergmann Cc: Linus Torvalds Tested-by: Guo Ren Signed-off-by: Guo Ren Link:

[PATCH 1/4] irqchip: riscv: Using CPUHP_AP_ONLINE_DYN

2021-03-01 Thread guoren
From: Guo Ren Remove RISC-V irqchip custom definitions in hotplug.h: - CPUHP_AP_IRQ_RISCV_STARTING - CPUHP_AP_IRQ_SIFIVE_PLIC_STARTING For coding convention. Cc: Peter Zijlstra Cc: Thomas Gleixner Cc: Arnd Bergmann Cc: Linus Torvalds Cc: Palmer Dabbelt Cc: Anup Patel Cc: Atish Patra

[PATCH 2/4] clocksource: riscv: Using CPUHP_AP_ONLINE_DYN

2021-03-01 Thread guoren
From: Guo Ren Remove RISC-V clocksource custom definitions in hotplug.h: - CPUHP_AP_RISCV_TIMER_STARTING For coding convention. Cc: Peter Zijlstra Cc: Thomas Gleixner Cc: Arnd Bergmann Cc: Linus Torvalds Cc: Anup Patel Cc: Christoph Hellwig Cc: Palmer Dabbelt Tested-by: Guo Ren

[GIT PULL] csky changes for v5.12-rc1

2021-02-27 Thread guoren
Hi Linus, The following changes since commit 7c53f6b671f4aba70ff15e1b05148b10d58c2837: Linux 5.11-rc3 (2021-01-10 14:34:50 -0800) are available in the Git repository at: https://github.com/c-sky/csky-linux.git tags/csky-for-linus-5.12-rc1 for you to fetch changes up to

[PATCH v2 2/2] drivers/clocksource: Fixup csky,mptimer compile error with CPU_CK610

2021-02-03 Thread guoren
From: Guo Ren The timer-mp-csky.c only could support CPU_CK860 and it will compile error with CPU_CK610. It has been selected in arch/csky/Kconfig. Signed-off-by: Guo Ren Cc: Daniel Lezcano Cc: Thomas Gleixner Cc: Marc Zyngier --- drivers/clocksource/Kconfig | 2 +- 1 file changed, 1

[PATCH v2 1/2] drivers/irqchip: Fixup csky,mpintc compile error with CPU_CK610

2021-02-03 Thread guoren
From: Guo Ren The irq-csky-mpintc.c only could support CPU_CK860 and it will compile error with CPU_CK610. It has beed selected in arch/csky/Kconfig Signed-off-by: Guo Ren Cc: Marc Zyngier --- drivers/irqchip/Kconfig | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) --- v2: Drop the

[PATCH 1/2] drivers/irqchip: Fixup csky,mpintc compile error with CPU_CK610

2021-02-03 Thread guoren
From: Guo Ren The irq-csky-mpintc.c only could support CPU_CK860 and it will compile error with CPU_CK610. Signed-off-by: Guo Ren Cc: Marc Zyngier --- drivers/irqchip/Kconfig | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/irqchip/Kconfig b/drivers/irqchip/Kconfig

[PATCH 2/2] drivers/clocksource: Fixup csky,mpintc compile error with CPU_CK610

2021-02-03 Thread guoren
From: Guo Ren The timer-mp-csky.c only could support CPU_CK860 and it will compile error with CPU_CK610. Signed-off-by: Guo Ren Cc: Daniel Lezcano Cc: Thomas Gleixner --- drivers/clocksource/Kconfig | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git

[PATCH 15/29] csky: Fixup update_mmu_cache called with user io mapping

2021-01-21 Thread guoren
From: Guo Ren The function update_mmu_cache could be called by user-io mapping. There is no space of struct page in mem_map for the pte. Just ignore the user-io mmaping in update_mmu_cache. Signed-off-by: Guo Ren --- arch/csky/abiv2/cacheflush.c | 3 +++ 1 file changed, 3 insertions(+) diff

[PATCH 09/29] csky: Fixup PTE global for 2.5:1.5 virtual memory

2021-01-21 Thread guoren
From: Guo Ren Fixup commit c2d1adfa9a24 "csky: Add memory layout 2.5G(user):1.5G (kernel)". That patch broke the global bit in PTE. C-SKY TLB's entry contain two pages: vpn, vpn + 1 -> ppn0, ppn1 All PPN's attributes contain global bit and final global is PPN0.G & PPN1.G. So we must keep

[PATCH 06/29] csky: Fixup futex SMP implementation

2021-01-20 Thread guoren
From: Guo Ren Arnd said: I would guess that for csky, this is a mistake, as the architecture is fairly new and should be able to implement it. Guo reply: The c610, c807, c810 don't support SMP, so futex_cmpxchg_enabled = 1 with asm-generic's implementation. For c860, there is no

[PATCH 11/29] csky: Add kmemleak support

2021-01-20 Thread guoren
From: Guo Ren Here is the log after enabled: [1.798972] kmemleak: Kernel memory leak detector initialized (mem pool available: 15851) [1.798983] kmemleak: Automatic memory scanning thread started Signed-off-by: Guo Ren --- arch/csky/Kconfig | 1 + 1 file changed, 1 insertion(+)

[PATCH 08/29] csky: Cleanup asm/spinlock.h

2021-01-20 Thread guoren
From: Guo Ren There are two implementation of spinlock in arch/csky: - simple one (NR_CPU = 1,2) - tick's one (NR_CPU = 3,4) Remove the simple one. There is already smp_mb in spinlock, so remove the definition of smp_mb__after_spinlock. Link:

[PATCH 07/29] csky: Fixup asm/cmpxchg.h with correct ordering barrier

2021-01-20 Thread guoren
From: Guo Ren Optimize the performance of cmpxchg by using more fine-grained acquire/release barriers. Signed-off-by: Guo Ren Cc: Peter Zijlstra Cc: Arnd Bergmann Cc: Paul E. McKenney --- arch/csky/include/asm/cmpxchg.h | 27 +-- 1 file changed, 17 insertions(+), 10

[PATCH 10/29] csky: Remove prologue of page fault handler in entry.S

2021-01-20 Thread guoren
From: Guo Ren There is a prologue on page fault handler which marking pages dirty and/or accessed in page attributes, but all of these have been handled in handle_pte_fault. - Add flush_tlb_one in vmalloc page fault instead of prologue. - Using cmxchg_fixup C codes in do_page_fault instead of

[PATCH 13/29] csky: Add show_tlb for CPU_CK860 debug

2021-01-20 Thread guoren
From: Guo Ren Print all 1024 jtlb entries and 16 iutlb entries and 16 dutlb entries in show_regs. Signed-off-by: Guo Ren --- arch/csky/kernel/ptrace.c | 121 ++ 1 file changed, 121 insertions(+) diff --git a/arch/csky/kernel/ptrace.c

[PATCH 14/29] csky: Fixup FAULT_FLAG_XXX param for handle_mm_fault

2021-01-20 Thread guoren
From: Guo Ren The past code only passes the FAULT_FLAG_WRITE into handle_mm_fault and missing USER & DEFAULT & RETRY. The patch references to arch/riscv/mm/fault.c, but there is no FAULT_FLAG_INSTRUCTION in csky hw. Signed-off-by: Guo Ren --- arch/csky/mm/fault.c | 23 +--

[PATCH 12/29] csky: Fix TLB maintenance synchronization problem

2021-01-20 Thread guoren
From: Guo Ren TLB invalidate didn't contain a barrier operation in csky cpu and we need to prevent previous PTW response after TLB invalidation instruction. Of cause, the ASID changing also needs to take care of the issue. CPU0CPU1 === ===

[PATCH 17/29] csky: Fixup do_page_fault parent irq status

2021-01-20 Thread guoren
From: Guo Ren We must succeed parent's context irq status in page fault handler. Signed-off-by: Guo Ren --- arch/csky/kernel/entry.S | 2 +- arch/csky/mm/fault.c | 4 2 files changed, 5 insertions(+), 1 deletion(-) diff --git a/arch/csky/kernel/entry.S b/arch/csky/kernel/entry.S

[PATCH 20/29] csky: Reconstruct VDSO framework

2021-01-20 Thread guoren
From: Guo Ren Reconstruct vdso framework to support future vsyscall, vgettimeofday features. These are very important features to reduce system calls into the kernel for performance improvement. The patch is reference RISC-V's Signed-off-by: Guo Ren Cc: Palmer Dabbelt ---

[PATCH 29/29] csky: Fixup pfn_valid error with wrong max_mapnr

2021-01-20 Thread guoren
From: Guo Ren The max_mapnr is the number of PFNs, not absolute PFN offset. Signed-off-by: Guo Ren --- arch/csky/mm/init.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/arch/csky/mm/init.c b/arch/csky/mm/init.c index 03970f4408f5..81e4e5e78f38 100644 ---

[PATCH 26/29] csky: kprobe: fix code in simulate without 'long'

2021-01-20 Thread guoren
From: Guo Ren The type of 'val' is 'unsigned long' in simulate_blz32, so 'val < 0' can't be true. Cast 'val' to 'long' here to determine branch token or not, Fixup instructions: bnezad32, bhsz32, bhz32, blsz32, blz32 Link:

[PATCH 25/29] csky: Fixup swapon

2021-01-20 Thread guoren
From: Guo Ren Current csky's swappon is broken by wrong swap PTE entry format. Now redesign the new format for abiv1 & abiv2 and make swappon + zram work properly on csky machines. C-SKY PTE has VALID, DIRTY to emulate PRESENT, READ, WRITE, EXEC attributes. GLOBAL bit is shared by two pages in

[PATCH 19/29] csky: mm: abort uaccess retries upon fatal signal

2021-01-20 Thread guoren
From: Guo Ren Pick up the patch from the 'Link' made by Mark Rutland. Keep the same with x86, arm, arm64, arc, sh, power. Link: https://lore.kernel.org/linux-arm-kernel/1499782763-31418-1-git-send-email-mark.rutl...@arm.com/ Signed-off-by: Guo Ren Cc: Mark Rutland --- arch/csky/mm/fault.c |

[PATCH 27/29] csky: Add VDSO with GENERIC_GETTIMEOFDAY, GENERIC_TIME_VSYSCALL, HAVE_GENERIC_VDSO

2021-01-20 Thread guoren
From: Guo Ren It could help to reduce the latency of the time-related functions in user space. We have referenced arm's and riscv's implementation for the patch. Signed-off-by: Guo Ren Cc: Vincent Chen Cc: Arnd Bergmann --- arch/csky/Kconfig | 4 +

[PATCH 21/29] csky: Fix a size determination in gpr_get()

2021-01-20 Thread guoren
From: Zhenzhong Duan "*" is missed in size determination as we are passing register set rather than a pointer. Fixes: dcad7854fcce ("sky: switch to ->regset_get()") Signed-off-by: Zhenzhong Duan Signed-off-by: Guo Ren --- arch/csky/kernel/ptrace.c | 2 +- 1 file changed, 1 insertion(+), 1

[PATCH 28/29] csky: Using set_max_mapnr api

2021-01-20 Thread guoren
From: Guo Ren Using set_max_mapnr API instead of setting the value directly. Signed-off-by: Guo Ren --- arch/csky/mm/init.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/arch/csky/mm/init.c b/arch/csky/mm/init.c index bc05a3be9d57..03970f4408f5 100644 ---

[PATCH 18/29] csky: Sync riscv mm/fault.c for easy maintenance

2021-01-20 Thread guoren
From: Guo Ren Sync arch/riscv/mm/fault.c into arch/csky for easy maintenance. Here are the patches related to the modification: cac4d1d "riscv/mm/fault: Move no context handling to no_context()" ac416a7 "riscv/mm/fault: Move vmalloc fault handling to vmalloc_fault()" 6c11ffb "riscv/mm/fault:

[PATCH 24/29] csky: Coding convention del unnecessary definition

2021-01-20 Thread guoren
From: Guo Ren Remove _PAGE_IOREMAP, __READABLE, __WRITEABLE, abi/pgtable-bits.h definition, they are no use at all. Signed-off-by: Guo Ren Cc: Arnd Bergmann --- arch/csky/abiv1/inc/abi/pgtable-bits.h | 17 ++-- arch/csky/abiv2/inc/abi/pgtable-bits.h | 14 +-

[PATCH 23/29] csky: Fixup _PAGE_ACCESSED for default pgprot

2021-01-20 Thread guoren
From: Guo Ren When the system memory is exhausted, linux will trigger kswapd to shrink memory page cache. We found the csky's .text file mapping pages would be reclaimed earlier than arm's elf. Because csky doesn't give _PAGE_ACCESSED for default pgprot and in zap_pte_range if (pte_young(ptent)

[PATCH 22/29] csky: remove unused including

2021-01-20 Thread guoren
From: Tian Tao Remove including that don't need it. Signed-off-by: Tian Tao Signed-off-by: Guo Ren --- arch/csky/include/asm/thread_info.h | 1 - 1 file changed, 1 deletion(-) diff --git a/arch/csky/include/asm/thread_info.h b/arch/csky/include/asm/thread_info.h index

[PATCH 16/29] csky: Add faulthandler_disabled() check

2021-01-20 Thread guoren
From: Guo Ren Similar to other architectures: In addition to in_atomic, we also need pagefault_disabled() to check. Signed-off-by: Guo Ren --- arch/csky/mm/fault.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/arch/csky/mm/fault.c b/arch/csky/mm/fault.c index

[PATCH 02/29] csky: Fixup perf probe failed

2021-01-20 Thread guoren
From: Guo Ren Current perf init will failed with: [1.452433] csky-pmu: probe of soc:pmu failed with error -16 This patch fix it up with adding CPUHP_AP_PERF_CSKY_ONLINE in cpuhotplug.h. Signed-off-by: Guo Ren Cc: Peter Zijlstra Cc: Ingo Molnar Cc: Arnaldo Carvalho de Melo ---

[PATCH 03/29] csky: Fixup show_regs doesn't contain regs->usp

2021-01-20 Thread guoren
From: Guo Ren Current show_regs didn't display regs->usp and it confused debug. So fixup wrong SP display and add PT_REGS. Signed-off-by: Guo Ren --- arch/csky/kernel/ptrace.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/arch/csky/kernel/ptrace.c

[PATCH 05/29] csky: Fixup barrier design

2021-01-20 Thread guoren
From: Guo Ren Remove shareable bit for ordering barrier, just keep ordering in current hart is enough for SMP. Using three continuous sync.is as PTW barrier to prevent speculative PTW in 860 microarchitecture. Signed-off-by: Guo Ren --- arch/csky/include/asm/barrier.h | 82

[PATCH 04/29] csky: Remove custom asm/atomic.h implementation

2021-01-20 Thread guoren
From: Guo Ren Use generic atomic implementation based on cmpxchg. So remove csky asm/atomic.h. Signed-off-by: Guo Ren Cc: Peter Zijlstra Cc: Arnd Bergmann Cc: Paul E. McKenney --- arch/csky/include/asm/atomic.h | 212 - 1 file changed, 212 deletions(-)

[PATCH 01/29] csky: Add memory layout 2.5G(user):1.5G(kernel)

2021-01-20 Thread guoren
From: Guo Ren There are two ways for translating va to pa for csky: - Use TLB(Translate Lookup Buffer) and PTW (Page Table Walk) - Use SSEG0/1 (Simple Segment Mapping) We use tlb mapping 0-2G and 3G-4G virtual address area and SSEG0/1 are for 2G-2.5G and 2.5G-3G translation. We could disable

[PATCH] riscv: Fixup pfn_valid error with wrong max_mapnr

2021-01-20 Thread guoren
From: Guo Ren The max_mapnr is the number of PFNs, not absolute PFN offset. Signed-off-by: Guo Ren Cc: Palmer Dabbelt --- arch/riscv/mm/init.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/riscv/mm/init.c b/arch/riscv/mm/init.c index bf5379135e39..1576ee9e5e94

[PATCH] riscv: Remove duplicate definition in pagtable.h

2021-01-11 Thread guoren
From: Guo Ren PAGE_KERNEL_EXEC has been defined above. Signed-off-by: Guo Ren Cc: Palmer Dabbelt Cc: Pekka Enberg --- arch/riscv/include/asm/pgtable.h | 1 - 1 file changed, 1 deletion(-) diff --git a/arch/riscv/include/asm/pgtable.h b/arch/riscv/include/asm/pgtable.h index

[PATCH] riscv: Fixup CONFIG_GENERIC_TIME_VSYSCALL

2021-01-02 Thread guoren
From: Guo Ren The patch fix commit: ad5d112 ("riscv: use vDSO common flow to reduce the latency of the time-related functions"). The GENERIC_TIME_VSYSCALL should be CONFIG_GENERIC_TIME_VSYSCALL or vgettimeofday won't work. Signed-off-by: Guo Ren Cc: Atish Patra Cc: Palmer Dabbelt Cc:

[PATCH] riscv: mm: abort uaccess retries upon fatal signal

2020-12-30 Thread guoren
From: Guo Ren Pick up the patch from the 'Link' made by Mark Rutland. Keep the same with x86, arm, arm64, arc, sh, power. Link: https://lore.kernel.org/linux-arm-kernel/1499782763-31418-1-git-send-email-mark.rutl...@arm.com/ Signed-off-by: Guo Ren Cc: Mark Rutland Cc: Pekka Enberg Cc:

[PATCH] mm: page-flags.h: Typo fix (It -> If)

2020-12-25 Thread guoren
From: Guo Ren The "If" was wrongly spelled as "It". Signed-off-by: Guo Ren Cc: Andrew Morton Cc: Oscar Salvador Cc: Alexander Duyck Cc: David Hildenbrand Cc: Steven Price --- include/linux/page-flags.h | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git

[PATCH v2 5/5] csky: Cleanup asm/spinlock.h

2020-12-20 Thread guoren
From: Guo Ren There are two implementation of spinlock in arch/csky: - simple one (NR_CPU = 1,2) - tick's one (NR_CPU = 3,4) Remove the simple one. There is already smp_mb in spinlock, so remove the definition of smp_mb__after_spinlock. Link:

[PATCH v2 3/5] csky: Fixup futex SMP implementation

2020-12-20 Thread guoren
From: Guo Ren Arnd said: I would guess that for csky, this is a mistake, as the architecture is fairly new and should be able to implement it. Guo reply: The c610, c807, c810 don't support SMP, so futex_cmpxchg_enabled = 1 with asm-generic's implementation. For c860, there is no

  1   2   3   4   >