Re: [PATCH V11 07/17] riscv: qspinlock: Introduce qspinlock param for command line

2023-09-14 Thread Waiman Long
On 9/14/23 03:32, Leonardo Bras wrote: On Tue, Sep 12, 2023 at 09:08:34AM +0800, Guo Ren wrote: On Mon, Sep 11, 2023 at 11:34 PM Waiman Long wrote: On 9/10/23 04:29, guo...@kernel.org wrote: From: Guo Ren Allow cmdline to force the kernel to use queued_spinlock when

Re: [PATCH V10 07/19] riscv: qspinlock: errata: Introduce ERRATA_THEAD_QSPINLOCK

2023-09-13 Thread Waiman Long
On 9/13/23 14:54, Palmer Dabbelt wrote: On Sun, 06 Aug 2023 22:23:34 PDT (-0700), sor...@fastmail.com wrote: On Wed, Aug 2, 2023, at 12:46 PM, guo...@kernel.org wrote: From: Guo Ren According to qspinlock requirements, RISC-V gives out a weak LR/SC forward progress guarantee which does not

Re: [PATCH V11 04/17] locking/qspinlock: Improve xchg_tail for number of cpus >= 16k

2023-09-13 Thread Waiman Long
On 9/13/23 08:52, Guo Ren wrote: On Wed, Sep 13, 2023 at 4:55 PM Leonardo Bras wrote: On Tue, Sep 12, 2023 at 09:10:08AM +0800, Guo Ren wrote: On Mon, Sep 11, 2023 at 9:03 PM Waiman Long wrote: On 9/10/23 23:09, Guo Ren wrote: On Mon, Sep 11, 2023 at 10:35 AM Waiman Long wrote: On 9/10

Re: [PATCH V11 07/17] riscv: qspinlock: Introduce qspinlock param for command line

2023-09-11 Thread Waiman Long
On 9/10/23 04:29, guo...@kernel.org wrote: From: Guo Ren Allow cmdline to force the kernel to use queued_spinlock when CONFIG_RISCV_COMBO_SPINLOCKS=y. Signed-off-by: Guo Ren Signed-off-by: Guo Ren --- Documentation/admin-guide/kernel-parameters.txt | 2 ++ arch/riscv/kernel/setup.c

Re: [PATCH V11 07/17] riscv: qspinlock: Introduce qspinlock param for command line

2023-09-11 Thread Waiman Long
On 9/10/23 04:29, guo...@kernel.org wrote: From: Guo Ren Allow cmdline to force the kernel to use queued_spinlock when CONFIG_RISCV_COMBO_SPINLOCKS=y. Signed-off-by: Guo Ren Signed-off-by: Guo Ren --- Documentation/admin-guide/kernel-parameters.txt | 2 ++ arch/riscv/kernel/setup.c

Re: [PATCH V11 04/17] locking/qspinlock: Improve xchg_tail for number of cpus >= 16k

2023-09-11 Thread Waiman Long
On 9/10/23 23:09, Guo Ren wrote: On Mon, Sep 11, 2023 at 10:35 AM Waiman Long wrote: On 9/10/23 04:28, guo...@kernel.org wrote: From: Guo Ren The target of xchg_tail is to write the tail to the lock value, so adding prefetchw could help the next cmpxchg step, which may decrease the cmpxchg

Re: [PATCH V11 04/17] locking/qspinlock: Improve xchg_tail for number of cpus >= 16k

2023-09-10 Thread Waiman Long
On 9/10/23 04:28, guo...@kernel.org wrote: From: Guo Ren The target of xchg_tail is to write the tail to the lock value, so adding prefetchw could help the next cmpxchg step, which may decrease the cmpxchg retry loops of xchg_tail. Some processors may utilize this feature to give a forward

Re: [PATCH V10 18/19] locking/qspinlock: Move pv_ops into x86 directory

2023-08-11 Thread Waiman Long
On 8/11/23 20:24, Guo Ren wrote: On Sat, Aug 12, 2023 at 4:42 AM Waiman Long wrote: On 8/2/23 12:47, guo...@kernel.org wrote: From: Guo Ren The pv_ops belongs to x86 custom infrastructure and cleans up the cna_configure_spin_lock_slowpath() with standard code. This is preparation for riscv

Re: [PATCH V10 18/19] locking/qspinlock: Move pv_ops into x86 directory

2023-08-11 Thread Waiman Long
On 8/2/23 12:47, guo...@kernel.org wrote: From: Guo Ren The pv_ops belongs to x86 custom infrastructure and cleans up the cna_configure_spin_lock_slowpath() with standard code. This is preparation for riscv support CNA qspoinlock. CNA qspinlock has not been merged into mainline yet. I will

Re: [PATCH V10 05/19] riscv: qspinlock: Introduce combo spinlock

2023-08-11 Thread Waiman Long
On 8/2/23 12:46, guo...@kernel.org wrote: From: Guo Ren Combo spinlock could support queued and ticket in one Linux Image and select them during boot time via errata mechanism. Here is the func size (Bytes) comparison table below: TYPE: COMBO | TICKET | QUEUED

Re: [PATCH V10 04/19] riscv: qspinlock: Add basic queued_spinlock support

2023-08-11 Thread Waiman Long
On 8/2/23 12:46, guo...@kernel.org wrote: \ diff --git a/arch/riscv/include/asm/spinlock.h b/arch/riscv/include/asm/spinlock.h new file mode 100644 index ..c644a92d4548 --- /dev/null +++ b/arch/riscv/include/asm/spinlock.h @@ -0,0 +1,17 @@ +/* SPDX-License-Identifier:

Re: [PATCH v7] x86/paravirt: useless assignment instructions cause Unixbench full core performance degradation

2022-06-28 Thread Waiman Long
u(node->prev + vcpu_is_preempted_node(node))) return true; /* unqueue */ Reviewed-by: Waiman Long ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/m

Re: [PATCH v6] x86/paravirt: useless assignment instructions cause Unixbench full core performance degradation

2022-06-28 Thread Waiman Long
On 6/28/22 08:54, Guo Hui wrote: The instructions assigned to the vcpu_is_preempted function parameter in the X86 architecture physical machine are redundant instructions, causing the multi-core performance of Unixbench to drop by about 4% to 5%. The C function is as follows: static bool

Re: [PATCH v4] x86/paravirt: useless assignment instructions cause Unixbench full core performance degradation

2022-06-27 Thread Waiman Long
On 6/27/22 10:27, Guo Hui wrote: The instructions assigned to the vcpu_is_preempted function parameter in the X86 architecture physical machine are redundant instructions, causing the multi-core performance of Unixbench to drop by about 4% to 5%. The C function is as follows: static bool

Re: [PATCH v2] x86/paravirt: useless assignment instructions cause Unixbench full core performance degradation

2022-06-27 Thread Waiman Long
On 6/27/22 01:54, Guo Hui wrote: Thank you very much Longman, my patch is as you said, only disable node_cpu on X86, enable node_cpu on arm64, powerpc, s390 architectures; the code is in file arch/x86/kernel/paravirt-spinlocks.c:     DECLARE_STATIC_KEY_FALSE(preemted_key);    

Re: [PATCH v2] x86/paravirt: useless assignment instructions cause Unixbench full core performance degradation

2022-06-26 Thread Waiman Long
On 6/26/22 22:13, Guo Hui wrote: The instructions assigned to the vcpu_is_preempted function parameter in the X86 architecture physical machine are redundant instructions, causing the multi-core performance of Unixbench to drop by about 4% to 5%. The C function is as follows: static bool

Re: [PATCH v3 5/6] powerpc/pseries: implement paravirt qspinlocks for SPLPAR

2020-07-25 Thread Waiman Long
On 7/25/20 1:26 PM, Peter Zijlstra wrote: On Fri, Jul 24, 2020 at 03:10:59PM -0400, Waiman Long wrote: On 7/24/20 4:16 AM, Will Deacon wrote: On Thu, Jul 23, 2020 at 08:47:59PM +0200, pet...@infradead.org wrote: On Thu, Jul 23, 2020 at 02:32:36PM -0400, Waiman Long wrote: BTW, do you have

Re: [PATCH v3 5/6] powerpc/pseries: implement paravirt qspinlocks for SPLPAR

2020-07-24 Thread Waiman Long
On 7/24/20 3:10 PM, Waiman Long wrote: On 7/24/20 4:16 AM, Will Deacon wrote: On Thu, Jul 23, 2020 at 08:47:59PM +0200, pet...@infradead.org wrote: On Thu, Jul 23, 2020 at 02:32:36PM -0400, Waiman Long wrote: BTW, do you have any comment on my v2 lock holder cpu info qspinlock patch? I

Re: [PATCH v4 0/6] powerpc: queued spinlocks and rwlocks

2020-07-24 Thread Waiman Long
/powerpc/include/asm/simple_spinlock.h create mode 100644 arch/powerpc/include/asm/simple_spinlock_types.h That patch series looks good to me. Thanks for working on this. For the series, Acked-by: Waiman Long ___ Virtualization mailing list

Re: [PATCH v4 6/6] powerpc: implement smp_cond_load_relaxed

2020-07-24 Thread Waiman Long
On 7/24/20 9:14 AM, Nicholas Piggin wrote: This implements smp_cond_load_relaed with the slowpath busy loop using the Nit: "smp_cond_load_relaxed" Cheers, Longman ___ Virtualization mailing list Virtualization@lists.linux-foundation.org

Re: [PATCH v3 5/6] powerpc/pseries: implement paravirt qspinlocks for SPLPAR

2020-07-24 Thread Waiman Long
On 7/24/20 4:16 AM, Will Deacon wrote: On Thu, Jul 23, 2020 at 08:47:59PM +0200, pet...@infradead.org wrote: On Thu, Jul 23, 2020 at 02:32:36PM -0400, Waiman Long wrote: BTW, do you have any comment on my v2 lock holder cpu info qspinlock patch? I will have to update the patch to fix

Re: [PATCH v3 5/6] powerpc/pseries: implement paravirt qspinlocks for SPLPAR

2020-07-23 Thread Waiman Long
On 7/23/20 3:58 PM, pet...@infradead.org wrote: On Thu, Jul 23, 2020 at 03:04:13PM -0400, Waiman Long wrote: On 7/23/20 2:47 PM, pet...@infradead.org wrote: On Thu, Jul 23, 2020 at 02:32:36PM -0400, Waiman Long wrote: BTW, do you have any comment on my v2 lock holder cpu info qspinlock patch

Re: [PATCH v3 5/6] powerpc/pseries: implement paravirt qspinlocks for SPLPAR

2020-07-23 Thread Waiman Long
On 7/23/20 2:47 PM, pet...@infradead.org wrote: On Thu, Jul 23, 2020 at 02:32:36PM -0400, Waiman Long wrote: BTW, do you have any comment on my v2 lock holder cpu info qspinlock patch? I will have to update the patch to fix the reported 0-day test problem, but I want to collect other feedback

Re: [PATCH v3 5/6] powerpc/pseries: implement paravirt qspinlocks for SPLPAR

2020-07-23 Thread Waiman Long
On 7/23/20 10:00 AM, Peter Zijlstra wrote: On Thu, Jul 09, 2020 at 12:06:13PM -0400, Waiman Long wrote: We don't really need to do a pv_spinlocks_init() if pv_kick() isn't supported. Waiman, if you cannot explain how not having kick is a sane thing, what are you saying here? The current PPC

Re: [PATCH v3 0/6] powerpc: queued spinlocks and rwlocks

2020-07-23 Thread Waiman Long
On 7/23/20 9:30 AM, Nicholas Piggin wrote: I would prefer to extract out the pending bit handling code out into a separate helper function which can be overridden by the arch code instead of breaking the slowpath into 2 pieces. You mean have the arch provide a queued_spin_lock_slowpath_pending

Re: [PATCH v3 0/6] powerpc: queued spinlocks and rwlocks

2020-07-21 Thread Waiman Long
On 7/21/20 7:08 AM, Nicholas Piggin wrote: diff --git a/arch/powerpc/include/asm/qspinlock.h b/arch/powerpc/include/asm/qspinlock.h index b752d34517b3..26d8766a1106 100644 --- a/arch/powerpc/include/asm/qspinlock.h +++ b/arch/powerpc/include/asm/qspinlock.h @@ -31,16 +31,57 @@ static inline

Re: [PATCH v3 5/6] powerpc/pseries: implement paravirt qspinlocks for SPLPAR

2020-07-09 Thread Waiman Long
++ arch/powerpc/platforms/pseries/Kconfig| 5 ++ arch/powerpc/platforms/pseries/setup.c| 6 +- include/asm-generic/qspinlock.h | 2 + Another ack? I am OK with adding the #ifdef around queued_spin_lock(). Acked-by: Waiman Long diff --git a/arch/powerpc

Re: [PATCH v3 0/6] powerpc: queued spinlocks and rwlocks

2020-07-08 Thread Waiman Long
On 7/8/20 7:50 PM, Waiman Long wrote: On 7/8/20 1:10 AM, Nicholas Piggin wrote: Excerpts from Waiman Long's message of July 8, 2020 1:33 pm: On 7/7/20 1:57 AM, Nicholas Piggin wrote: Yes, powerpc could certainly get more performance out of the slow paths, and then there are a few parameters

Re: [PATCH v3 0/6] powerpc: queued spinlocks and rwlocks

2020-07-08 Thread Waiman Long
On 7/8/20 4:41 AM, Peter Zijlstra wrote: On Tue, Jul 07, 2020 at 03:57:06PM +1000, Nicholas Piggin wrote: Yes, powerpc could certainly get more performance out of the slow paths, and then there are a few parameters to tune. Can you clarify? The slow path is already in use on ARM64 which is

Re: [PATCH v3 0/6] powerpc: queued spinlocks and rwlocks

2020-07-08 Thread Waiman Long
On 7/8/20 4:32 AM, Peter Zijlstra wrote: On Tue, Jul 07, 2020 at 11:33:45PM -0400, Waiman Long wrote: From 5d7941a498935fb225b2c7a3108cbf590114c3db Mon Sep 17 00:00:00 2001 From: Waiman Long Date: Tue, 7 Jul 2020 22:29:16 -0400 Subject: [PATCH 2/9] locking/pvqspinlock: Introduce

Re: [PATCH v3 0/6] powerpc: queued spinlocks and rwlocks

2020-07-08 Thread Waiman Long
On 7/8/20 1:10 AM, Nicholas Piggin wrote: Excerpts from Waiman Long's message of July 8, 2020 1:33 pm: On 7/7/20 1:57 AM, Nicholas Piggin wrote: Yes, powerpc could certainly get more performance out of the slow paths, and then there are a few parameters to tune. We don't have a good alternate

Re: [PATCH v3 0/6] powerpc: queued spinlocks and rwlocks

2020-07-07 Thread Waiman Long
rom 161e545523a7eb4c42c145c04e9a5a15903ba3d9 Mon Sep 17 00:00:00 2001 From: Waiman Long Date: Tue, 7 Jul 2020 20:46:51 -0400 Subject: [PATCH 1/9] locking/pvqspinlock: Code relocation and extraction Move pv_kick_node() and the unlock functions up and extract out the hash and lock code from pv_wait_head_or_lock() into pv_hash_l

Re: [PATCH v2 5/6] powerpc/pseries: implement paravirt qspinlocks for SPLPAR

2020-07-05 Thread Waiman Long
On 7/3/20 3:35 AM, Nicholas Piggin wrote: Signed-off-by: Nicholas Piggin --- arch/powerpc/include/asm/paravirt.h | 28 ++ arch/powerpc/include/asm/qspinlock.h | 55 +++ arch/powerpc/include/asm/qspinlock_paravirt.h | 5 ++

Re: [PATCH 6/8] powerpc/pseries: implement paravirt qspinlocks for SPLPAR

2020-07-02 Thread Waiman Long
On 7/2/20 12:15 PM, kernel test robot wrote: Hi Nicholas, I love your patch! Yet something to improve: [auto build test ERROR on powerpc/next] [also build test ERROR on tip/locking/core v5.8-rc3 next-20200702] [If your patch is applied to the wrong git tree, kindly drop us a note. And when

Re: [PATCH 6/8] powerpc/pseries: implement paravirt qspinlocks for SPLPAR

2020-07-02 Thread Waiman Long
On 7/2/20 3:48 AM, Nicholas Piggin wrote: Signed-off-by: Nicholas Piggin --- arch/powerpc/include/asm/paravirt.h | 23 arch/powerpc/include/asm/qspinlock.h | 55 +++ arch/powerpc/include/asm/qspinlock_paravirt.h | 5 ++

Re: [PATCH v4 0/3] mm, treewide: Rename kzfree() to kfree_sensitive()

2020-06-16 Thread Waiman Long
On 6/16/20 2:53 PM, Joe Perches wrote: On Mon, 2020-06-15 at 21:57 -0400, Waiman Long wrote: v4: - Break out the memzero_explicit() change as suggested by Dan Carpenter so that it can be backported to stable. - Drop the "crypto: Remove unnecessary memzero_explicit()&q

Re: [PATCH v4 0/3] mm, treewide: Rename kzfree() to kfree_sensitive()

2020-06-16 Thread Waiman Long
On 6/16/20 2:53 PM, Joe Perches wrote: On Mon, 2020-06-15 at 21:57 -0400, Waiman Long wrote: v4: - Break out the memzero_explicit() change as suggested by Dan Carpenter so that it can be backported to stable. - Drop the "crypto: Remove unnecessary memzero_explicit()&q

Re: [PATCH v5 2/2] mm, treewide: Rename kzfree() to kfree_sensitive()

2020-06-16 Thread Waiman Long
On 6/16/20 2:09 PM, Andrew Morton wrote: On Tue, 16 Jun 2020 11:43:11 -0400 Waiman Long wrote: As said by Linus: A symmetric naming is only helpful if it implies symmetries in use. Otherwise it's actively misleading. In "kzalloc()", the z is meaningful and an important pa

[PATCH v5 2/2] mm, treewide: Rename kzfree() to kfree_sensitive()

2020-06-16 Thread Waiman Long
ked-by: Michal Hocko Acked-by: Johannes Weiner Signed-off-by: Waiman Long --- arch/s390/crypto/prng.c | 4 +-- arch/x86/power/hibernate.c| 2 +- crypto/adiantum.c | 2 +- crypto/ahash.c

[PATCH v5 1/2] mm/slab: Use memzero_explicit() in kzfree()

2020-06-16 Thread Waiman Long
.org Acked-by: Michal Hocko Signed-off-by: Waiman Long --- mm/slab_common.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/mm/slab_common.c b/mm/slab_common.c index 9e72ba224175..37d48a56431d 100644 --- a/mm/slab_common.c +++ b/mm/slab_common.c @@ -1726,7 +1726,7 @@ void kz

[PATCH v5 0/2] mm, treewide: Rename kzfree() to kfree_sensitive()

2020-06-16 Thread Waiman Long
especially if LTO is used. Instead, the new kfree_sensitive() uses memzero_explicit() which won't get compiled out. Waiman Long (2): mm/slab: Use memzero_explicit() in kzfree() mm, treewide: Rename kzfree() to kfree_sensitive() arch/s390/crypto/prng.c | 4 +-- arch

Re: [PATCH v4 2/3] mm, treewide: Rename kzfree() to kfree_sensitive()

2020-06-16 Thread Waiman Long
On 6/16/20 10:26 AM, Dan Carpenter wrote: Last time you sent this we couldn't decide which tree it should go through. Either the crypto tree or through Andrew seems like the right thing to me. Also the other issue is that it risks breaking things if people add new kzfree() instances while we

Re: [PATCH v4 3/3] btrfs: Use kfree() in btrfs_ioctl_get_subvol_info()

2020-06-16 Thread Waiman Long
On 6/16/20 10:48 AM, David Sterba wrote: On Mon, Jun 15, 2020 at 09:57:18PM -0400, Waiman Long wrote: In btrfs_ioctl_get_subvol_info(), there is a classic case where kzalloc() was incorrectly paired with kzfree(). According to David Sterba, there isn't any sensitive information

Re: [PATCH v4 1/3] mm/slab: Use memzero_explicit() in kzfree()

2020-06-16 Thread Waiman Long
On 6/15/20 11:30 PM, Eric Biggers wrote: On Mon, Jun 15, 2020 at 09:57:16PM -0400, Waiman Long wrote: The kzfree() function is normally used to clear some sensitive information, like encryption keys, in the buffer before freeing it back to the pool. Memset() is currently used for the buffer

[PATCH v4 3/3] btrfs: Use kfree() in btrfs_ioctl_get_subvol_info()

2020-06-15 Thread Waiman Long
. Reported-by: David Sterba Signed-off-by: Waiman Long --- fs/btrfs/ioctl.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c index f1dd9e4271e9..e8f7c5f00894 100644 --- a/fs/btrfs/ioctl.c +++ b/fs/btrfs/ioctl.c @@ -2692,7 +2692,7 @@ static

[PATCH v4 2/3] mm, treewide: Rename kzfree() to kfree_sensitive()

2020-06-15 Thread Waiman Long
ked-by: Michal Hocko Acked-by: Johannes Weiner Signed-off-by: Waiman Long --- arch/s390/crypto/prng.c | 4 +-- arch/x86/power/hibernate.c| 2 +- crypto/adiantum.c | 2 +- crypto/ahash.c

[PATCH v4 0/3] mm, treewide: Rename kzfree() to kfree_sensitive()

2020-06-15 Thread Waiman Long
ring isn't totally safe either as compiler may compile out the clearing in their optimizer especially if LTO is used. Instead, the new kfree_sensitive() uses memzero_explicit() which won't get compiled out. Waiman Long (3): mm/slab: Use memzero_explicit() in kzfree() mm, treewide: Ren

[PATCH v4 1/3] mm/slab: Use memzero_explicit() in kzfree()

2020-06-15 Thread Waiman Long
especially if LTO is being used. To make sure that this optimization will not happen, memzero_explicit(), which is introduced in v3.18, is now used in kzfree() to do the clearing. Fixes: 3ef0e5ba4673 ("slab: introduce kzfree()") Cc: sta...@vger.kernel.org Signed-off-by: Waiman Lon

Re: [PATCH 1/2] mm, treewide: Rename kzfree() to kfree_sensitive()

2020-06-15 Thread Waiman Long
On 6/15/20 2:07 PM, Dan Carpenter wrote: On Mon, Apr 13, 2020 at 05:15:49PM -0400, Waiman Long wrote: diff --git a/mm/slab_common.c b/mm/slab_common.c index 23c7500eea7d..c08bc7eb20bd 100644 --- a/mm/slab_common.c +++ b/mm/slab_common.c @@ -1707,17 +1707,17 @@ void *krealloc(const void *p

Re: [PATCH v2 2/2] crypto: Remove unnecessary memzero_explicit()

2020-04-14 Thread Waiman Long
On 4/14/20 3:16 PM, Michal Suchánek wrote: > On Tue, Apr 14, 2020 at 12:24:36PM -0400, Waiman Long wrote: >> On 4/14/20 2:08 AM, Christophe Leroy wrote: >>> >>> Le 14/04/2020 à 00:28, Waiman Long a écrit : >>>> Since kfree_sensitive() will do an implicit me

Re: [PATCH 1/2] mm, treewide: Rename kzfree() to kfree_sensitive()

2020-04-14 Thread Waiman Long
On 4/14/20 8:48 AM, David Sterba wrote: > On Mon, Apr 13, 2020 at 05:15:49PM -0400, Waiman Long wrote: >> fs/btrfs/ioctl.c | 2 +- > >> diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c >> index 40b729dce91c..eab3f8510426 100644 >> ---

Re: [PATCH v2 2/2] crypto: Remove unnecessary memzero_explicit()

2020-04-14 Thread Waiman Long
On 4/14/20 2:08 AM, Christophe Leroy wrote: > > > Le 14/04/2020 à 00:28, Waiman Long a écrit : >> Since kfree_sensitive() will do an implicit memzero_explicit(), there >> is no need to call memzero_explicit() before it. Eliminate those >> memzero_explicit() and simplify

[PATCH v2 2/2] crypto: Remove unnecessary memzero_explicit()

2020-04-13 Thread Waiman Long
-by: Waiman Long --- .../allwinner/sun8i-ce/sun8i-ce-cipher.c | 19 +- .../allwinner/sun8i-ss/sun8i-ss-cipher.c | 20 +-- drivers/crypto/amlogic/amlogic-gxl-cipher.c | 12 +++ drivers/crypto/inside-secure/safexcel_hash.c | 3 +-- 4 files changed, 14

Re: [PATCH 2/2] crypto: Remove unnecessary memzero_explicit()

2020-04-13 Thread Waiman Long
On 4/13/20 5:31 PM, Joe Perches wrote: > On Mon, 2020-04-13 at 17:15 -0400, Waiman Long wrote: >> Since kfree_sensitive() will do an implicit memzero_explicit(), there >> is no need to call memzero_explicit() before it. Eliminate those >> memzero_explicit() and simplify the

[PATCH 1/2] mm, treewide: Rename kzfree() to kfree_sensitive()

2020-04-13 Thread Waiman Long
ng is done by using the command sequence: git grep -w --name-only kzfree |\ xargs sed -i 's/\bkzfree\b/kfree_sensitive/' followed by some editing of the kfree_sensitive() kerneldoc and the use of memzero_explicit() instead of memset(). Suggested-by: Joe Perches Signed-off-by: W

[PATCH 2/2] crypto: Remove unnecessary memzero_explicit()

2020-04-13 Thread Waiman Long
Since kfree_sensitive() will do an implicit memzero_explicit(), there is no need to call memzero_explicit() before it. Eliminate those memzero_explicit() and simplify the call sites. Signed-off-by: Waiman Long --- .../crypto/allwinner/sun8i-ce/sun8i-ce-cipher.c | 15 +++ .../crypto

[PATCH 0/2] mm, treewide: Rename kzfree() to kfree_sensitive()

2020-04-13 Thread Waiman Long
compile out the clearing in their optimizer. Instead, the new kfree_sensitive() uses memzero_explicit() which won't get compiled out. Waiman Long (2): mm, treewide: Rename kzfree() to kfree_sensitive() crypto: Remove unnecessary memzero_explicit() arch/s390/crypto/prng.c

Re: [PATCH] x86/paravirt: Guard against invalid cpu # in pv_vcpu_is_preempted()

2019-04-01 Thread Waiman Long
On 04/01/2019 02:38 AM, Juergen Gross wrote: > On 25/03/2019 19:03, Waiman Long wrote: >> On 03/25/2019 12:40 PM, Juergen Gross wrote: >>> On 25/03/2019 16:57, Waiman Long wrote: >>>> It was found that passing an invalid cpu number to pv_vcpu_is_preempted() >&

Re: [PATCH] x86/paravirt: Guard against invalid cpu # in pv_vcpu_is_preempted()

2019-03-25 Thread Waiman Long
On 03/25/2019 12:40 PM, Juergen Gross wrote: > On 25/03/2019 16:57, Waiman Long wrote: >> It was found that passing an invalid cpu number to pv_vcpu_is_preempted() >> might panic the kernel in a VM guest. For example, >> >> [2.531077] Oops: [#1] SMP PTI >&g

[PATCH] x86/paravirt: Guard against invalid cpu # in pv_vcpu_is_preempted()

2019-03-25 Thread Waiman Long
:__raw_callee_save___kvm_vcpu_is_preempted+0x0/0x20 To guard against this kind of kernel panic, check is added to pv_vcpu_is_preempted() to make sure that no invalid cpu number will be used. Signed-off-by: Waiman Long --- arch/x86/include/asm/paravirt.h | 6 ++ 1 file changed, 6 insertions(+) diff --git a/arch/x86/include/asm

Re: [PATCH-tip v2 2/2] x86/xen: Deprecate xen_nopvspin

2017-11-02 Thread Waiman Long
On 11/01/2017 06:01 PM, Boris Ostrovsky wrote: > On 11/01/2017 04:58 PM, Waiman Long wrote: >> +/* TODO: To be removed in a future kernel version */ >> static __init int xen_parse_nopvspin(char *arg) >> { >> -xen_pvspin = false; >> +pr_warn(&qu

[PATCH-tip v2 2/2] x86/xen: Deprecate xen_nopvspin

2017-11-01 Thread Waiman Long
With the new pvlock_type kernel parameter, xen_nopvspin is no longer needed. This patch deprecates the xen_nopvspin parameter by removing its documentation and treating it as an alias of "pvlock_type=queued". Signed-off-by: Waiman Long <long...@redhat.com> --- Documentation/ad

[PATCH-tip v2 0/2] x86/paravirt: Enable users to choose PV lock type

2017-11-01 Thread Waiman Long
h 2 deprecates Xen's xen_nopvspin parameter as it is no longer needed. Waiman Long (2): x86/paravirt: Add kernel parameter to choose paravirt lock type x86/xen: Deprecate xen_nopvspin Documentation/admin-guide/kernel-parameters.txt | 11 --- arch/x86/include/asm/paravirt.h |

[PATCH-tip v2 1/2] x86/paravirt: Add kernel parameter to choose paravirt lock type

2017-11-01 Thread Waiman Long
this new parameter in determining if pvqspinlock should be used. The parameter, however, will override Xen's xen_nopvspin in term of disabling unfair lock. Signed-off-by: Waiman Long <long...@redhat.com> --- Documentation/admin-guide/kernel-parameters.txt | 7 + arch/x86/include/asm/para

Re: [PATCH] x86/paravirt: Add kernel parameter to choose paravirt lock type

2017-11-01 Thread Waiman Long
On 11/01/2017 03:01 PM, Boris Ostrovsky wrote: > On 11/01/2017 12:28 PM, Waiman Long wrote: >> On 11/01/2017 11:51 AM, Juergen Gross wrote: >>> On 01/11/17 16:32, Waiman Long wrote: >>>> Currently, there are 3 different lock types that can be chosen

Re: [PATCH] x86/paravirt: Add kernel parameter to choose paravirt lock type

2017-11-01 Thread Waiman Long
On 11/01/2017 11:51 AM, Juergen Gross wrote: > On 01/11/17 16:32, Waiman Long wrote: >> Currently, there are 3 different lock types that can be chosen for >> the x86 architecture: >> >> - qspinlock >> - pvqspinlock >> - unfair lock >> >> One

[PATCH] x86/paravirt: Add kernel parameter to choose paravirt lock type

2017-11-01 Thread Waiman Long
this new parameter in determining if pvqspinlock should be used. The parameter, however, will override Xen's xen_nopvspin in term of disabling unfair lock. Signed-off-by: Waiman Long <long...@redhat.com> --- Documentation/admin-guide/kernel-parameters.txt | 7 + arch/x86/include/asm/para

Re: [PATCH v3 0/2] guard virt_spin_lock() with a static key

2017-09-25 Thread Waiman Long
t can decide whether to use paravitualized >> spinlocks, the current fallback to the unfair test-and-set scheme, or >> to mimic the bare metal behavior. >> >> V3: >> - remove test for hypervisor environment from virt_spin_lock(9 as >> suggested by Waiman Long

Re: [PATCH v2 1/2] paravirt/locks: use new static key for controlling call of virt_spin_lock()

2017-09-06 Thread Waiman Long
On 09/06/2017 12:04 PM, Peter Zijlstra wrote: > On Wed, Sep 06, 2017 at 11:49:49AM -0400, Waiman Long wrote: >>> #define virt_spin_lock virt_spin_lock >>> static inline bool virt_spin_lock(struct qspinlock *lock) >>> { >>> + if (!static_branch_likely(

Re: [PATCH v2 2/2] paravirt,xen: correct xen_nopvspin case

2017-09-06 Thread Waiman Long
t xen_init_spinlocks(void) > > if (!xen_pvspin) { > printk(KERN_DEBUG "xen: PV spinlocks disabled\n"); > + static_branch_disable(_spin_lock_key); > return; > } > printk(KERN_DEBUG "xen: PV spinlocks enable

Re: [PATCH v2 1/2] paravirt/locks: use new static key for controlling call of virt_spin_lock()

2017-09-06 Thread Waiman Long
> diff --git a/kernel/locking/qspinlock.c b/kernel/locking/qspinlock.c > index 294294c71ba4..838d235b87ef 100644 > --- a/kernel/locking/qspinlock.c > +++ b/kernel/locking/qspinlock.c > @@ -76,6 +76,10 @@ > #define MAX_NODES4 > #endif > > +#ifdef CONFIG_PARAVIRT > +DEFIN

Re: [PATCH 3/4] paravirt: add virt_spin_lock pvops function

2017-09-06 Thread Waiman Long
On 09/06/2017 03:08 AM, Peter Zijlstra wrote: > Guys, please trim email. > > On Tue, Sep 05, 2017 at 10:31:46AM -0400, Waiman Long wrote: >> For clarification, I was actually asking if you consider just adding one >> more jump label to skip it for Xen/KVM instead of making >

Re: [PATCH 3/4] paravirt: add virt_spin_lock pvops function

2017-09-05 Thread Waiman Long
On 09/05/2017 10:24 AM, Waiman Long wrote: > On 09/05/2017 10:18 AM, Juergen Gross wrote: >> On 05/09/17 16:10, Waiman Long wrote: >>> On 09/05/2017 09:24 AM, Juergen Gross wrote: >>>> There are cases where a guest tries to switch spinlocks to bare metal

Re: [PATCH 3/4] paravirt: add virt_spin_lock pvops function

2017-09-05 Thread Waiman Long
On 09/05/2017 10:18 AM, Juergen Gross wrote: > On 05/09/17 16:10, Waiman Long wrote: >> On 09/05/2017 09:24 AM, Juergen Gross wrote: >>> There are cases where a guest tries to switch spinlocks to bare metal >>> behavior (e.g. by setting "xen_nopvspin"

Re: [PATCH 3/4] paravirt: add virt_spin_lock pvops function

2017-09-05 Thread Waiman Long
On 09/05/2017 10:08 AM, Peter Zijlstra wrote: > On Tue, Sep 05, 2017 at 10:02:57AM -0400, Waiman Long wrote: >> On 09/05/2017 09:24 AM, Juergen Gross wrote: >>> +static inline bool native_virt_spin_lock(struct qspinlock *lock) >>> +{ >>> + if (!s

Re: [PATCH 3/4] paravirt: add virt_spin_lock pvops function

2017-09-05 Thread Waiman Long
On 09/05/2017 09:24 AM, Juergen Gross wrote: > There are cases where a guest tries to switch spinlocks to bare metal > behavior (e.g. by setting "xen_nopvspin" boot parameter). Today this > has the downside of falling back to unfair test and set scheme for > qspinlocks due to virt_spin_lock()

Re: [PATCH 3/4] paravirt: add virt_spin_lock pvops function

2017-09-05 Thread Waiman Long
On 09/05/2017 09:24 AM, Juergen Gross wrote: > There are cases where a guest tries to switch spinlocks to bare metal > behavior (e.g. by setting "xen_nopvspin" boot parameter). Today this > has the downside of falling back to unfair test and set scheme for > qspinlocks due to virt_spin_lock()

[PATCH v5 2/2] x86/kvm: Provide optimized version of vcpu_is_preempted() for x86-64

2017-02-20 Thread Waiman Long
__raw_callee_save___kvm_vcpu_is_preempt() to verify that they matched. Suggested-by: Peter Zijlstra <pet...@infradead.org> Signed-off-by: Waiman Long <long...@redhat.com> --- arch/x86/kernel/asm-offsets_64.c | 9 + arch/x86/kernel/kvm.c| 24 2 f

[PATCH v5 0/2] x86/kvm: Reduce vcpu_is_preempted() overhead

2017-02-20 Thread Waiman Long
ds to reduce this performance overhead by replacing the C __kvm_vcpu_is_preempted() function by an optimized version of __raw_callee_save___kvm_vcpu_is_preempted() written in assembly. Waiman Long (2): x86/paravirt: Change vcp_is_preempted() arg type to long x86/kvm: Provide optim

[PATCH v5 1/2] x86/paravirt: Change vcp_is_preempted() arg type to long

2017-02-20 Thread Waiman Long
number won't exceed 32 bits. Signed-off-by: Waiman Long <long...@redhat.com> --- arch/x86/include/asm/paravirt.h | 2 +- arch/x86/include/asm/qspinlock.h | 2 +- arch/x86/kernel/kvm.c| 2 +- arch/x86/kernel/paravirt-spinlocks.c | 2 +- 4 files changed, 4 insertions

Re: [PATCH v4 1/2] x86/paravirt: Change vcp_is_preempted() arg type to long

2017-02-16 Thread Waiman Long
On 02/16/2017 11:09 AM, Peter Zijlstra wrote: > On Wed, Feb 15, 2017 at 04:37:49PM -0500, Waiman Long wrote: >> The cpu argument in the function prototype of vcpu_is_preempted() >> is changed from int to long. That makes it easier to provide a better >> optimized assembly ver

Re: [PATCH v4 2/2] x86/kvm: Provide optimized version of vcpu_is_preempted() for x86-64

2017-02-16 Thread Waiman Long
On 02/16/2017 11:48 AM, Peter Zijlstra wrote: > On Wed, Feb 15, 2017 at 04:37:50PM -0500, Waiman Long wrote: >> +/* >> + * Hand-optimize version for x86-64 to avoid 8 64-bit register saving and >> + * restoring to/from the stack. It is assumed that the preempted value >>

[PATCH v4 2/2] x86/kvm: Provide optimized version of vcpu_is_preempted() for x86-64

2017-02-15 Thread Waiman Long
__raw_callee_save___kvm_vcpu_is_preempt() to verify that they matched. Suggested-by: Peter Zijlstra <pet...@infradead.org> Signed-off-by: Waiman Long <long...@redhat.com> --- arch/x86/kernel/kvm.c | 30 ++ 1 file changed, 30 insertions(+) diff --git a/arch/x86/ke

[PATCH v4 1/2] x86/paravirt: Change vcp_is_preempted() arg type to long

2017-02-15 Thread Waiman Long
number won't exceed 32 bits. Signed-off-by: Waiman Long <long...@redhat.com> --- arch/x86/include/asm/paravirt.h | 2 +- arch/x86/include/asm/qspinlock.h | 2 +- arch/x86/kernel/kvm.c| 2 +- arch/x86/kernel/paravirt-spinlocks.c | 2 +- 4 files changed, 4 insertions

[PATCH v4 0/2] x86/kvm: Reduce vcpu_is_preempted() overhead

2017-02-15 Thread Waiman Long
ted() can have some impact on system performance on a VM guest, especially of x86-64 guest, this patch set intends to reduce this performance overhead by replacing the C __kvm_vcpu_is_preempted() function by an optimized version of __raw_callee_save___kvm_vcpu_is_preempted() written in assembly. Wa

[PATCH v3 1/2] x86/paravirt: Change vcp_is_preempted() arg type to long

2017-02-15 Thread Waiman Long
number won't exceed 32 bits. Signed-off-by: Waiman Long <long...@redhat.com> --- arch/x86/include/asm/paravirt.h | 2 +- arch/x86/include/asm/qspinlock.h | 2 +- arch/x86/kernel/kvm.c| 2 +- arch/x86/kernel/paravirt-spinlocks.c | 2 +- 4 files changed, 4 insertions

[PATCH v3 0/2] x86/kvm: Reduce vcpu_is_preempted() overhead

2017-02-15 Thread Waiman Long
rmance on a VM guest, especially of x86-64 guest, this patch set intends to reduce this performance overhead by replacing the C __kvm_vcpu_is_preempted() function by an optimized version of __raw_callee_save___kvm_vcpu_is_preempted() written in assembly. Waiman Long (2): x86/paravirt:

[PATCH v3 2/2] x86/kvm: Provide optimized version of vcpu_is_preempted() for x86-64

2017-02-15 Thread Waiman Long
__raw_callee_save___kvm_vcpu_is_preempt() to verify that they matched. Suggested-by: Peter Zijlstra <pet...@infradead.org> Signed-off-by: Waiman Long <long...@redhat.com> --- arch/x86/kernel/kvm.c | 28 1 file changed, 28 insertions(+) diff --git a/arch/x86/ke

Re: [PATCH v2] x86/paravirt: Don't make vcpu_is_preempted() a callee-save function

2017-02-14 Thread Waiman Long
On 02/14/2017 04:39 AM, Peter Zijlstra wrote: > On Mon, Feb 13, 2017 at 05:34:01PM -0500, Waiman Long wrote: >> It is the address of _time that will exceed the 32-bit limit. > That seems extremely unlikely. That would mean we have more than 4G > worth of per-cpu variables declare

Re: [PATCH v2] x86/paravirt: Don't make vcpu_is_preempted() a callee-save function

2017-02-13 Thread Waiman Long
On 02/13/2017 04:52 PM, Peter Zijlstra wrote: > On Mon, Feb 13, 2017 at 03:12:45PM -0500, Waiman Long wrote: >> On 02/13/2017 02:42 PM, Waiman Long wrote: >>> On 02/13/2017 05:53 AM, Peter Zijlstra wrote: >>>> On Mon, Feb 13, 2017 at 11:47:16AM +0100, Peter Zijlstra

Re: [PATCH v2] x86/paravirt: Don't make vcpu_is_preempted() a callee-save function

2017-02-13 Thread Waiman Long
On 02/13/2017 03:06 PM, h...@zytor.com wrote: > On February 13, 2017 2:53:43 AM PST, Peter Zijlstra > wrote: >> On Mon, Feb 13, 2017 at 11:47:16AM +0100, Peter Zijlstra wrote: >>> That way we'd end up with something like: >>> >>> asm(" >>> push %rdi; >>> movslq %edi, %rdi;

Re: [PATCH v2] x86/paravirt: Don't make vcpu_is_preempted() a callee-save function

2017-02-13 Thread Waiman Long
On 02/13/2017 02:42 PM, Waiman Long wrote: > On 02/13/2017 05:53 AM, Peter Zijlstra wrote: >> On Mon, Feb 13, 2017 at 11:47:16AM +0100, Peter Zijlstra wrote: >>> That way we'd end up with something like: >>> >>> asm(" >>> push %rdi; >>>

Re: [PATCH v2] x86/paravirt: Don't make vcpu_is_preempted() a callee-save function

2017-02-13 Thread Waiman Long
On 02/13/2017 05:53 AM, Peter Zijlstra wrote: > On Mon, Feb 13, 2017 at 11:47:16AM +0100, Peter Zijlstra wrote: >> That way we'd end up with something like: >> >> asm(" >> push %rdi; >> movslq %edi, %rdi; >> movq __per_cpu_offset(,%rdi,8), %rax; >> cmpb $0, %[offset](%rax); >> setne %al; >> pop

Re: [PATCH v2] x86/paravirt: Don't make vcpu_is_preempted() a callee-save function

2017-02-13 Thread Waiman Long
On 02/13/2017 05:47 AM, Peter Zijlstra wrote: > On Fri, Feb 10, 2017 at 12:00:43PM -0500, Waiman Long wrote: > >>>> +asm( >>>> +".pushsection .text;" >>>> +".global __raw_callee_save___kvm_vcpu_is_preempted;" >

Re: [PATCH v2] x86/paravirt: Don't make vcpu_is_preempted() a callee-save function

2017-02-10 Thread Waiman Long
On 02/10/2017 11:35 AM, Waiman Long wrote: > On 02/10/2017 11:19 AM, Peter Zijlstra wrote: >> On Fri, Feb 10, 2017 at 10:43:09AM -0500, Waiman Long wrote: >>> It was found when running fio sequential write test with a XFS ramdisk >>> on a VM running on a 2-socket x8

Re: [PATCH v2] x86/paravirt: Don't make vcpu_is_preempted() a callee-save function

2017-02-10 Thread Waiman Long
On 02/10/2017 11:19 AM, Peter Zijlstra wrote: > On Fri, Feb 10, 2017 at 10:43:09AM -0500, Waiman Long wrote: >> It was found when running fio sequential write test with a XFS ramdisk >> on a VM running on a 2-socket x86-64 system, the %CPU times as reported >> by perf were as

[PATCH v2] x86/paravirt: Don't make vcpu_is_preempted() a callee-save function

2017-02-10 Thread Waiman Long
[k] osq_lock 10.14% 10.14% fio [k] __kvm_vcpu_is_preempted On bare metal, the patch doesn't introduce any performance regression. On KVM guest, it produces noticeable performance improvement (up to 7%). Signed-off-by: Waiman Long <long...@redhat.com> --- v1->v2: - Rerun

Re: [PATCH 1/2] x86/paravirt: Don't make vcpu_is_preempted() a callee-save function

2017-02-08 Thread Waiman Long
On 02/08/2017 02:05 PM, Peter Zijlstra wrote: > On Wed, Feb 08, 2017 at 01:00:24PM -0500, Waiman Long wrote: >> It was found when running fio sequential write test with a XFS ramdisk >> on a 2-socket x86-64 system, the %CPU times as reported by perf were >> as follows: >

Re: [PATCH 2/2] locking/mutex,rwsem: Reduce vcpu_is_preempted() calling frequency

2017-02-08 Thread Waiman Long
On 02/08/2017 02:05 PM, Peter Zijlstra wrote: > On Wed, Feb 08, 2017 at 01:00:25PM -0500, Waiman Long wrote: >> As the vcpu_is_preempted() call is pretty costly compared with other >> checks within mutex_spin_on_owner() and rwsem_spin_on_owner(), they >> are done at a red

[PATCH 1/2] x86/paravirt: Don't make vcpu_is_preempted() a callee-save function

2017-02-08 Thread Waiman Long
and rwsem slowpaths, there isn't much to gain by making it callee-save. So it is now changed to a normal function call instead. With this patch applied, the aggregrate bandwidth of the fio sequential write test increased slightly from 2563.3MB/s to 2588.1MB/s (about 1%). Signed-off-by: Waiman

  1   2   3   4   5   >