struct mutex *lock)
+{
+ return devm_add_action_or_reset(dev, devm_mutex_release, lock);
+}
+
/***
* mutex_destroy - mark a mutex unusable
* @lock: the mutex to be destroyed
Acked-by: Waiman Long
On 3/11/24 19:47, George Stark wrote:
Hello Waiman, Marek
Thanks for the review.
I've never used lockdep for debug but it seems preferable to
keep that feature working. It could be look like this:
diff --git a/include/linux/mutex.h b/include/linux/mutex.h
index f7611c092db7..574f6de6084d
On 3/7/24 04:56, Marek Behún wrote:
On Thu, Mar 07, 2024 at 05:40:26AM +0300, George Stark wrote:
Using of devm API leads to a certain order of releasing resources.
So all dependent resources which are not devm-wrapped should be deleted
with respect to devm-release order. Mutex is one of such
On 12/15/23 10:58, Andy Shevchenko wrote:
On Fri, Dec 15, 2023 at 8:23 AM Christophe Leroy
wrote:
From: George Stark
Using of devm API leads to a certain order of releasing resources.
So all dependent resources which are not devm-wrapped should be deleted
with respect to devm-release order.
On 12/14/23 14:53, Christophe Leroy wrote:
Le 14/12/2023 à 19:48, Waiman Long a écrit :
On 12/14/23 12:36, George Stark wrote:
Using of devm API leads to a certain order of releasing resources.
So all dependent resources which are not devm-wrapped should be deleted
with respect to devm
On 12/14/23 12:36, George Stark wrote:
Using of devm API leads to a certain order of releasing resources.
So all dependent resources which are not devm-wrapped should be deleted
with respect to devm-release order. Mutex is one of such objects that
often is bound to other resources and has no
function, not a NOP */
+#define mutex_destroy mutex_destroy
+
#else
# define __DEBUG_MUTEX_INITIALIZER(lockname)
Acked-by: Waiman Long
On 12/6/23 16:02, Waiman Long wrote:
On 12/6/23 14:55, Hans de Goede wrote:
Hi,
On 12/6/23 19:58, George Stark wrote:
Hello Hans
Thanks for the review.
On 12/6/23 18:01, Hans de Goede wrote:
Hi George,
On 12/4/23 19:05, George Stark wrote:
Using of devm API leads to certain order
On 12/6/23 19:37, George Stark wrote:
Hello Waiman
Thanks for the review.
On 12/7/23 00:02, Waiman Long wrote:
On 12/6/23 14:55, Hans de Goede wrote:
Hi,
On 12/6/23 19:58, George Stark wrote:
Hello Hans
Thanks for the review.
On 12/6/23 18:01, Hans de Goede wrote:
Hi George
On 12/6/23 14:55, Hans de Goede wrote:
Hi,
On 12/6/23 19:58, George Stark wrote:
Hello Hans
Thanks for the review.
On 12/6/23 18:01, Hans de Goede wrote:
Hi George,
On 12/4/23 19:05, George Stark wrote:
Using of devm API leads to certain order of releasing resources.
So all dependent
On 5/20/22 04:36, Maninder Singh wrote:
As of now sprint_* APIs don't pass buffer size as an argument
and use sprintf directly.
To replace dangerous sprintf API to scnprintf,
buffer size is required in arguments.
Co-developed-by: Onkarnath
Signed-off-by: Onkarnath
Signed-off-by: Maninder
On 11/8/21 20:46, Nicholas Piggin wrote:
Excerpts from Michael Ellerman's message of November 9, 2021 11:04 am:
Waiman Long writes:
It was found that the following warning message could be printed out when
booting the kernel on PowerPC systems that support LPAR:
[0.129584] WARNING: CPU
On 11/8/21 18:06, Nathan Lynch wrote:
Waiman Long writes:
It was found that the following warning message could be printed out when
booting the kernel on PowerPC systems that support LPAR:
[0.129584] WARNING: CPU: 0 PID: 1 at mm/memblock.c:1451
memblock_alloc_internal+0x5c/0x104
On 11/8/21 20:04, Michael Ellerman wrote:
Waiman Long writes:
It was found that the following warning message could be printed out when
booting the kernel on PowerPC systems that support LPAR:
[0.129584] WARNING: CPU: 0 PID: 1 at mm/memblock.c:1451
memblock_alloc_internal+0x5c/0x104
("powerpc/pseries: Prevent free CPU ids being reused on
another node")
Signed-off-by: Waiman Long
---
arch/powerpc/platforms/pseries/hotplug-cpu.c | 8
1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/arch/powerpc/platforms/pseries/hotplug-cpu.c
b/arch/powerpc
On 10/25/21 11:44 AM, Arnd Bergmann wrote:
On Mon, Oct 25, 2021 at 5:28 PM Waiman Long wrote:
On 10/25/21 9:06 AM, Arnd Bergmann wrote:
On s390, we pick between the cmpxchg() based directed-yield when
running on virtualized CPUs, and a normal qspinlock when running on a
dedicated CPU.
I am
On 10/25/21 9:06 AM, Arnd Bergmann wrote:
On Mon, Oct 25, 2021 at 11:57 AM Peter Zijlstra wrote:
On Sat, Oct 23, 2021 at 06:04:57PM +0200, Arnd Bergmann wrote:
On Sat, Oct 23, 2021 at 3:37 AM Waiman Long wrote:
On 10/22/21 7:59 AM, Arnd Bergmann wrote:
From: Arnd Bergmann
As this is all
On 10/22/21 7:59 AM, Arnd Bergmann wrote:
From: Arnd Bergmann
parisc, ia64 and powerpc32 are the only remaining architectures that
provide custom arch_{spin,read,write}_lock_flags() functions, which are
meant to re-enable interrupts while waiting for a spinlock.
However, none of these can
On 4/6/21 7:52 PM, Stafford Horne wrote:
For OpenRISC I did ack the patch to convert to
CONFIG_ARCH_USE_QUEUED_SPINLOCKS_XCHG32=y. But I think you are right, the
generic code in xchg_tail and the xchg16 emulation code in produced by OpenRISC
using xchg32 would produce very similar code. I
: Michael Ellerman
Cc: Nicholas Piggin
Cc: Nathan Lynch
Cc: Gautham R Shenoy
Cc: Peter Zijlstra
Cc: Valentin Schneider
Cc: Juri Lelli
Cc: Waiman Long
Cc: Phil Auld
Srikar Dronamraju (4):
powerpc: Refactor is_kvm_guest declaration to new header
powerpc: Rename is_kvm_guest to check_kvm_guest
On 7/25/20 1:26 PM, Peter Zijlstra wrote:
On Fri, Jul 24, 2020 at 03:10:59PM -0400, Waiman Long wrote:
On 7/24/20 4:16 AM, Will Deacon wrote:
On Thu, Jul 23, 2020 at 08:47:59PM +0200, pet...@infradead.org wrote:
On Thu, Jul 23, 2020 at 02:32:36PM -0400, Waiman Long wrote:
BTW, do you have
On 7/24/20 3:10 PM, Waiman Long wrote:
On 7/24/20 4:16 AM, Will Deacon wrote:
On Thu, Jul 23, 2020 at 08:47:59PM +0200, pet...@infradead.org wrote:
On Thu, Jul 23, 2020 at 02:32:36PM -0400, Waiman Long wrote:
BTW, do you have any comment on my v2 lock holder cpu info
qspinlock patch?
I
/powerpc/include/asm/simple_spinlock.h
create mode 100644 arch/powerpc/include/asm/simple_spinlock_types.h
That patch series looks good to me. Thanks for working on this.
For the series,
Acked-by: Waiman Long
On 7/24/20 9:14 AM, Nicholas Piggin wrote:
This implements smp_cond_load_relaed with the slowpath busy loop using the
Nit: "smp_cond_load_relaxed"
Cheers,
Longman
On 7/24/20 4:16 AM, Will Deacon wrote:
On Thu, Jul 23, 2020 at 08:47:59PM +0200, pet...@infradead.org wrote:
On Thu, Jul 23, 2020 at 02:32:36PM -0400, Waiman Long wrote:
BTW, do you have any comment on my v2 lock holder cpu info qspinlock patch?
I will have to update the patch to fix
On 7/23/20 3:58 PM, pet...@infradead.org wrote:
On Thu, Jul 23, 2020 at 03:04:13PM -0400, Waiman Long wrote:
On 7/23/20 2:47 PM, pet...@infradead.org wrote:
On Thu, Jul 23, 2020 at 02:32:36PM -0400, Waiman Long wrote:
BTW, do you have any comment on my v2 lock holder cpu info qspinlock patch
On 7/23/20 2:47 PM, pet...@infradead.org wrote:
On Thu, Jul 23, 2020 at 02:32:36PM -0400, Waiman Long wrote:
BTW, do you have any comment on my v2 lock holder cpu info qspinlock patch?
I will have to update the patch to fix the reported 0-day test problem, but
I want to collect other feedback
On 7/23/20 10:00 AM, Peter Zijlstra wrote:
On Thu, Jul 09, 2020 at 12:06:13PM -0400, Waiman Long wrote:
We don't really need to do a pv_spinlocks_init() if pv_kick() isn't
supported.
Waiman, if you cannot explain how not having kick is a sane thing, what
are you saying here?
The current PPC
On 7/23/20 9:30 AM, Nicholas Piggin wrote:
I would prefer to extract out the pending bit handling code out into a
separate helper function which can be overridden by the arch code
instead of breaking the slowpath into 2 pieces.
You mean have the arch provide a queued_spin_lock_slowpath_pending
On 7/21/20 7:08 AM, Nicholas Piggin wrote:
diff --git a/arch/powerpc/include/asm/qspinlock.h
b/arch/powerpc/include/asm/qspinlock.h
index b752d34517b3..26d8766a1106 100644
--- a/arch/powerpc/include/asm/qspinlock.h
+++ b/arch/powerpc/include/asm/qspinlock.h
@@ -31,16 +31,57 @@ static inline
++
arch/powerpc/platforms/pseries/Kconfig| 5 ++
arch/powerpc/platforms/pseries/setup.c| 6 +-
include/asm-generic/qspinlock.h | 2 +
Another ack?
I am OK with adding the #ifdef around queued_spin_lock().
Acked-by: Waiman Long
diff --git a/arch/powerpc
On 7/8/20 7:50 PM, Waiman Long wrote:
On 7/8/20 1:10 AM, Nicholas Piggin wrote:
Excerpts from Waiman Long's message of July 8, 2020 1:33 pm:
On 7/7/20 1:57 AM, Nicholas Piggin wrote:
Yes, powerpc could certainly get more performance out of the slow
paths, and then there are a few parameters
On 7/8/20 4:41 AM, Peter Zijlstra wrote:
On Tue, Jul 07, 2020 at 03:57:06PM +1000, Nicholas Piggin wrote:
Yes, powerpc could certainly get more performance out of the slow
paths, and then there are a few parameters to tune.
Can you clarify? The slow path is already in use on ARM64 which is
On 7/8/20 4:32 AM, Peter Zijlstra wrote:
On Tue, Jul 07, 2020 at 11:33:45PM -0400, Waiman Long wrote:
From 5d7941a498935fb225b2c7a3108cbf590114c3db Mon Sep 17 00:00:00 2001
From: Waiman Long
Date: Tue, 7 Jul 2020 22:29:16 -0400
Subject: [PATCH 2/9] locking/pvqspinlock: Introduce
On 7/8/20 1:10 AM, Nicholas Piggin wrote:
Excerpts from Waiman Long's message of July 8, 2020 1:33 pm:
On 7/7/20 1:57 AM, Nicholas Piggin wrote:
Yes, powerpc could certainly get more performance out of the slow
paths, and then there are a few parameters to tune.
We don't have a good alternate
rom 161e545523a7eb4c42c145c04e9a5a15903ba3d9 Mon Sep 17 00:00:00 2001
From: Waiman Long
Date: Tue, 7 Jul 2020 20:46:51 -0400
Subject: [PATCH 1/9] locking/pvqspinlock: Code relocation and extraction
Move pv_kick_node() and the unlock functions up and extract out the hash
and lock code from pv_wait_head_or_lock() into pv_hash_l
On 7/6/20 12:35 AM, Nicholas Piggin wrote:
v3 is updated to use __pv_queued_spin_unlock, noticed by Waiman (thank you).
Thanks,
Nick
Nicholas Piggin (6):
powerpc/powernv: must include hvcall.h to get PAPR defines
powerpc/pseries: move some PAPR paravirt functions to their own file
On 7/3/20 3:35 AM, Nicholas Piggin wrote:
Signed-off-by: Nicholas Piggin
---
arch/powerpc/include/asm/paravirt.h | 28 ++
arch/powerpc/include/asm/qspinlock.h | 55 +++
arch/powerpc/include/asm/qspinlock_paravirt.h | 5 ++
On 7/2/20 12:15 PM, kernel test robot wrote:
Hi Nicholas,
I love your patch! Yet something to improve:
[auto build test ERROR on powerpc/next]
[also build test ERROR on tip/locking/core v5.8-rc3 next-20200702]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when
On 7/2/20 3:48 AM, Nicholas Piggin wrote:
Signed-off-by: Nicholas Piggin
---
arch/powerpc/include/asm/paravirt.h | 23
arch/powerpc/include/asm/qspinlock.h | 55 +++
arch/powerpc/include/asm/qspinlock_paravirt.h | 5 ++
On 6/16/20 2:53 PM, Joe Perches wrote:
On Mon, 2020-06-15 at 21:57 -0400, Waiman Long wrote:
v4:
- Break out the memzero_explicit() change as suggested by Dan Carpenter
so that it can be backported to stable.
- Drop the "crypto: Remove unnecessary memzero_explicit()&q
On 6/16/20 2:53 PM, Joe Perches wrote:
On Mon, 2020-06-15 at 21:57 -0400, Waiman Long wrote:
v4:
- Break out the memzero_explicit() change as suggested by Dan Carpenter
so that it can be backported to stable.
- Drop the "crypto: Remove unnecessary memzero_explicit()&q
On 6/16/20 2:09 PM, Andrew Morton wrote:
On Tue, 16 Jun 2020 11:43:11 -0400 Waiman Long wrote:
As said by Linus:
A symmetric naming is only helpful if it implies symmetries in use.
Otherwise it's actively misleading.
In "kzalloc()", the z is meaningful and an important pa
.org
Acked-by: Michal Hocko
Signed-off-by: Waiman Long
---
mm/slab_common.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/mm/slab_common.c b/mm/slab_common.c
index 9e72ba224175..37d48a56431d 100644
--- a/mm/slab_common.c
+++ b/mm/slab_common.c
@@ -1726,7 +1726,7 @@ void kz
ked-by: Michal Hocko
Acked-by: Johannes Weiner
Signed-off-by: Waiman Long
---
arch/s390/crypto/prng.c | 4 +--
arch/x86/power/hibernate.c| 2 +-
crypto/adiantum.c | 2 +-
crypto/ahash.c
especially if LTO is
used. Instead, the new kfree_sensitive() uses memzero_explicit() which
won't get compiled out.
Waiman Long (2):
mm/slab: Use memzero_explicit() in kzfree()
mm, treewide: Rename kzfree() to kfree_sensitive()
arch/s390/crypto/prng.c | 4 +--
arch
On 6/16/20 10:26 AM, Dan Carpenter wrote:
Last time you sent this we couldn't decide which tree it should go
through. Either the crypto tree or through Andrew seems like the right
thing to me.
Also the other issue is that it risks breaking things if people add
new kzfree() instances while we
On 6/16/20 10:48 AM, David Sterba wrote:
On Mon, Jun 15, 2020 at 09:57:18PM -0400, Waiman Long wrote:
In btrfs_ioctl_get_subvol_info(), there is a classic case where kzalloc()
was incorrectly paired with kzfree(). According to David Sterba, there
isn't any sensitive information
On 6/15/20 11:30 PM, Eric Biggers wrote:
On Mon, Jun 15, 2020 at 09:57:16PM -0400, Waiman Long wrote:
The kzfree() function is normally used to clear some sensitive
information, like encryption keys, in the buffer before freeing it back
to the pool. Memset() is currently used for the buffer
.
Reported-by: David Sterba
Signed-off-by: Waiman Long
---
fs/btrfs/ioctl.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c
index f1dd9e4271e9..e8f7c5f00894 100644
--- a/fs/btrfs/ioctl.c
+++ b/fs/btrfs/ioctl.c
@@ -2692,7 +2692,7 @@ static
ked-by: Michal Hocko
Acked-by: Johannes Weiner
Signed-off-by: Waiman Long
---
arch/s390/crypto/prng.c | 4 +--
arch/x86/power/hibernate.c| 2 +-
crypto/adiantum.c | 2 +-
crypto/ahash.c
especially if LTO is being used. To make sure that this
optimization will not happen, memzero_explicit(), which is introduced
in v3.18, is now used in kzfree() to do the clearing.
Fixes: 3ef0e5ba4673 ("slab: introduce kzfree()")
Cc: sta...@vger.kernel.org
Signed-off-by: Waiman Lon
ring isn't totally safe either as compiler
may compile out the clearing in their optimizer especially if LTO is
used. Instead, the new kfree_sensitive() uses memzero_explicit() which
won't get compiled out.
Waiman Long (3):
mm/slab: Use memzero_explicit() in kzfree()
mm, treewide: Ren
On 6/15/20 2:07 PM, Dan Carpenter wrote:
On Mon, Apr 13, 2020 at 05:15:49PM -0400, Waiman Long wrote:
diff --git a/mm/slab_common.c b/mm/slab_common.c
index 23c7500eea7d..c08bc7eb20bd 100644
--- a/mm/slab_common.c
+++ b/mm/slab_common.c
@@ -1707,17 +1707,17 @@ void *krealloc(const void *p
On 4/14/20 3:16 PM, Michal Suchánek wrote:
> On Tue, Apr 14, 2020 at 12:24:36PM -0400, Waiman Long wrote:
>> On 4/14/20 2:08 AM, Christophe Leroy wrote:
>>>
>>> Le 14/04/2020 à 00:28, Waiman Long a écrit :
>>>> Since kfree_sensitive() will do an implicit me
On 4/14/20 8:48 AM, David Sterba wrote:
> On Mon, Apr 13, 2020 at 05:15:49PM -0400, Waiman Long wrote:
>> fs/btrfs/ioctl.c | 2 +-
>
>> diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c
>> index 40b729dce91c..eab3f8510426 100644
>> ---
On 4/14/20 2:08 AM, Christophe Leroy wrote:
>
>
> Le 14/04/2020 à 00:28, Waiman Long a écrit :
>> Since kfree_sensitive() will do an implicit memzero_explicit(), there
>> is no need to call memzero_explicit() before it. Eliminate those
>> memzero_explicit() and simplify
-by: Waiman Long
---
.../allwinner/sun8i-ce/sun8i-ce-cipher.c | 19 +-
.../allwinner/sun8i-ss/sun8i-ss-cipher.c | 20 +--
drivers/crypto/amlogic/amlogic-gxl-cipher.c | 12 +++
drivers/crypto/inside-secure/safexcel_hash.c | 3 +--
4 files changed, 14
On 4/13/20 5:31 PM, Joe Perches wrote:
> On Mon, 2020-04-13 at 17:15 -0400, Waiman Long wrote:
>> Since kfree_sensitive() will do an implicit memzero_explicit(), there
>> is no need to call memzero_explicit() before it. Eliminate those
>> memzero_explicit() and simplify the
Since kfree_sensitive() will do an implicit memzero_explicit(), there
is no need to call memzero_explicit() before it. Eliminate those
memzero_explicit() and simplify the call sites.
Signed-off-by: Waiman Long
---
.../crypto/allwinner/sun8i-ce/sun8i-ce-cipher.c | 15 +++
.../crypto
ng is done by using the command sequence:
git grep -w --name-only kzfree |\
xargs sed -i 's/\bkzfree\b/kfree_sensitive/'
followed by some editing of the kfree_sensitive() kerneldoc and the
use of memzero_explicit() instead of memset().
Suggested-by: Joe Perches
Signed-off-by: W
compile out the clearing in their optimizer. Instead, the new
kfree_sensitive() uses memzero_explicit() which won't get compiled out.
Waiman Long (2):
mm, treewide: Rename kzfree() to kfree_sensitive()
crypto: Remove unnecessary memzero_explicit()
arch/s390/crypto/prng.c
99.9000th: 82
> min=0, max=9887 min=0, max=121
>
> Performance counter stats for 'system wide' (5 runs):
>
> context-switches43,373 ( +- 0.40% ) 44,597 ( +- 0.55% )
> cpu-migrations 1,211 ( +- 5.04% ) 220 ( +- 6.23% )
>
On 12/5/19 3:32 AM, Srikar Dronamraju wrote:
> With the static key shared processor available, is_shared_processor()
> can return without having to query the lppaca structure.
>
> Cc: Parth Shah
> Cc: Ihor Pasichnyk
> Cc: Juri Lelli
> Cc: Phil Auld
> Cc: Waiman Long
00th: 70
> 98.9000th: 8136 99.9000th: 100
> min=-1, max=10008 min=0, max=142
>
> Performance counter stats for 'system wide' (4 runs):
>
> context-switches 42,604 ( +- 0.87% ) 45,397 ( +- 0.25% )
&g
ude/asm/Kbuild | 1 -
> arch/riscv/include/asm/Kbuild | 1 -
> arch/sparc/include/asm/Kbuild | 1 -
> drivers/pci/Kconfig | 2 +-
> include/asm-generic/Kbuild | 1 +
> 9 files changed, 2 insertions(+), 8 deletions(-)
>
That looks OK.
Acked-by: Waiman Long
On 04/10/2019 04:15 AM, huang ying wrote:
> Hi, Waiman,
>
> What's the status of this patchset? And its merging plan?
>
> Best Regards,
> Huang, Ying
I have broken the patch into 3 parts (0/1/2) and rewritten some of them.
Part 0 has been merged into tip. Parts 1 and 2 are still under testing.
On 04/04/2019 12:44 PM, Josh Poimboeuf wrote:
> Keeping track of the number of mitigations for all the CPU speculation
> bugs has become overwhelming for many users. It's getting more and more
> complicated to decide which mitigations are needed for a given
> architecture. Complicating matters
On 03/22/2019 03:30 PM, Davidlohr Bueso wrote:
> On Fri, 22 Mar 2019, Linus Torvalds wrote:
>> Some of them _might_ be performance-critical. There's the one on
>> mmap_sem in the fault handling path, for example. And yes, I'd expect
>> the normal case to very much be "no other readers or writers"
On 03/22/2019 01:25 PM, Russell King - ARM Linux admin wrote:
> On Fri, Mar 22, 2019 at 10:30:08AM -0400, Waiman Long wrote:
>> Modify __down_read_trylock() to optimize for an unlocked rwsem and make
>> it generate slightly better code.
>>
>> Before th
On 03/22/2019 01:01 PM, Linus Torvalds wrote:
> On Fri, Mar 22, 2019 at 7:30 AM Waiman Long wrote:
>> 19 files changed, 133 insertions(+), 930 deletions(-)
> Lovely. And it all looks sane to me.
>
> So ack.
>
> The only comment I have is about __down_read_trylock()
case (1 thread), the new down_read_trylock() is a
little bit faster. For the contended cases, the new down_read_trylock()
perform pretty well in x86-64, but performance degrades at high
contention level on ARM64.
Suggested-by: Linus Torvalds
Signed-off-by: Waiman Long
---
kernel/locking/rwse
-spinlock.c and make all
architectures use a single implementation of rwsem - rwsem-xadd.c.
All references to RWSEM_GENERIC_SPINLOCK and RWSEM_XCHGADD_ALGORITHM
in the code are removed.
Suggested-by: Peter Zijlstra
Signed-off-by: Waiman Long
---
arch/alpha/Kconfig | 7 -
arch/arc/Kconfig
to access the internal rwsem macros and functions.
Signed-off-by: Waiman Long
---
MAINTAINERS | 1 -
arch/alpha/include/asm/rwsem.h | 211
arch/arm/include/asm/Kbuild | 1 -
arch/arm64/include/asm/Kbuild | 1 -
arch/hexagon/include/asm
the architectures use one single implementation of rwsem - rwsem-xadd.c.
Waiman Long (3):
locking/rwsem: Remove arch specific rwsem files
locking/rwsem: Remove rwsem-spinlock.c & use rwsem-xadd.c for all
archs
locking/rwsem: Optimize down_read_trylock()
MAINTAI
On 02/21/2019 09:14 AM, Will Deacon wrote:
> On Wed, Feb 13, 2019 at 05:00:17PM -0500, Waiman Long wrote:
>> Modify __down_read_trylock() to optimize for an unlocked rwsem and make
>> it generate slightly better code.
>>
>> Before this patch, down_read_trylock:
>
On 02/15/2019 01:40 PM, Will Deacon wrote:
> On Thu, Feb 14, 2019 at 11:37:15AM +0100, Peter Zijlstra wrote:
>> On Wed, Feb 13, 2019 at 05:00:14PM -0500, Waiman Long wrote:
>>> v4:
>>> - Remove rwsem-spinlock.c and make all archs use rwsem-xadd.c.
>>>
>>
On 02/14/2019 05:37 AM, Peter Zijlstra wrote:
> On Wed, Feb 13, 2019 at 05:00:14PM -0500, Waiman Long wrote:
>> v4:
>> - Remove rwsem-spinlock.c and make all archs use rwsem-xadd.c.
>>
>> v3:
>> - Optimize __down_read_trylock() for the uncontended case as s
On 02/14/2019 01:02 PM, Will Deacon wrote:
> On Thu, Feb 14, 2019 at 11:33:33AM +0100, Peter Zijlstra wrote:
>> On Wed, Feb 13, 2019 at 03:32:12PM -0500, Waiman Long wrote:
>>> Modify __down_read_trylock() to optimize for an unlocked rwsem and make
>>> it ge
On 02/14/2019 12:04 PM, Christoph Hellwig wrote:
> On Thu, Feb 14, 2019 at 10:26:52AM -0500, Waiman Long wrote:
>> Would you mind dropping just patch 3 from your series?
> Sure, we can just drop this patch.
Thanks,
Longman
On 02/14/2019 05:52 AM, Geert Uytterhoeven wrote:
> On Thu, Feb 14, 2019 at 12:08 AM Christoph Hellwig wrote:
>> Introduce one central definition of RWSEM_XCHGADD_ALGORITHM and
>> RWSEM_GENERIC_SPINLOCK in kernel/Kconfig.locks and let architectures
>> select RWSEM_XCHGADD_ALGORITHM if they want
On 02/14/2019 08:23 AM, Davidlohr Bueso wrote:
> On Fri, 08 Feb 2019, Waiman Long wrote:
>> I am planning to run more performance test and post the data sometimes
>> next week. Davidlohr is also going to run some of his rwsem performance
>> test on this patchset.
>
> S
On 02/14/2019 05:33 AM, Peter Zijlstra wrote:
> On Wed, Feb 13, 2019 at 03:32:12PM -0500, Waiman Long wrote:
>> Modify __down_read_trylock() to optimize for an unlocked rwsem and make
>> it generate slightly better code.
>>
>> Before this patch, down_read_trylock:
>
case (1 thread), the new down_read_trylock() is a
little bit faster. For the contended cases, the new down_read_trylock()
perform pretty well in x86-64, but performance degrades at high
contention level on ARM64.
Suggested-by: Linus Torvalds
Signed-off-by: Waiman Long
---
kernel/locking/rw
-spinlock.c and make all
architectures use a single implementation of rwsem - rwsem-xadd.c.
All references to RWSEM_GENERIC_SPINLOCK and RWSEM_XCHGADD_ALGORITHM
in the code are removed.
Suggested-by: Peter Zijlstra
Signed-off-by: Waiman Long
---
arch/alpha/Kconfig | 7 -
arch/arc/Kconfig
to access the internal rwsem macros and functions.
Signed-off-by: Waiman Long
---
MAINTAINERS | 1 -
arch/alpha/include/asm/rwsem.h | 211 ---
arch/arm/include/asm/Kbuild | 1 -
arch/arm64/include/asm/Kbuild | 1 -
arch/hexagon/include
of this patchset is to remove the architecture specific files
for rwsem-xadd to make it easer to add enhancements in the later rwsem
patches. It also removes the legacy rwsem-spinlock.c file and make all
the architectures use one single implementation of rwsem - rwsem-xadd.c.
Waiman Long (3):
locking
case (1 thread), the new down_read_trylock() is a
little bit faster. For the contended cases, the new down_read_trylock()
perform pretty well in x86-64, but performance degrades at high
contention level on ARM64.
Suggested-by: Linus Torvalds
Signed-off-by: Waiman Long
---
kernel/locking/rw
to access the internal rwsem macros and functions.
Signed-off-by: Waiman Long
---
MAINTAINERS | 1 -
arch/alpha/include/asm/rwsem.h | 211 ---
arch/arm/include/asm/Kbuild | 1 -
arch/arm64/include/asm/Kbuild | 1 -
arch/hexagon/include
ch 2 for arm64.
Waiman Long (2):
locking/rwsem: Remove arch specific rwsem files
locking/rwsem: Optimize down_read_trylock()
MAINTAINERS | 1 -
arch/alpha/include/asm/rwsem.h | 211 ---
arch/arm/include/asm/Kbuild | 1 -
arch/arm64/include
On 02/13/2019 02:45 AM, Ingo Molnar wrote:
> * Waiman Long wrote:
>
>> I looked at the assembly code in arch/x86/include/asm/rwsem.h. For both
>> trylocks (read & write), the count is read first before attempting to
>> lock it. We did the same for all tryl
On 02/12/2019 02:58 PM, Linus Torvalds wrote:
> On Mon, Feb 11, 2019 at 11:31 AM Waiman Long wrote:
>> Modify __down_read_trylock() to make it generate slightly better code
>> (smaller and maybe a tiny bit faster).
> This looks good, but I would ask you to try one slightly
On 02/12/2019 01:36 PM, Waiman Long wrote:
> On 02/12/2019 08:25 AM, Peter Zijlstra wrote:
>> On Tue, Feb 12, 2019 at 02:24:04PM +0100, Peter Zijlstra wrote:
>>> On Mon, Feb 11, 2019 at 02:31:26PM -0500, Waiman Long wrote:
>>>> Modify __down_read_trylock() to make it
On 02/12/2019 08:25 AM, Peter Zijlstra wrote:
> On Tue, Feb 12, 2019 at 02:24:04PM +0100, Peter Zijlstra wrote:
>> On Mon, Feb 11, 2019 at 02:31:26PM -0500, Waiman Long wrote:
>>> Modify __down_read_trylock() to make it generate slightly better code
>>> (smaller
1 27,787 28,259
28,359 9,234
On a ARM64 system, the performance results were:
Before PatchAfter Patch
# of Threads rlock rlock
- -
1 24,155
to access the internal rwsem macros and functions.
Signed-off-by: Waiman Long
---
MAINTAINERS | 1 -
arch/alpha/include/asm/rwsem.h | 211 ---
arch/arm/include/asm/Kbuild | 1 -
arch/arm64/include/asm/Kbuild | 1 -
arch/hexagon/include
platforms that I can tested on (arm64 & ppc) are
both using the generic C codes, the rwsem performance shouldn't be
affected by this patch except the down_read_trylock() code which was
included in patch 2 for arm64.
Waiman Long (2):
locking/rwsem: Remove arch specific rwsem files
locking/r
On 02/11/2019 06:58 AM, Peter Zijlstra wrote:
> Which is clearly worse. Now we can write that as:
>
> int __down_read_trylock2(unsigned long *l)
> {
> long tmp = READ_ONCE(*l);
>
> while (tmp >= 0) {
> if (try_cmpxchg(l, , tmp + 1))
>
On 02/11/2019 05:39 AM, Ingo Molnar wrote:
> * Ingo Molnar wrote:
>
>> Sounds good to me - I've merged this patch, will push it out after
>> testing.
> Based on Peter's feedback I'm delaying this - performance testing on at
> least one key ll/sc arch would be nice indeed.
>
> Thanks,
>
>
On 02/10/2019 09:00 PM, Waiman Long wrote:
> As the generic rwsem-xadd code is using the appropriate acquire and
> release versions of the atomic operations, the arch specific rwsem.h
> files will not be that much faster than the generic code as long as the
> atomic functions
1 - 100 of 136 matches
Mail list logo