On 03/22/2019 03:30 PM, Davidlohr Bueso wrote:
> On Fri, 22 Mar 2019, Linus Torvalds wrote:
>> Some of them _might_ be performance-critical. There's the one on
>> mmap_sem in the fault handling path, for example. And yes, I'd expect
>> the normal case to very much be "no other readers or writers"
On 03/22/2019 01:25 PM, Russell King - ARM Linux admin wrote:
> On Fri, Mar 22, 2019 at 10:30:08AM -0400, Waiman Long wrote:
>> Modify __down_read_trylock() to optimize for an unlocked rwsem and make
>> it generate slightly better code.
>>
>> Before th
On 03/22/2019 01:01 PM, Linus Torvalds wrote:
> On Fri, Mar 22, 2019 at 7:30 AM Waiman Long wrote:
>> 19 files changed, 133 insertions(+), 930 deletions(-)
> Lovely. And it all looks sane to me.
>
> So ack.
>
> The only comment I have is about __down_read_trylock()
to access the internal rwsem macros and functions.
Signed-off-by: Waiman Long
---
MAINTAINERS | 1 -
arch/alpha/include/asm/rwsem.h | 211
arch/arm/include/asm/Kbuild | 1 -
arch/arm64/include/asm/Kbuild | 1 -
arch/hexagon/include/asm
case (1 thread), the new down_read_trylock() is a
little bit faster. For the contended cases, the new down_read_trylock()
perform pretty well in x86-64, but performance degrades at high
contention level on ARM64.
Suggested-by: Linus Torvalds
Signed-off-by: Waiman Long
---
kernel/locking/rwse
-spinlock.c and make all
architectures use a single implementation of rwsem - rwsem-xadd.c.
All references to RWSEM_GENERIC_SPINLOCK and RWSEM_XCHGADD_ALGORITHM
in the code are removed.
Suggested-by: Peter Zijlstra
Signed-off-by: Waiman Long
---
arch/alpha/Kconfig | 7 -
arch/arc/Kconfig
the architectures use one single implementation of rwsem - rwsem-xadd.c.
Waiman Long (3):
locking/rwsem: Remove arch specific rwsem files
locking/rwsem: Remove rwsem-spinlock.c & use rwsem-xadd.c for all
archs
locking/rwsem: Optimize down_read_trylock()
MAINTAI
On 02/21/2019 09:14 AM, Will Deacon wrote:
> On Wed, Feb 13, 2019 at 05:00:17PM -0500, Waiman Long wrote:
>> Modify __down_read_trylock() to optimize for an unlocked rwsem and make
>> it generate slightly better code.
>>
>> Before this patch, down_read_trylock:
>
On 02/15/2019 01:40 PM, Will Deacon wrote:
> On Thu, Feb 14, 2019 at 11:37:15AM +0100, Peter Zijlstra wrote:
>> On Wed, Feb 13, 2019 at 05:00:14PM -0500, Waiman Long wrote:
>>> v4:
>>> - Remove rwsem-spinlock.c and make all archs use rwsem-xadd.c.
>>>
>>
On 02/14/2019 05:37 AM, Peter Zijlstra wrote:
> On Wed, Feb 13, 2019 at 05:00:14PM -0500, Waiman Long wrote:
>> v4:
>> - Remove rwsem-spinlock.c and make all archs use rwsem-xadd.c.
>>
>> v3:
>> - Optimize __down_read_trylock() for the uncontended case as s
On 02/14/2019 01:02 PM, Will Deacon wrote:
> On Thu, Feb 14, 2019 at 11:33:33AM +0100, Peter Zijlstra wrote:
>> On Wed, Feb 13, 2019 at 03:32:12PM -0500, Waiman Long wrote:
>>> Modify __down_read_trylock() to optimize for an unlocked rwsem and make
>>> it ge
On 02/14/2019 08:23 AM, Davidlohr Bueso wrote:
> On Fri, 08 Feb 2019, Waiman Long wrote:
>> I am planning to run more performance test and post the data sometimes
>> next week. Davidlohr is also going to run some of his rwsem performance
>> test on this patchset.
>
> S
On 02/14/2019 05:33 AM, Peter Zijlstra wrote:
> On Wed, Feb 13, 2019 at 03:32:12PM -0500, Waiman Long wrote:
>> Modify __down_read_trylock() to optimize for an unlocked rwsem and make
>> it generate slightly better code.
>>
>> Before this patch, down_read_trylock:
>
case (1 thread), the new down_read_trylock() is a
little bit faster. For the contended cases, the new down_read_trylock()
perform pretty well in x86-64, but performance degrades at high
contention level on ARM64.
Suggested-by: Linus Torvalds
Signed-off-by: Waiman Long
---
kernel/locking/rw
-spinlock.c and make all
architectures use a single implementation of rwsem - rwsem-xadd.c.
All references to RWSEM_GENERIC_SPINLOCK and RWSEM_XCHGADD_ALGORITHM
in the code are removed.
Suggested-by: Peter Zijlstra
Signed-off-by: Waiman Long
---
arch/alpha/Kconfig | 7 -
arch/arc/Kconfig
to access the internal rwsem macros and functions.
Signed-off-by: Waiman Long
---
MAINTAINERS | 1 -
arch/alpha/include/asm/rwsem.h | 211 ---
arch/arm/include/asm/Kbuild | 1 -
arch/arm64/include/asm/Kbuild | 1 -
arch/hexagon/include
of this patchset is to remove the architecture specific files
for rwsem-xadd to make it easer to add enhancements in the later rwsem
patches. It also removes the legacy rwsem-spinlock.c file and make all
the architectures use one single implementation of rwsem - rwsem-xadd.c.
Waiman Long (3):
locking
case (1 thread), the new down_read_trylock() is a
little bit faster. For the contended cases, the new down_read_trylock()
perform pretty well in x86-64, but performance degrades at high
contention level on ARM64.
Suggested-by: Linus Torvalds
Signed-off-by: Waiman Long
---
kernel/locking/rw
to access the internal rwsem macros and functions.
Signed-off-by: Waiman Long
---
MAINTAINERS | 1 -
arch/alpha/include/asm/rwsem.h | 211 ---
arch/arm/include/asm/Kbuild | 1 -
arch/arm64/include/asm/Kbuild | 1 -
arch/hexagon/include
ch 2 for arm64.
Waiman Long (2):
locking/rwsem: Remove arch specific rwsem files
locking/rwsem: Optimize down_read_trylock()
MAINTAINERS | 1 -
arch/alpha/include/asm/rwsem.h | 211 ---
arch/arm/include/asm/Kbuild | 1 -
arch/arm64/include
On 02/13/2019 02:45 AM, Ingo Molnar wrote:
> * Waiman Long wrote:
>
>> I looked at the assembly code in arch/x86/include/asm/rwsem.h. For both
>> trylocks (read & write), the count is read first before attempting to
>> lock it. We did the same for all tryl
On 02/12/2019 02:58 PM, Linus Torvalds wrote:
> On Mon, Feb 11, 2019 at 11:31 AM Waiman Long wrote:
>> Modify __down_read_trylock() to make it generate slightly better code
>> (smaller and maybe a tiny bit faster).
> This looks good, but I would ask you to try one slightly
On 02/12/2019 01:36 PM, Waiman Long wrote:
> On 02/12/2019 08:25 AM, Peter Zijlstra wrote:
>> On Tue, Feb 12, 2019 at 02:24:04PM +0100, Peter Zijlstra wrote:
>>> On Mon, Feb 11, 2019 at 02:31:26PM -0500, Waiman Long wrote:
>>>> Modify __down_read_trylock() to make it
platforms that I can tested on (arm64 & ppc) are
both using the generic C codes, the rwsem performance shouldn't be
affected by this patch except the down_read_trylock() code which was
included in patch 2 for arm64.
Waiman Long (2):
locking/rwsem: Remove arch specific rwsem files
locking/r
1 27,787 28,259
28,359 9,234
On a ARM64 system, the performance results were:
Before PatchAfter Patch
# of Threads rlock rlock
- -
1 24,155
to access the internal rwsem macros and functions.
Signed-off-by: Waiman Long
---
MAINTAINERS | 1 -
arch/alpha/include/asm/rwsem.h | 211 ---
arch/arm/include/asm/Kbuild | 1 -
arch/arm64/include/asm/Kbuild | 1 -
arch/hexagon/include
On 02/11/2019 06:58 AM, Peter Zijlstra wrote:
> Which is clearly worse. Now we can write that as:
>
> int __down_read_trylock2(unsigned long *l)
> {
> long tmp = READ_ONCE(*l);
>
> while (tmp >= 0) {
> if (try_cmpxchg(l, , tmp + 1))
>
On 02/10/2019 09:00 PM, Waiman Long wrote:
> As the generic rwsem-xadd code is using the appropriate acquire and
> release versions of the atomic operations, the arch specific rwsem.h
> files will not be that much faster than the generic code as long as the
> atomic functions
/locking needs to access
the internal rwsem macros and functions.
Signed-off-by: Waiman Long
---
MAINTAINERS | 1 -
arch/alpha/include/asm/rwsem.h | 211 ---
arch/arm/include/asm/Kbuild | 1 -
arch/arm64/include/asm/Kbuild | 1 -
arch
On 02/08/2019 02:50 PM, Linus Torvalds wrote:
> On Thu, Feb 7, 2019 at 11:08 AM Waiman Long wrote:
>> This patchset revamps the current rwsem-xadd implementation to make
>> it saner and easier to work with. This patchset removes all the
>> architecture specific assembly co
On 02/07/2019 03:54 PM, Waiman Long wrote:
> On 02/07/2019 03:08 PM, Peter Zijlstra wrote:
>> On Thu, Feb 07, 2019 at 02:07:19PM -0500, Waiman Long wrote:
>>> On 32-bit architectures, there aren't enough bits to hold both.
>>> 64-bit architectures, however,
On 02/07/2019 03:08 PM, Peter Zijlstra wrote:
> On Thu, Feb 07, 2019 at 02:07:19PM -0500, Waiman Long wrote:
>> On 32-bit architectures, there aren't enough bits to hold both.
>> 64-bit architectures, however, can have enough bits to do that. For
>> x86-64, the physical add
On 02/07/2019 02:51 PM, Davidlohr Bueso wrote:
> On Thu, 07 Feb 2019, Waiman Long wrote:
>> 30 files changed, 1197 insertions(+), 1594 deletions(-)
>
> Performance numbers on numerous workloads, pretty please.
>
> I'll go and throw this at my mmap_sem intensive workl
On 02/07/2019 02:36 PM, Peter Zijlstra wrote:
> On Thu, Feb 07, 2019 at 02:07:08PM -0500, Waiman Long wrote:
>
>> +static inline int __down_read_trylock(struct rw_semaphore *sem)
>> +{
>> +long tmp;
>> +
>> +while ((tmp = atomic_long_read(>
() calls are replaced by either lockevent_inc() or
lockevent_cond_inc() calls.
The qstat_hop() call is renamed to lockevent_pv_hop(). The "reset_counters"
debugfs file is also renamed to ".reset_counts".
Signed-off-by: Waiman Long
---
kernel/locking/lock_events.h| 55
-by: Waiman Long
---
kernel/locking/lock_events_list.h | 1 +
kernel/locking/rwsem-xadd.c | 80 ++-
kernel/locking/rwsem-xadd.h | 3 ++
3 files changed, 74 insertions(+), 10 deletions(-)
diff --git a/kernel/locking/lock_events_list.h
b/kernel
after sleeping.
Signed-off-by: Waiman Long
---
arch/Kconfig | 2 +-
kernel/locking/lock_events_list.h | 17 +
kernel/locking/rwsem-xadd.c | 12
3 files changed, 30 insertions(+), 1 deletion(-)
diff --git a/arch/Kconfig b/arch/Kconfig
index
() are also moved over to rwsem-xadd.h.
Signed-off-by: Waiman Long
---
kernel/locking/rwsem-xadd.h | 12 ++--
kernel/locking/rwsem.c | 3 ---
2 files changed, 10 insertions(+), 5 deletions(-)
diff --git a/kernel/locking/rwsem-xadd.h b/kernel/locking/rwsem-xadd.h
index 64e7d62..77151c3
directory.
Signed-off-by: Waiman Long
---
arch/Kconfig| 10 +++
arch/x86/Kconfig| 8 ---
kernel/locking/Makefile | 1 +
kernel/locking/lock_events.c| 153
kernel/locking/lock_events.h| 6 +-
kernel
of the rwsem count and owner fields to give more information
about what is wrong with the rwsem.
Signed-off-by: Waiman Long
---
kernel/locking/rwsem-xadd.h | 19 ---
kernel/locking/rwsem.c | 5 +++--
2 files changed, 15 insertions(+), 9 deletions(-)
diff --git a/kernel/locking/rwsem
The rwsem_down_read_failed*() functions were relocted from above the
optimistic spinning section to below that section. This enables the
reader functions to use optimisitic spinning in future patches. There
is no code change.
Signed-off-by: Waiman Long
---
kernel/locking/rwsem-xadd.c | 172
-off-by: Waiman Long
---
kernel/locking/rwsem-xadd.c | 145 +++-
kernel/locking/rwsem-xadd.h | 85 +-
2 files changed, 89 insertions(+), 141 deletions(-)
diff --git a/kernel/locking/rwsem-xadd.c b/kernel/locking/rwsem-xadd.c
index
()/up_write()")
will have to be reverted.
Signed-off-by: Waiman Long
---
kernel/locking/rwsem-xadd.c | 74 -
1 file changed, 74 deletions(-)
diff --git a/kernel/locking/rwsem-xadd.c b/kernel/locking/rwsem-xadd.c
index 12b1d61..5f74bae 100644
--- a/ker
.
The generic asm rwsem.h can also be merged into kernel/locking/rwsem.h
as no other code other than those under kernel/locking needs to access
the internal rwsem macros and functions.
Signed-off-by: Waiman Long
---
MAINTAINERS | 1 -
arch/alpha/include/asm/rwsem.h | 211
to deadlock. So we have to make sure that an RT task
will not spin on a reader-owned rwsem.
Signed-off-by: Waiman Long
---
kernel/locking/rwsem-xadd.c | 16 ++--
1 file changed, 10 insertions(+), 6 deletions(-)
diff --git a/kernel/locking/rwsem-xadd.c b/kernel/locking/rwsem-xadd.c
index
, the extra constant argument to
rwsem_try_write_lock() and rwsem_try_write_lock_unqueued() should be
optimized out by the compiler.
Signed-off-by: Waiman Long
---
kernel/locking/rwsem-xadd.c | 22 --
1 file changed, 12 insertions(+), 10 deletions(-)
diff --git a/kernel
on both read and write lock performance.
Signed-off-by: Waiman Long
---
kernel/locking/rwsem-xadd.c | 20 +++--
kernel/locking/rwsem-xadd.h | 105 +++-
2 files changed, 110 insertions(+), 15 deletions(-)
diff --git a/kernel/locking/rwsem-xadd.c b
Signed-off-by: Waiman Long
---
kernel/locking/rwsem-xadd.c | 20
1 file changed, 16 insertions(+), 4 deletions(-)
diff --git a/kernel/locking/rwsem-xadd.c b/kernel/locking/rwsem-xadd.c
index 16dc7a1..21d462f 100644
--- a/kernel/locking/rwsem-xadd.c
+++ b/kernel/locking/rwsem
as reader-owned when the functions return. That
is currently true except in the transient case that the waiter queue
just becomes empty. So a rwsem_set_reader_owned() call is added for
this case. The __rwsem_set_reader_owned() call in __rwsem_mark_wake()
is now necessary.
Signed-off-by: Waiman Long
We don't need to expose rwsem internal functions which are not supposed
to be called directly from other kernel code.
Signed-off-by: Waiman Long
---
include/linux/rwsem.h | 7 ---
kernel/locking/rwsem-xadd.h | 7 +++
2 files changed, 7 insertions(+), 7 deletions(-)
diff --git
The content of kernel/locking/rwsem.h is now specific to rwsem-xadd only.
Rename it to rwsem-xadd.h to indicate that it is specific to rwsem-xadd
and include it only when CONFIG_RWSEM_XCHGADD_ALGORITHM is set.
Signed-off-by: Waiman Long
---
kernel/locking/percpu-rwsem.c | 4 +-
kernel/locking
.
Waiman Long (22):
locking/qspinlock_stat: Introduce a generic lockevent counting APIs
locking/lock_events: Make lock_events available for all archs & other
locks
locking/rwsem: Relocate rwsem_down_read_failed()
locking/rwsem: Remove arch specific rwsem files
locking/rwsem:
On 10/11/2017 04:50 PM, Dave Chinner wrote:
> On Wed, Oct 11, 2017 at 02:01:51PM -0400, Waiman Long wrote:
>> In term of rwsem performance, a rwsem microbenchmark and fio randrw
>> test with a xfs filesystem on a ramdisk were used to verify the
>> performance changes due t
On 10/11/2017 02:58 PM, Waiman Long wrote:
> On 10/11/2017 02:40 PM, Peter Zijlstra wrote:
>> On Wed, Oct 11, 2017 at 02:01:53PM -0400, Waiman Long wrote:
>>> +/*
>>> + * The definition of the atomic counter in the semaphore:
>>> + *
>>> + *
than xadd in ppc, the elimination of the atomic
count reversal in slowpath helps the contended performance, though.
Signed-off-by: Waiman Long <long...@redhat.com>
---
include/asm-generic/rwsem.h | 129 -
include/linux/rwsem.h | 12 ++--
if the owner is properly
set first.
Signed-off-by: Waiman Long <long...@redhat.com>
---
kernel/locking/rwsem-xadd.c | 7 +++
kernel/locking/rwsem-xadd.h | 19 ---
kernel/locking/rwsem.c | 17 ++---
kernel/locking/rwsem.h | 11 ---
4 files c
The rwsem_down_read_failed*() functions were relocted from above the
optimistic spinning section to below that section. This enables
them to use functions in that section in future patches. There is no
code change.
Signed-off-by: Waiman Long <long...@redhat.com>
---
kernel/locking/rwsem-
.
Signed-off-by: Waiman Long <long...@redhat.com>
---
arch/alpha/include/asm/rwsem.h | 195 ---
arch/arm/include/asm/Kbuild | 1 -
arch/arm64/include/asm/Kbuild | 1 -
arch/hexagon/include/asm/Kbuild | 1 -
arch/ia64/include/asm/rwsem.h
done more than
646k of them.
For the patched kernel, the locking rate dropped to 12,590 kop/s. The
number of locking operations done per thread had a range of 14,450 -
22,648. The rwsem became much more fair with the tradeoff of lower
overall throughput.
Signed-off-by: Waiman Long <long...@redhat.
.
Signed-off-by: Waiman Long <long...@redhat.com>
---
kernel/locking/rwsem-xadd.c | 66 ++---
1 file changed, 57 insertions(+), 9 deletions(-)
diff --git a/kernel/locking/rwsem-xadd.c b/kernel/locking/rwsem-xadd.c
index bca412f..52305c3 100644
--- a/
We don't need to expose rwsem internal functions which are not supposed
to be called directly from other kernel code.
Signed-off-by: Waiman Long <long...@redhat.com>
---
include/linux/rwsem.h | 7 ---
kernel/locking/rwsem-xadd.h | 7 +++
2 files changed, 7 insertions
alternatively, the resulting locking total rates on a 4.14 based
kernel were 927 kop/s and 3218 kop/s without and with the patch
respectively. That was an increase of about 247%.
Signed-off-by: Waiman Long <long...@redhat.com>
---
kernel/locking/rwsem-xadd.
This patch modifies rwsem_spin_on_owner() to return a tri-state value
to better reflect the state of lock holder which enables us to make a
better decision of what to do next.
Signed-off-by: Waiman Long <long...@redhat.com>
---
kernel/locking/rwsem-xadd.c | 14 +-
1 file chan
9 211,276/ 509,712/1,134,0074,894/221,839/246,818
11 884,513/1,043,989/1,252,5339,604/ 11,105/ 25,225
It can be seen that rwsem changes from writer-preferring to
reader-preferring.
Waiman Long (11):
locking/rwsem: relocate rwsem_down_read_failed()
locking/rwsem: Impl
+ RWSEM_ACTIVE_READ_BIAS
Signed-off-by: Waiman Long <long...@redhat.com>
---
arch/alpha/include/asm/rwsem.h| 3 ++-
arch/ia64/include/asm/rwsem.h | 2 +-
arch/s390/include/asm/rwsem.h | 2 +-
arch/x86/include/asm/rwsem.h | 3 ++-
include/asm-generic/rwsem.h
The rwsem_down_read_failed() function was relocted from above the
optimistic spinning section to below that section. This enables
it to use functions in that section in future patches. There is no
code change.
Signed-off-by: Waiman Long <long...@redhat.com>
---
kernel/locking/rwsem-xadd.
or spinning writers.
This patch provides the helper functions to facilitate the use of
that bit.
Signed-off-by: Waiman Long <long...@redhat.com>
---
kernel/locking/rwsem.h | 66 ++
1 file changed, 56 insertions(+), 10 deletions(-)
diff
stealing on a rwsem as long as
the lock is reader-owned and optimistic spinning hasn't been disabled
because of long writer wait. This will improve overall performance
without running the risk of writer lock starvation.
Signed-off-by: Waiman Long <long...@redhat.com>
---
kernel/locking
ed-off-by: Waiman Long <long...@redhat.com>
---
arch/alpha/include/asm/rwsem.h| 8 +---
arch/ia64/include/asm/rwsem.h | 7 ++-
arch/s390/include/asm/rwsem.h | 7 +--
arch/x86/include/asm/rwsem.h | 19 +--
include/asm-generic/rwsem.
runs) on a 4.12 based kernel were 1760.2 Mop/s and
5439.0 Mop/s without and with the patch respectively. That was an
increase of about 209%.
Signed-off-by: Waiman Long <long...@redhat.com>
---
kernel/locking/rwsem-xadd.c | 72 ++---
1 file chang
This patch modifies rwsem_spin_on_owner() to return a tri-state value
to better reflect the state of lock holder which enables us to make a
better decision of what to do next.
Signed-off-by: Waiman Long <long...@redhat.com>
---
kernel/locking/rwsem-xadd.c | 14 +-
1 file chan
On 10/05/2016 08:19 AM, Waiman Long wrote:
On 10/04/2016 03:06 PM, Davidlohr Bueso wrote:
On Thu, 18 Aug 2016, Waiman Long wrote:
The osq_lock() and osq_unlock() function may not provide the necessary
acquire and release barrier in some cases. This patch makes sure
that the proper barriers
On 08/24/2016 12:00 AM, Davidlohr Bueso wrote:
On Thu, 18 Aug 2016, Waiman Long wrote:
The default reader spining threshold is current set to 4096. However,
the right reader spinning threshold may vary from one system to
another and among the different architectures. This patch adds a new
On 08/19/2016 01:57 AM, Wanpeng Li wrote:
2016-08-19 5:11 GMT+08:00 Waiman Long<waiman.l...@hpe.com>:
When the count value is in between 0 and RWSEM_WAITING_BIAS, there
are 2 possibilities.
Either a writer is present and there is no waiter
count = 0x0001
or there are waiters and r
of different systems as well as for
testing purposes.
Signed-off-by: Waiman Long <waiman.l...@hpe.com>
---
Documentation/kernel-parameters.txt |3 +++
kernel/locking/rwsem-xadd.c | 14 +-
2 files changed, 16 insertions(+), 1 deletions(-)
diff --git a/Documen
patch BW after patch % change
--- --
randrw1352 MB/s 2164 MB/s +60%
randwrite 1710 MB/s 2550 MB/s +49%
Signed-off-by: Waiman Long <waiman.l...@hpe.com>
---
kernel/locking/rwsem-xadd.c
+ RWSEM_ACTIVE_READ_BIAS
Signed-off-by: Waiman Long <waiman.l...@hpe.com>
---
arch/alpha/include/asm/rwsem.h|3 ++-
arch/ia64/include/asm/rwsem.h |2 +-
arch/s390/include/asm/rwsem.h |2 +-
arch/x86/include/asm/rwsem.h |3 ++-
include/asm-generic/rwsem.h |4 ++--
i
. If
there are sufficient more successful spin attempts than failed ones,
it will try to reactivate reader spinning.
Signed-off-by: Waiman Long <waiman.l...@hpe.com>
---
include/linux/rwsem.h | 12
kernel/locking/rwsem-xadd.c | 27 +--
2 files chang
--- --
randrw1210 MB/s 1352 MB/s +12%
randwrite 1622 MB/s 1710 MB/s +5.4%
The write-only microbench also showed improvement because some read
locking was done by the XFS code.
Signed-off-by: Waiman Long <waiman.l...@hpe.com>
---
new boot parameter to change the reader spinning
threshold which can be system specific.
Waiman Long (10):
locking/osq: Make lock/unlock proper acquire/release barrier
locking/rwsem: Stop active read lock ASAP
locking/rwsem: Make rwsem_spin_on_owner() return a tri-state value
locking/rwse
org>
Signed-off-by: Waiman Long <waiman.l...@hpe.com>
---
kernel/locking/osq_lock.c | 24 ++--
1 files changed, 18 insertions(+), 6 deletions(-)
diff --git a/kernel/locking/osq_lock.c b/kernel/locking/osq_lock.c
index 05a3785..3da0b97 100644
--- a/kernel/locking/osq_
ed-off-by: Waiman Long <waiman.l...@hpe.com>
---
arch/alpha/include/asm/rwsem.h|8 +---
arch/ia64/include/asm/rwsem.h |7 ++-
arch/s390/include/asm/rwsem.h |7 +--
arch/x86/include/asm/rwsem.h | 19 +--
include/asm-generic/rwsem.h
.
Both the spinning threshold and the default value for rspin_enabled
can be overridden by architecture specific rwsem.h header file.
Signed-off-by: Waiman Long <waiman.l...@hpe.com>
---
include/linux/rwsem.h | 19 +++-
kernel/locking/rwsem-xadd.c
Move the rwsem_down_read_failed() function down to below the
optimistic spinning section before enabling optimistic spinning for
the readers. It is because the rwsem_down_read_failed() function will
call rwsem_optimistic_spin() in later patch.
There is no change in code.
Signed-off-by: Waiman
This patch modifies rwsem_spin_on_owner() to return a tri-state value
to better reflect the state of lock holder which enables us to make a
better decision of what to do next.
Signed-off-by: Waiman Long <waiman.l...@hpe.com>
---
kernel/locking/rwsem-xadd.c | 14 +-
1 files c
On 06/17/2016 11:45 AM, Will Deacon wrote:
On Fri, Jun 17, 2016 at 11:26:41AM -0400, Waiman Long wrote:
On 06/16/2016 08:48 PM, Boqun Feng wrote:
On Thu, Jun 16, 2016 at 05:35:54PM -0400, Waiman Long wrote:
If you look into the actual code:
next = xchg_release(>next, N
--- --
randrw1210 MB/s 1352 MB/s +12%
randwrite 1622 MB/s 1710 MB/s +5.4%
The write-only microbench also showed improvement because some read
locking was done by the XFS code.
Signed-off-by: Waiman Long <waiman.l...@hpe.com>
---
On 06/16/2016 08:48 PM, Boqun Feng wrote:
On Thu, Jun 16, 2016 at 05:35:54PM -0400, Waiman Long wrote:
On 06/15/2016 10:19 PM, Boqun Feng wrote:
On Wed, Jun 15, 2016 at 03:01:19PM -0400, Waiman Long wrote:
On 06/15/2016 04:04 AM, Boqun Feng wrote:
Hi Waiman,
On Tue, Jun 14, 2016 at 06:48
Move the rwsem_down_read_failed() function down to below the
optimistic spinning section before enabling optimistic spinning for
the readers. It is because the rwsem_down_read_failed() function will
call rwsem_optimistic_spin() in later patch.
There is no change in code.
Signed-off-by: Waiman
.
Both the spinning threshold and the default value for rspin_enabled
can be overridden by architecture specific rwsem.h header file.
Signed-off-by: Waiman Long <waiman.l...@hpe.com>
---
include/linux/rwsem.h | 19 +++-
kernel/locking/rwsem-xadd.c
k code.
Patch 8 enables readers to do optimistic spinning.
Patch 9 allows reactivation of reader spinning when a lot of
writer-on-writer spins are successful.
Patch 10 adds a new boot parameter to change the reader spinning
threshold which can be system specific.
Waiman Long (10):
locking/osq: Mak
This patch modifies rwsem_spin_on_owner() to return a tri-state value
to better reflect the state of lock holder which enables us to make a
better decision of what to do next.
Signed-off-by: Waiman Long <waiman.l...@hpe.com>
---
kernel/locking/rwsem-xadd.c | 14 +-
1 files c
. If
there are sufficient more successful spin attempts than failed ones,
it will try to reactivate reader spinning.
Signed-off-by: Waiman Long <waiman.l...@hpe.com>
---
include/linux/rwsem.h | 12
kernel/locking/rwsem-xadd.c | 27 +--
2 files chang
of different systems as well as for
testing purposes.
Signed-off-by: Waiman Long <waiman.l...@hpe.com>
---
Documentation/kernel-parameters.txt |3 +++
kernel/locking/rwsem-xadd.c | 14 +-
2 files changed, 16 insertions(+), 1 deletions(-)
diff --git a/Documen
ed-off-by: Waiman Long <waiman.l...@hpe.com>
---
arch/alpha/include/asm/rwsem.h|8 +---
arch/ia64/include/asm/rwsem.h |7 ++-
arch/s390/include/asm/rwsem.h |7 +--
arch/x86/include/asm/rwsem.h | 19 +--
include/asm-generic/rwsem.h
On 06/15/2016 10:19 PM, Boqun Feng wrote:
On Wed, Jun 15, 2016 at 03:01:19PM -0400, Waiman Long wrote:
On 06/15/2016 04:04 AM, Boqun Feng wrote:
Hi Waiman,
On Tue, Jun 14, 2016 at 06:48:04PM -0400, Waiman Long wrote:
The osq_lock() and osq_unlock() function may not provide the necessary
On 06/15/2016 10:14 PM, Davidlohr Bueso wrote:
On Wed, 15 Jun 2016, Waiman Long wrote:
I think there will be a little bit of performance impact for a
workload that produce just the right amount of rwsem contentions.
I'm not saying the change doesn't make sense, but this is the sort of
thing
On 06/15/2016 01:45 PM, Peter Zijlstra wrote:
On Tue, Jun 14, 2016 at 06:48:08PM -0400, Waiman Long wrote:
+++ b/arch/alpha/include/asm/rwsem.h
@@ -17,9 +17,9 @@
#define RWSEM_UNLOCKED_VALUE 0xL
#define RWSEM_ACTIVE_BIAS 0x0001L
#define
On 06/15/2016 01:43 PM, Peter Zijlstra wrote:
On Tue, Jun 14, 2016 at 06:48:08PM -0400, Waiman Long wrote:
even the reduced maximum of about 16k (32-bit) or 1G (64-bit) should
be more than enough for the foreseeable future.
So what happens if I manage to create 16k+ threads on my 32bit kernel
On 06/15/2016 01:40 PM, Peter Zijlstra wrote:
On Tue, Jun 14, 2016 at 06:48:07PM -0400, Waiman Long wrote:
Move the rwsem_down_read_failed() function down to below the optimistic
spinning section before enabling optimistic spinning for the readers.
newline
There is no change in code
1 - 100 of 112 matches
Mail list logo