---
21,179 9,436
41,505 8,268
8 721 7,041
16 575 7,652
32 70 2,189
64 39 534
Waiman Long (12):
924
32 78 300
64 38 195
240 50 149
There is no performance gain at low contention level. At high contention
level, however, this patch gives a pretty decent performance boost.
Signed-off-by: Waiman Long
write()")
will have to be reverted.
Signed-off-by: Waiman Long
---
kernel/locking/rwsem-xadd.c | 74 -
1 file changed, 74 deletions(-)
diff --git a/kernel/locking/rwsem-xadd.c b/kernel/locking/rwsem-xadd.c
index 58b3a64e6f2c..4f036bda9063 100644
--- a/ker
.
Signed-off-by: Waiman Long
---
kernel/locking/rwsem-xadd.c | 40 ++---
kernel/locking/rwsem.h | 5 +
2 files changed, 38 insertions(+), 7 deletions(-)
diff --git a/kernel/locking/rwsem-xadd.c b/kernel/locking/rwsem-xadd.c
index 4f036bda9063..35891c53338b
On 04/04/2019 05:38 AM, Peter Zijlstra wrote:
> On Thu, Apr 04, 2019 at 07:05:24AM +0200, Juergen Gross wrote:
>
>> Without PARAVIRT_SPINLOCK this would be just an alternative() then?
> That could maybe work yes. This is all early enough.
Yes, alternative() should work as it is done before SMP boo
On bare metail, the pvqspinlock event counts will always be 0. So there
is no point in showing their corresponding debugfs files. So they are
skipped in this case.
Signed-off-by: Waiman Long
Acked-by: Davidlohr Bueso
---
kernel/locking/lock_events.c | 28 +++-
1 file
directory.
Signed-off-by: Waiman Long
Acked-by: Davidlohr Bueso
---
arch/Kconfig| 10 +++
arch/x86/Kconfig| 8 --
kernel/locking/Makefile | 1 +
kernel/locking/lock_events.c| 153
kernel/locking/lock_events.h
sualize how frequently a code path is being
used as well as spotting abnormal behavior due to bugs in the code
without noticeably affecting kernel performance and hence behavior.
4) Reorganize rwsem structure to optimize for the uncontended case.
Both (2) and (3) are useful debugging ai
rwsem_down_read_failed() returns, for instance.
Signed-off-by: Waiman Long
Acked-by: Davidlohr Bueso
---
kernel/locking/rwsem-xadd.c | 6 +++---
kernel/locking/rwsem.c | 19 ++-
kernel/locking/rwsem.h | 17 +++--
3 files changed, 20 insertions(+), 22 deletions
() calls are replaced by either lockevent_inc() or
lockevent_cond_inc() calls.
The qstat_hop() call is renamed to lockevent_pv_hop(). The "reset_counters"
debugfs file is also renamed to ".reset_counts".
Signed-off-by: Waiman Long
Acked-by: Davidlohr Bueso
---
kernel/locking/lock_e
() are also moved over to rwsem-xadd.h.
Signed-off-by: Waiman Long
Acked-by: Davidlohr Bueso
---
kernel/locking/rwsem.c | 3 ---
kernel/locking/rwsem.h | 12 ++--
2 files changed, 10 insertions(+), 5 deletions(-)
diff --git a/kernel/locking/rwsem.c b/kernel/locking/rwsem.c
index
ng"s apart. No performance drop was observed when only a single rwsem
was used (hot cache). So the drop is likely just an idiosyncrasy of the
cache architecture of this chip than an inherent problem with the patch.
Suggested-by: Linus Torvalds
Signed-off-by: Waiman Long
---
include/linux/rwsem.h |
The atomic_long_cmpxchg_acquire() in rwsem_try_read_lock_unqueued() is
replaced by atomic_long_try_cmpxchg_acquire() to simpify the code and
generate slightly better assembly code. There is no functional change.
Signed-off-by: Waiman Long
Acked-by: Will Deacon
Acked-by: Davidlohr Bueso
slowpath were
write-locks in the optimistic spinning code path with no sleeping at
all. For this system, over 97% of the locks are acquired via optimistic
spinning. It illustrates the importance of optimistic spinning in
improving the performance of rwsem.
Signed-off-by: Waiman Long
Acked-by: Davidlohr
The rwsem_down_read_failed*() functions were relocated from above the
optimistic spinning section to below that section. This enables the
reader functions to use optimisitic spinning in future patches. There
is no code change.
Signed-off-by: Waiman Long
Acked-by: Will Deacon
Acked-by: Davidlohr
We don't need to expose rwsem internal functions which are not supposed
to be called directly from other kernel code.
Signed-off-by: Waiman Long
Acked-by: Will Deacon
Acked-by: Davidlohr Bueso
---
include/linux/rwsem.h | 7 ---
kernel/locking/rwsem.h | 7 +++
2 files chang
f the rwsem count and owner fields to give more information
about what is wrong with the rwsem. The debug_locks_off() function is
called as is done inside DEBUG_LOCKS_WARN_ON().
Signed-off-by: Waiman Long
Acked-by: Davidlohr Bueso
---
kernel/locking/rwsem.c | 3 ++-
kernel/locking/rwsem.h
On 04/04/2019 12:44 PM, Josh Poimboeuf wrote:
> Keeping track of the number of mitigations for all the CPU speculation
> bugs has become overwhelming for many users. It's getting more and more
> complicated to decide which mitigations are needed for a given
> architecture. Complicating matters is
On 04/03/2019 01:16 PM, Peter Zijlstra wrote:
> On Wed, Apr 03, 2019 at 12:33:20PM -0400, Waiman Long wrote:
>> static inline void queued_spin_lock_slowpath(struct qspinlock *lock, u32
>> val)
>> {
>> if (static_branch_
On 04/02/2019 05:43 AM, Peter Zijlstra wrote:
> On Mon, Apr 01, 2019 at 10:36:19AM -0400, Waiman Long wrote:
>> On 03/29/2019 11:20 AM, Alex Kogan wrote:
>>> +config NUMA_AWARE_SPINLOCKS
>>> + bool "Numa-aware spinlocks"
>>> + depends on NUMA
>
On 04/03/2019 08:59 AM, Peter Zijlstra wrote:
> On Thu, Mar 28, 2019 at 02:10:54PM -0400, Waiman Long wrote:
>> This is part 2 of a 3-part (0/1/2) series to rearchitect the internal
>> operation of rwsem.
>>
>> part 0: https://lkml.org/lkml/2019/3/22/1662
>> part 1
On 04/03/2019 11:39 AM, Alex Kogan wrote:
> Peter, Longman, many thanks for your detailed comments!
>
> A few follow-up questions are inlined below.
>
>> On Apr 2, 2019, at 5:43 AM, Peter Zijlstra wrote:
>>
>> On Mon, Apr 01, 2019 at 10:36:19AM -0400, Waiman Long wro
On 04/03/2019 09:12 AM, Peter Zijlstra wrote:
> On Thu, Feb 28, 2019 at 02:09:41PM -0500, Waiman Long wrote:
>> For an uncontended rwsem, count and owner are the only fields a task
>> needs to touch when acquiring the rwsem. So they are put next to each
>> other to increase
On 04/03/2019 09:09 AM, Peter Zijlstra wrote:
> On Thu, Feb 28, 2019 at 02:09:36PM -0500, Waiman Long wrote:
>> diff --git a/kernel/locking/rwsem.h b/kernel/locking/rwsem.h
>> index 1d8f722..c8fd3f1 100644
>> --- a/kernel/locking/rwsem.h
>> +++ b/kernel/locking
On 04/02/2019 05:18 PM, Johannes Weiner wrote:
> On Tue, Apr 02, 2019 at 03:38:10PM -0400, Waiman Long wrote:
>> The output of the PSI files show a bunch of numbers with no unit.
>> The psi.txt documentation file also does not indicate what units
>> are used. One can only f
Commit-ID: ddb20d1d3aed8f130519c0a29cd5392efcc067b8
Gitweb: https://git.kernel.org/tip/ddb20d1d3aed8f130519c0a29cd5392efcc067b8
Author: Waiman Long
AuthorDate: Fri, 22 Mar 2019 10:30:08 -0400
Committer: Ingo Molnar
CommitDate: Wed, 3 Apr 2019 14:50:52 +0200
locking/rwsem: Optimize
Commit-ID: 390a0c62c23cb026cd4664a66f6f45fed3a215f6
Gitweb: https://git.kernel.org/tip/390a0c62c23cb026cd4664a66f6f45fed3a215f6
Author: Waiman Long
AuthorDate: Fri, 22 Mar 2019 10:30:07 -0400
Committer: Ingo Molnar
CommitDate: Wed, 3 Apr 2019 14:50:52 +0200
locking/rwsem: Remove rwsem
Commit-ID: 46ad0840b1584b92b5ff2cc3ed0b011dd6b8e0f1
Gitweb: https://git.kernel.org/tip/46ad0840b1584b92b5ff2cc3ed0b011dd6b8e0f1
Author: Waiman Long
AuthorDate: Fri, 22 Mar 2019 10:30:06 -0400
Committer: Ingo Molnar
CommitDate: Wed, 3 Apr 2019 14:50:50 +0200
locking/rwsem: Remove arch
Commit-ID: 0975e3df30eb5849284c01be66c2ec16d8a48114
Gitweb: https://git.kernel.org/tip/0975e3df30eb5849284c01be66c2ec16d8a48114
Author: Waiman Long
AuthorDate: Fri, 22 Mar 2019 10:30:08 -0400
Committer: Ingo Molnar
CommitDate: Wed, 3 Apr 2019 11:42:35 +0200
locking/rwsem: Optimize
Commit-ID: 701fd16f3b4e3e5f317a051b36962b8cc756c138
Gitweb: https://git.kernel.org/tip/701fd16f3b4e3e5f317a051b36962b8cc756c138
Author: Waiman Long
AuthorDate: Fri, 22 Mar 2019 10:30:06 -0400
Committer: Ingo Molnar
CommitDate: Wed, 3 Apr 2019 11:42:33 +0200
locking/rwsem: Remove arch
Commit-ID: 79407a77fe0ea11c0d38c5f4a3936bf35a994965
Gitweb: https://git.kernel.org/tip/79407a77fe0ea11c0d38c5f4a3936bf35a994965
Author: Waiman Long
AuthorDate: Fri, 22 Mar 2019 10:30:07 -0400
Committer: Ingo Molnar
CommitDate: Wed, 3 Apr 2019 11:42:34 +0200
locking/rwsem: Remove rwsem
On 04/02/2019 03:17 PM, Jan Harkes wrote:
> On Sun, Mar 31, 2019 at 03:13:47PM -0400, Jan Harkes wrote:
>> On Sun, Mar 31, 2019 at 02:14:13PM -0400, Waiman Long wrote:
>>> One possibility is that there is a previous reference to the memory
>>> currently occupied by
On 03/29/2019 11:20 AM, Alex Kogan wrote:
> In CNA, spinning threads are organized in two queues, a main queue for
> threads running on the same node as the current lock holder, and a
> secondary queue for threads running on other nodes. At the unlock time,
> the lock holder scans the main queue lo
On 04/01/2019 02:38 AM, Juergen Gross wrote:
> On 25/03/2019 19:03, Waiman Long wrote:
>> On 03/25/2019 12:40 PM, Juergen Gross wrote:
>>> On 25/03/2019 16:57, Waiman Long wrote:
>>>> It was found that passing an invalid cpu number to pv_vcpu_is_preempted()
>&
On 03/31/2019 12:00 AM, Jan Harkes wrote:
> On Fri, Mar 29, 2019 at 05:53:22PM +0000, Waiman Long wrote:
>> On 03/29/2019 12:10 PM, Jan Harkes wrote:
>>> I knew I definitely had never seen this problem with the stable kernel
>>> on Ubuntu xenial (4.4) so I bisected be
On 03/29/2019 12:10 PM, Jan Harkes wrote:
> I was testing Coda on the 5.1-rc2 kernel and noticed that when I run a
> binary out of /coda, the binary would never exit and the system would
> detect a soft lockup. I narrowed it down to a very simple reproducible
> case of running a statically linked e
On 03/28/2019 04:56 PM, Linus Torvalds wrote:
> On Thu, Mar 28, 2019 at 1:47 PM Linus Torvalds
> wrote:
>> On Thu, Mar 28, 2019 at 11:12 AM Waiman Long wrote:
>>> With the merging of owner into count for x86-64, there is only 16 bits
>>> left for reader count. It is
On 03/28/2019 04:47 PM, Linus Torvalds wrote:
> On Thu, Mar 28, 2019 at 11:12 AM Waiman Long wrote:
>> With the merging of owner into count for x86-64, there is only 16 bits
>> left for reader count. It is theoretically possible for an application to
>> cause more than 64k
, the extra constant argument to
rwsem_try_write_lock() and rwsem_try_write_lock_unqueued() should be
optimized out by the compiler.
Signed-off-by: Waiman Long
---
kernel/locking/rwsem-xadd.c | 25 ++---
1 file changed, 14 insertions(+), 11 deletions(-)
diff --git a/kernel
16 1,727 1,918
32 1,263 1,956
64 889 1,343
Signed-off-by: Waiman Long
---
kernel/locking/rwsem-xadd.c | 38 ++---
1 file changed, 31 insertions(+), 7 deletions(-)
diff --gi
Before combining owner and count, we are adding two new helpers for
accessing the owner value in the rwsem.
1) struct task_struct *rwsem_get_owner(struct rw_semaphore *sem)
2) bool is_rwsem_reader_owned(struct rw_semaphore *sem)
Signed-off-by: Waiman Long
---
kernel/locking/rwsem-xadd.c | 15
maximum reader count to 32k.
A limit of 256 is also imposed on the number of readers that can be woken
up in one wakeup function call. This will eliminate the possibility of
waking up more than 64k readers and overflowing the count.
Signed-off-by: Waiman Long
---
kernel/locking/lock_events_list.h
The performance are roughly the same before and after the patch. There
are run-to-run variations in performance. Runs with higher variances
usually have higher throughput.
Signed-off-by: Waiman Long
---
kernel/locking/rwsem-xadd.c | 147
kernel/locking
rwsem_sleep_reader=308201
rwsem_sleep_writer=72281
So a lot more threads acquired the lock in the slowpath and more threads
went to sleep.
Signed-off-by: Waiman Long
---
kernel/locking/lock_events_list.h | 1 +
kernel/locking/rwsem-xadd.c | 62 ---
kernel/locking/rwsem.h
.
Signed-off-by: Waiman Long
---
kernel/locking/rwsem-xadd.c | 40 ++---
kernel/locking/rwsem.h | 5 +
2 files changed, 38 insertions(+), 7 deletions(-)
diff --git a/kernel/locking/rwsem-xadd.c b/kernel/locking/rwsem-xadd.c
index 4f036bda9063..35891c53338b
wasn't significant in this case, but this change
is required by a follow-on patch.
Signed-off-by: Waiman Long
---
kernel/locking/lock_events_list.h | 1 +
kernel/locking/rwsem-xadd.c | 88 ++-
kernel/locking/rwsem.h| 3 ++
3 files change
write()")
will have to be reverted.
Signed-off-by: Waiman Long
---
kernel/locking/rwsem-xadd.c | 74 -
1 file changed, 74 deletions(-)
diff --git a/kernel/locking/rwsem-xadd.c b/kernel/locking/rwsem-xadd.c
index 58b3a64e6f2c..4f036bda9063 100644
--- a/ker
32 2,388 5303,717 359
64 1,424 3224,060 401
128 1,642 5104,488 628
It is obvious that RT tasks can benefit pretty significantly with this set
of patches.
Signed-off-by: Waiman Long
---
kernel/locking/rwsem-xa
924
32 78 300
64 38 195
240 50 149
There is no performance gain at low contention level. At high contention
level, however, this patch gives a pretty decent performance boost.
Signed-off-by: Waiman Long
became much more fair,
though there was a drop of about 26% in the mean locking operations
done which was a tradeoff of having better fairness.
Signed-off-by: Waiman Long
---
kernel/locking/lock_events_list.h | 2 +
kernel/locking/rwsem-xadd.c | 154 ++
kernel
--
21,179 9,436
41,505 8,268
8 721 7,041
16 575 7,652
32 70 2,189
64 39 534
Waiman Long (
On 03/22/2019 01:50 PM, Christopher Lameter wrote:
> On Fri, 22 Mar 2019, Waiman Long wrote:
>
>> I am looking forward to it.
> There is also alrady rcu being used in these paths. kfree_rcu() would not
> be enough? It is an estalished mechanism that is mature and well
> under
On 03/22/2019 07:16 AM, Oleg Nesterov wrote:
> On 03/21, Matthew Wilcox wrote:
>> On Thu, Mar 21, 2019 at 05:45:10PM -0400, Waiman Long wrote:
>>
>>> To avoid this dire condition and reduce lock hold time of tasklist_lock,
>>> flush_sigqueue() is modified to pass i
On 03/21/2019 06:00 PM, Peter Zijlstra wrote:
> On Thu, Mar 21, 2019 at 05:45:12PM -0400, Waiman Long wrote:
>> If the freeing queue has many objects, freeing all of them consecutively
>> may cause soft lockup especially on a debug kernel. So kmem_free_up_q()
>> is modified
Add a new free_uid_to_q() function to put the user structure on
freeing queue instead of freeing it directly. That new function is then
called from __sigqueue_free() with a free_q parameter.
Signed-off-by: Waiman Long
---
include/linux/sched/user.h | 3 +++
kernel/signal.c| 2
If the freeing queue has many objects, freeing all of them consecutively
may cause soft lockup especially on a debug kernel. So kmem_free_up_q()
is modified to call cond_resched() if running in the process context.
Signed-off-by: Waiman Long
---
mm/slab_common.c | 11 ++-
1 file changed
the actual freeing of memory objects can be deferred until after the
tasklist_lock is released and irq re-enabled.
Signed-off-by: Waiman Long
---
include/linux/signal.h | 4 +++-
kernel/exit.c| 12
kernel/signal.c | 27 ---
securi
, kmem_free_up_q() can be called to free all the memory
objects in the freeing queue after releasing the lock.
Signed-off-by: Waiman Long
---
include/linux/slab.h | 28
mm/slab_common.c | 41 +
2 files changed, 69 insertions
g kernel.
Waiman Long (4):
mm: Implement kmem objects freeing queue
signal: Make flush_sigqueue() use free_q to release memory
signal: Add free_uid_to_q()
mm: Do periodic rescheduling when freeing objects in kmem_free_up_q()
include/linux/sched/user.h | 3 +++
include/linux/signal.h
On 03/18/2019 04:44 AM, Zhenzhong Duan wrote:
>
> On 2019/3/15 22:17, Waiman Long wrote:
>> On 03/15/2019 05:25 AM, Peter Zijlstra wrote:
>>> On Thu, Mar 14, 2019 at 04:42:12PM +0800, Zhenzhong Duan wrote:
>>>> This reverts commit f99fd22e4d4bc84880a8a31
On 03/15/2019 05:25 AM, Peter Zijlstra wrote:
> On Thu, Mar 14, 2019 at 04:42:12PM +0800, Zhenzhong Duan wrote:
>> This reverts commit f99fd22e4d4bc84880a8a3117311bbf0e3a6a9dc.
>>
>> It's unnecessory after commit "acpi_pm: Fix bootup softlockup due to PMTMR
>> counter read contention", the simple H
concurrent ipc_obtain_object_check()
will not incorrectly match a deleted IPC id to to a new one.
Reported-by: Manfred Spraul
Signed-off-by: Waiman Long
---
ipc/util.c | 25 ++---
1 file changed, 22 insertions(+), 3 deletions(-)
diff --git a/ipc/util.c b/ipc/util.c
index 78
For an uncontended rwsem, count and owner are the only fields a task
needs to touch when acquiring the rwsem. So they are put next to each
other to increase the chance that they will share the same cacheline.
Suggested-by: Linus Torvalds
Signed-off-by: Waiman Long
---
include/linux/rwsem.h
f the rwsem count and owner fields to give more information
about what is wrong with the rwsem.
Signed-off-by: Waiman Long
Acked-by: Davidlohr Bueso
---
kernel/locking/rwsem.c | 3 ++-
kernel/locking/rwsem.h | 19 ---
2 files changed, 14 insertions(+), 8 deletions(-)
diff --
On bare metail, the pvqspinlock event counts will always be 0. So there
is no point in showing their corresponding debugfs files. So they are
skipped in this case.
Signed-off-by: Waiman Long
Acked-by: Davidlohr Bueso
---
kernel/locking/lock_events.c | 28 +++-
1 file
slowpath were
write-locks in the optimistic spinning code path with no sleeping at
all. For this system, over 97% of the locks are acquired via optimistic
spinning. It illustrates the importance of optimistic spinning in
improving the performance of rwsem.
Signed-off-by: Waiman Long
Acked-by: Davidlohr
() calls are replaced by either lockevent_inc() or
lockevent_cond_inc() calls.
The qstat_hop() call is renamed to lockevent_pv_hop(). The "reset_counters"
debugfs file is also renamed to ".reset_counts".
Signed-off-by: Waiman Long
Acked-by: Davidlohr Bueso
---
kernel/locking/lock_e
The atomic_long_cmpxchg_acquire() in rwsem_try_read_lock_unqueued() is
replaced by atomic_long_try_cmpxchg_acquire() to simpify the code and
generate slightly better assembly code. There is no functional change.
Signed-off-by: Waiman Long
Acked-by: Will Deacon
Acked-by: Davidlohr Bueso
The rwsem_down_read_failed*() functions were relocted from above the
optimistic spinning section to below that section. This enables the
reader functions to use optimisitic spinning in future patches. There
is no code change.
Signed-off-by: Waiman Long
Acked-by: Will Deacon
Acked-by: Davidlohr
() are also moved over to rwsem-xadd.h.
Signed-off-by: Waiman Long
Acked-by: Davidlohr Bueso
---
kernel/locking/rwsem.c | 3 ---
kernel/locking/rwsem.h | 12 ++--
2 files changed, 10 insertions(+), 5 deletions(-)
diff --git a/kernel/locking/rwsem.c b/kernel/locking/rwsem.c
index 59e5848
directory.
Signed-off-by: Waiman Long
Acked-by: Davidlohr Bueso
---
arch/Kconfig| 10 +++
arch/x86/Kconfig| 8 ---
kernel/locking/Makefile | 1 +
kernel/locking/lock_events.c| 153
kernel/locking
We don't need to expose rwsem internal functions which are not supposed
to be called directly from other kernel code.
Signed-off-by: Waiman Long
Acked-by: Will Deacon
Acked-by: Davidlohr Bueso
---
include/linux/rwsem.h | 7 ---
kernel/locking/rwsem.h | 7 +++
2 files chang
frequently a code path is being
used as well as spotting abnormal behavior due to bugs in the code
without noticeably affecting kernel performance and hence behavior.
Both (2) and (3) are useful debugging aids.
Waiman Long (11):
locking/rwsem: Relocate rwsem_down_read_failed()
locking/
rwsem_down_read_failed() returns, for instance.
Signed-off-by: Waiman Long
Acked-by: Davidlohr Bueso
---
kernel/locking/rwsem-xadd.c | 6 +++---
kernel/locking/rwsem.c | 19 ++-
kernel/locking/rwsem.h | 17 +++--
3 files changed, 20 insertions(+), 22 deletions
don't need more than 32k
IPC identifiers.
Signed-off-by: Waiman Long
---
Documentation/admin-guide/kernel-parameters.txt | 5 -
ipc/ipc_sysctl.c| 2 ++
ipc/util.c | 7 ++-
ipc/util.h
ce numbers from 64k down to 128. So it is a trade-off.
The computation of a new IPC id is not done in the performance critical
path. So a little bit of additional overhead shouldn't have any real
performance impact.
Signed-off-by: Waiman Long
Acked-by: Manfred Spraul
---
Documentation/adm
is being done irrespective of the ipcmni mode.
Suggested-by: Matthew Wilcox
Signed-off-by: Waiman Long
---
include/linux/ipc_namespace.h | 1 +
ipc/util.c| 12 +---
2 files changed, 10 insertions(+), 3 deletions(-)
diff --git a/include/linux/ipc_namespace.h b/include
erhead.
The cyclical id allocation isn't done for non-ipcmni_extend mode as the
potential memory and performance overhead may be problematic on system
with slow CPU and little memory. Systems that run applications which need
more than 32k IPC identifiers can certainly afford the extra overhead.
Commit-ID: 733000c7ffd9d9c8c4fdfd82f0d41956c8cf0537
Gitweb: https://git.kernel.org/tip/733000c7ffd9d9c8c4fdfd82f0d41956c8cf0537
Author: Waiman Long
AuthorDate: Sun, 24 Feb 2019 20:14:13 -0500
Committer: Ingo Molnar
CommitDate: Thu, 28 Feb 2019 07:55:38 +0100
locking/qspinlock: Remove
On 02/27/2019 08:18 PM, Huang, Ying wrote:
> Waiman Long writes:
>
>> On 02/26/2019 12:30 PM, Linus Torvalds wrote:
>>> On Tue, Feb 26, 2019 at 12:17 AM Huang, Ying wrote:
>>>> As for fixing. Should we care about the cache line alignment of struct
>>&g
On 02/26/2019 12:30 PM, Linus Torvalds wrote:
> On Tue, Feb 26, 2019 at 12:17 AM Huang, Ying wrote:
>> As for fixing. Should we care about the cache line alignment of struct
>> inode? Or its size is considered more important because there may be a
>> huge number of struct inode in the system?
>
With the > 4 nesting levels case handled by the commit d682b596d993
("locking/qspinlock: Handle > 4 slowpath nesting levels"), the BUG_ON()
call in encode_tail() will never be triggered. Remove it.
Signed-off-by: Waiman Long
---
kernel/locking/qspinlock.c | 3 ---
1 file chan
On 02/21/2019 09:15 AM, Will Deacon wrote:
> Hi Waiman,
>
> On Fri, Feb 15, 2019 at 03:50:02PM -0500, Waiman Long wrote:
>> Moves all the owner setting code closer to the rwsem-xadd fast paths
>> directly within rwsem.h file.
>>
>> For __down_read() and __down_re
On 02/21/2019 09:14 AM, Will Deacon wrote:
> On Wed, Feb 13, 2019 at 05:00:17PM -0500, Waiman Long wrote:
>> Modify __down_read_trylock() to optimize for an unlocked rwsem and make
>> it generate slightly better code.
>>
>> Before this patch, down_read_trylock:
>
On 02/15/2019 01:40 PM, Will Deacon wrote:
> On Thu, Feb 14, 2019 at 11:37:15AM +0100, Peter Zijlstra wrote:
>> On Wed, Feb 13, 2019 at 05:00:14PM -0500, Waiman Long wrote:
>>> v4:
>>> - Remove rwsem-spinlock.c and make all archs use rwsem-xadd.c.
>>>
>>
The rwsem_down_read_failed*() functions were relocted from above the
optimistic spinning section to below that section. This enables the
reader functions to use optimisitic spinning in future patches. There
is no code change.
Signed-off-by: Waiman Long
---
kernel/locking/rwsem-xadd.c | 172
f the rwsem count and owner fields to give more information
about what is wrong with the rwsem.
Signed-off-by: Waiman Long
---
kernel/locking/rwsem.c | 3 ++-
kernel/locking/rwsem.h | 19 ---
2 files changed, 14 insertions(+), 8 deletions(-)
diff --git a/kernel/locking/rwsem.c b/k
() are also moved over to rwsem-xadd.h.
Signed-off-by: Waiman Long
---
kernel/locking/rwsem.c | 3 ---
kernel/locking/rwsem.h | 12 ++--
2 files changed, 10 insertions(+), 5 deletions(-)
diff --git a/kernel/locking/rwsem.c b/kernel/locking/rwsem.c
index 59e5848..90de5f1 100644
--- a/kernel
The atomic_long_cmpxchg_acquire() in rwsem_try_read_lock_unqueued() is
replaced by atomic_long_try_cmpxchg_acquire() to simpify the code and
generate slightly better assembly code. There is no functional change.
Signed-off-by: Waiman Long
---
kernel/locking/rwsem-xadd.c | 15 +--
1
queue
just becomes empty. So a rwsem_set_reader_owned() call is added for
this case. The __rwsem_set_reader_owned() call in __rwsem_mark_wake()
is now necessary.
Signed-off-by: Waiman Long
---
kernel/locking/rwsem-xadd.c | 6 +++---
kernel/locking/rwsem.c | 19 ++-
kernel
() calls are replaced by either lockevent_inc() or
lockevent_cond_inc() calls.
The qstat_hop() call is renamed to lockevent_pv_hop(). The "reset_counters"
debugfs file is also renamed to ".reset_counts".
Signed-off-by: Waiman Long
---
kernel/locking/lock_events.h| 55
directory.
Signed-off-by: Waiman Long
---
arch/Kconfig| 10 +++
arch/x86/Kconfig| 8 ---
kernel/locking/Makefile | 1 +
kernel/locking/lock_events.c| 153
kernel/locking/lock_events.h| 10 ++-
kernel
On bare metail, the pvqspinlock event counts will always be 0. So there
is no point in showing their corresponding debugfs files. So they are
skipped in this case.
Signed-off-by: Waiman Long
---
kernel/locking/lock_events.c | 28 +++-
1 file changed, 27 insertions(+), 1
e and hence behavior.
Both (2) and (3) are useful debugging aids.
Waiman Long (10):
locking/rwsem: Relocate rwsem_down_read_failed()
locking/rwsem: Move owner setting code from rwsem.c to rwsem.h
locking/rwsem: Move rwsem internal function declarations to
rwsem-xadd.h
locking/r
slowpath were
write-locks in the optimistic spinning code path with no sleeping at
all. For this system, over 97% of the locks are acquired via optimistic
spinning. It illustrates the importance of optimistic spinning in
improving the performance of rwsem.
Signed-off-by: Waiman Long
---
arch/Kconfig
We don't need to expose rwsem internal functions which are not supposed
to be called directly from other kernel code.
Signed-off-by: Waiman Long
---
include/linux/rwsem.h | 7 ---
kernel/locking/rwsem.h | 7 +++
2 files changed, 7 insertions(+), 7 deletions(-)
diff --git a/in
On 02/15/2019 01:49 PM, Will Deacon wrote:
> On Tue, Feb 12, 2019 at 07:26:57PM -0500, Waiman Long wrote:
>> This is part 1 of a 3-part (0/1/2) series to rearchitect the internal
>> operation of rwsem. This depends on the part 0 patches sent out previously
>>
>> h
On 02/14/2019 05:37 AM, Peter Zijlstra wrote:
> On Wed, Feb 13, 2019 at 05:00:14PM -0500, Waiman Long wrote:
>> v4:
>> - Remove rwsem-spinlock.c and make all archs use rwsem-xadd.c.
>>
>> v3:
>> - Optimize __down_read_trylock() for the uncontended case as s
On 02/14/2019 01:02 PM, Will Deacon wrote:
> On Thu, Feb 14, 2019 at 11:33:33AM +0100, Peter Zijlstra wrote:
>> On Wed, Feb 13, 2019 at 03:32:12PM -0500, Waiman Long wrote:
>>> Modify __down_read_trylock() to optimize for an unlocked rwsem and make
>>> it ge
On 02/14/2019 12:04 PM, Christoph Hellwig wrote:
> On Thu, Feb 14, 2019 at 10:26:52AM -0500, Waiman Long wrote:
>> Would you mind dropping just patch 3 from your series?
> Sure, we can just drop this patch.
Thanks,
Longman
801 - 900 of 2871 matches
Mail list logo