[PATCH] lib/plist: rename DEBUG_PI_LIST to DEBUG_PLIST

2019-03-17 Thread Davidlohr Bueso
This is a lot more appropriate than PI_LIST, which in the kernel one would assume that it has to do with priority-inheritance; which is not -- furthermore futexes make use of plists so this can be even more confusing, albeit the debug nature of the config option. Signed-off-by: Davidlohr Bueso

Re: [LSF/MM TOPIC] Using XArray to manage the VMA

2019-03-14 Thread Davidlohr Bueso
On Wed, 13 Mar 2019, Matthew Wilcox wrote: It's probably worth listing the advantages of the Maple Tree over the rbtree. I'm not familiar with maple trees, are they referred to by another name? (is this some sort of B-tree?). Google just shows me real trees. - Shallower tree. A 1000-entry

Re: [LSF/MM TOPIC] Using XArray to manage the VMA

2019-03-13 Thread Davidlohr Bueso
On Wed, 13 Mar 2019, Laurent Dufour wrote: If this is not too late and if there is still place available, I would like to attend the MM track and propose a topic about using the XArray to replace the VMA's RB tree and list. Using the XArray in place of the VMA's tree and list seems to be a

Re: [PATCH][v2] ipc: prevent lockup on alloc_msg and free_msg

2019-03-12 Thread Davidlohr Bueso
er releasing the info->lock the thing is freed anyway so it should not change things. Feel free to add my: Reviewed-by: Davidlohr Bueso + list_for_each_entry_safe(msg, nmsg, _msg, m_list) { + list_del(>m_list); + free_msg(msg); + } + /

Re: [PATCH] mm/debug: add a cast to u64 for atomic64_read()

2019-03-10 Thread Davidlohr Bueso
t;pinned_vm %llx data_vm %lx exec_vm %lx stack_vm %lx\n" ~~~^ %lx Fixes: 70f8a3ca68d3 ("mm: make mm->pinned_vm an atomic64 counter") Signed-off-by: Qian Cai Acked-by: Davidlohr Bueso

Re: [PATCH] tools/perf-bench: Add basic syscall benchmark

2019-03-08 Thread Davidlohr Bueso
off-by: Davidlohr Bueso --- tools/perf/Documentation/perf-bench.txt | 11 + tools/perf/bench/Build | 1 + tools/perf/bench/bench.h| 1 + tools/perf/bench/syscall.c | 78 + tools/perf/builtin-bench.c | 8 +++

[PATCH] tools/perf-bench: Add basic syscall benchmark

2019-03-07 Thread Davidlohr Bueso
throughput compatible with 'perf-bench' via getppid(2), yet without any of the additional template stuff from Ingo's version (based on numa.c). The code is identical to what mmtests uses. https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1067469.html Signed-off-by: Davidlohr Bueso --- tools

Re: [PATCH-tip v2 00/10] locking/rwsem: Rwsem rearchitecture part 1

2019-02-26 Thread Davidlohr Bueso
y affecting kernel performance and hence behavior. Both (2) and (3) are useful debugging aids. Yes, this will come in handy in the future. Feel free to add my: Acked-by: Davidlohr Bueso Thanks.

Re: [PATCH-tip 00/22] locking/rwsem: Rework rwsem-xadd & enable new rwsem features

2019-02-14 Thread Davidlohr Bueso
On Fri, 08 Feb 2019, Waiman Long wrote: I am planning to run more performance test and post the data sometimes next week. Davidlohr is also going to run some of his rwsem performance test on this patchset. So I ran this series on a 40-core IB 2 socket with various worklods in mmtests. Below

Re: [PATCH 0/3] Add gup fast + longterm and use it in HFI1

2019-02-11 Thread Davidlohr Bueso
On Mon, 11 Feb 2019, ira.we...@intel.com wrote: Ira Weiny (3): mm/gup: Change "write" parameter to flags mm/gup: Introduce get_user_pages_fast_longterm() IB/HFI1: Use new get_user_pages_fast_longterm() Out of curiosity, are you planning on having all rdma drivers use

[PATCH v2] xsk: share the mmap_sem for page pinning

2019-02-11 Thread Davidlohr Bueso
Holding mmap_sem exclusively for a gup() is an overkill. Lets share the lock and replace the gup call for gup_longterm(), as it is better suited for the lifetime of the pinning. Cc: David S. Miller Cc: Bjorn Topel Cc: Magnus Karlsson CC: net...@vger.kernel.org Signed-off-by: Davidlohr Bueso

Re: [patch V2 0/2] genirq, proc: Speedup /proc/stat interrupt statistics

2019-02-08 Thread Davidlohr Bueso
| 29 ++--- include/linux/irqdesc.h |1 + kernel/irq/chip.c | 12 ++-- kernel/irq/internals.h |8 +++- kernel/irq/irqdesc.c|7 ++- 5 files changed, 50 insertions(+), 7 deletions(-) Reviewed-by: Davidlohr Bueso

[tip:locking/urgent] futex: Fix barrier comment

2019-02-08 Thread tip-bot for Davidlohr Bueso
Commit-ID: 6f568ebe2afefdc33a6fb06ef20a94f8b96455f1 Gitweb: https://git.kernel.org/tip/6f568ebe2afefdc33a6fb06ef20a94f8b96455f1 Author: Davidlohr Bueso AuthorDate: Wed, 6 Feb 2019 10:56:02 -0800 Committer: Thomas Gleixner CommitDate: Fri, 8 Feb 2019 13:00:35 +0100 futex: Fix barrier

Re: [PATCH-tip 00/22] locking/rwsem: Rework rwsem-xadd & enable new rwsem features

2019-02-07 Thread Davidlohr Bueso
On Thu, 07 Feb 2019, Waiman Long wrote: 30 files changed, 1197 insertions(+), 1594 deletions(-) Performance numbers on numerous workloads, pretty please. I'll go and throw this at my mmap_sem intensive workloads I've collected. Thanks, Davidlohr

Re: [tip:locking/core] sched/wake_q: Reduce reference counting for special users

2019-02-07 Thread Davidlohr Bueso
Could this change be pushed to v5.0 (tip/urgent) just like the wake_q fixes that are already in Linus' tree? This will help backporting efforts as most distros will want to avoid the performance hit and include this patch. Thanks, Davidlohr On Mon, 04 Feb 2019, tip-bot for Davidlohr Bueso wrote

Re: [PATCH 2/2] MIPS/c-r4k: do no use mmap_sem for gup_fast()

2019-02-07 Thread Davidlohr Bueso
On Thu, 07 Feb 2019, Paul Burton wrote: Hi Davidlohr, On Wed, Feb 06, 2019 at 09:37:40PM -0800, Davidlohr Bueso wrote: It is well known that because the mm can internally call the regular gup_unlocked if the lockless approach fails and take the sem there, the caller must not hold the mmap_sem

Re: [PATCH -tip 0/2] more get_user_pages mmap_sem cleanups

2019-02-06 Thread Davidlohr Bueso
Unlike what the subject says, this is not against -tip, it applies on today's -next. On Wed, 06 Feb 2019, Davidlohr Bueso wrote: Hi, Here are two more patchlets that cleanup mmap_sem and gup abusers. The second is also a fixlet. Compile-tested only. Please consider for v5.1 Thanks

[PATCH -tip 0/2] more get_user_pages mmap_sem cleanups

2019-02-06 Thread Davidlohr Bueso
Hi, Here are two more patchlets that cleanup mmap_sem and gup abusers. The second is also a fixlet. Compile-tested only. Please consider for v5.1 Thanks! Davidlohr Bueso (2): xsk: do not use mmap_sem MIPS/c-r4k: do no use mmap_sem for gup_fast() arch/mips/mm/c-r4k.c | 6 +- net/xdp

[PATCH 1/2] xsk: do not use mmap_sem

2019-02-06 Thread Davidlohr Bueso
Holding mmap_sem exclusively for a gup() is an overkill. Lets replace the call for gup_fast() and let the mm take it if necessary. Cc: David S. Miller Cc: Bjorn Topel Cc: Magnus Karlsson CC: net...@vger.kernel.org Signed-off-by: Davidlohr Bueso --- net/xdp/xdp_umem.c | 6 ++ 1 file

[PATCH 2/2] MIPS/c-r4k: do no use mmap_sem for gup_fast()

2019-02-06 Thread Davidlohr Bueso
Hogan Cc: linux-m...@vger.kernel.org Signed-off-by: Davidlohr Bueso --- arch/mips/mm/c-r4k.c | 6 +- 1 file changed, 1 insertion(+), 5 deletions(-) diff --git a/arch/mips/mm/c-r4k.c b/arch/mips/mm/c-r4k.c index cc4e17caeb26..38fe86928837 100644 --- a/arch/mips/mm/c-r4k.c +++ b/arch/mips/mm

[PATCH 7/6] Documentation/infiniband: update from locked to pinned_vm

2019-02-06 Thread Davidlohr Bueso
We are really talking about pinned_vm here. Signed-off-by: Davidlohr Bueso --- Documentation/infiniband/user_verbs.txt | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/Documentation/infiniband/user_verbs.txt b/Documentation/infiniband/user_verbs.txt index df049b9f5b6e

[PATCH] kernel/futex: Fix barrier comment

2019-02-06 Thread Davidlohr Bueso
The current comment for the barrier that guarantees that waiter increment is always before taking the hb spinlock (barrier (A)) needs to be fixed. We are obviously referring to hb_waiters_inc, which is a full barrier. Reported-by: Peter Zijlstra Signed-off-by: Davidlohr Bueso --- kernel

[PATCH 1/6] mm: make mm->pinned_vm an atomic64 counter

2019-02-06 Thread Davidlohr Bueso
-by: Christoph Lameter Reviewed-by: Daniel Jordan Reviewed-by: Jan Kara Signed-off-by: Davidlohr Bueso --- drivers/infiniband/core/umem.c | 12 ++-- drivers/infiniband/hw/hfi1/user_pages.c| 6 +++--- drivers/infiniband/hw/qib/qib_user_pages.c | 4 ++-- drivers/infiniband/hw

[PATCH 5/6] drivers/IB,usnic: reduce scope of mmap_sem

2019-02-06 Thread Davidlohr Bueso
atomic. We also share the lock. Cc: be...@cisco.com Cc: neesc...@cisco.com Acked-by: Parvi Kaustubhi Reviewed-by: Ira Weiny Signed-off-by: Davidlohr Bueso --- drivers/infiniband/hw/usnic/usnic_ib_main.c | 2 - drivers/infiniband/hw/usnic/usnic_uiom.c| 58

[PATCH 6/6] drivers/IB,core: reduce scope of mmap_sem

2019-02-06 Thread Davidlohr Bueso
ib_umem_get() uses gup_longterm() and relies on the lock to stabilze the vma_list, so we cannot really get rid of mmap_sem altogether, but now that the counter is atomic, we can get of some complexity that mmap_sem brings with only pinned_vm. Reviewed-by: Ira Weiny Signed-off-by: Davidlohr Bueso

[PATCH 2/6] drivers/mic/scif: do not use mmap_sem

2019-02-06 Thread Davidlohr Bueso
-off-by: Davidlohr Bueso --- drivers/misc/mic/scif/scif_rma.c | 36 +++- 1 file changed, 11 insertions(+), 25 deletions(-) diff --git a/drivers/misc/mic/scif/scif_rma.c b/drivers/misc/mic/scif/scif_rma.c index 2448368f181e..263b8ad507ea 100644 --- a/drivers/misc

[PATCH v3 0/6] mm: make pinned_vm atomic and simplify users

2019-02-06 Thread Davidlohr Bueso
//lkml.org/lkml/2018/11/5/854 Davidlohr Bueso (6): mm: make mm->pinned_vm an atomic64 counter drivers/mic/scif: do not use mmap_sem drivers/IB,qib: optimize mmap_sem usage drivers/IB,hfi1: do not se mmap_sem drivers/IB,usnic: reduce scope of mmap_sem drivers/IB,core: reduce scop

[PATCH 4/6] drivers/IB,hfi1: do not se mmap_sem

2019-02-06 Thread Davidlohr Bueso
hed. Reviewed-by: Ira Weiny Signed-off-by: Davidlohr Bueso --- drivers/infiniband/hw/hfi1/user_pages.c | 6 -- 1 file changed, 6 deletions(-) diff --git a/drivers/infiniband/hw/hfi1/user_pages.c b/drivers/infiniband/hw/hfi1/user_pages.c index 40a6e434190f..24b592c6522e 100644 --- a/driv

[PATCH 3/6] drivers/IB,qib: optimize mmap_sem usage

2019-02-06 Thread Davidlohr Bueso
can therefore be converted to reader. This also fixes a bug that __qib_get_user_pages was not taking into account the current value of pinned_vm. Cc: dennis.dalessan...@intel.com Cc: mike.marcinis...@intel.com Reviewed-by: Ira Weiny Signed-off-by: Davidlohr Bueso --- drivers/infiniband/hw/qib

[tip:locking/core] sched/wake_q: Reduce reference counting for special users

2019-02-04 Thread tip-bot for Davidlohr Bueso
Commit-ID: 07879c6a3740fbbf3c8891a0ab484c20a12794d8 Gitweb: https://git.kernel.org/tip/07879c6a3740fbbf3c8891a0ab484c20a12794d8 Author: Davidlohr Bueso AuthorDate: Tue, 18 Dec 2018 11:53:52 -0800 Committer: Ingo Molnar CommitDate: Mon, 4 Feb 2019 09:03:28 +0100 sched/wake_q: Reduce

Re: [PATCH 3/6] drivers/IB,qib: do not use mmap_sem

2019-01-29 Thread Davidlohr Bueso
On Mon, 28 Jan 2019, Jason Gunthorpe wrote: .. and I'm looking at some of the other conversions here.. *most likely* any caller that is manipulating rlimit for get_user_pages should really be calling get_user_pages_longterm, so they should not be converted to use _fast? Yeah this was

[tip:perf/core] perf sched: Use cached rbtrees

2019-01-26 Thread tip-bot for Davidlohr Bueso
Commit-ID: cb4c13a5137766c3666ae106e1a5549316992379 Gitweb: https://git.kernel.org/tip/cb4c13a5137766c3666ae106e1a5549316992379 Author: Davidlohr Bueso AuthorDate: Thu, 6 Dec 2018 11:18:19 -0800 Committer: Arnaldo Carvalho de Melo CommitDate: Fri, 25 Jan 2019 15:12:10 +0100 perf sched

[tip:perf/core] perf hist: Use cached rbtrees

2019-01-26 Thread tip-bot for Davidlohr Bueso
Commit-ID: 2eb3d6894ae3b9cc8a94c91458a041c45773f23d Gitweb: https://git.kernel.org/tip/2eb3d6894ae3b9cc8a94c91458a041c45773f23d Author: Davidlohr Bueso AuthorDate: Thu, 6 Dec 2018 11:18:18 -0800 Committer: Arnaldo Carvalho de Melo CommitDate: Fri, 25 Jan 2019 15:12:10 +0100 perf hist

[tip:perf/core] perf util: Use cached rbtree for rblists

2019-01-26 Thread tip-bot for Davidlohr Bueso
Commit-ID: ca2270292e6c3415102242bf9dc3d05f622b7b28 Gitweb: https://git.kernel.org/tip/ca2270292e6c3415102242bf9dc3d05f622b7b28 Author: Davidlohr Bueso AuthorDate: Thu, 6 Dec 2018 11:18:16 -0800 Committer: Arnaldo Carvalho de Melo CommitDate: Fri, 25 Jan 2019 15:12:10 +0100 perf util

[tip:perf/core] perf symbols: Use cached rbtrees

2019-01-26 Thread tip-bot for Davidlohr Bueso
Commit-ID: 7137ff50b68a48bc28270c91b1c313259ab0c1c4 Gitweb: https://git.kernel.org/tip/7137ff50b68a48bc28270c91b1c313259ab0c1c4 Author: Davidlohr Bueso AuthorDate: Thu, 6 Dec 2018 11:18:17 -0800 Committer: Arnaldo Carvalho de Melo CommitDate: Fri, 25 Jan 2019 15:12:10 +0100 perf

[tip:perf/core] perf callchain: Use cached rbtrees

2019-01-26 Thread tip-bot for Davidlohr Bueso
Commit-ID: 55ecd6310f9fe48cf7e435be408862da1e0e6baa Gitweb: https://git.kernel.org/tip/55ecd6310f9fe48cf7e435be408862da1e0e6baa Author: Davidlohr Bueso AuthorDate: Thu, 6 Dec 2018 11:18:15 -0800 Committer: Arnaldo Carvalho de Melo CommitDate: Fri, 25 Jan 2019 15:12:09 +0100 perf

[tip:perf/core] perf machine: Use cached rbtrees

2019-01-26 Thread tip-bot for Davidlohr Bueso
Commit-ID: f3acb3a8a2081344801974ac5ec8e1b0d6f0ef36 Gitweb: https://git.kernel.org/tip/f3acb3a8a2081344801974ac5ec8e1b0d6f0ef36 Author: Davidlohr Bueso AuthorDate: Thu, 6 Dec 2018 11:18:14 -0800 Committer: Arnaldo Carvalho de Melo CommitDate: Fri, 25 Jan 2019 15:12:09 +0100 perf

[tip:perf/core] tools: Update rbtree implementation

2019-01-26 Thread tip-bot for Davidlohr Bueso
Commit-ID: 3aef2cad5d51ee66d2a614dd2f70cb34c74caf77 Gitweb: https://git.kernel.org/tip/3aef2cad5d51ee66d2a614dd2f70cb34c74caf77 Author: Davidlohr Bueso AuthorDate: Thu, 6 Dec 2018 11:18:13 -0800 Committer: Arnaldo Carvalho de Melo CommitDate: Fri, 25 Jan 2019 15:12:09 +0100 tools

Re: [PATCH 6/7] perf hist: Use cached rbtrees

2019-01-22 Thread Davidlohr Bueso
On Tue, 22 Jan 2019, Arnaldo Carvalho de Melo wrote: Em Thu, Dec 06, 2018 at 11:18:18AM -0800, Davidlohr Bueso escreveu: At the cost of an extra pointer, we can avoid the O(logN) cost of finding the first element in the tree (smallest node), which is something heavily required for histograms

Re: [PATCH 6/6] drivers/IB,core: reduce scope of mmap_sem

2019-01-21 Thread Davidlohr Bueso
On Mon, 21 Jan 2019, Jason Gunthorpe wrote: On Mon, Jan 21, 2019 at 09:42:20AM -0800, Davidlohr Bueso wrote: ib_umem_get() uses gup_longterm() and relies on the lock to stabilze the vma_list, so we cannot really get rid of mmap_sem altogether, but now that the counter is atomic, we can get

[PATCH 6/6] drivers/IB,core: reduce scope of mmap_sem

2019-01-21 Thread Davidlohr Bueso
ib_umem_get() uses gup_longterm() and relies on the lock to stabilze the vma_list, so we cannot really get rid of mmap_sem altogether, but now that the counter is atomic, we can get of some complexity that mmap_sem brings with only pinned_vm. Reviewed-by: Ira Weiny Signed-off-by: Davidlohr Bueso

[PATCH 3/6] drivers/IB,qib: do not use mmap_sem

2019-01-21 Thread Davidlohr Bueso
that __qib_get_user_pages was not taking into account the current value of pinned_vm. Cc: dennis.dalessan...@intel.com Cc: mike.marcinis...@intel.com Reviewed-by: Ira Weiny Signed-off-by: Davidlohr Bueso --- drivers/infiniband/hw/qib/qib_user_pages.c | 67 ++ 1 file changed, 22 insertions

[PATCH 4/6] drivers/IB,hfi1: do not se mmap_sem

2019-01-21 Thread Davidlohr Bueso
hed. Cc: mike.marcinis...@intel.com Cc: dennis.dalessan...@intel.com Reviewed-by: Ira Weiny Signed-off-by: Davidlohr Bueso --- drivers/infiniband/hw/hfi1/user_pages.c | 6 -- 1 file changed, 6 deletions(-) diff --git a/drivers/infiniband/hw/hfi1/user_pages.c b/drivers/infiniband/hw/h

[PATCH 5/6] drivers/IB,usnic: reduce scope of mmap_sem

2019-01-21 Thread Davidlohr Bueso
atomic. Cc: be...@cisco.com Cc: neesc...@cisco.com Cc: pkaus...@cisco.com Reviewed-by: Ira Weiny Signed-off-by: Davidlohr Bueso --- drivers/infiniband/hw/usnic/usnic_ib_main.c | 2 -- drivers/infiniband/hw/usnic/usnic_uiom.c| 54 +++-- drivers/infiniband/hw/usnic

[PATCH 2/6] mic/scif: do not use mmap_sem

2019-01-21 Thread Davidlohr Bueso
-off-by: Davidlohr Bueso --- drivers/misc/mic/scif/scif_rma.c | 36 +++- 1 file changed, 11 insertions(+), 25 deletions(-) diff --git a/drivers/misc/mic/scif/scif_rma.c b/drivers/misc/mic/scif/scif_rma.c index 2448368f181e..263b8ad507ea 100644 --- a/drivers/misc

[PATCH 1/6] mm: make mm->pinned_vm an atomic64 counter

2019-01-21 Thread Davidlohr Bueso
not possible to acquire it. By making the counter atomic we no longer need to hold the mmap_sem and can simply some code around it for pinned_vm users. The counter is 64-bit such that we need not worry about overflows such as rdma user input controlled from userspace. Signed-off-by: Davidlohr Bueso

[PATCH v2 -next 0/6] mm: make pinned_vm atomic and simplify users

2019-01-21 Thread Davidlohr Bueso
ll present. Also encapsulating internal mm logic via mm[un]pin() instead of drivers having to know about internals and playing nice with compaction are all wins. Thanks! [1] https://lkml.org/lkml/2018/11/5/854 Davidlohr Bueso (6): mm: make mm->pinned_vm an atomic64 counter mic/scif: do not

Re: [PATCH v4] sched/wake_q: Reduce reference counting for special users

2019-01-21 Thread Davidlohr Bueso
Hi - considering that the wake_q patches were picked up for tip/urgent, can this one make it in as well? Thanks, Davidlohr On Tue, 18 Dec 2018, Waiman Long wrote: On 12/18/2018 02:53 PM, Davidlohr Bueso wrote: Some users, specifically futexes and rwsems, required fixes that allowed

[tip:locking/core] sched/wake_q: Add branch prediction hint to wake_q_add() cmpxchg

2019-01-21 Thread tip-bot for Davidlohr Bueso
Commit-ID: 87ff19cb2f1aa55a5d8b691e6690cc059a59d2ec Gitweb: https://git.kernel.org/tip/87ff19cb2f1aa55a5d8b691e6690cc059a59d2ec Author: Davidlohr Bueso AuthorDate: Sun, 2 Dec 2018 21:31:30 -0800 Committer: Ingo Molnar CommitDate: Mon, 21 Jan 2019 11:18:50 +0100 sched/wake_q: Add

Re: [PATCH -next 0/6] mm: make pinned_vm atomic and simplify users

2019-01-15 Thread Davidlohr Bueso
Also Ccing lkml, sorry. On Tue, 15 Jan 2019, Davidlohr Bueso wrote: Hi, The following patches aim to provide cleanups to users that pin pages (mostly infiniband) by converting the counter to atomic -- note that Daniel Jordan also has patches[1] for the locked_vm counterpart and vfio. Apart

Re: [PATCH 1/1] epoll: remove wrong assert that ep_poll_callback is always called with irqs off

2019-01-08 Thread Davidlohr Bueso
On 2019-01-08 04:42, Roman Penyaev wrote: What we can do: a) disable irqs if we are not in interrupt. b) revert the patch completely. David, is it really crucial in terms of performance to avoid double local_irq_save() on Xen on this ep_poll_callback() hot path? Note that such optimizations

[PATCH v4] sched/wake_q: Reduce reference counting for special users

2018-12-18 Thread Davidlohr Bueso
the task is 'safe' from wake_q point of view (int that it requires reference throughout the entire queue/>wakeup cycle). In the one case it has internal reference counting, in the other case it consumes the reference counting. Signed-off-by: Davidlohr Bueso --- - Changes from v3: fixed wake_q_add_s

Re: [PATCH v2] sched/wake_q: Reduce reference counting for special users

2018-12-18 Thread Davidlohr Bueso
On Tue, 18 Dec 2018, Davidlohr Bueso wrote: +void wake_q_add_safe(struct wake_q_head *head, struct task_struct *task) +{ + if (!__wake_q_add(head, task)) + get_task_struct(task); *sigh* and this should be put().

Re: [PATCH v2] sched/wake_q: Reduce reference counting for special users

2018-12-18 Thread Davidlohr Bueso
int that it requires reference throughout the entire queue/>wakeup cycle). In the one case it has internal reference counting, in the other case it consumes the reference counting. Signed-off-by: Davidlohr Bueso --- Changes from v2: got rid of some bogus/incomplete leftover comments in wake_q_add(). include

[PATCH v2] sched/wake_q: Reduce reference counting for special users

2018-12-18 Thread Davidlohr Bueso
the task is 'safe' from wake_q point of view (int that it requires reference throughout the entire queue/>wakeup cycle). In the one case it has internal reference counting, in the other case it consumes the reference counting. Signed-off-by: Davidlohr Bueso --- Changes from v1: - Simplify s

Re: [RFC] locking/rwsem: Avoid issuing wakeup before setting the reader waiter to nil

2018-12-18 Thread Davidlohr Bueso
On Tue, 18 Dec 2018, Peter Zijlstra wrote: I'd rather do it like so, except I'm still conflicted on the naming. +void wake_q_add(struct wake_q_head *head, struct task_struct *task) +{ + if (__wake_q_add(head, task)) + get_task_struct(task); +} + +void

Re: [RFC] locking/rwsem: Avoid issuing wakeup before setting the reader waiter to nil

2018-12-17 Thread Davidlohr Bueso
urn value of the operation and do the put() if necessary when the cmpxchg() fails. Regular users of wake_q_add() that don't care about when the wakeup actually happens can just ignore the return value. Signed-off-by: Davidlohr Bueso --- include/linux/sched/wake_q.h | 7 -- kernel/fute

Re: [PATCH 0/3] use rwlock in order to reduce ep_poll_callback() contention

2018-12-17 Thread Davidlohr Bueso
On 2018-12-17 03:49, Roman Penyaev wrote: On 2018-12-13 19:13, Davidlohr Bueso wrote: Yes, good idea. But frankly I do not want to bloat epoll-wait.c with my multi-writers-single-reader test case, because soon epoll-wait.c will become unmaintainable with all possible loads and set of different

Re: [PATCH 1/3] epoll: make sure all elements in ready list are in FIFO order

2018-12-13 Thread Davidlohr Bueso
pefully the same will be for this case. With that: Reviewed-by: Davidlohr Bueso Signed-off-by: Roman Penyaev Cc: Davidlohr Bueso Cc: Jason Baron Cc: Al Viro Cc: Andrew Morton Cc: Linus Torvalds Cc: linux-fsde...@vger.kernel.org Cc: linux-kernel@vger.kernel.org --- fs/eventpoll.c |

Re: [PATCH 0/3] use rwlock in order to reduce ep_poll_callback() contention

2018-12-13 Thread Davidlohr Bueso
On 2018-12-12 03:03, Roman Penyaev wrote: The last patch targets the contention problem in ep_poll_callback(), which can be very well reproduced by generating events (write to pipe or eventfd) from many threads, while consumer thread does polling. The following are some microbenchmark results

Re: [PATCH] percpu_rwsem: fix missed wakeup due to reordering of load

2018-12-12 Thread Davidlohr Bueso
On 2018-12-12 06:26, Prateek Sood wrote: Please confirm if the suspicion of smp_rmb is correct. IMO, it should be smp_mb() translating to dmb ish. Feel free to add my ack. This should also be Cc to stable as of v4.11. Fixes: 8f95c90ceb54 (sched/wait, RCU: Introduce rcuwait machinery) Thanks,

[PATCH 2/7] perf machine: Use cached rbtrees

2018-12-06 Thread Davidlohr Bueso
ticing that the rb_erase_init() calls have been replaced by rb_erase_cached() which has no _init() flavor, however, the node is explicitly cleared next anyway, which was redundant until now. Signed-off-by: Davidlohr Bueso --- tools/perf/builtin-report.c | 3 ++- tools/perf/util/build-id.c

[PATCH 2/7] perf machine: Use cached rbtrees

2018-12-06 Thread Davidlohr Bueso
ticing that the rb_erase_init() calls have been replaced by rb_erase_cached() which has no _init() flavor, however, the node is explicitly cleared next anyway, which was redundant until now. Signed-off-by: Davidlohr Bueso --- tools/perf/builtin-report.c | 3 ++- tools/perf/util/build-id.c

[PATCH 4/7] perf util: Use cached rbtree for rblists

2018-12-06 Thread Davidlohr Bueso
probes, and buildid. Signed-off-by: Davidlohr Bueso --- tools/perf/util/intlist.h | 2 +- tools/perf/util/metricgroup.c | 2 +- tools/perf/util/rb_resort.h | 2 +- tools/perf/util/rblist.c | 28 ++-- tools/perf/util/rblist.h | 2 +- tools/perf/util/stat

[PATCH 4/7] perf util: Use cached rbtree for rblists

2018-12-06 Thread Davidlohr Bueso
probes, and buildid. Signed-off-by: Davidlohr Bueso --- tools/perf/util/intlist.h | 2 +- tools/perf/util/metricgroup.c | 2 +- tools/perf/util/rb_resort.h | 2 +- tools/perf/util/rblist.c | 28 ++-- tools/perf/util/rblist.h | 2 +- tools/perf/util/stat

[PATCH 5/7] perf symbols: Use cached rbtrees

2018-12-06 Thread Davidlohr Bueso
At the cost of an extra pointer, we can avoid the O(logN) cost of finding the first element in the tree (smallest node). Signed-off-by: Davidlohr Bueso --- tools/perf/builtin-annotate.c| 2 +- tools/perf/util/dso.c| 4 +- tools/perf/util/dso.h| 6 +-- tools/perf

[PATCH 1/7] tools/perf: Update rbtree implementation

2018-12-06 Thread Davidlohr Bueso
There have been a number of changes in the kernel's rbrtee implementation, including loose lockless searching guarantees and rb_root_cached, which later patches will use as an optimization. Signed-off-by: Davidlohr Bueso --- tools/include/linux/rbtree.h | 52 -- tools/include

[PATCH 5/7] perf symbols: Use cached rbtrees

2018-12-06 Thread Davidlohr Bueso
At the cost of an extra pointer, we can avoid the O(logN) cost of finding the first element in the tree (smallest node). Signed-off-by: Davidlohr Bueso --- tools/perf/builtin-annotate.c| 2 +- tools/perf/util/dso.c| 4 +- tools/perf/util/dso.h| 6 +-- tools/perf

[PATCH 1/7] tools/perf: Update rbtree implementation

2018-12-06 Thread Davidlohr Bueso
There have been a number of changes in the kernel's rbrtee implementation, including loose lockless searching guarantees and rb_root_cached, which later patches will use as an optimization. Signed-off-by: Davidlohr Bueso --- tools/include/linux/rbtree.h | 52 -- tools/include

[PATCH 7/7] perf sched: Use cached rbtrees

2018-12-06 Thread Davidlohr Bueso
At the cost of an extra pointer, we can avoid the O(logN) cost of finding the first element in the tree (smallest node), which is something heavily required for perf-sched. Signed-off-by: Davidlohr Bueso --- tools/perf/builtin-sched.c | 45 + 1 file

[PATCH 3/7] perf callchain: Use cached rbtrees

2018-12-06 Thread Davidlohr Bueso
At the cost of an extra pointer, we can avoid the O(logN) cost of finding the first element in the tree (smallest node), which is something required for nearly every in/srcline callchain node deletion (in/srcline__tree_delete()). Signed-off-by: Davidlohr Bueso --- tools/perf/util/dso.c | 4

[PATCH v2 -tip 0/7] tools/perf: Update rbtree implementation and optimize users

2018-12-06 Thread Davidlohr Bueso
tried to split them the best I could. Applies on today's -tip tree. Please consider for v4.21. Thanks! Davidlohr Bueso (7): tools/perf: Update rbtree implementation perf machine: Use cached rbtrees perf callchain: Use cached rbtrees perf util: Use cached rbtree for rblists perf symbols

[PATCH 7/7] perf sched: Use cached rbtrees

2018-12-06 Thread Davidlohr Bueso
At the cost of an extra pointer, we can avoid the O(logN) cost of finding the first element in the tree (smallest node), which is something heavily required for perf-sched. Signed-off-by: Davidlohr Bueso --- tools/perf/builtin-sched.c | 45 + 1 file

[PATCH 3/7] perf callchain: Use cached rbtrees

2018-12-06 Thread Davidlohr Bueso
At the cost of an extra pointer, we can avoid the O(logN) cost of finding the first element in the tree (smallest node), which is something required for nearly every in/srcline callchain node deletion (in/srcline__tree_delete()). Signed-off-by: Davidlohr Bueso --- tools/perf/util/dso.c | 4

[PATCH v2 -tip 0/7] tools/perf: Update rbtree implementation and optimize users

2018-12-06 Thread Davidlohr Bueso
tried to split them the best I could. Applies on today's -tip tree. Please consider for v4.21. Thanks! Davidlohr Bueso (7): tools/perf: Update rbtree implementation perf machine: Use cached rbtrees perf callchain: Use cached rbtrees perf util: Use cached rbtree for rblists perf symbols

[PATCH 6/7] perf hist: Use cached rbtrees

2018-12-06 Thread Davidlohr Bueso
hist::entries hist::entries_collapsed hist_entry::hroot_in hist_entry::hroot_out Signed-off-by: Davidlohr Bueso --- tools/perf/builtin-annotate.c | 2 +- tools/perf/builtin-c2c.c | 6 +- tools/perf/builtin-diff.c | 10 +- tools/perf/builtin-top.c | 2 +- tools

[PATCH 6/7] perf hist: Use cached rbtrees

2018-12-06 Thread Davidlohr Bueso
hist::entries hist::entries_collapsed hist_entry::hroot_in hist_entry::hroot_out Signed-off-by: Davidlohr Bueso --- tools/perf/builtin-annotate.c | 2 +- tools/perf/builtin-c2c.c | 6 +- tools/perf/builtin-diff.c | 10 +- tools/perf/builtin-top.c | 2 +- tools

Re: [RFC PATCH 1/1] epoll: use rwlock in order to reduce ep_poll_callback() contention

2018-12-05 Thread Davidlohr Bueso
On 12/3/18 6:02 AM, Roman Penyaev wrote: The main change is in replacement of the spinlock with a rwlock, which is taken on read in ep_poll_callback(), and then by adding poll items to the tail of the list using xchg atomic instruction. Write lock is taken everywhere else in order to stop list

Re: [RFC PATCH 1/1] epoll: use rwlock in order to reduce ep_poll_callback() contention

2018-12-05 Thread Davidlohr Bueso
On 12/3/18 6:02 AM, Roman Penyaev wrote: The main change is in replacement of the spinlock with a rwlock, which is taken on read in ep_poll_callback(), and then by adding poll items to the tail of the list using xchg atomic instruction. Write lock is taken everywhere else in order to stop list

Re: [RFC PATCH 1/1] epoll: use rwlock in order to reduce ep_poll_callback() contention

2018-12-05 Thread Davidlohr Bueso
On 12/3/18 6:02 AM, Roman Penyaev wrote: if (!ep_is_linked(epi)) { - list_add_tail(>rdllink, >rdllist); + /* Reverse ->ovflist, events should be in FIFO */ + list_add(>rdllink, >rdllist);

Re: [RFC PATCH 1/1] epoll: use rwlock in order to reduce ep_poll_callback() contention

2018-12-05 Thread Davidlohr Bueso
On 12/3/18 6:02 AM, Roman Penyaev wrote: if (!ep_is_linked(epi)) { - list_add_tail(>rdllink, >rdllist); + /* Reverse ->ovflist, events should be in FIFO */ + list_add(>rdllink, >rdllist);

Re: [RFC PATCH 1/1] epoll: use rwlock in order to reduce ep_poll_callback() contention

2018-12-05 Thread Davidlohr Bueso
+ akpm. Also, there are some epoll patches queued for -next, and as such this patch does not apply against linux-next. Thanks, Davidlohr On Tue, 04 Dec 2018, Jason Baron wrote: On 12/3/18 6:02 AM, Roman Penyaev wrote: Hi all, The goal of this patch is to reduce contention of

Re: [RFC PATCH 1/1] epoll: use rwlock in order to reduce ep_poll_callback() contention

2018-12-05 Thread Davidlohr Bueso
+ akpm. Also, there are some epoll patches queued for -next, and as such this patch does not apply against linux-next. Thanks, Davidlohr On Tue, 04 Dec 2018, Jason Baron wrote: On 12/3/18 6:02 AM, Roman Penyaev wrote: Hi all, The goal of this patch is to reduce contention of

Re: [PATCH] percpu_rwsem: fix missed wakeup due to reordering of load

2018-12-02 Thread Davidlohr Bueso
On 2018-11-30 07:10, Prateek Sood wrote: In a scenario where cpu_hotplug_lock percpu_rw_semaphore is already acquired for read operation by P1 using percpu_down_read(). Now we have P1 in the path of releaseing the cpu_hotplug_lock and P2 is in the process of acquiring cpu_hotplug_lock. P1

Re: [PATCH] percpu_rwsem: fix missed wakeup due to reordering of load

2018-12-02 Thread Davidlohr Bueso
On 2018-11-30 07:10, Prateek Sood wrote: In a scenario where cpu_hotplug_lock percpu_rw_semaphore is already acquired for read operation by P1 using percpu_down_read(). Now we have P1 in the path of releaseing the cpu_hotplug_lock and P2 is in the process of acquiring cpu_hotplug_lock. P1

[PATCH -tip] kernel/sched,wake_q: Branch predict wake_q_add() cmpxchg

2018-12-02 Thread Davidlohr Bueso
1-2%. Signed-off-by: Davidlohr Bueso --- kernel/sched/core.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/kernel/sched/core.c b/kernel/sched/core.c index 091e089063be..f7747cf6e427 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -408,7 +408,7 @@ void wake_q_add

[PATCH -tip] kernel/sched,wake_q: Branch predict wake_q_add() cmpxchg

2018-12-02 Thread Davidlohr Bueso
1-2%. Signed-off-by: Davidlohr Bueso --- kernel/sched/core.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/kernel/sched/core.c b/kernel/sched/core.c index 091e089063be..f7747cf6e427 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -408,7 +408,7 @@ void wake_q_add

Re: [RFC] locking/rwsem: Avoid issuing wakeup before setting the reader waiter to nil

2018-11-29 Thread Davidlohr Bueso
I messed up something such that waiman was not in the thread. Ccing. On Thu, 29 Nov 2018, Waiman Long wrote: That can be costly for x86 which will now have 2 locked instructions. Yeah, and when used as an actual queue we should really start to notice. Some users just have a single task in

Re: [RFC] locking/rwsem: Avoid issuing wakeup before setting the reader waiter to nil

2018-11-29 Thread Davidlohr Bueso
I messed up something such that waiman was not in the thread. Ccing. On Thu, 29 Nov 2018, Waiman Long wrote: That can be costly for x86 which will now have 2 locked instructions. Yeah, and when used as an actual queue we should really start to notice. Some users just have a single task in

Re: [RFC] locking/rwsem: Avoid issuing wakeup before setting the reader waiter to nil

2018-11-29 Thread Davidlohr Bueso
On Thu, 29 Nov 2018, Waiman Long wrote: That can be costly for x86 which will now have 2 locked instructions. Yeah, and when used as an actual queue we should really start to notice. Some users just have a single task in the wake_q because avoiding the cost of wake_up_process() with locks

Re: [RFC] locking/rwsem: Avoid issuing wakeup before setting the reader waiter to nil

2018-11-29 Thread Davidlohr Bueso
On Thu, 29 Nov 2018, Waiman Long wrote: That can be costly for x86 which will now have 2 locked instructions. Yeah, and when used as an actual queue we should really start to notice. Some users just have a single task in the wake_q because avoiding the cost of wake_up_process() with locks

Re: [RFC] locking/rwsem: Avoid issuing wakeup before setting the reader waiter to nil

2018-11-29 Thread Davidlohr Bueso
On Thu, 29 Nov 2018, Peter Zijlstra wrote: On Thu, Nov 29, 2018 at 02:12:32PM +0100, Peter Zijlstra wrote: Yes, I think this is real, and worse, I think we need to go audit all wake_q_add() users and document this behaviour. In the ideal case we'd delay the actual wakeup to the last

Re: [RFC] locking/rwsem: Avoid issuing wakeup before setting the reader waiter to nil

2018-11-29 Thread Davidlohr Bueso
On Thu, 29 Nov 2018, Peter Zijlstra wrote: On Thu, Nov 29, 2018 at 02:12:32PM +0100, Peter Zijlstra wrote: Yes, I think this is real, and worse, I think we need to go audit all wake_q_add() users and document this behaviour. In the ideal case we'd delay the actual wakeup to the last

Re: [RFC] locking/rwsem: Avoid issuing wakeup before setting the reader waiter to nil

2018-11-29 Thread Davidlohr Bueso
On Thu, 29 Nov 2018, Peter Zijlstra wrote: On Thu, Nov 29, 2018 at 12:58:26PM -0500, Waiman Long wrote: OK, you convinced me. However, that can still lead to anonymous wakeups that can be problematic if it happens in certain places. Should we try to reduce anonymous wakeup as much as possible?

Re: [RFC] locking/rwsem: Avoid issuing wakeup before setting the reader waiter to nil

2018-11-29 Thread Davidlohr Bueso
On Thu, 29 Nov 2018, Peter Zijlstra wrote: On Thu, Nov 29, 2018 at 12:58:26PM -0500, Waiman Long wrote: OK, you convinced me. However, that can still lead to anonymous wakeups that can be problematic if it happens in certain places. Should we try to reduce anonymous wakeup as much as possible?

[tip:perf/core] perf bench: Add epoll_ctl(2) benchmark

2018-11-21 Thread tip-bot for Davidlohr Bueso
Commit-ID: 231457ec707475c71d4e538a3253f1ed9e294cf0 Gitweb: https://git.kernel.org/tip/231457ec707475c71d4e538a3253f1ed9e294cf0 Author: Davidlohr Bueso AuthorDate: Tue, 6 Nov 2018 07:22:26 -0800 Committer: Arnaldo Carvalho de Melo CommitDate: Wed, 21 Nov 2018 22:39:55 -0300 perf bench

[tip:perf/core] perf bench: Add epoll_ctl(2) benchmark

2018-11-21 Thread tip-bot for Davidlohr Bueso
Commit-ID: 231457ec707475c71d4e538a3253f1ed9e294cf0 Gitweb: https://git.kernel.org/tip/231457ec707475c71d4e538a3253f1ed9e294cf0 Author: Davidlohr Bueso AuthorDate: Tue, 6 Nov 2018 07:22:26 -0800 Committer: Arnaldo Carvalho de Melo CommitDate: Wed, 21 Nov 2018 22:39:55 -0300 perf bench

[tip:perf/core] perf bench: Add epoll parallel epoll_wait benchmark

2018-11-21 Thread tip-bot for Davidlohr Bueso
Commit-ID: 121dd9ea0116de3e79a4903a84018190c595e2b6 Gitweb: https://git.kernel.org/tip/121dd9ea0116de3e79a4903a84018190c595e2b6 Author: Davidlohr Bueso AuthorDate: Tue, 6 Nov 2018 07:22:25 -0800 Committer: Arnaldo Carvalho de Melo CommitDate: Wed, 21 Nov 2018 22:38:47 -0300 perf bench

[tip:perf/core] perf bench: Add epoll parallel epoll_wait benchmark

2018-11-21 Thread tip-bot for Davidlohr Bueso
Commit-ID: 121dd9ea0116de3e79a4903a84018190c595e2b6 Gitweb: https://git.kernel.org/tip/121dd9ea0116de3e79a4903a84018190c595e2b6 Author: Davidlohr Bueso AuthorDate: Tue, 6 Nov 2018 07:22:25 -0800 Committer: Arnaldo Carvalho de Melo CommitDate: Wed, 21 Nov 2018 22:38:47 -0300 perf bench

[tip:perf/core] perf bench: Move HAVE_PTHREAD_ATTR_SETAFFINITY_NP into bench.h

2018-11-21 Thread tip-bot for Davidlohr Bueso
Commit-ID: d47d77c3f008d3cf02c6ce92ef4f6e32ca270351 Gitweb: https://git.kernel.org/tip/d47d77c3f008d3cf02c6ce92ef4f6e32ca270351 Author: Davidlohr Bueso AuthorDate: Fri, 9 Nov 2018 13:07:19 -0800 Committer: Arnaldo Carvalho de Melo CommitDate: Wed, 21 Nov 2018 12:00:32 -0300 perf bench

<    1   2   3   4   5   6   7   8   9   10   >