This is a lot more appropriate than PI_LIST, which in the kernel
one would assume that it has to do with priority-inheritance;
which is not -- furthermore futexes make use of plists so this
can be even more confusing, albeit the debug nature of the config
option.
Signed-off-by: Davidlohr Bueso
On Wed, 13 Mar 2019, Matthew Wilcox wrote:
It's probably worth listing the advantages of the Maple Tree over the
rbtree.
I'm not familiar with maple trees, are they referred to by another name?
(is this some sort of B-tree?). Google just shows me real trees.
- Shallower tree. A 1000-entry
On Wed, 13 Mar 2019, Laurent Dufour wrote:
If this is not too late and if there is still place available, I would
like to attend the MM track and propose a topic about using the XArray
to replace the VMA's RB tree and list.
Using the XArray in place of the VMA's tree and list seems to be a
er releasing the info->lock the thing is
freed anyway so it should not change things.
Feel free to add my:
Reviewed-by: Davidlohr Bueso
+ list_for_each_entry_safe(msg, nmsg, _msg, m_list) {
+ list_del(>m_list);
+ free_msg(msg);
+ }
+
/
t;pinned_vm %llx data_vm %lx exec_vm %lx stack_vm %lx\n"
~~~^
%lx
Fixes: 70f8a3ca68d3 ("mm: make mm->pinned_vm an atomic64 counter")
Signed-off-by: Qian Cai
Acked-by: Davidlohr Bueso
off-by: Davidlohr Bueso
---
tools/perf/Documentation/perf-bench.txt | 11 +
tools/perf/bench/Build | 1 +
tools/perf/bench/bench.h| 1 +
tools/perf/bench/syscall.c | 78 +
tools/perf/builtin-bench.c | 8 +++
throughput compatible with 'perf-bench' via getppid(2), yet
without any of the additional template stuff from Ingo's version (based
on numa.c). The code is identical to what mmtests uses.
https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1067469.html
Signed-off-by: Davidlohr Bueso
---
tools
y affecting kernel performance and hence behavior.
Both (2) and (3) are useful debugging aids.
Yes, this will come in handy in the future. Feel free to add my:
Acked-by: Davidlohr Bueso
Thanks.
On Fri, 08 Feb 2019, Waiman Long wrote:
I am planning to run more performance test and post the data sometimes
next week. Davidlohr is also going to run some of his rwsem performance
test on this patchset.
So I ran this series on a 40-core IB 2 socket with various worklods in
mmtests. Below
On Mon, 11 Feb 2019, ira.we...@intel.com wrote:
Ira Weiny (3):
mm/gup: Change "write" parameter to flags
mm/gup: Introduce get_user_pages_fast_longterm()
IB/HFI1: Use new get_user_pages_fast_longterm()
Out of curiosity, are you planning on having all rdma drivers
use
Holding mmap_sem exclusively for a gup() is an overkill.
Lets share the lock and replace the gup call for gup_longterm(),
as it is better suited for the lifetime of the pinning.
Cc: David S. Miller
Cc: Bjorn Topel
Cc: Magnus Karlsson
CC: net...@vger.kernel.org
Signed-off-by: Davidlohr Bueso
| 29 ++---
include/linux/irqdesc.h |1 +
kernel/irq/chip.c | 12 ++--
kernel/irq/internals.h |8 +++-
kernel/irq/irqdesc.c|7 ++-
5 files changed, 50 insertions(+), 7 deletions(-)
Reviewed-by: Davidlohr Bueso
Commit-ID: 6f568ebe2afefdc33a6fb06ef20a94f8b96455f1
Gitweb: https://git.kernel.org/tip/6f568ebe2afefdc33a6fb06ef20a94f8b96455f1
Author: Davidlohr Bueso
AuthorDate: Wed, 6 Feb 2019 10:56:02 -0800
Committer: Thomas Gleixner
CommitDate: Fri, 8 Feb 2019 13:00:35 +0100
futex: Fix barrier
On Thu, 07 Feb 2019, Waiman Long wrote:
30 files changed, 1197 insertions(+), 1594 deletions(-)
Performance numbers on numerous workloads, pretty please.
I'll go and throw this at my mmap_sem intensive workloads
I've collected.
Thanks,
Davidlohr
Could this change be pushed to v5.0 (tip/urgent) just like the wake_q fixes
that are already in Linus' tree? This will help backporting efforts as
most distros will want to avoid the performance hit and include this
patch.
Thanks,
Davidlohr
On Mon, 04 Feb 2019, tip-bot for Davidlohr Bueso wrote
On Thu, 07 Feb 2019, Paul Burton wrote:
Hi Davidlohr,
On Wed, Feb 06, 2019 at 09:37:40PM -0800, Davidlohr Bueso wrote:
It is well known that because the mm can internally
call the regular gup_unlocked if the lockless approach
fails and take the sem there, the caller must not hold
the mmap_sem
Unlike what the subject says, this is not against -tip, it applies
on today's -next.
On Wed, 06 Feb 2019, Davidlohr Bueso wrote:
Hi,
Here are two more patchlets that cleanup mmap_sem and gup abusers.
The second is also a fixlet.
Compile-tested only. Please consider for v5.1
Thanks
Hi,
Here are two more patchlets that cleanup mmap_sem and gup abusers.
The second is also a fixlet.
Compile-tested only. Please consider for v5.1
Thanks!
Davidlohr Bueso (2):
xsk: do not use mmap_sem
MIPS/c-r4k: do no use mmap_sem for gup_fast()
arch/mips/mm/c-r4k.c | 6 +-
net/xdp
Holding mmap_sem exclusively for a gup() is an overkill.
Lets replace the call for gup_fast() and let the mm take
it if necessary.
Cc: David S. Miller
Cc: Bjorn Topel
Cc: Magnus Karlsson
CC: net...@vger.kernel.org
Signed-off-by: Davidlohr Bueso
---
net/xdp/xdp_umem.c | 6 ++
1 file
Hogan
Cc: linux-m...@vger.kernel.org
Signed-off-by: Davidlohr Bueso
---
arch/mips/mm/c-r4k.c | 6 +-
1 file changed, 1 insertion(+), 5 deletions(-)
diff --git a/arch/mips/mm/c-r4k.c b/arch/mips/mm/c-r4k.c
index cc4e17caeb26..38fe86928837 100644
--- a/arch/mips/mm/c-r4k.c
+++ b/arch/mips/mm
We are really talking about pinned_vm here.
Signed-off-by: Davidlohr Bueso
---
Documentation/infiniband/user_verbs.txt | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/Documentation/infiniband/user_verbs.txt
b/Documentation/infiniband/user_verbs.txt
index df049b9f5b6e
The current comment for the barrier that guarantees that waiter
increment is always before taking the hb spinlock (barrier (A))
needs to be fixed. We are obviously referring to hb_waiters_inc,
which is a full barrier.
Reported-by: Peter Zijlstra
Signed-off-by: Davidlohr Bueso
---
kernel
-by: Christoph Lameter
Reviewed-by: Daniel Jordan
Reviewed-by: Jan Kara
Signed-off-by: Davidlohr Bueso
---
drivers/infiniband/core/umem.c | 12 ++--
drivers/infiniband/hw/hfi1/user_pages.c| 6 +++---
drivers/infiniband/hw/qib/qib_user_pages.c | 4 ++--
drivers/infiniband/hw
atomic. We
also share the lock.
Cc: be...@cisco.com
Cc: neesc...@cisco.com
Acked-by: Parvi Kaustubhi
Reviewed-by: Ira Weiny
Signed-off-by: Davidlohr Bueso
---
drivers/infiniband/hw/usnic/usnic_ib_main.c | 2 -
drivers/infiniband/hw/usnic/usnic_uiom.c| 58
ib_umem_get() uses gup_longterm() and relies on the lock to
stabilze the vma_list, so we cannot really get rid of mmap_sem
altogether, but now that the counter is atomic, we can get of
some complexity that mmap_sem brings with only pinned_vm.
Reviewed-by: Ira Weiny
Signed-off-by: Davidlohr Bueso
-off-by: Davidlohr Bueso
---
drivers/misc/mic/scif/scif_rma.c | 36 +++-
1 file changed, 11 insertions(+), 25 deletions(-)
diff --git a/drivers/misc/mic/scif/scif_rma.c b/drivers/misc/mic/scif/scif_rma.c
index 2448368f181e..263b8ad507ea 100644
--- a/drivers/misc
//lkml.org/lkml/2018/11/5/854
Davidlohr Bueso (6):
mm: make mm->pinned_vm an atomic64 counter
drivers/mic/scif: do not use mmap_sem
drivers/IB,qib: optimize mmap_sem usage
drivers/IB,hfi1: do not se mmap_sem
drivers/IB,usnic: reduce scope of mmap_sem
drivers/IB,core: reduce scop
hed.
Reviewed-by: Ira Weiny
Signed-off-by: Davidlohr Bueso
---
drivers/infiniband/hw/hfi1/user_pages.c | 6 --
1 file changed, 6 deletions(-)
diff --git a/drivers/infiniband/hw/hfi1/user_pages.c
b/drivers/infiniband/hw/hfi1/user_pages.c
index 40a6e434190f..24b592c6522e 100644
--- a/driv
can therefore be converted to reader.
This also fixes a bug that __qib_get_user_pages was not
taking into account the current value of pinned_vm.
Cc: dennis.dalessan...@intel.com
Cc: mike.marcinis...@intel.com
Reviewed-by: Ira Weiny
Signed-off-by: Davidlohr Bueso
---
drivers/infiniband/hw/qib
Commit-ID: 07879c6a3740fbbf3c8891a0ab484c20a12794d8
Gitweb: https://git.kernel.org/tip/07879c6a3740fbbf3c8891a0ab484c20a12794d8
Author: Davidlohr Bueso
AuthorDate: Tue, 18 Dec 2018 11:53:52 -0800
Committer: Ingo Molnar
CommitDate: Mon, 4 Feb 2019 09:03:28 +0100
sched/wake_q: Reduce
On Mon, 28 Jan 2019, Jason Gunthorpe wrote:
.. and I'm looking at some of the other conversions here.. *most
likely* any caller that is manipulating rlimit for get_user_pages
should really be calling get_user_pages_longterm, so they should not
be converted to use _fast?
Yeah this was
Commit-ID: cb4c13a5137766c3666ae106e1a5549316992379
Gitweb: https://git.kernel.org/tip/cb4c13a5137766c3666ae106e1a5549316992379
Author: Davidlohr Bueso
AuthorDate: Thu, 6 Dec 2018 11:18:19 -0800
Committer: Arnaldo Carvalho de Melo
CommitDate: Fri, 25 Jan 2019 15:12:10 +0100
perf sched
Commit-ID: 2eb3d6894ae3b9cc8a94c91458a041c45773f23d
Gitweb: https://git.kernel.org/tip/2eb3d6894ae3b9cc8a94c91458a041c45773f23d
Author: Davidlohr Bueso
AuthorDate: Thu, 6 Dec 2018 11:18:18 -0800
Committer: Arnaldo Carvalho de Melo
CommitDate: Fri, 25 Jan 2019 15:12:10 +0100
perf hist
Commit-ID: ca2270292e6c3415102242bf9dc3d05f622b7b28
Gitweb: https://git.kernel.org/tip/ca2270292e6c3415102242bf9dc3d05f622b7b28
Author: Davidlohr Bueso
AuthorDate: Thu, 6 Dec 2018 11:18:16 -0800
Committer: Arnaldo Carvalho de Melo
CommitDate: Fri, 25 Jan 2019 15:12:10 +0100
perf util
Commit-ID: 7137ff50b68a48bc28270c91b1c313259ab0c1c4
Gitweb: https://git.kernel.org/tip/7137ff50b68a48bc28270c91b1c313259ab0c1c4
Author: Davidlohr Bueso
AuthorDate: Thu, 6 Dec 2018 11:18:17 -0800
Committer: Arnaldo Carvalho de Melo
CommitDate: Fri, 25 Jan 2019 15:12:10 +0100
perf
Commit-ID: 55ecd6310f9fe48cf7e435be408862da1e0e6baa
Gitweb: https://git.kernel.org/tip/55ecd6310f9fe48cf7e435be408862da1e0e6baa
Author: Davidlohr Bueso
AuthorDate: Thu, 6 Dec 2018 11:18:15 -0800
Committer: Arnaldo Carvalho de Melo
CommitDate: Fri, 25 Jan 2019 15:12:09 +0100
perf
Commit-ID: f3acb3a8a2081344801974ac5ec8e1b0d6f0ef36
Gitweb: https://git.kernel.org/tip/f3acb3a8a2081344801974ac5ec8e1b0d6f0ef36
Author: Davidlohr Bueso
AuthorDate: Thu, 6 Dec 2018 11:18:14 -0800
Committer: Arnaldo Carvalho de Melo
CommitDate: Fri, 25 Jan 2019 15:12:09 +0100
perf
Commit-ID: 3aef2cad5d51ee66d2a614dd2f70cb34c74caf77
Gitweb: https://git.kernel.org/tip/3aef2cad5d51ee66d2a614dd2f70cb34c74caf77
Author: Davidlohr Bueso
AuthorDate: Thu, 6 Dec 2018 11:18:13 -0800
Committer: Arnaldo Carvalho de Melo
CommitDate: Fri, 25 Jan 2019 15:12:09 +0100
tools
On Tue, 22 Jan 2019, Arnaldo Carvalho de Melo wrote:
Em Thu, Dec 06, 2018 at 11:18:18AM -0800, Davidlohr Bueso escreveu:
At the cost of an extra pointer, we can avoid the O(logN) cost
of finding the first element in the tree (smallest node), which
is something heavily required for histograms
On Mon, 21 Jan 2019, Jason Gunthorpe wrote:
On Mon, Jan 21, 2019 at 09:42:20AM -0800, Davidlohr Bueso wrote:
ib_umem_get() uses gup_longterm() and relies on the lock to
stabilze the vma_list, so we cannot really get rid of mmap_sem
altogether, but now that the counter is atomic, we can get
ib_umem_get() uses gup_longterm() and relies on the lock to
stabilze the vma_list, so we cannot really get rid of mmap_sem
altogether, but now that the counter is atomic, we can get of
some complexity that mmap_sem brings with only pinned_vm.
Reviewed-by: Ira Weiny
Signed-off-by: Davidlohr Bueso
that __qib_get_user_pages was not taking into
account the current value of pinned_vm.
Cc: dennis.dalessan...@intel.com
Cc: mike.marcinis...@intel.com
Reviewed-by: Ira Weiny
Signed-off-by: Davidlohr Bueso
---
drivers/infiniband/hw/qib/qib_user_pages.c | 67 ++
1 file changed, 22 insertions
hed.
Cc: mike.marcinis...@intel.com
Cc: dennis.dalessan...@intel.com
Reviewed-by: Ira Weiny
Signed-off-by: Davidlohr Bueso
---
drivers/infiniband/hw/hfi1/user_pages.c | 6 --
1 file changed, 6 deletions(-)
diff --git a/drivers/infiniband/hw/hfi1/user_pages.c
b/drivers/infiniband/hw/h
atomic.
Cc: be...@cisco.com
Cc: neesc...@cisco.com
Cc: pkaus...@cisco.com
Reviewed-by: Ira Weiny
Signed-off-by: Davidlohr Bueso
---
drivers/infiniband/hw/usnic/usnic_ib_main.c | 2 --
drivers/infiniband/hw/usnic/usnic_uiom.c| 54 +++--
drivers/infiniband/hw/usnic
-off-by: Davidlohr Bueso
---
drivers/misc/mic/scif/scif_rma.c | 36 +++-
1 file changed, 11 insertions(+), 25 deletions(-)
diff --git a/drivers/misc/mic/scif/scif_rma.c b/drivers/misc/mic/scif/scif_rma.c
index 2448368f181e..263b8ad507ea 100644
--- a/drivers/misc
not possible to acquire it.
By making the counter atomic we no longer need to hold the mmap_sem
and can simply some code around it for pinned_vm users. The counter
is 64-bit such that we need not worry about overflows such as rdma
user input controlled from userspace.
Signed-off-by: Davidlohr Bueso
ll
present. Also encapsulating internal mm logic via mm[un]pin() instead of
drivers having to know about internals and playing nice with compaction are
all wins.
Thanks!
[1] https://lkml.org/lkml/2018/11/5/854
Davidlohr Bueso (6):
mm: make mm->pinned_vm an atomic64 counter
mic/scif: do not
Hi - considering that the wake_q patches were picked up for tip/urgent, can
this one make it in as well?
Thanks,
Davidlohr
On Tue, 18 Dec 2018, Waiman Long wrote:
On 12/18/2018 02:53 PM, Davidlohr Bueso wrote:
Some users, specifically futexes and rwsems, required fixes
that allowed
Commit-ID: 87ff19cb2f1aa55a5d8b691e6690cc059a59d2ec
Gitweb: https://git.kernel.org/tip/87ff19cb2f1aa55a5d8b691e6690cc059a59d2ec
Author: Davidlohr Bueso
AuthorDate: Sun, 2 Dec 2018 21:31:30 -0800
Committer: Ingo Molnar
CommitDate: Mon, 21 Jan 2019 11:18:50 +0100
sched/wake_q: Add
Also Ccing lkml, sorry.
On Tue, 15 Jan 2019, Davidlohr Bueso wrote:
Hi,
The following patches aim to provide cleanups to users that pin pages
(mostly infiniband) by converting the counter to atomic -- note that
Daniel Jordan also has patches[1] for the locked_vm counterpart and vfio.
Apart
On 2019-01-08 04:42, Roman Penyaev wrote:
What we can do:
a) disable irqs if we are not in interrupt.
b) revert the patch completely.
David, is it really crucial in terms of performance to avoid double
local_irq_save() on Xen on this ep_poll_callback() hot path?
Note that such optimizations
the
task is 'safe' from wake_q point of view (int that it requires
reference throughout the entire queue/>wakeup cycle). In the one
case it has internal reference counting, in the other case it
consumes the reference counting.
Signed-off-by: Davidlohr Bueso
---
- Changes from v3: fixed wake_q_add_s
On Tue, 18 Dec 2018, Davidlohr Bueso wrote:
+void wake_q_add_safe(struct wake_q_head *head, struct task_struct *task)
+{
+ if (!__wake_q_add(head, task))
+ get_task_struct(task);
*sigh* and this should be put().
int that it requires
reference throughout the entire queue/>wakeup cycle). In the one
case it has internal reference counting, in the other case it
consumes the reference counting.
Signed-off-by: Davidlohr Bueso
---
Changes from v2: got rid of some bogus/incomplete leftover comments
in wake_q_add().
include
the
task is 'safe' from wake_q point of view (int that it requires
reference throughout the entire queue/>wakeup cycle). In the one
case it has internal reference counting, in the other case it
consumes the reference counting.
Signed-off-by: Davidlohr Bueso
---
Changes from v1:
- Simplify s
On Tue, 18 Dec 2018, Peter Zijlstra wrote:
I'd rather do it like so, except I'm still conflicted on the naming.
+void wake_q_add(struct wake_q_head *head, struct task_struct *task)
+{
+ if (__wake_q_add(head, task))
+ get_task_struct(task);
+}
+
+void
urn value of the operation and
do the put() if necessary when the cmpxchg() fails. Regular users
of wake_q_add() that don't care about when the wakeup actually
happens can just ignore the return value.
Signed-off-by: Davidlohr Bueso
---
include/linux/sched/wake_q.h | 7 --
kernel/fute
On 2018-12-17 03:49, Roman Penyaev wrote:
On 2018-12-13 19:13, Davidlohr Bueso wrote:
Yes, good idea. But frankly I do not want to bloat epoll-wait.c with
my multi-writers-single-reader test case, because soon epoll-wait.c
will become unmaintainable with all possible loads and set of
different
pefully the same
will be for this case.
With that:
Reviewed-by: Davidlohr Bueso
Signed-off-by: Roman Penyaev
Cc: Davidlohr Bueso
Cc: Jason Baron
Cc: Al Viro
Cc: Andrew Morton
Cc: Linus Torvalds
Cc: linux-fsde...@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
---
fs/eventpoll.c |
On 2018-12-12 03:03, Roman Penyaev wrote:
The last patch targets the contention problem in ep_poll_callback(),
which
can be very well reproduced by generating events (write to pipe or
eventfd)
from many threads, while consumer thread does polling.
The following are some microbenchmark results
On 2018-12-12 06:26, Prateek Sood wrote:
Please confirm if the suspicion of smp_rmb is correct.
IMO, it should be smp_mb() translating to dmb ish.
Feel free to add my ack. This should also be Cc to stable as of v4.11.
Fixes: 8f95c90ceb54 (sched/wait, RCU: Introduce rcuwait machinery)
Thanks,
ticing
that the rb_erase_init() calls have been replaced by rb_erase_cached()
which has no _init() flavor, however, the node is explicitly
cleared next anyway, which was redundant until now.
Signed-off-by: Davidlohr Bueso
---
tools/perf/builtin-report.c | 3 ++-
tools/perf/util/build-id.c
ticing
that the rb_erase_init() calls have been replaced by rb_erase_cached()
which has no _init() flavor, however, the node is explicitly
cleared next anyway, which was redundant until now.
Signed-off-by: Davidlohr Bueso
---
tools/perf/builtin-report.c | 3 ++-
tools/perf/util/build-id.c
probes, and buildid.
Signed-off-by: Davidlohr Bueso
---
tools/perf/util/intlist.h | 2 +-
tools/perf/util/metricgroup.c | 2 +-
tools/perf/util/rb_resort.h | 2 +-
tools/perf/util/rblist.c | 28 ++--
tools/perf/util/rblist.h | 2 +-
tools/perf/util/stat
probes, and buildid.
Signed-off-by: Davidlohr Bueso
---
tools/perf/util/intlist.h | 2 +-
tools/perf/util/metricgroup.c | 2 +-
tools/perf/util/rb_resort.h | 2 +-
tools/perf/util/rblist.c | 28 ++--
tools/perf/util/rblist.h | 2 +-
tools/perf/util/stat
At the cost of an extra pointer, we can avoid the O(logN) cost
of finding the first element in the tree (smallest node).
Signed-off-by: Davidlohr Bueso
---
tools/perf/builtin-annotate.c| 2 +-
tools/perf/util/dso.c| 4 +-
tools/perf/util/dso.h| 6 +--
tools/perf
There have been a number of changes in the kernel's rbrtee
implementation, including loose lockless searching guarantees
and rb_root_cached, which later patches will use as an
optimization.
Signed-off-by: Davidlohr Bueso
---
tools/include/linux/rbtree.h | 52 --
tools/include
At the cost of an extra pointer, we can avoid the O(logN) cost
of finding the first element in the tree (smallest node).
Signed-off-by: Davidlohr Bueso
---
tools/perf/builtin-annotate.c| 2 +-
tools/perf/util/dso.c| 4 +-
tools/perf/util/dso.h| 6 +--
tools/perf
There have been a number of changes in the kernel's rbrtee
implementation, including loose lockless searching guarantees
and rb_root_cached, which later patches will use as an
optimization.
Signed-off-by: Davidlohr Bueso
---
tools/include/linux/rbtree.h | 52 --
tools/include
At the cost of an extra pointer, we can avoid the O(logN) cost
of finding the first element in the tree (smallest node), which
is something heavily required for perf-sched.
Signed-off-by: Davidlohr Bueso
---
tools/perf/builtin-sched.c | 45 +
1 file
At the cost of an extra pointer, we can avoid the O(logN) cost
of finding the first element in the tree (smallest node), which
is something required for nearly every in/srcline callchain node
deletion (in/srcline__tree_delete()).
Signed-off-by: Davidlohr Bueso
---
tools/perf/util/dso.c | 4
tried to split them the best I could.
Applies on today's -tip tree. Please consider for v4.21.
Thanks!
Davidlohr Bueso (7):
tools/perf: Update rbtree implementation
perf machine: Use cached rbtrees
perf callchain: Use cached rbtrees
perf util: Use cached rbtree for rblists
perf symbols
At the cost of an extra pointer, we can avoid the O(logN) cost
of finding the first element in the tree (smallest node), which
is something heavily required for perf-sched.
Signed-off-by: Davidlohr Bueso
---
tools/perf/builtin-sched.c | 45 +
1 file
At the cost of an extra pointer, we can avoid the O(logN) cost
of finding the first element in the tree (smallest node), which
is something required for nearly every in/srcline callchain node
deletion (in/srcline__tree_delete()).
Signed-off-by: Davidlohr Bueso
---
tools/perf/util/dso.c | 4
tried to split them the best I could.
Applies on today's -tip tree. Please consider for v4.21.
Thanks!
Davidlohr Bueso (7):
tools/perf: Update rbtree implementation
perf machine: Use cached rbtrees
perf callchain: Use cached rbtrees
perf util: Use cached rbtree for rblists
perf symbols
hist::entries
hist::entries_collapsed
hist_entry::hroot_in
hist_entry::hroot_out
Signed-off-by: Davidlohr Bueso
---
tools/perf/builtin-annotate.c | 2 +-
tools/perf/builtin-c2c.c | 6 +-
tools/perf/builtin-diff.c | 10 +-
tools/perf/builtin-top.c | 2 +-
tools
hist::entries
hist::entries_collapsed
hist_entry::hroot_in
hist_entry::hroot_out
Signed-off-by: Davidlohr Bueso
---
tools/perf/builtin-annotate.c | 2 +-
tools/perf/builtin-c2c.c | 6 +-
tools/perf/builtin-diff.c | 10 +-
tools/perf/builtin-top.c | 2 +-
tools
On 12/3/18 6:02 AM, Roman Penyaev wrote:
The main change is in replacement of the spinlock with a rwlock, which is
taken on read in ep_poll_callback(), and then by adding poll items to the
tail of the list using xchg atomic instruction. Write lock is taken
everywhere else in order to stop list
On 12/3/18 6:02 AM, Roman Penyaev wrote:
The main change is in replacement of the spinlock with a rwlock, which is
taken on read in ep_poll_callback(), and then by adding poll items to the
tail of the list using xchg atomic instruction. Write lock is taken
everywhere else in order to stop list
On 12/3/18 6:02 AM, Roman Penyaev wrote:
if (!ep_is_linked(epi)) {
- list_add_tail(>rdllink, >rdllist);
+ /* Reverse ->ovflist, events should be in FIFO */
+ list_add(>rdllink, >rdllist);
On 12/3/18 6:02 AM, Roman Penyaev wrote:
if (!ep_is_linked(epi)) {
- list_add_tail(>rdllink, >rdllist);
+ /* Reverse ->ovflist, events should be in FIFO */
+ list_add(>rdllink, >rdllist);
+ akpm.
Also, there are some epoll patches queued for -next, and as such
this patch does not apply against linux-next.
Thanks,
Davidlohr
On Tue, 04 Dec 2018, Jason Baron wrote:
On 12/3/18 6:02 AM, Roman Penyaev wrote:
Hi all,
The goal of this patch is to reduce contention of
+ akpm.
Also, there are some epoll patches queued for -next, and as such
this patch does not apply against linux-next.
Thanks,
Davidlohr
On Tue, 04 Dec 2018, Jason Baron wrote:
On 12/3/18 6:02 AM, Roman Penyaev wrote:
Hi all,
The goal of this patch is to reduce contention of
On 2018-11-30 07:10, Prateek Sood wrote:
In a scenario where cpu_hotplug_lock percpu_rw_semaphore is already
acquired for read operation by P1 using percpu_down_read().
Now we have P1 in the path of releaseing the cpu_hotplug_lock and P2
is in the process of acquiring cpu_hotplug_lock.
P1
On 2018-11-30 07:10, Prateek Sood wrote:
In a scenario where cpu_hotplug_lock percpu_rw_semaphore is already
acquired for read operation by P1 using percpu_down_read().
Now we have P1 in the path of releaseing the cpu_hotplug_lock and P2
is in the process of acquiring cpu_hotplug_lock.
P1
1-2%.
Signed-off-by: Davidlohr Bueso
---
kernel/sched/core.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 091e089063be..f7747cf6e427 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -408,7 +408,7 @@ void wake_q_add
1-2%.
Signed-off-by: Davidlohr Bueso
---
kernel/sched/core.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 091e089063be..f7747cf6e427 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -408,7 +408,7 @@ void wake_q_add
I messed up something such that waiman was not in the thread. Ccing.
On Thu, 29 Nov 2018, Waiman Long wrote:
That can be costly for x86 which will now have 2 locked instructions.
Yeah, and when used as an actual queue we should really start to notice.
Some users just have a single task in
I messed up something such that waiman was not in the thread. Ccing.
On Thu, 29 Nov 2018, Waiman Long wrote:
That can be costly for x86 which will now have 2 locked instructions.
Yeah, and when used as an actual queue we should really start to notice.
Some users just have a single task in
On Thu, 29 Nov 2018, Waiman Long wrote:
That can be costly for x86 which will now have 2 locked instructions.
Yeah, and when used as an actual queue we should really start to notice.
Some users just have a single task in the wake_q because avoiding the cost
of wake_up_process() with locks
On Thu, 29 Nov 2018, Waiman Long wrote:
That can be costly for x86 which will now have 2 locked instructions.
Yeah, and when used as an actual queue we should really start to notice.
Some users just have a single task in the wake_q because avoiding the cost
of wake_up_process() with locks
On Thu, 29 Nov 2018, Peter Zijlstra wrote:
On Thu, Nov 29, 2018 at 02:12:32PM +0100, Peter Zijlstra wrote:
Yes, I think this is real, and worse, I think we need to go audit all
wake_q_add() users and document this behaviour.
In the ideal case we'd delay the actual wakeup to the last
On Thu, 29 Nov 2018, Peter Zijlstra wrote:
On Thu, Nov 29, 2018 at 02:12:32PM +0100, Peter Zijlstra wrote:
Yes, I think this is real, and worse, I think we need to go audit all
wake_q_add() users and document this behaviour.
In the ideal case we'd delay the actual wakeup to the last
On Thu, 29 Nov 2018, Peter Zijlstra wrote:
On Thu, Nov 29, 2018 at 12:58:26PM -0500, Waiman Long wrote:
OK, you convinced me. However, that can still lead to anonymous wakeups
that can be problematic if it happens in certain places. Should we try
to reduce anonymous wakeup as much as possible?
On Thu, 29 Nov 2018, Peter Zijlstra wrote:
On Thu, Nov 29, 2018 at 12:58:26PM -0500, Waiman Long wrote:
OK, you convinced me. However, that can still lead to anonymous wakeups
that can be problematic if it happens in certain places. Should we try
to reduce anonymous wakeup as much as possible?
Commit-ID: 231457ec707475c71d4e538a3253f1ed9e294cf0
Gitweb: https://git.kernel.org/tip/231457ec707475c71d4e538a3253f1ed9e294cf0
Author: Davidlohr Bueso
AuthorDate: Tue, 6 Nov 2018 07:22:26 -0800
Committer: Arnaldo Carvalho de Melo
CommitDate: Wed, 21 Nov 2018 22:39:55 -0300
perf bench
Commit-ID: 231457ec707475c71d4e538a3253f1ed9e294cf0
Gitweb: https://git.kernel.org/tip/231457ec707475c71d4e538a3253f1ed9e294cf0
Author: Davidlohr Bueso
AuthorDate: Tue, 6 Nov 2018 07:22:26 -0800
Committer: Arnaldo Carvalho de Melo
CommitDate: Wed, 21 Nov 2018 22:39:55 -0300
perf bench
Commit-ID: 121dd9ea0116de3e79a4903a84018190c595e2b6
Gitweb: https://git.kernel.org/tip/121dd9ea0116de3e79a4903a84018190c595e2b6
Author: Davidlohr Bueso
AuthorDate: Tue, 6 Nov 2018 07:22:25 -0800
Committer: Arnaldo Carvalho de Melo
CommitDate: Wed, 21 Nov 2018 22:38:47 -0300
perf bench
Commit-ID: 121dd9ea0116de3e79a4903a84018190c595e2b6
Gitweb: https://git.kernel.org/tip/121dd9ea0116de3e79a4903a84018190c595e2b6
Author: Davidlohr Bueso
AuthorDate: Tue, 6 Nov 2018 07:22:25 -0800
Committer: Arnaldo Carvalho de Melo
CommitDate: Wed, 21 Nov 2018 22:38:47 -0300
perf bench
Commit-ID: d47d77c3f008d3cf02c6ce92ef4f6e32ca270351
Gitweb: https://git.kernel.org/tip/d47d77c3f008d3cf02c6ce92ef4f6e32ca270351
Author: Davidlohr Bueso
AuthorDate: Fri, 9 Nov 2018 13:07:19 -0800
Committer: Arnaldo Carvalho de Melo
CommitDate: Wed, 21 Nov 2018 12:00:32 -0300
perf bench
201 - 300 of 4799 matches
Mail list logo