Re: [v2] locking/percpu-rwsem: Optimize readers and reduce global impact

2016-08-31 Thread Guenter Roeck
On 08/31/2016 06:41 AM, Guenter Roeck wrote: On 08/31/2016 01:09 AM, Peter Zijlstra wrote: On Tue, Aug 30, 2016 at 10:21:02PM -0700, Guenter Roeck wrote: Peter, The call to rcu_sync_is_idle() causes the following build error when building x86_64:allmodconfig. ERROR: "rcu_sync_lockdep_assert"

Re: [v2] locking/percpu-rwsem: Optimize readers and reduce global impact

2016-08-31 Thread Guenter Roeck
On 08/31/2016 01:09 AM, Peter Zijlstra wrote: On Tue, Aug 30, 2016 at 10:21:02PM -0700, Guenter Roeck wrote: Peter, The call to rcu_sync_is_idle() causes the following build error when building x86_64:allmodconfig. ERROR: "rcu_sync_lockdep_assert" [kernel/locking/locktorture.ko] undefined! ERR

Re: [v2] locking/percpu-rwsem: Optimize readers and reduce global impact

2016-08-31 Thread Peter Zijlstra
On Tue, Aug 30, 2016 at 10:21:02PM -0700, Guenter Roeck wrote: > Peter, > > The call to rcu_sync_is_idle() causes the following build error when building > x86_64:allmodconfig. > > ERROR: "rcu_sync_lockdep_assert" [kernel/locking/locktorture.ko] undefined! > ERROR: "rcu_sync_lockdep_assert" [fs/e

Re: [v2] locking/percpu-rwsem: Optimize readers and reduce global impact

2016-08-30 Thread Guenter Roeck
Peter, On Tue, Aug 09, 2016 at 11:51:12AM +0200, Peter Zijlstra wrote: > Currently the percpu-rwsem switches to (global) atomic ops while a > writer is waiting; which could be quite a while and slows down > releasing the readers. > > This patch cures this problem by ordering the reader-state vs >

Re: [PATCH v2] locking/percpu-rwsem: Optimize readers and reduce global impact

2016-08-26 Thread Om Dhyade
On 8/26/2016 9:47 AM, Dmitry Shmidt wrote: On Fri, Aug 26, 2016 at 5:51 AM, Tejun Heo wrote: Hello, John. On Thu, Aug 25, 2016 at 07:14:07PM -0700, John Stultz wrote: Hey! Good news. This patch along with Peter's locking changes pushes the latencies down to an apparently acceptable level!

Re: [PATCH v2] locking/percpu-rwsem: Optimize readers and reduce global impact

2016-08-26 Thread Dmitry Shmidt
On Fri, Aug 26, 2016 at 5:51 AM, Tejun Heo wrote: > Hello, John. > > On Thu, Aug 25, 2016 at 07:14:07PM -0700, John Stultz wrote: >> Hey! Good news. This patch along with Peter's locking changes pushes >> the latencies down to an apparently acceptable level! > > Ah, that's good to hear. Please fe

Re: [PATCH v2] locking/percpu-rwsem: Optimize readers and reduce global impact

2016-08-26 Thread Tejun Heo
Hello, John. On Thu, Aug 25, 2016 at 07:14:07PM -0700, John Stultz wrote: > Hey! Good news. This patch along with Peter's locking changes pushes > the latencies down to an apparently acceptable level! Ah, that's good to hear. Please feel free to ping me if you guys wanna talk about cgroup usage

Re: [PATCH v2] locking/percpu-rwsem: Optimize readers and reduce global impact

2016-08-25 Thread John Stultz
On Wed, Aug 24, 2016 at 2:30 PM, Tejun Heo wrote: > Hello, John. > > On Wed, Aug 24, 2016 at 02:16:52PM -0700, John Stultz wrote: >> Hey Peter, Tejun, Oleg, >> So while you're tweaks for the percpu-rwsem have greatly helped the >> regression folks were seeing (many thanks, by the way), as noted

Re: [PATCH v2] locking/percpu-rwsem: Optimize readers and reduce global impact

2016-08-24 Thread John Stultz
On Wed, Aug 24, 2016 at 2:30 PM, Tejun Heo wrote: > On Wed, Aug 24, 2016 at 02:16:52PM -0700, John Stultz wrote: >> >> So I was wondering if patches to go back to the per signal_struct >> locking would still be considered? Or is the global lock approach the >> only way forward? > > We can't simply

Re: [PATCH v2] locking/percpu-rwsem: Optimize readers and reduce global impact

2016-08-24 Thread Tejun Heo
Hello, John. On Wed, Aug 24, 2016 at 02:16:52PM -0700, John Stultz wrote: > Hey Peter, Tejun, Oleg, > So while you're tweaks for the percpu-rwsem have greatly helped the > regression folks were seeing (many thanks, by the way), as noted > above, the performance regression with the global lock co

Re: [PATCH v2] locking/percpu-rwsem: Optimize readers and reduce global impact

2016-08-24 Thread John Stultz
On Fri, Aug 12, 2016 at 6:44 PM, Om Dhyade wrote: > Update from my tests: > Use-case: Android application launches. > > I tested the patches on android N build, i see max latency ~7ms. > In my tests, the wait is due to: copy_process(fork.c) blocks all threads in > __cgroup_procs_write including th

Re: [PATCH v2] locking/percpu-rwsem: Optimize readers and reduce global impact

2016-08-12 Thread Om Dhyade
Thank you Dimtry for sharing the patches. Update from my tests: Use-case: Android application launches. I tested the patches on android N build, i see max latency ~7ms. In my tests, the wait is due to: copy_process(fork.c) blocks all threads in __cgroup_procs_write including threads which are n

Re: [PATCH v2] locking/percpu-rwsem: Optimize readers and reduce global impact

2016-08-10 Thread Peter Zijlstra
On Tue, Aug 09, 2016 at 04:47:38PM -0700, John Stultz wrote: > On Tue, Aug 9, 2016 at 2:51 AM, Peter Zijlstra wrote: > > > > Currently the percpu-rwsem switches to (global) atomic ops while a > > writer is waiting; which could be quite a while and slows down > > releasing the readers. > > > > This

Re: [PATCH v2] locking/percpu-rwsem: Optimize readers and reduce global impact

2016-08-10 Thread Oleg Nesterov
On 08/10, Peter Zijlstra wrote: > > On Tue, Aug 09, 2016 at 04:47:38PM -0700, John Stultz wrote: > > On Tue, Aug 9, 2016 at 2:51 AM, Peter Zijlstra wrote: > > > > > > Currently the percpu-rwsem switches to (global) atomic ops while a > > > writer is waiting; which could be quite a while and slows

Re: [PATCH v2] locking/percpu-rwsem: Optimize readers and reduce global impact

2016-08-09 Thread John Stultz
On Tue, Aug 9, 2016 at 2:51 AM, Peter Zijlstra wrote: > > Currently the percpu-rwsem switches to (global) atomic ops while a > writer is waiting; which could be quite a while and slows down > releasing the readers. > > This patch cures this problem by ordering the reader-state vs > reader-count (s

[PATCH v2] locking/percpu-rwsem: Optimize readers and reduce global impact

2016-08-09 Thread Peter Zijlstra
Currently the percpu-rwsem switches to (global) atomic ops while a writer is waiting; which could be quite a while and slows down releasing the readers. This patch cures this problem by ordering the reader-state vs reader-count (see the comments in __percpu_down_read() and percpu_down_write()). T