Re: [PATCH v7 1/4] spinlock: A new lockref structure for lockless update of refcount

2013-09-09 Thread Peter Zijlstra
On Sun, Sep 01, 2013 at 01:13:06AM +0100, Al Viro wrote: > +static noinline_for_stack > +char *dentry_name(char *buf, char *end, const struct dentry *d, struct > printf_spec spec, > + int depth) > +{ > + int i, n = 0; > + const char *s; > + char *p = buf; > + const

Re: [PATCH v7 1/4] spinlock: A new lockref structure for lockless update of refcount

2013-09-09 Thread Peter Zijlstra
On Sun, Sep 01, 2013 at 01:13:06AM +0100, Al Viro wrote: +static noinline_for_stack +char *dentry_name(char *buf, char *end, const struct dentry *d, struct printf_spec spec, + int depth) +{ + int i, n = 0; + const char *s; + char *p = buf; + const struct

Re: [PATCH v7 1/4] spinlock: A new lockref structure for lockless update of refcount

2013-09-08 Thread Al Viro
On Sun, Sep 08, 2013 at 08:32:03PM -0700, Linus Torvalds wrote: > On Sun, Sep 8, 2013 at 5:03 PM, Al Viro wrote: > > > > There's one exception - basically, we decide to put duplicates of > > reference(s) we hold into (a bunch of) structures being created. If > > we decide that we'd failed and

Re: [PATCH v7 1/4] spinlock: A new lockref structure for lockless update of refcount

2013-09-08 Thread Ramkumar Ramachandra
Linus Torvalds wrote: > On Sun, Sep 8, 2013 at 5:03 PM, Al Viro wrote: >> >> There's one exception - basically, we decide to put duplicates of >> reference(s) we hold into (a bunch of) structures being created. If >> we decide that we'd failed and need to roll back, the structures >> need to be

Re: [PATCH v7 1/4] spinlock: A new lockref structure for lockless update of refcount

2013-09-08 Thread Linus Torvalds
On Sun, Sep 8, 2013 at 5:03 PM, Al Viro wrote: > > There's one exception - basically, we decide to put duplicates of > reference(s) we hold into (a bunch of) structures being created. If > we decide that we'd failed and need to roll back, the structures > need to be taken out of whatever lists,

Re: [PATCH v7 1/4] spinlock: A new lockref structure for lockless update of refcount

2013-09-08 Thread Ramkumar Ramachandra
Linus Torvalds wrote: > On Sun, Sep 8, 2013 at 5:03 PM, Al Viro wrote: >> >> Well... unlazy_walk() is always followed by terminate_walk() very shortly, >> but there's a minor problem - terminate_walk() uses "are we in RCU >> mode?" for two things: >> a) do we need to do path_put() here?

Re: [PATCH v7 1/4] spinlock: A new lockref structure for lockless update of refcount

2013-09-08 Thread Al Viro
On Sun, Sep 08, 2013 at 05:38:46PM -0700, Linus Torvalds wrote: > On Sun, Sep 8, 2013 at 5:35 PM, Al Viro wrote: > > > > That should also work, replacing the current tip of #for-next. Do you > > prefer to merge those two diffs of yours into a single commit? > > If you're ok with my patch (it's

Re: [PATCH v7 1/4] spinlock: A new lockref structure for lockless update of refcount

2013-09-08 Thread Linus Torvalds
On Sun, Sep 8, 2013 at 5:35 PM, Al Viro wrote: > > That should also work, replacing the current tip of #for-next. Do you > prefer to merge those two diffs of yours into a single commit? If you're ok with my patch (it's now also tested, I'm running with it and it looks fine), I'll commit that

Re: [PATCH v7 1/4] spinlock: A new lockref structure for lockless update of refcount

2013-09-08 Thread Al Viro
On Sun, Sep 08, 2013 at 05:25:40PM -0700, Linus Torvalds wrote: > On Sun, Sep 8, 2013 at 5:03 PM, Al Viro wrote: > > > > Well... unlazy_walk() is always followed by terminate_walk() very shortly, > > but there's a minor problem - terminate_walk() uses "are we in RCU > > mode?" for two things: >

Re: [PATCH v7 1/4] spinlock: A new lockref structure for lockless update of refcount

2013-09-08 Thread Al Viro
On Mon, Sep 09, 2013 at 01:03:00AM +0100, Al Viro wrote: > Well... unlazy_walk() is always followed by terminate_walk() very shortly, > but there's a minor problem - terminate_walk() uses "are we in RCU > mode?" for two things: > a) do we need to do path_put() here? > b) do we need

Re: [PATCH v7 1/4] spinlock: A new lockref structure for lockless update of refcount

2013-09-08 Thread Linus Torvalds
On Sun, Sep 8, 2013 at 5:03 PM, Al Viro wrote: > > Well... unlazy_walk() is always followed by terminate_walk() very shortly, > but there's a minor problem - terminate_walk() uses "are we in RCU > mode?" for two things: > a) do we need to do path_put() here? > b) do we need to

Re: [PATCH v7 1/4] spinlock: A new lockref structure for lockless update of refcount

2013-09-08 Thread Al Viro
On Sun, Sep 08, 2013 at 02:45:32PM -0700, Linus Torvalds wrote: > But that test never triggered any of my thinking, and it's racy > anyway, and the condition we care about is another race, which means > that it not only didn't trigger my thinking, it doesn't trigger in any > normal testing

Re: [PATCH v7 1/4] spinlock: A new lockref structure for lockless update of refcount

2013-09-08 Thread Linus Torvalds
Al, this is mainly a rambling question for you, but others left on the Cc just FYI.. On Thu, Aug 29, 2013 at 4:42 PM, Linus Torvalds wrote: > > So I fixed the VFS layer instead. With dput() and friends using > lockrefs, the only thing remaining in the hot RCU dentry lookup path > was the nasty

Re: [PATCH v7 1/4] spinlock: A new lockref structure for lockless update of refcount

2013-09-08 Thread Linus Torvalds
Al, this is mainly a rambling question for you, but others left on the Cc just FYI.. On Thu, Aug 29, 2013 at 4:42 PM, Linus Torvalds torva...@linux-foundation.org wrote: So I fixed the VFS layer instead. With dput() and friends using lockrefs, the only thing remaining in the hot RCU dentry

Re: [PATCH v7 1/4] spinlock: A new lockref structure for lockless update of refcount

2013-09-08 Thread Al Viro
On Sun, Sep 08, 2013 at 02:45:32PM -0700, Linus Torvalds wrote: But that test never triggered any of my thinking, and it's racy anyway, and the condition we care about is another race, which means that it not only didn't trigger my thinking, it doesn't trigger in any normal testing either.

Re: [PATCH v7 1/4] spinlock: A new lockref structure for lockless update of refcount

2013-09-08 Thread Linus Torvalds
On Sun, Sep 8, 2013 at 5:03 PM, Al Viro v...@zeniv.linux.org.uk wrote: Well... unlazy_walk() is always followed by terminate_walk() very shortly, but there's a minor problem - terminate_walk() uses are we in RCU mode? for two things: a) do we need to do path_put() here? b)

Re: [PATCH v7 1/4] spinlock: A new lockref structure for lockless update of refcount

2013-09-08 Thread Al Viro
On Mon, Sep 09, 2013 at 01:03:00AM +0100, Al Viro wrote: Well... unlazy_walk() is always followed by terminate_walk() very shortly, but there's a minor problem - terminate_walk() uses are we in RCU mode? for two things: a) do we need to do path_put() here? b) do we need to

Re: [PATCH v7 1/4] spinlock: A new lockref structure for lockless update of refcount

2013-09-08 Thread Al Viro
On Sun, Sep 08, 2013 at 05:25:40PM -0700, Linus Torvalds wrote: On Sun, Sep 8, 2013 at 5:03 PM, Al Viro v...@zeniv.linux.org.uk wrote: Well... unlazy_walk() is always followed by terminate_walk() very shortly, but there's a minor problem - terminate_walk() uses are we in RCU mode? for

Re: [PATCH v7 1/4] spinlock: A new lockref structure for lockless update of refcount

2013-09-08 Thread Linus Torvalds
On Sun, Sep 8, 2013 at 5:35 PM, Al Viro v...@zeniv.linux.org.uk wrote: That should also work, replacing the current tip of #for-next. Do you prefer to merge those two diffs of yours into a single commit? If you're ok with my patch (it's now also tested, I'm running with it and it looks fine),

Re: [PATCH v7 1/4] spinlock: A new lockref structure for lockless update of refcount

2013-09-08 Thread Al Viro
On Sun, Sep 08, 2013 at 05:38:46PM -0700, Linus Torvalds wrote: On Sun, Sep 8, 2013 at 5:35 PM, Al Viro v...@zeniv.linux.org.uk wrote: That should also work, replacing the current tip of #for-next. Do you prefer to merge those two diffs of yours into a single commit? If you're ok with

Re: [PATCH v7 1/4] spinlock: A new lockref structure for lockless update of refcount

2013-09-08 Thread Ramkumar Ramachandra
Linus Torvalds wrote: On Sun, Sep 8, 2013 at 5:03 PM, Al Viro v...@zeniv.linux.org.uk wrote: Well... unlazy_walk() is always followed by terminate_walk() very shortly, but there's a minor problem - terminate_walk() uses are we in RCU mode? for two things: a) do we need to do

Re: [PATCH v7 1/4] spinlock: A new lockref structure for lockless update of refcount

2013-09-08 Thread Linus Torvalds
On Sun, Sep 8, 2013 at 5:03 PM, Al Viro v...@zeniv.linux.org.uk wrote: There's one exception - basically, we decide to put duplicates of reference(s) we hold into (a bunch of) structures being created. If we decide that we'd failed and need to roll back, the structures need to be taken out

Re: [PATCH v7 1/4] spinlock: A new lockref structure for lockless update of refcount

2013-09-08 Thread Ramkumar Ramachandra
Linus Torvalds wrote: On Sun, Sep 8, 2013 at 5:03 PM, Al Viro v...@zeniv.linux.org.uk wrote: There's one exception - basically, we decide to put duplicates of reference(s) we hold into (a bunch of) structures being created. If we decide that we'd failed and need to roll back, the structures

Re: [PATCH v7 1/4] spinlock: A new lockref structure for lockless update of refcount

2013-09-08 Thread Al Viro
On Sun, Sep 08, 2013 at 08:32:03PM -0700, Linus Torvalds wrote: On Sun, Sep 8, 2013 at 5:03 PM, Al Viro v...@zeniv.linux.org.uk wrote: There's one exception - basically, we decide to put duplicates of reference(s) we hold into (a bunch of) structures being created. If we decide that we'd

Re: [PATCH v7 1/4] spinlock: A new lockref structure for lockless update of refcount

2013-09-05 Thread Ingo Molnar
* Waiman Long wrote: > On 09/05/2013 09:31 AM, Ingo Molnar wrote: > >* Waiman Long wrote: > > > > > >>The latest tty patches did work. The tty related spinlock contention > >>is now completely gone. The short workload can now reach over 8M JPM > >>which is the highest I have ever seen. > >> >

Re: [PATCH v7 1/4] spinlock: A new lockref structure for lockless update of refcount

2013-09-05 Thread Waiman Long
On 09/05/2013 09:31 AM, Ingo Molnar wrote: * Waiman Long wrote: The latest tty patches did work. The tty related spinlock contention is now completely gone. The short workload can now reach over 8M JPM which is the highest I have ever seen. The perf profile was: 5.85% reaim reaim

Re: [PATCH v7 1/4] spinlock: A new lockref structure for lockless update of refcount

2013-09-05 Thread Ingo Molnar
* Waiman Long wrote: > On 09/03/2013 03:09 PM, Linus Torvalds wrote: > >On Tue, Sep 3, 2013 at 8:34 AM, Linus Torvalds > > wrote: > >>I suspect the tty_ldisc_lock() could be made to go away if we care. > >Heh. I just pulled the tty patches from Greg, and the locking has > >changed completely.

Re: [PATCH v7 1/4] spinlock: A new lockref structure for lockless update of refcount

2013-09-05 Thread Ingo Molnar
* Waiman Long waiman.l...@hp.com wrote: On 09/03/2013 03:09 PM, Linus Torvalds wrote: On Tue, Sep 3, 2013 at 8:34 AM, Linus Torvalds torva...@linux-foundation.org wrote: I suspect the tty_ldisc_lock() could be made to go away if we care. Heh. I just pulled the tty patches from Greg, and

Re: [PATCH v7 1/4] spinlock: A new lockref structure for lockless update of refcount

2013-09-05 Thread Waiman Long
On 09/05/2013 09:31 AM, Ingo Molnar wrote: * Waiman Longwaiman.l...@hp.com wrote: The latest tty patches did work. The tty related spinlock contention is now completely gone. The short workload can now reach over 8M JPM which is the highest I have ever seen. The perf profile was: 5.85%

Re: [PATCH v7 1/4] spinlock: A new lockref structure for lockless update of refcount

2013-09-05 Thread Ingo Molnar
* Waiman Long waiman.l...@hp.com wrote: On 09/05/2013 09:31 AM, Ingo Molnar wrote: * Waiman Longwaiman.l...@hp.com wrote: The latest tty patches did work. The tty related spinlock contention is now completely gone. The short workload can now reach over 8M JPM which is the highest I

Re: [PATCH v7 1/4] spinlock: A new lockref structure for lockless update of refcount

2013-09-04 Thread Waiman Long
On 09/04/2013 05:34 PM, Linus Torvalds wrote: On Wed, Sep 4, 2013 at 12:25 PM, Waiman Long wrote: Yes, the perf profile was taking from an 80-core machine. There isn't any scalability issue hiding for the short workload on an 80-core machine. However, I am certain that more may pop up when

Re: [PATCH v7 1/4] spinlock: A new lockref structure for lockless update of refcount

2013-09-04 Thread Linus Torvalds
On Wed, Sep 4, 2013 at 12:25 PM, Waiman Long wrote: > > Yes, the perf profile was taking from an 80-core machine. There isn't any > scalability issue hiding for the short workload on an 80-core machine. > > However, I am certain that more may pop up when running in an even larger > machine like

Re: [PATCH v7 1/4] spinlock: A new lockref structure for lockless update of refcount

2013-09-04 Thread Waiman Long
On 09/04/2013 11:14 AM, Linus Torvalds wrote: On Wed, Sep 4, 2013 at 7:52 AM, Waiman Long wrote: The latest tty patches did work. The tty related spinlock contention is now completely gone. The short workload can now reach over 8M JPM which is the highest I have ever seen. Good. And this was

Re: [PATCH v7 1/4] spinlock: A new lockref structure for lockless update of refcount

2013-09-04 Thread Waiman Long
On 09/03/2013 03:09 PM, Linus Torvalds wrote: On Tue, Sep 3, 2013 at 8:34 AM, Linus Torvalds wrote: I suspect the tty_ldisc_lock() could be made to go away if we care. Heh. I just pulled the tty patches from Greg, and the locking has changed completely. It may actually fix your AIM7

Re: [PATCH v7 1/4] spinlock: A new lockref structure for lockless update of refcount

2013-09-04 Thread Linus Torvalds
On Wed, Sep 4, 2013 at 7:52 AM, Waiman Long wrote: > > The latest tty patches did work. The tty related spinlock contention is now > completely gone. The short workload can now reach over 8M JPM which is the > highest I have ever seen. Good. And this was with the 80-core machine, so there aren't

Re: [PATCH v7 1/4] spinlock: A new lockref structure for lockless update of refcount

2013-09-04 Thread Linus Torvalds
On Wed, Sep 4, 2013 at 7:52 AM, Waiman Long waiman.l...@hp.com wrote: The latest tty patches did work. The tty related spinlock contention is now completely gone. The short workload can now reach over 8M JPM which is the highest I have ever seen. Good. And this was with the 80-core machine,

Re: [PATCH v7 1/4] spinlock: A new lockref structure for lockless update of refcount

2013-09-04 Thread Waiman Long
On 09/03/2013 03:09 PM, Linus Torvalds wrote: On Tue, Sep 3, 2013 at 8:34 AM, Linus Torvalds torva...@linux-foundation.org wrote: I suspect the tty_ldisc_lock() could be made to go away if we care. Heh. I just pulled the tty patches from Greg, and the locking has changed completely. It may

Re: [PATCH v7 1/4] spinlock: A new lockref structure for lockless update of refcount

2013-09-04 Thread Waiman Long
On 09/04/2013 11:14 AM, Linus Torvalds wrote: On Wed, Sep 4, 2013 at 7:52 AM, Waiman Longwaiman.l...@hp.com wrote: The latest tty patches did work. The tty related spinlock contention is now completely gone. The short workload can now reach over 8M JPM which is the highest I have ever seen.

Re: [PATCH v7 1/4] spinlock: A new lockref structure for lockless update of refcount

2013-09-04 Thread Linus Torvalds
On Wed, Sep 4, 2013 at 12:25 PM, Waiman Long waiman.l...@hp.com wrote: Yes, the perf profile was taking from an 80-core machine. There isn't any scalability issue hiding for the short workload on an 80-core machine. However, I am certain that more may pop up when running in an even larger

Re: [PATCH v7 1/4] spinlock: A new lockref structure for lockless update of refcount

2013-09-04 Thread Waiman Long
On 09/04/2013 05:34 PM, Linus Torvalds wrote: On Wed, Sep 4, 2013 at 12:25 PM, Waiman Longwaiman.l...@hp.com wrote: Yes, the perf profile was taking from an 80-core machine. There isn't any scalability issue hiding for the short workload on an 80-core machine. However, I am certain that more

Re: [PATCH v7 1/4] spinlock: A new lockref structure for lockless update of refcount

2013-09-03 Thread Sedat Dilek
On Wed, Sep 4, 2013 at 1:15 AM, Dave Jones wrote: > On Wed, Sep 04, 2013 at 01:05:38AM +0200, Sedat Dilek wrote: > > On Wed, Sep 4, 2013 at 12:55 AM, Dave Jones wrote: > > > On Wed, Sep 04, 2013 at 12:37:25AM +0200, Sedat Dilek wrote: > > > > > > > > You're spending more time on the task

Re: [PATCH v7 1/4] spinlock: A new lockref structure for lockless update of refcount

2013-09-03 Thread Dave Jones
On Wed, Sep 04, 2013 at 01:05:38AM +0200, Sedat Dilek wrote: > On Wed, Sep 4, 2013 at 12:55 AM, Dave Jones wrote: > > On Wed, Sep 04, 2013 at 12:37:25AM +0200, Sedat Dilek wrote: > > > > > > You're spending more time on the task stats than on the actual lookup. > > > > Maybe you should

Re: [PATCH v7 1/4] spinlock: A new lockref structure for lockless update of refcount

2013-09-03 Thread Sedat Dilek
On Wed, Sep 4, 2013 at 12:41 AM, Sedat Dilek wrote: > On Tue, Sep 3, 2013 at 5:14 PM, Waiman Long wrote: >> On 09/03/2013 02:01 AM, Ingo Molnar wrote: >>> >>> * Waiman Long wrote: >>> Yes, that patch worked. It eliminated the lglock as a bottleneck in the AIM7 workload. The

Re: [PATCH v7 1/4] spinlock: A new lockref structure for lockless update of refcount

2013-09-03 Thread Sedat Dilek
On Wed, Sep 4, 2013 at 12:55 AM, Dave Jones wrote: > On Wed, Sep 04, 2013 at 12:37:25AM +0200, Sedat Dilek wrote: > > > > You're spending more time on the task stats than on the actual lookup. > > > Maybe you should turn off CONFIG_TASKSTATS..But why that whole > > > irq_return thing? Odd. >

Re: [PATCH v7 1/4] spinlock: A new lockref structure for lockless update of refcount

2013-09-03 Thread Sedat Dilek
On Tue, Sep 3, 2013 at 5:14 PM, Waiman Long wrote: > On 09/03/2013 02:01 AM, Ingo Molnar wrote: >> >> * Waiman Long wrote: >> >>> Yes, that patch worked. It eliminated the lglock as a bottleneck in the >>> AIM7 workload. The lg_global_lock did not show up in the perf profile, >>> whereas the

Re: [PATCH v7 1/4] spinlock: A new lockref structure for lockless update of refcount

2013-09-03 Thread Dave Jones
On Wed, Sep 04, 2013 at 12:37:25AM +0200, Sedat Dilek wrote: > > You're spending more time on the task stats than on the actual lookup. > > Maybe you should turn off CONFIG_TASKSTATS..But why that whole > > irq_return thing? Odd. > > > > [ init/Kconfig ] > ... > config TASKSTATS >

Re: [PATCH v7 1/4] spinlock: A new lockref structure for lockless update of refcount

2013-09-03 Thread Sedat Dilek
On Sun, Sep 1, 2013 at 5:32 PM, Linus Torvalds wrote: > On Sun, Sep 1, 2013 at 3:01 AM, Sedat Dilek wrote: >> >> Looks like this is now 10x faster: ~2.66Mloops (debug) VS. >> ~26.60Mloops (no-debug). > > Ok, that's getting to be in the right ballpark. > > But your profile is still odd. > >>

Re: [PATCH v7 1/4] spinlock: A new lockref structure for lockless update of refcount

2013-09-03 Thread Linus Torvalds
On Tue, Sep 3, 2013 at 2:34 PM, Linus Torvalds wrote: > > I'll try to hack that up too, but it's looking like it really is just > the "lock xadd", not the memory dependency chain.. Yeah, no difference: Better code generation with my quick hack for a percpu spinlock: │81078e70

Re: [PATCH v7 1/4] spinlock: A new lockref structure for lockless update of refcount

2013-09-03 Thread Linus Torvalds
On Tue, Sep 3, 2013 at 2:13 PM, Linus Torvalds wrote: > > This is the one that actually compiles. Whether it *works* is still a > total mystery. It generates ok code, and it booted, so it seems to work at least for my config. However, it seems to make no performance-difference what-so-ever,

Re: [PATCH v7 1/4] spinlock: A new lockref structure for lockless update of refcount

2013-09-03 Thread Linus Torvalds
On Tue, Sep 3, 2013 at 2:05 PM, Linus Torvalds wrote: > > TOTALLY UNTESTED PATCH ATTACHED. Actually, that was the previous (broken) version of that patch - I hadn't regenerated it after fixing some stupid compile errors, and it had the DECLARE parts wrong. This is the one that actually

Re: [PATCH v7 1/4] spinlock: A new lockref structure for lockless update of refcount

2013-09-03 Thread Waiman Long
On 09/03/2013 03:09 PM, Linus Torvalds wrote: On Tue, Sep 3, 2013 at 8:34 AM, Linus Torvalds wrote: I suspect the tty_ldisc_lock() could be made to go away if we care. Heh. I just pulled the tty patches from Greg, and the locking has changed completely. It may actually fix your AIM7

Re: [PATCH v7 1/4] spinlock: A new lockref structure for lockless update of refcount

2013-09-03 Thread Linus Torvalds
On Tue, Sep 3, 2013 at 12:19 PM, Ingo Molnar wrote: > > * Linus Torvalds wrote: > >> >> - the lglock data structure isn't a percpu data structure, it's this >> stupid global data structure that has a percpu pointer in it. So that >> first "mov (%rdi),%rdx" is purely to load what is effectively

Re: [PATCH v7 1/4] spinlock: A new lockref structure for lockless update of refcount

2013-09-03 Thread Ingo Molnar
* Linus Torvalds wrote: > On Tue, Sep 3, 2013 at 8:41 AM, Linus Torvalds > wrote: > > > > I've done that, and it matches the PEBS runs, except obviously with > > the instruction skew (so then depending on run it's 95% the > > instruction after the xadd). So the PEBS profiles are entirely > >

Re: [PATCH v7 1/4] spinlock: A new lockref structure for lockless update of refcount

2013-09-03 Thread Linus Torvalds
On Tue, Sep 3, 2013 at 8:34 AM, Linus Torvalds wrote: > > I suspect the tty_ldisc_lock() could be made to go away if we care. Heh. I just pulled the tty patches from Greg, and the locking has changed completely. It may actually fix your AIM7 test-case, because while the global spinlock remains

Re: [PATCH v7 1/4] spinlock: A new lockref structure for lockless update of refcount

2013-09-03 Thread Linus Torvalds
On Tue, Sep 3, 2013 at 8:41 AM, Linus Torvalds wrote: > > I've done that, and it matches the PEBS runs, except obviously with > the instruction skew (so then depending on run it's 95% the > instruction after the xadd). So the PEBS profiles are entirely > consistent with other data. So one thing

Re: [PATCH v7 1/4] spinlock: A new lockref structure for lockless update of refcount

2013-09-03 Thread Linus Torvalds
On Tue, Sep 3, 2013 at 12:24 AM, Sedat Dilek wrote: > > Can someone summarize this thread (70+ postings)? > Which patches are needed? And fixing what? > ( Can people provide separate patches with a proper changelog? ) > Improvements? The core lockref part is now merged and available in my git

Re: [PATCH v7 1/4] spinlock: A new lockref structure for lockless update of refcount

2013-09-03 Thread Linus Torvalds
On Tue, Sep 3, 2013 at 3:15 AM, Ingo Molnar wrote: > > One more thing to try would be a regular '-e cycles' non-PEBS run and see > whether there's still largish overhead visible around that instruction. I've done that, and it matches the PEBS runs, except obviously with the instruction skew (so

Re: [PATCH v7 1/4] spinlock: A new lockref structure for lockless update of refcount

2013-09-03 Thread Linus Torvalds
On Tue, Sep 3, 2013 at 8:14 AM, Waiman Long wrote: > > Other than the global tty_ldisc_lock, there is no other major > bottleneck. I am not that worry about the tty_ldisc_lock bottleneck > as real world applications probably won't have that many calls to > set the tty driver. I suspect the

Re: [PATCH v7 1/4] spinlock: A new lockref structure for lockless update of refcount

2013-09-03 Thread Waiman Long
On 09/03/2013 02:01 AM, Ingo Molnar wrote: * Waiman Long wrote: Yes, that patch worked. It eliminated the lglock as a bottleneck in the AIM7 workload. The lg_global_lock did not show up in the perf profile, whereas the lg_local_lock was only 0.07%. Just curious: what's the worst bottleneck

Re: [PATCH v7 1/4] spinlock: A new lockref structure for lockless update of refcount

2013-09-03 Thread Pavel Machek
Hi! > It is entirely possible that it is just a "cycles:pp" oddity - because > the "lock xadd" is serializing, it can't retire until everything > around it has been sorted out, and maybe it just shows up in profiles > more than is really "fair" to the instruction itself, because it ends > up

Re: [PATCH v7 1/4] spinlock: A new lockref structure for lockless update of refcount

2013-09-03 Thread Ingo Molnar
* Linus Torvalds wrote: > On Mon, Sep 2, 2013 at 12:05 AM, Ingo Molnar wrote: > > > > The Haswell perf code isn't very widely tested yet as it took quite some > > time to get it ready for upstream and thus got merged late, but on its > > face this looks like a pretty good profile. > > Yes.

Re: [PATCH v7 1/4] spinlock: A new lockref structure for lockless update of refcount

2013-09-03 Thread Sedat Dilek
On Tue, Sep 3, 2013 at 8:01 AM, Ingo Molnar wrote: > > * Waiman Long wrote: > >> On 08/30/2013 10:42 PM, Al Viro wrote: >> >On Sat, Aug 31, 2013 at 03:35:16AM +0100, Al Viro wrote: >> > >> >>Aha... OK, I see what's going on. We end up with shm_mnt *not* marked >> >>as long-living vfsmount,

Re: [PATCH v7 1/4] spinlock: A new lockref structure for lockless update of refcount

2013-09-03 Thread Ingo Molnar
* Waiman Long wrote: > On 08/30/2013 10:42 PM, Al Viro wrote: > >On Sat, Aug 31, 2013 at 03:35:16AM +0100, Al Viro wrote: > > > >>Aha... OK, I see what's going on. We end up with shm_mnt *not* marked > >>as long-living vfsmount, even though it lives forever. See if the > >>following helps;

Re: [PATCH v7 1/4] spinlock: A new lockref structure for lockless update of refcount

2013-09-03 Thread Pavel Machek
Hi! It is entirely possible that it is just a cycles:pp oddity - because the lock xadd is serializing, it can't retire until everything around it has been sorted out, and maybe it just shows up in profiles more than is really fair to the instruction itself, because it ends up being that

Re: [PATCH v7 1/4] spinlock: A new lockref structure for lockless update of refcount

2013-09-03 Thread Waiman Long
On 09/03/2013 02:01 AM, Ingo Molnar wrote: * Waiman Longwaiman.l...@hp.com wrote: Yes, that patch worked. It eliminated the lglock as a bottleneck in the AIM7 workload. The lg_global_lock did not show up in the perf profile, whereas the lg_local_lock was only 0.07%. Just curious: what's

Re: [PATCH v7 1/4] spinlock: A new lockref structure for lockless update of refcount

2013-09-03 Thread Linus Torvalds
On Tue, Sep 3, 2013 at 8:14 AM, Waiman Long waiman.l...@hp.com wrote: Other than the global tty_ldisc_lock, there is no other major bottleneck. I am not that worry about the tty_ldisc_lock bottleneck as real world applications probably won't have that many calls to set the tty driver. I

Re: [PATCH v7 1/4] spinlock: A new lockref structure for lockless update of refcount

2013-09-03 Thread Linus Torvalds
On Tue, Sep 3, 2013 at 3:15 AM, Ingo Molnar mi...@kernel.org wrote: One more thing to try would be a regular '-e cycles' non-PEBS run and see whether there's still largish overhead visible around that instruction. I've done that, and it matches the PEBS runs, except obviously with the

Re: [PATCH v7 1/4] spinlock: A new lockref structure for lockless update of refcount

2013-09-03 Thread Linus Torvalds
On Tue, Sep 3, 2013 at 12:24 AM, Sedat Dilek sedat.di...@gmail.com wrote: Can someone summarize this thread (70+ postings)? Which patches are needed? And fixing what? ( Can people provide separate patches with a proper changelog? ) Improvements? The core lockref part is now merged and

Re: [PATCH v7 1/4] spinlock: A new lockref structure for lockless update of refcount

2013-09-03 Thread Linus Torvalds
On Tue, Sep 3, 2013 at 8:41 AM, Linus Torvalds torva...@linux-foundation.org wrote: I've done that, and it matches the PEBS runs, except obviously with the instruction skew (so then depending on run it's 95% the instruction after the xadd). So the PEBS profiles are entirely consistent with

Re: [PATCH v7 1/4] spinlock: A new lockref structure for lockless update of refcount

2013-09-03 Thread Linus Torvalds
On Tue, Sep 3, 2013 at 8:34 AM, Linus Torvalds torva...@linux-foundation.org wrote: I suspect the tty_ldisc_lock() could be made to go away if we care. Heh. I just pulled the tty patches from Greg, and the locking has changed completely. It may actually fix your AIM7 test-case, because while

Re: [PATCH v7 1/4] spinlock: A new lockref structure for lockless update of refcount

2013-09-03 Thread Ingo Molnar
* Linus Torvalds torva...@linux-foundation.org wrote: On Tue, Sep 3, 2013 at 8:41 AM, Linus Torvalds torva...@linux-foundation.org wrote: I've done that, and it matches the PEBS runs, except obviously with the instruction skew (so then depending on run it's 95% the instruction after

Re: [PATCH v7 1/4] spinlock: A new lockref structure for lockless update of refcount

2013-09-03 Thread Linus Torvalds
On Tue, Sep 3, 2013 at 12:19 PM, Ingo Molnar mi...@kernel.org wrote: * Linus Torvalds torva...@linux-foundation.org wrote: - the lglock data structure isn't a percpu data structure, it's this stupid global data structure that has a percpu pointer in it. So that first mov (%rdi),%rdx is

Re: [PATCH v7 1/4] spinlock: A new lockref structure for lockless update of refcount

2013-09-03 Thread Waiman Long
On 09/03/2013 03:09 PM, Linus Torvalds wrote: On Tue, Sep 3, 2013 at 8:34 AM, Linus Torvalds torva...@linux-foundation.org wrote: I suspect the tty_ldisc_lock() could be made to go away if we care. Heh. I just pulled the tty patches from Greg, and the locking has changed completely. It may

Re: [PATCH v7 1/4] spinlock: A new lockref structure for lockless update of refcount

2013-09-03 Thread Linus Torvalds
On Tue, Sep 3, 2013 at 2:05 PM, Linus Torvalds torva...@linux-foundation.org wrote: TOTALLY UNTESTED PATCH ATTACHED. Actually, that was the previous (broken) version of that patch - I hadn't regenerated it after fixing some stupid compile errors, and it had the DECLARE parts wrong. This is the

Re: [PATCH v7 1/4] spinlock: A new lockref structure for lockless update of refcount

2013-09-03 Thread Linus Torvalds
On Tue, Sep 3, 2013 at 2:13 PM, Linus Torvalds torva...@linux-foundation.org wrote: This is the one that actually compiles. Whether it *works* is still a total mystery. It generates ok code, and it booted, so it seems to work at least for my config. However, it seems to make no

Re: [PATCH v7 1/4] spinlock: A new lockref structure for lockless update of refcount

2013-09-03 Thread Linus Torvalds
On Tue, Sep 3, 2013 at 2:34 PM, Linus Torvalds torva...@linux-foundation.org wrote: I'll try to hack that up too, but it's looking like it really is just the lock xadd, not the memory dependency chain.. Yeah, no difference: Better code generation with my quick hack for a percpu spinlock:

Re: [PATCH v7 1/4] spinlock: A new lockref structure for lockless update of refcount

2013-09-03 Thread Sedat Dilek
On Sun, Sep 1, 2013 at 5:32 PM, Linus Torvalds torva...@linux-foundation.org wrote: On Sun, Sep 1, 2013 at 3:01 AM, Sedat Dilek sedat.di...@gmail.com wrote: Looks like this is now 10x faster: ~2.66Mloops (debug) VS. ~26.60Mloops (no-debug). Ok, that's getting to be in the right ballpark.

Re: [PATCH v7 1/4] spinlock: A new lockref structure for lockless update of refcount

2013-09-03 Thread Dave Jones
On Wed, Sep 04, 2013 at 12:37:25AM +0200, Sedat Dilek wrote: You're spending more time on the task stats than on the actual lookup. Maybe you should turn off CONFIG_TASKSTATS..But why that whole irq_return thing? Odd. [ init/Kconfig ] ... config TASKSTATS bool

Re: [PATCH v7 1/4] spinlock: A new lockref structure for lockless update of refcount

2013-09-03 Thread Sedat Dilek
On Tue, Sep 3, 2013 at 5:14 PM, Waiman Long waiman.l...@hp.com wrote: On 09/03/2013 02:01 AM, Ingo Molnar wrote: * Waiman Longwaiman.l...@hp.com wrote: Yes, that patch worked. It eliminated the lglock as a bottleneck in the AIM7 workload. The lg_global_lock did not show up in the perf

Re: [PATCH v7 1/4] spinlock: A new lockref structure for lockless update of refcount

2013-09-03 Thread Sedat Dilek
On Wed, Sep 4, 2013 at 12:55 AM, Dave Jones da...@redhat.com wrote: On Wed, Sep 04, 2013 at 12:37:25AM +0200, Sedat Dilek wrote: You're spending more time on the task stats than on the actual lookup. Maybe you should turn off CONFIG_TASKSTATS..But why that whole irq_return thing?

Re: [PATCH v7 1/4] spinlock: A new lockref structure for lockless update of refcount

2013-09-03 Thread Sedat Dilek
On Wed, Sep 4, 2013 at 12:41 AM, Sedat Dilek sedat.di...@gmail.com wrote: On Tue, Sep 3, 2013 at 5:14 PM, Waiman Long waiman.l...@hp.com wrote: On 09/03/2013 02:01 AM, Ingo Molnar wrote: * Waiman Longwaiman.l...@hp.com wrote: Yes, that patch worked. It eliminated the lglock as a bottleneck

Re: [PATCH v7 1/4] spinlock: A new lockref structure for lockless update of refcount

2013-09-03 Thread Dave Jones
On Wed, Sep 04, 2013 at 01:05:38AM +0200, Sedat Dilek wrote: On Wed, Sep 4, 2013 at 12:55 AM, Dave Jones da...@redhat.com wrote: On Wed, Sep 04, 2013 at 12:37:25AM +0200, Sedat Dilek wrote: You're spending more time on the task stats than on the actual lookup. Maybe you should

Re: [PATCH v7 1/4] spinlock: A new lockref structure for lockless update of refcount

2013-09-03 Thread Sedat Dilek
On Wed, Sep 4, 2013 at 1:15 AM, Dave Jones da...@redhat.com wrote: On Wed, Sep 04, 2013 at 01:05:38AM +0200, Sedat Dilek wrote: On Wed, Sep 4, 2013 at 12:55 AM, Dave Jones da...@redhat.com wrote: On Wed, Sep 04, 2013 at 12:37:25AM +0200, Sedat Dilek wrote: You're spending more

Re: [PATCH v7 1/4] spinlock: A new lockref structure for lockless update of refcount

2013-09-03 Thread Ingo Molnar
* Waiman Long waiman.l...@hp.com wrote: On 08/30/2013 10:42 PM, Al Viro wrote: On Sat, Aug 31, 2013 at 03:35:16AM +0100, Al Viro wrote: Aha... OK, I see what's going on. We end up with shm_mnt *not* marked as long-living vfsmount, even though it lives forever. See if the following

Re: [PATCH v7 1/4] spinlock: A new lockref structure for lockless update of refcount

2013-09-03 Thread Sedat Dilek
On Tue, Sep 3, 2013 at 8:01 AM, Ingo Molnar mi...@kernel.org wrote: * Waiman Long waiman.l...@hp.com wrote: On 08/30/2013 10:42 PM, Al Viro wrote: On Sat, Aug 31, 2013 at 03:35:16AM +0100, Al Viro wrote: Aha... OK, I see what's going on. We end up with shm_mnt *not* marked as

Re: [PATCH v7 1/4] spinlock: A new lockref structure for lockless update of refcount

2013-09-03 Thread Ingo Molnar
* Linus Torvalds torva...@linux-foundation.org wrote: On Mon, Sep 2, 2013 at 12:05 AM, Ingo Molnar mi...@kernel.org wrote: The Haswell perf code isn't very widely tested yet as it took quite some time to get it ready for upstream and thus got merged late, but on its face this looks like

Re: [PATCH v7 1/4] spinlock: A new lockref structure for lockless update of refcount

2013-09-02 Thread Waiman Long
On 08/30/2013 10:42 PM, Al Viro wrote: On Sat, Aug 31, 2013 at 03:35:16AM +0100, Al Viro wrote: Aha... OK, I see what's going on. We end up with shm_mnt *not* marked as long-living vfsmount, even though it lives forever. See if the following helps; if it does (and I very much expect it to),

Re: [PATCH v7 1/4] spinlock: A new lockref structure for lockless update of refcount

2013-09-02 Thread Linus Torvalds
On Mon, Sep 2, 2013 at 12:05 AM, Ingo Molnar wrote: > > The Haswell perf code isn't very widely tested yet as it took quite some > time to get it ready for upstream and thus got merged late, but on its > face this looks like a pretty good profile. Yes. And everything else looks fine too.

Re: [PATCH v7 1/4] spinlock: A new lockref structure for lockless update of refcount

2013-09-02 Thread David Ahern
On 9/2/13 4:30 AM, Sedat Dilek wrote: On Sun, Sep 1, 2013 at 5:55 PM, Linus Torvalds wrote: On Sun, Sep 1, 2013 at 8:45 AM, Sedat Dilek wrote: Samples: 160K of event 'cycles:pp', Event count (approx.): 77003901089 + 12,46% t_lockref_from- [kernel.kallsyms] [k] irq_return + 4,86%

Re: [PATCH v7 1/4] spinlock: A new lockref structure for lockless update of refcount

2013-09-02 Thread Sedat Dilek
On Sun, Sep 1, 2013 at 5:55 PM, Linus Torvalds wrote: > On Sun, Sep 1, 2013 at 8:45 AM, Sedat Dilek wrote: >> >> Samples: 160K of event 'cycles:pp', Event count (approx.): 77003901089 >> + 12,46% t_lockref_from- [kernel.kallsyms] [k] irq_return >> + 4,86% t_lockref_from-

Re: [PATCH v7 1/4] spinlock: A new lockref structure for lockless update of refcount

2013-09-02 Thread Ingo Molnar
* Linus Torvalds wrote: > On Sun, Sep 1, 2013 at 5:12 PM, Linus Torvalds > wrote: > > > > It *is* one of the few locked accesses remaining, and it's clearly > > getting called a lot (three calls per system call: two mntput's - one > > for the root path, one for the result path, and one from

Re: [PATCH v7 1/4] spinlock: A new lockref structure for lockless update of refcount

2013-09-02 Thread Ingo Molnar
* Linus Torvalds torva...@linux-foundation.org wrote: On Sun, Sep 1, 2013 at 5:12 PM, Linus Torvalds torva...@linux-foundation.org wrote: It *is* one of the few locked accesses remaining, and it's clearly getting called a lot (three calls per system call: two mntput's - one for the

Re: [PATCH v7 1/4] spinlock: A new lockref structure for lockless update of refcount

2013-09-02 Thread Sedat Dilek
On Sun, Sep 1, 2013 at 5:55 PM, Linus Torvalds torva...@linux-foundation.org wrote: On Sun, Sep 1, 2013 at 8:45 AM, Sedat Dilek sedat.di...@gmail.com wrote: Samples: 160K of event 'cycles:pp', Event count (approx.): 77003901089 + 12,46% t_lockref_from- [kernel.kallsyms] [k] irq_return

Re: [PATCH v7 1/4] spinlock: A new lockref structure for lockless update of refcount

2013-09-02 Thread David Ahern
On 9/2/13 4:30 AM, Sedat Dilek wrote: On Sun, Sep 1, 2013 at 5:55 PM, Linus Torvalds torva...@linux-foundation.org wrote: On Sun, Sep 1, 2013 at 8:45 AM, Sedat Dilek sedat.di...@gmail.com wrote: Samples: 160K of event 'cycles:pp', Event count (approx.): 77003901089 + 12,46% t_lockref_from-

Re: [PATCH v7 1/4] spinlock: A new lockref structure for lockless update of refcount

2013-09-02 Thread Linus Torvalds
On Mon, Sep 2, 2013 at 12:05 AM, Ingo Molnar mi...@kernel.org wrote: The Haswell perf code isn't very widely tested yet as it took quite some time to get it ready for upstream and thus got merged late, but on its face this looks like a pretty good profile. Yes. And everything else looks fine

Re: [PATCH v7 1/4] spinlock: A new lockref structure for lockless update of refcount

2013-09-02 Thread Waiman Long
On 08/30/2013 10:42 PM, Al Viro wrote: On Sat, Aug 31, 2013 at 03:35:16AM +0100, Al Viro wrote: Aha... OK, I see what's going on. We end up with shm_mnt *not* marked as long-living vfsmount, even though it lives forever. See if the following helps; if it does (and I very much expect it to),

Re: [PATCH v7 1/4] spinlock: A new lockref structure for lockless update of refcount

2013-09-01 Thread Linus Torvalds
On Sun, Sep 1, 2013 at 5:12 PM, Linus Torvalds wrote: > > It *is* one of the few locked accesses remaining, and it's clearly > getting called a lot (three calls per system call: two mntput's - one > for the root path, one for the result path, and one from path_init -> > rcu_walk_init), but with

Re: [PATCH v7 1/4] spinlock: A new lockref structure for lockless update of refcount

2013-09-01 Thread Linus Torvalds
On Sun, Sep 1, 2013 at 4:30 PM, Al Viro wrote: > > Hrm... It excludes sharing between the locks, all right. AFAICS, that > won't exclude sharing with plain per-cpu vars, will it? Yes it will. DEFINE_PER_CPU_SHARED_ALIGNED not only aligns the data, it also puts it in a separate section with

Re: [PATCH v7 1/4] spinlock: A new lockref structure for lockless update of refcount

2013-09-01 Thread Al Viro
On Sun, Sep 01, 2013 at 03:48:01PM -0700, Linus Torvalds wrote: > I made DEFINE_LGLOCK use DEFINE_PER_CPU_SHARED_ALIGNED for the > spinlock, so that each local lock gets its own cacheline, and the > total loops jumped to 62M (from 52-54M before). So when I looked at > the numbers, I thought "oh,

Re: [PATCH v7 1/4] spinlock: A new lockref structure for lockless update of refcount

2013-09-01 Thread Linus Torvalds
On Sun, Sep 1, 2013 at 3:44 PM, Al Viro wrote: > > GRRR... I see something else: > void file_sb_list_del(struct file *file) > { > if (!list_empty(>f_u.fu_list)) { > lg_local_lock_cpu(_lglock, file_list_cpu(file)); > list_del_init(>f_u.fu_list); >

  1   2   3   >