Re: [patch] timer: Fix timers_update_migration(), and call it in tmigr_init()

2017-04-29 Thread Mike Galbraith
On Sun, 2017-04-30 at 06:20 +0200, Mike Galbraith wrote: > On Sat, 2017-04-29 at 20:43 -0700, Paul E. McKenney wrote: > > On Sun, Apr 30, 2017 at 03:21:58AM +0200, Mike Galbraith wrote: > > > On Sat, 2017-04-29 at 14:45 -0700, Paul E. McKenney wrote: > > > > On S

Re: [patch] timer: Fix timers_update_migration(), and call it in tmigr_init()

2017-04-29 Thread Mike Galbraith
On Sun, 2017-04-30 at 06:20 +0200, Mike Galbraith wrote: > On Sat, 2017-04-29 at 20:43 -0700, Paul E. McKenney wrote: > > On Sun, Apr 30, 2017 at 03:21:58AM +0200, Mike Galbraith wrote: > > > On Sat, 2017-04-29 at 14:45 -0700, Paul E. McKenney wrote: > > > > On S

Re: [patch] timer: Fix timers_update_migration(), and call it in tmigr_init()

2017-04-29 Thread Mike Galbraith
On Sat, 2017-04-29 at 20:43 -0700, Paul E. McKenney wrote: > On Sun, Apr 30, 2017 at 03:21:58AM +0200, Mike Galbraith wrote: > > On Sat, 2017-04-29 at 14:45 -0700, Paul E. McKenney wrote: > > > On Sat, Apr 29, 2017 at 08:20:33PM +0200, Mike Galbraith wrote: > > > &g

Re: [patch] timer: Fix timers_update_migration(), and call it in tmigr_init()

2017-04-29 Thread Mike Galbraith
On Sat, 2017-04-29 at 20:43 -0700, Paul E. McKenney wrote: > On Sun, Apr 30, 2017 at 03:21:58AM +0200, Mike Galbraith wrote: > > On Sat, 2017-04-29 at 14:45 -0700, Paul E. McKenney wrote: > > > On Sat, Apr 29, 2017 at 08:20:33PM +0200, Mike Galbraith wrote: > > > &g

Re: [patch] timer: Fix timers_update_migration(), and call it in tmigr_init()

2017-04-29 Thread Mike Galbraith
On Sat, 2017-04-29 at 14:45 -0700, Paul E. McKenney wrote: > On Sat, Apr 29, 2017 at 08:20:33PM +0200, Mike Galbraith wrote: > > On Sat, 2017-04-29 at 11:06 -0700, Paul E. McKenney wrote: > > > > > If someone will either repost a fresh series or point me at exactly > &g

Re: [patch] timer: Fix timers_update_migration(), and call it in tmigr_init()

2017-04-29 Thread Mike Galbraith
On Sat, 2017-04-29 at 14:45 -0700, Paul E. McKenney wrote: > On Sat, Apr 29, 2017 at 08:20:33PM +0200, Mike Galbraith wrote: > > On Sat, 2017-04-29 at 11:06 -0700, Paul E. McKenney wrote: > > > > > If someone will either repost a fresh series or point me at exactly > &g

Re: [patch] timer: Fix timers_update_migration(), and call it in tmigr_init()

2017-04-29 Thread Mike Galbraith
On Sat, 2017-04-29 at 11:06 -0700, Paul E. McKenney wrote: > If someone will either repost a fresh series or point me at exactly > the set of patches to use, I will run it through rcutorture again. Patchlet is against x86-tip/master.today. -Mike

Re: [patch] timer: Fix timers_update_migration(), and call it in tmigr_init()

2017-04-29 Thread Mike Galbraith
On Sat, 2017-04-29 at 11:06 -0700, Paul E. McKenney wrote: > If someone will either repost a fresh series or point me at exactly > the set of patches to use, I will run it through rcutorture again. Patchlet is against x86-tip/master.today. -Mike

[patch] timer: Fix timers_update_migration(), and call it in tmigr_init()

2017-04-29 Thread Mike Galbraith
in add_timer_on(). Remove redundant loop avoidance such that tick_nohz_activate() updates timer_bases[].nohz_active as intended, and call it in tmigr_init() to update timer_bases[].migration_enabled. Signed-off-by: Mike Galbraith <efa...@gmx.de> Fixes: ec2206b91d43 timer: Imp

[patch] timer: Fix timers_update_migration(), and call it in tmigr_init()

2017-04-29 Thread Mike Galbraith
in add_timer_on(). Remove redundant loop avoidance such that tick_nohz_activate() updates timer_bases[].nohz_active as intended, and call it in tmigr_init() to update timer_bases[].migration_enabled. Signed-off-by: Mike Galbraith Fixes: ec2206b91d43 timer: Implement the hierarchical pull model

Re: [patch] Re: x86-tip tsc/tick gripage

2017-04-28 Thread Mike Galbraith
On Fri, 2017-04-28 at 10:45 +0200, Mike Galbraith wrote: > Bah, nevermind. I forgot to restore command line. Well how 'bout that, it's not only old multi-socket boxen. I just reproduced on my i4790 desktop box. Boot virgin tip to init 3 with nowatchdog on command line, let box i

Re: [patch] Re: x86-tip tsc/tick gripage

2017-04-28 Thread Mike Galbraith
On Fri, 2017-04-28 at 10:45 +0200, Mike Galbraith wrote: > Bah, nevermind. I forgot to restore command line. Well how 'bout that, it's not only old multi-socket boxen. I just reproduced on my i4790 desktop box. Boot virgin tip to init 3 with nowatchdog on command line, let box i

Re: [patch] Re: x86-tip tsc/tick gripage

2017-04-28 Thread Mike Galbraith
> On Fri, 2017-04-28 at 09:35 +0200, Mike Galbraith wrote: > ...which makes my DL980 a happy camper with tip. Bah, nevermind. I forgot to restore command line.

Re: [patch] Re: x86-tip tsc/tick gripage

2017-04-28 Thread Mike Galbraith
> On Fri, 2017-04-28 at 09:35 +0200, Mike Galbraith wrote: > ...which makes my DL980 a happy camper with tip. Bah, nevermind. I forgot to restore command line.

[patch] Re: x86-tip tsc/tick gripage

2017-04-28 Thread Mike Galbraith
On Wed, 2017-04-26 at 13:39 +0200, Mike Galbraith wrote: > On Wed, 2017-04-26 at 12:26 +0200, Peter Zijlstra wrote: > > On Wed, Apr 26, 2017 at 10:57:42AM +0200, Mike Galbraith wrote: > > > > > Both still lose their TSC. > > > > > > [ 11.982468

[patch] Re: x86-tip tsc/tick gripage

2017-04-28 Thread Mike Galbraith
On Wed, 2017-04-26 at 13:39 +0200, Mike Galbraith wrote: > On Wed, 2017-04-26 at 12:26 +0200, Peter Zijlstra wrote: > > On Wed, Apr 26, 2017 at 10:57:42AM +0200, Mike Galbraith wrote: > > > > > Both still lose their TSC. > > > > > > [ 11.982468

Re: TREE_SRCU slows hotplug by factor ~16

2017-04-26 Thread Mike Galbraith
On Wed, 2017-04-26 at 22:32 -0700, Paul E. McKenney wrote: > On Thu, Apr 27, 2017 at 06:15:56AM +0200, Mike Galbraith wrote: > > On Wed, 2017-04-26 at 21:11 -0700, Paul E. McKenney wrote: > > > > > This is with srcutree.exp_holdoff set to 25*1000? > > >

Re: TREE_SRCU slows hotplug by factor ~16

2017-04-26 Thread Mike Galbraith
On Wed, 2017-04-26 at 22:32 -0700, Paul E. McKenney wrote: > On Thu, Apr 27, 2017 at 06:15:56AM +0200, Mike Galbraith wrote: > > On Wed, 2017-04-26 at 21:11 -0700, Paul E. McKenney wrote: > > > > > This is with srcutree.exp_holdoff set to 25*1000? > > >

Re: x86-tip tsc/tick gripage

2017-04-26 Thread Mike Galbraith
On Wed, 2017-04-26 at 14:30 +0200, Mike Galbraith wrote: > On Wed, 2017-04-26 at 13:39 +0200, Mike Galbraith wrote: > > On Wed, 2017-04-26 at 12:26 +0200, Peter Zijlstra wrote: > > > On Wed, Apr 26, 2017 at 10:57:42AM +0200, Mike Galbraith wrote: > > > > &

Re: x86-tip tsc/tick gripage

2017-04-26 Thread Mike Galbraith
On Wed, 2017-04-26 at 14:30 +0200, Mike Galbraith wrote: > On Wed, 2017-04-26 at 13:39 +0200, Mike Galbraith wrote: > > On Wed, 2017-04-26 at 12:26 +0200, Peter Zijlstra wrote: > > > On Wed, Apr 26, 2017 at 10:57:42AM +0200, Mike Galbraith wrote: > > > > &

Re: TREE_SRCU slows hotplug by factor ~16

2017-04-26 Thread Mike Galbraith
On Wed, 2017-04-26 at 21:11 -0700, Paul E. McKenney wrote: > This is with srcutree.exp_holdoff set to 25*1000? Yup.

Re: TREE_SRCU slows hotplug by factor ~16

2017-04-26 Thread Mike Galbraith
On Wed, 2017-04-26 at 21:11 -0700, Paul E. McKenney wrote: > This is with srcutree.exp_holdoff set to 25*1000? Yup.

Re: TREE_SRCU slows hotplug by factor ~16

2017-04-26 Thread Mike Galbraith
On Wed, 2017-04-26 at 20:12 +0200, Mike Galbraith wrote: > On Wed, 2017-04-26 at 10:56 -0700, Paul E. McKenney wrote: > > > > OK, I do need to do more work. My current guess is that I should have > > > set the default for srcutree.exp_holdoff to 25*1000 instead of 50*1

Re: TREE_SRCU slows hotplug by factor ~16

2017-04-26 Thread Mike Galbraith
On Wed, 2017-04-26 at 20:12 +0200, Mike Galbraith wrote: > On Wed, 2017-04-26 at 10:56 -0700, Paul E. McKenney wrote: > > > > OK, I do need to do more work. My current guess is that I should have > > > set the default for srcutree.exp_holdoff to 25*1000 instead of 50*1

Re: x86-tip tsc/tick gripage

2017-04-26 Thread Mike Galbraith
On Wed, 2017-04-26 at 14:30 +0200, Mike Galbraith wrote: > On Wed, 2017-04-26 at 13:39 +0200, Mike Galbraith wrote: > > On Wed, 2017-04-26 at 12:26 +0200, Peter Zijlstra wrote: > > > On Wed, Apr 26, 2017 at 10:57:42AM +0200, Mike Galbraith wrote: > > > > &

Re: x86-tip tsc/tick gripage

2017-04-26 Thread Mike Galbraith
On Wed, 2017-04-26 at 14:30 +0200, Mike Galbraith wrote: > On Wed, 2017-04-26 at 13:39 +0200, Mike Galbraith wrote: > > On Wed, 2017-04-26 at 12:26 +0200, Peter Zijlstra wrote: > > > On Wed, Apr 26, 2017 at 10:57:42AM +0200, Mike Galbraith wrote: > > > > &

Re: TREE_SRCU slows hotplug by factor ~16

2017-04-26 Thread Mike Galbraith
On Wed, 2017-04-26 at 10:56 -0700, Paul E. McKenney wrote: > > OK, I do need to do more work. My current guess is that I should have > > set the default for srcutree.exp_holdoff to 25*1000 instead of 50*1000. > > But I am sure that further data will show me the error of my ways. ;-) I can give

Re: TREE_SRCU slows hotplug by factor ~16

2017-04-26 Thread Mike Galbraith
On Wed, 2017-04-26 at 10:56 -0700, Paul E. McKenney wrote: > > OK, I do need to do more work. My current guess is that I should have > > set the default for srcutree.exp_holdoff to 25*1000 instead of 50*1000. > > But I am sure that further data will show me the error of my ways. ;-) I can give

Re: TREE_SRCU slows hotplug by factor ~16

2017-04-26 Thread Mike Galbraith
On Wed, 2017-04-26 at 17:49 +0200, Mike Galbraith wrote: > On Wed, 2017-04-26 at 08:44 -0700, Paul E. McKenney wrote: > > Should I be comparing this with the 55s number from your initial email, > > or to the 39s number? > > Should be the 39... And 39 it is. -Mike

Re: TREE_SRCU slows hotplug by factor ~16

2017-04-26 Thread Mike Galbraith
On Wed, 2017-04-26 at 17:49 +0200, Mike Galbraith wrote: > On Wed, 2017-04-26 at 08:44 -0700, Paul E. McKenney wrote: > > Should I be comparing this with the 55s number from your initial email, > > or to the 39s number? > > Should be the 39... And 39 it is. -Mike

Re: TREE_SRCU slows hotplug by factor ~16

2017-04-26 Thread Mike Galbraith
On Wed, 2017-04-26 at 08:44 -0700, Paul E. McKenney wrote: > On Wed, Apr 26, 2017 at 05:26:20PM +0200, Mike Galbraith wrote: > > On Wed, 2017-04-26 at 07:31 -0700, Paul E. McKenney wrote: > > > > > And a sneak preview, semi-tested. If you get a chance to run this, plea

Re: TREE_SRCU slows hotplug by factor ~16

2017-04-26 Thread Mike Galbraith
On Wed, 2017-04-26 at 08:44 -0700, Paul E. McKenney wrote: > On Wed, Apr 26, 2017 at 05:26:20PM +0200, Mike Galbraith wrote: > > On Wed, 2017-04-26 at 07:31 -0700, Paul E. McKenney wrote: > > > > > And a sneak preview, semi-tested. If you get a chance to run this, plea

Re: TREE_SRCU slows hotplug by factor ~16

2017-04-26 Thread Mike Galbraith
On Wed, 2017-04-26 at 07:31 -0700, Paul E. McKenney wrote: > And a sneak preview, semi-tested. If you get a chance to run this, please > let me know now it goes. That took 'time stress-cpu-hotplug.sh' down to 48s, close to classic. -Mike

Re: TREE_SRCU slows hotplug by factor ~16

2017-04-26 Thread Mike Galbraith
On Wed, 2017-04-26 at 07:31 -0700, Paul E. McKenney wrote: > And a sneak preview, semi-tested. If you get a chance to run this, please > let me know now it goes. That took 'time stress-cpu-hotplug.sh' down to 48s, close to classic. -Mike

Re: x86-tip tsc/tick gripage

2017-04-26 Thread Mike Galbraith
On Wed, 2017-04-26 at 13:39 +0200, Mike Galbraith wrote: > On Wed, 2017-04-26 at 12:26 +0200, Peter Zijlstra wrote: > > On Wed, Apr 26, 2017 at 10:57:42AM +0200, Mike Galbraith wrote: > > > > > Both still lose their TSC. > > > > > > [ 11.982468

Re: x86-tip tsc/tick gripage

2017-04-26 Thread Mike Galbraith
On Wed, 2017-04-26 at 13:39 +0200, Mike Galbraith wrote: > On Wed, 2017-04-26 at 12:26 +0200, Peter Zijlstra wrote: > > On Wed, Apr 26, 2017 at 10:57:42AM +0200, Mike Galbraith wrote: > > > > > Both still lose their TSC. > > > > > > [ 11.982468

Re: x86-tip tsc/tick gripage

2017-04-26 Thread Mike Galbraith
On Wed, 2017-04-26 at 12:26 +0200, Peter Zijlstra wrote: > On Wed, Apr 26, 2017 at 10:57:42AM +0200, Mike Galbraith wrote: > > > Both still lose their TSC. > > > > [ 11.982468] tsc: Refined TSC clocksource calibration: 2260.999 MHz > > [ 11.994275] clocksource:

Re: x86-tip tsc/tick gripage

2017-04-26 Thread Mike Galbraith
On Wed, 2017-04-26 at 12:26 +0200, Peter Zijlstra wrote: > On Wed, Apr 26, 2017 at 10:57:42AM +0200, Mike Galbraith wrote: > > > Both still lose their TSC. > > > > [ 11.982468] tsc: Refined TSC clocksource calibration: 2260.999 MHz > > [ 11.994275] clocksource:

Re: x86-tip tsc/tick gripage

2017-04-26 Thread Mike Galbraith
On Wed, 2017-04-26 at 10:57 +0200, Mike Galbraith wrote: > On Wed, 2017-04-26 at 10:31 +0200, Mike Galbraith wrote: > > On Wed, 2017-04-26 at 10:21 +0200, Ingo Molnar wrote: > > > > > I have temporarily removed the current timers/urgent lineup from -tip: > > >

Re: x86-tip tsc/tick gripage

2017-04-26 Thread Mike Galbraith
On Wed, 2017-04-26 at 10:57 +0200, Mike Galbraith wrote: > On Wed, 2017-04-26 at 10:31 +0200, Mike Galbraith wrote: > > On Wed, 2017-04-26 at 10:21 +0200, Ingo Molnar wrote: > > > > > I have temporarily removed the current timers/urgent lineup from -tip: > > >

Re: x86-tip tsc/tick gripage

2017-04-26 Thread Mike Galbraith
On Wed, 2017-04-26 at 10:31 +0200, Mike Galbraith wrote: > On Wed, 2017-04-26 at 10:21 +0200, Ingo Molnar wrote: > > > I have temporarily removed the current timers/urgent lineup from -tip: > > > > 098991fccfc7: nohz: Print more debug info in tick_nohz_stop_sched_tick()

Re: x86-tip tsc/tick gripage

2017-04-26 Thread Mike Galbraith
On Wed, 2017-04-26 at 10:31 +0200, Mike Galbraith wrote: > On Wed, 2017-04-26 at 10:21 +0200, Ingo Molnar wrote: > > > I have temporarily removed the current timers/urgent lineup from -tip: > > > > 098991fccfc7: nohz: Print more debug info in tick_nohz_stop_sched_tick()

Re: x86-tip tsc/tick gripage

2017-04-26 Thread Mike Galbraith
On Wed, 2017-04-26 at 10:21 +0200, Ingo Molnar wrote: > I have temporarily removed the current timers/urgent lineup from -tip: > > 098991fccfc7: nohz: Print more debug info in tick_nohz_stop_sched_tick() > 22aa2ad45fd8: tick: Make sure tick timer is active when bypassing > reprogramming >

Re: x86-tip tsc/tick gripage

2017-04-26 Thread Mike Galbraith
On Wed, 2017-04-26 at 10:21 +0200, Ingo Molnar wrote: > I have temporarily removed the current timers/urgent lineup from -tip: > > 098991fccfc7: nohz: Print more debug info in tick_nohz_stop_sched_tick() > 22aa2ad45fd8: tick: Make sure tick timer is active when bypassing > reprogramming >

Re: x86-tip tsc/tick gripage

2017-04-26 Thread Mike Galbraith
On Wed, 2017-04-26 at 10:02 +0200, Mike Galbraith wrote: > tip v4.11-rc8-893-g8ec9e12aff06, trusty ole 8 socket (X7560) DL980 G7 Ew, DL980 then turned into unhappy RCU camper. [ 316.980923] basemono: 31695600 ts->next_tick: 31638000 dev->next_event: 316956005002 [ 689.893

Re: x86-tip tsc/tick gripage

2017-04-26 Thread Mike Galbraith
On Wed, 2017-04-26 at 10:02 +0200, Mike Galbraith wrote: > tip v4.11-rc8-893-g8ec9e12aff06, trusty ole 8 socket (X7560) DL980 G7 Ew, DL980 then turned into unhappy RCU camper. [ 316.980923] basemono: 31695600 ts->next_tick: 31638000 dev->next_event: 316956005002 [ 689.893

x86-tip tsc/tick gripage

2017-04-26 Thread Mike Galbraith
Greetings, After picking up the pieces of my tip-rt tree, I'm seeing grumbling on two boxen, and it ain't me, the below is virgin tip. The second box has crap BIOS (replacement ready to be installed), but works fine if the sync code gets the things synchronized before giving up (bumping loop

x86-tip tsc/tick gripage

2017-04-26 Thread Mike Galbraith
Greetings, After picking up the pieces of my tip-rt tree, I'm seeing grumbling on two boxen, and it ain't me, the below is virgin tip. The second box has crap BIOS (replacement ready to be installed), but works fine if the sync code gets the things synchronized before giving up (bumping loop

Re: [PATCH] sched: topology: s/borken/broken

2017-04-24 Thread Mike Galbraith
On Mon, 2017-04-24 at 16:50 +0530, Viresh Kumar wrote: > Fix minor spelling mistake. That's a perfectly correct alternative spelling of b0rken :) > Signed-off-by: Viresh Kumar > --- > kernel/sched/topology.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > >

Re: [PATCH] sched: topology: s/borken/broken

2017-04-24 Thread Mike Galbraith
On Mon, 2017-04-24 at 16:50 +0530, Viresh Kumar wrote: > Fix minor spelling mistake. That's a perfectly correct alternative spelling of b0rken :) > Signed-off-by: Viresh Kumar > --- > kernel/sched/topology.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git

Re: TREE_SRCU slows hotplug by factor ~16

2017-04-24 Thread Mike Galbraith
On Mon, 2017-04-24 at 09:35 +0200, Mike Galbraith wrote: > # tracer: nop > # > # entries-in-buffer/entries-written: 229332/229332 #P:8 > # > # _-=> irqs-off > # / _=> need-resched > #

Re: TREE_SRCU slows hotplug by factor ~16

2017-04-24 Thread Mike Galbraith
On Mon, 2017-04-24 at 09:35 +0200, Mike Galbraith wrote: > # tracer: nop > # > # entries-in-buffer/entries-written: 229332/229332 #P:8 > # > # _-=> irqs-off > # / _=> need-resched > #

Re: TREE_SRCU slows hotplug by factor ~16

2017-04-24 Thread Mike Galbraith
On Sun, 2017-04-23 at 23:22 -0700, Paul E. McKenney wrote: > Could you please collect an ftrace (or whatever) showing the timestamp > sequence of calls to synchronize_srcu(), synchronize_srcu_expedited(), > and call_srcu() during the execution of the stress script? If it is easy > to do, also

Re: TREE_SRCU slows hotplug by factor ~16

2017-04-24 Thread Mike Galbraith
On Sun, 2017-04-23 at 23:22 -0700, Paul E. McKenney wrote: > Could you please collect an ftrace (or whatever) showing the timestamp > sequence of calls to synchronize_srcu(), synchronize_srcu_expedited(), > and call_srcu() during the execution of the stress script? If it is easy > to do, also

Re: TREE_SRCU slows hotplug by factor ~16

2017-04-23 Thread Mike Galbraith
On Sun, 2017-04-23 at 20:32 -0700, Paul E. McKenney wrote: > On Mon, Apr 24, 2017 at 04:48:09AM +0200, Mike Galbraith wrote: > > Greetings, > > > > Running Steven's hotplug stress script in tip w. CLASSIC_SRCU takes 55s > > in my i4790 box, whereas TREE_SR

Re: TREE_SRCU slows hotplug by factor ~16

2017-04-23 Thread Mike Galbraith
On Sun, 2017-04-23 at 20:32 -0700, Paul E. McKenney wrote: > On Mon, Apr 24, 2017 at 04:48:09AM +0200, Mike Galbraith wrote: > > Greetings, > > > > Running Steven's hotplug stress script in tip w. CLASSIC_SRCU takes 55s > > in my i4790 box, whereas TREE_SR

TREE_SRCU slows hotplug by factor ~16

2017-04-23 Thread Mike Galbraith
Greetings, Running Steven's hotplug stress script in tip w. CLASSIC_SRCU takes 55s in my i4790 box, whereas TREE_SRCU takes over 16m. (Master with the same config does it in 39s.. but then lockdep isn't enabled in master) -Mike

TREE_SRCU slows hotplug by factor ~16

2017-04-23 Thread Mike Galbraith
Greetings, Running Steven's hotplug stress script in tip w. CLASSIC_SRCU takes 55s in my i4790 box, whereas TREE_SRCU takes over 16m. (Master with the same config does it in 39s.. but then lockdep isn't enabled in master) -Mike

Re: Random guest crashes since 5c34d002dcc7 ("virtio_pci: use shared interrupts for virtqueues")

2017-04-10 Thread Mike Galbraith
On Tue, 2017-04-11 at 00:23 +0300, Michael S. Tsirkin wrote: > On Sat, Apr 08, 2017 at 07:01:34AM +0200, Mike Galbraith wrote: > > On Fri, 2017-04-07 at 21:56 +0300, Michael S. Tsirkin wrote: > > > > > OK. test3 and test4 are now pushed: test3 should fix your hang, >

Re: Random guest crashes since 5c34d002dcc7 ("virtio_pci: use shared interrupts for virtqueues")

2017-04-10 Thread Mike Galbraith
On Tue, 2017-04-11 at 00:23 +0300, Michael S. Tsirkin wrote: > On Sat, Apr 08, 2017 at 07:01:34AM +0200, Mike Galbraith wrote: > > On Fri, 2017-04-07 at 21:56 +0300, Michael S. Tsirkin wrote: > > > > > OK. test3 and test4 are now pushed: test3 should fix your hang, >

Re: [PATCH -v6 13/13] futex: futex_lock_pi() vs PREEMPT_RT_FULL

2017-04-07 Thread Mike Galbraith
On Fri, 2017-04-07 at 19:26 -0700, Darren Hart wrote: > I would like to see more testing because... well... futexes. But, we don't > have > a futex torture suite yet, but that is something I'm hoping to be looking into > in the near future. What testing we do have available has passed between my

Re: [PATCH -v6 13/13] futex: futex_lock_pi() vs PREEMPT_RT_FULL

2017-04-07 Thread Mike Galbraith
On Fri, 2017-04-07 at 19:26 -0700, Darren Hart wrote: > I would like to see more testing because... well... futexes. But, we don't > have > a futex torture suite yet, but that is something I'm hoping to be looking into > in the near future. What testing we do have available has passed between my

Re: Random guest crashes since 5c34d002dcc7 ("virtio_pci: use shared interrupts for virtqueues")

2017-04-07 Thread Mike Galbraith
On Fri, 2017-04-07 at 21:56 +0300, Michael S. Tsirkin wrote: > OK. test3 and test4 are now pushed: test3 should fix your hang, > test4 is trying to fix a crash reported independently. test3 does not fix the post hibernate hang business that I can easily reproduce, those are NFS, and at least as

Re: Random guest crashes since 5c34d002dcc7 ("virtio_pci: use shared interrupts for virtqueues")

2017-04-07 Thread Mike Galbraith
On Fri, 2017-04-07 at 21:56 +0300, Michael S. Tsirkin wrote: > OK. test3 and test4 are now pushed: test3 should fix your hang, > test4 is trying to fix a crash reported independently. test3 does not fix the post hibernate hang business that I can easily reproduce, those are NFS, and at least as

Re: Random guest crashes since 5c34d002dcc7 ("virtio_pci: use shared interrupts for virtqueues")

2017-04-07 Thread Mike Galbraith
On Fri, 2017-04-07 at 16:35 +0300, Michael S. Tsirkin wrote: > Oh wait, I still put the ctx feature patches in there :( > Pls ignore, I'll update when I've fixed it up. Sorry about the noise. Both worked fine w/wo threadirqs. -Mike

Re: Random guest crashes since 5c34d002dcc7 ("virtio_pci: use shared interrupts for virtqueues")

2017-04-07 Thread Mike Galbraith
On Fri, 2017-04-07 at 16:35 +0300, Michael S. Tsirkin wrote: > Oh wait, I still put the ctx feature patches in there :( > Pls ignore, I'll update when I've fixed it up. Sorry about the noise. Both worked fine w/wo threadirqs. -Mike

Re: Random guest crashes since 5c34d002dcc7 ("virtio_pci: use shared interrupts for virtqueues")

2017-04-07 Thread Mike Galbraith
On Fri, 2017-04-07 at 09:22 +0200, Mike Galbraith wrote: > On Fri, 2017-04-07 at 09:05 +0200, Mike Galbraith wrote: > > On Fri, 2017-04-07 at 08:44 +0200, Mike Galbraith wrote: > > > On Fri, 2017-04-07 at 09:24 +0300, Michael S. Tsirkin wrote: > > > > On Fri, Apr 07,

Re: Random guest crashes since 5c34d002dcc7 ("virtio_pci: use shared interrupts for virtqueues")

2017-04-07 Thread Mike Galbraith
On Fri, 2017-04-07 at 09:22 +0200, Mike Galbraith wrote: > On Fri, 2017-04-07 at 09:05 +0200, Mike Galbraith wrote: > > On Fri, 2017-04-07 at 08:44 +0200, Mike Galbraith wrote: > > > On Fri, 2017-04-07 at 09:24 +0300, Michael S. Tsirkin wrote: > > > > On Fri, Apr 07,

Re: Random guest crashes since 5c34d002dcc7 ("virtio_pci: use shared interrupts for virtqueues")

2017-04-07 Thread Mike Galbraith
On Fri, 2017-04-07 at 09:05 +0200, Mike Galbraith wrote: > On Fri, 2017-04-07 at 08:44 +0200, Mike Galbraith wrote: > > On Fri, 2017-04-07 at 09:24 +0300, Michael S. Tsirkin wrote: > > > On Fri, Apr 07, 2017 at 08:03:19AM +0200, Mike Galbraith wrote: > > > > >

Re: Random guest crashes since 5c34d002dcc7 ("virtio_pci: use shared interrupts for virtqueues")

2017-04-07 Thread Mike Galbraith
On Fri, 2017-04-07 at 09:05 +0200, Mike Galbraith wrote: > On Fri, 2017-04-07 at 08:44 +0200, Mike Galbraith wrote: > > On Fri, 2017-04-07 at 09:24 +0300, Michael S. Tsirkin wrote: > > > On Fri, Apr 07, 2017 at 08:03:19AM +0200, Mike Galbraith wrote: > > > > >

Re: Random guest crashes since 5c34d002dcc7 ("virtio_pci: use shared interrupts for virtqueues")

2017-04-07 Thread Mike Galbraith
On Fri, 2017-04-07 at 08:44 +0200, Mike Galbraith wrote: > On Fri, 2017-04-07 at 09:24 +0300, Michael S. Tsirkin wrote: > > On Fri, Apr 07, 2017 at 08:03:19AM +0200, Mike Galbraith wrote: > > > > Test tag works fine here w/wo threadirqs, RT works as well. > > >

Re: Random guest crashes since 5c34d002dcc7 ("virtio_pci: use shared interrupts for virtqueues")

2017-04-07 Thread Mike Galbraith
On Fri, 2017-04-07 at 08:44 +0200, Mike Galbraith wrote: > On Fri, 2017-04-07 at 09:24 +0300, Michael S. Tsirkin wrote: > > On Fri, Apr 07, 2017 at 08:03:19AM +0200, Mike Galbraith wrote: > > > > Test tag works fine here w/wo threadirqs, RT works as well. > > >

Re: Random guest crashes since 5c34d002dcc7 ("virtio_pci: use shared interrupts for virtqueues")

2017-04-07 Thread Mike Galbraith
On Fri, 2017-04-07 at 09:24 +0300, Michael S. Tsirkin wrote: > On Fri, Apr 07, 2017 at 08:03:19AM +0200, Mike Galbraith wrote: > > Test tag works fine here w/wo threadirqs, RT works as well. > > > > -Mike > > Thanks a lot. > OK I pushed out two new tags >

Re: Random guest crashes since 5c34d002dcc7 ("virtio_pci: use shared interrupts for virtqueues")

2017-04-07 Thread Mike Galbraith
On Fri, 2017-04-07 at 09:24 +0300, Michael S. Tsirkin wrote: > On Fri, Apr 07, 2017 at 08:03:19AM +0200, Mike Galbraith wrote: > > Test tag works fine here w/wo threadirqs, RT works as well. > > > > -Mike > > Thanks a lot. > OK I pushed out two new tags >

Re: Random guest crashes since 5c34d002dcc7 ("virtio_pci: use shared interrupts for virtqueues")

2017-04-07 Thread Mike Galbraith
On Thu, 2017-04-06 at 00:38 +0300, Michael S. Tsirkin wrote: > What I did is a revert the refactorings while keeping the affinity API - > we can safely postpone them until the next release without loss of > functionality. But that's on top of my testing tree so it has unrelated > stuff as well.

Re: Random guest crashes since 5c34d002dcc7 ("virtio_pci: use shared interrupts for virtqueues")

2017-04-07 Thread Mike Galbraith
On Thu, 2017-04-06 at 00:38 +0300, Michael S. Tsirkin wrote: > What I did is a revert the refactorings while keeping the affinity API - > we can safely postpone them until the next release without loss of > functionality. But that's on top of my testing tree so it has unrelated > stuff as well.

Re: [PATCH] sched: Fix numabalancing to work with isolated cpus

2017-04-06 Thread Mike Galbraith
On Tue, 2017-04-04 at 22:57 +0530, Srikar Dronamraju wrote: > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c > index f045a35..f853dc0 100644 > --- a/kernel/sched/fair.c > +++ b/kernel/sched/fair.c > @@ -1666,6 +1666,10 @@ static void task_numa_find_cpu(struct task_numa_env > *env, > >

Re: [PATCH] sched: Fix numabalancing to work with isolated cpus

2017-04-06 Thread Mike Galbraith
On Tue, 2017-04-04 at 22:57 +0530, Srikar Dronamraju wrote: > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c > index f045a35..f853dc0 100644 > --- a/kernel/sched/fair.c > +++ b/kernel/sched/fair.c > @@ -1666,6 +1666,10 @@ static void task_numa_find_cpu(struct task_numa_env > *env, > >

Re: net/sched: latent livelock in dev_deactivate_many() due to yield() usage

2017-04-05 Thread Mike Galbraith
On Wed, 2017-04-05 at 17:31 -0700, Stephen Hemminger wrote: > On Sun, 02 Apr 2017 06:28:41 +0200 > Mike Galbraith <efa...@gmx.de> wrote: > > > Livelock can be triggered by setting kworkers to SCHED_FIFO, then > > suspend/resume.. you come back from sleepy-land

Re: net/sched: latent livelock in dev_deactivate_many() due to yield() usage

2017-04-05 Thread Mike Galbraith
On Wed, 2017-04-05 at 17:31 -0700, Stephen Hemminger wrote: > On Sun, 02 Apr 2017 06:28:41 +0200 > Mike Galbraith wrote: > > > Livelock can be triggered by setting kworkers to SCHED_FIFO, then > > suspend/resume.. you come back from sleepy-land with a spinning > > kw

Re: net/sched: latent livelock in dev_deactivate_many() due to yield() usage

2017-04-05 Thread Mike Galbraith
On Wed, 2017-04-05 at 16:55 -0700, Cong Wang wrote: > On Tue, Apr 4, 2017 at 11:12 PM, Mike Galbraith <efa...@gmx.de> wrote: > > On Tue, 2017-04-04 at 22:25 -0700, Cong Wang wrote: > > > On Tue, Apr 4, 2017 at 8:20 PM, Mike Galbraith <efa...@gmx.de> wrote:

Re: net/sched: latent livelock in dev_deactivate_many() due to yield() usage

2017-04-05 Thread Mike Galbraith
On Wed, 2017-04-05 at 16:55 -0700, Cong Wang wrote: > On Tue, Apr 4, 2017 at 11:12 PM, Mike Galbraith wrote: > > On Tue, 2017-04-04 at 22:25 -0700, Cong Wang wrote: > > > On Tue, Apr 4, 2017 at 8:20 PM, Mike Galbraith wrote: > > > > -

[tip:locking/core] rtmutex: Plug preempt count leak in rt_mutex_futex_unlock()

2017-04-05 Thread tip-bot for Mike Galbraith
Commit-ID: def34eaae5ce04b324e48e1bfac873091d945213 Gitweb: http://git.kernel.org/tip/def34eaae5ce04b324e48e1bfac873091d945213 Author: Mike Galbraith <efa...@gmx.de> AuthorDate: Wed, 5 Apr 2017 10:08:27 +0200 Committer: Thomas Gleixner <t...@linutronix.de> CommitDate: Wed, 5

[tip:locking/core] rtmutex: Plug preempt count leak in rt_mutex_futex_unlock()

2017-04-05 Thread tip-bot for Mike Galbraith
Commit-ID: def34eaae5ce04b324e48e1bfac873091d945213 Gitweb: http://git.kernel.org/tip/def34eaae5ce04b324e48e1bfac873091d945213 Author: Mike Galbraith AuthorDate: Wed, 5 Apr 2017 10:08:27 +0200 Committer: Thomas Gleixner CommitDate: Wed, 5 Apr 2017 16:59:37 +0200 rtmutex: Plug preempt

[tip:locking/core] Retiplockingcore_rtmutex_Deboost_before_waking_up_the_top_waiter

2017-04-05 Thread tip-bot for Mike Galbraith
Commit-ID: 94247f76e7361afd85ba03a3f923bf3d07ba3017 Gitweb: http://git.kernel.org/tip/94247f76e7361afd85ba03a3f923bf3d07ba3017 Author: Mike Galbraith <efa...@gmx.de> AuthorDate: Wed, 5 Apr 2017 10:08:27 +0200 Committer: Thomas Gleixner <t...@linutronix.de> CommitDate: Wed, 5

[tip:locking/core] Retiplockingcore_rtmutex_Deboost_before_waking_up_the_top_waiter

2017-04-05 Thread tip-bot for Mike Galbraith
Commit-ID: 94247f76e7361afd85ba03a3f923bf3d07ba3017 Gitweb: http://git.kernel.org/tip/94247f76e7361afd85ba03a3f923bf3d07ba3017 Author: Mike Galbraith AuthorDate: Wed, 5 Apr 2017 10:08:27 +0200 Committer: Thomas Gleixner CommitDate: Wed, 5 Apr 2017 16:52:10 +0200

Re: [tip:locking/core] rtmutex: Deboost before waking up the top waiter

2017-04-05 Thread Mike Galbraith
locking/rtmutex: Fix preempt leak in __rt_mutex_futex_unlock() mark_wakeup_next_waiter() already disables preemption, doing so again leaves us with an unpaired preempt_disable(). Signed-off-by: Mike Galbraith <efa...@gmx.de> --- kernel/locking/rtmutex.c | 10 +- 1 file chan

Re: [tip:locking/core] rtmutex: Deboost before waking up the top waiter

2017-04-05 Thread Mike Galbraith
locking/rtmutex: Fix preempt leak in __rt_mutex_futex_unlock() mark_wakeup_next_waiter() already disables preemption, doing so again leaves us with an unpaired preempt_disable(). Signed-off-by: Mike Galbraith --- kernel/locking/rtmutex.c | 10 +- 1 file changed, 5 insertions(+), 5

Re: Random guest crashes since 5c34d002dcc7 ("virtio_pci: use shared interrupts for virtqueues")

2017-04-05 Thread Mike Galbraith
On Wed, 2017-04-05 at 08:29 +0200, Christoph Hellwig wrote: > Can you check where the issues appear? I'd like to do a pure revert > of the shared interrupts, but that three has a lot more in it.. Not immediately, one of my several pots is emitting black smoke. -Mike

Re: Random guest crashes since 5c34d002dcc7 ("virtio_pci: use shared interrupts for virtqueues")

2017-04-05 Thread Mike Galbraith
On Wed, 2017-04-05 at 08:29 +0200, Christoph Hellwig wrote: > Can you check where the issues appear? I'd like to do a pure revert > of the shared interrupts, but that three has a lot more in it.. Not immediately, one of my several pots is emitting black smoke. -Mike

Re: net/sched: latent livelock in dev_deactivate_many() due to yield() usage

2017-04-05 Thread Mike Galbraith
On Tue, 2017-04-04 at 22:25 -0700, Cong Wang wrote: > On Tue, Apr 4, 2017 at 8:20 PM, Mike Galbraith <efa...@gmx.de> wrote: > > - while (some_qdisc_is_busy(dev)) > > - yield(); > > + swait_event_timeout(swait, >

Re: net/sched: latent livelock in dev_deactivate_many() due to yield() usage

2017-04-05 Thread Mike Galbraith
On Tue, 2017-04-04 at 22:25 -0700, Cong Wang wrote: > On Tue, Apr 4, 2017 at 8:20 PM, Mike Galbraith wrote: > > - while (some_qdisc_is_busy(dev)) > > - yield(); > > + swait_event_timeout(swait, > > !some_qdisc_is_busy(de

Re: Random guest crashes since 5c34d002dcc7 ("virtio_pci: use shared interrupts for virtqueues")

2017-04-04 Thread Mike Galbraith
On Wed, 2017-04-05 at 06:51 +0300, Michael S. Tsirkin wrote: > Any issues at all left with this tree? > In particular any regressions? Nothing blatantly obvious in a testdrive that lasted a couple minutes. I'd have to beat on it a bit to look for things beyond the reported, but can't afford to

Re: Random guest crashes since 5c34d002dcc7 ("virtio_pci: use shared interrupts for virtqueues")

2017-04-04 Thread Mike Galbraith
On Wed, 2017-04-05 at 06:51 +0300, Michael S. Tsirkin wrote: > Any issues at all left with this tree? > In particular any regressions? Nothing blatantly obvious in a testdrive that lasted a couple minutes. I'd have to beat on it a bit to look for things beyond the reported, but can't afford to

Re: Random guest crashes since 5c34d002dcc7 ("virtio_pci: use shared interrupts for virtqueues")

2017-04-04 Thread Mike Galbraith
On Wed, 2017-04-05 at 05:24 +0200, Mike Galbraith wrote: > On Wed, 2017-04-05 at 06:13 +0300, Michael S. Tsirkin wrote: > > On Wed, Apr 05, 2017 at 05:09:09AM +0200, Mike Galbraith wrote: > > > On Tue, 2017-04-04 at 22:03 +0300, Michael S. Tsirkin wrote: > > > > &g

Re: Random guest crashes since 5c34d002dcc7 ("virtio_pci: use shared interrupts for virtqueues")

2017-04-04 Thread Mike Galbraith
On Wed, 2017-04-05 at 05:24 +0200, Mike Galbraith wrote: > On Wed, 2017-04-05 at 06:13 +0300, Michael S. Tsirkin wrote: > > On Wed, Apr 05, 2017 at 05:09:09AM +0200, Mike Galbraith wrote: > > > On Tue, 2017-04-04 at 22:03 +0300, Michael S. Tsirkin wrote: > > > > &g

Re: Random guest crashes since 5c34d002dcc7 ("virtio_pci: use shared interrupts for virtqueues")

2017-04-04 Thread Mike Galbraith
On Wed, 2017-04-05 at 06:13 +0300, Michael S. Tsirkin wrote: > On Wed, Apr 05, 2017 at 05:09:09AM +0200, Mike Galbraith wrote: > > On Tue, 2017-04-04 at 22:03 +0300, Michael S. Tsirkin wrote: > > > > > since I couldn't reproduce, I decided it's worth trying to see > &g

Re: Random guest crashes since 5c34d002dcc7 ("virtio_pci: use shared interrupts for virtqueues")

2017-04-04 Thread Mike Galbraith
On Wed, 2017-04-05 at 06:13 +0300, Michael S. Tsirkin wrote: > On Wed, Apr 05, 2017 at 05:09:09AM +0200, Mike Galbraith wrote: > > On Tue, 2017-04-04 at 22:03 +0300, Michael S. Tsirkin wrote: > > > > > since I couldn't reproduce, I decided it's worth trying to see > &g

Re: net/sched: latent livelock in dev_deactivate_many() due to yield() usage

2017-04-04 Thread Mike Galbraith
On Tue, 2017-04-04 at 15:39 -0700, Cong Wang wrote: > Thanks for the report! Looks like a quick solution here is to replace > this yield() with cond_resched(), it is harder to really wait for > all qdisc's to transmit all packets. No, cond_resched() won't help. What I did is below, but I

Re: net/sched: latent livelock in dev_deactivate_many() due to yield() usage

2017-04-04 Thread Mike Galbraith
On Tue, 2017-04-04 at 15:39 -0700, Cong Wang wrote: > Thanks for the report! Looks like a quick solution here is to replace > this yield() with cond_resched(), it is harder to really wait for > all qdisc's to transmit all packets. No, cond_resched() won't help. What I did is below, but I

<    6   7   8   9   10   11   12   13   14   15   >