subject:"\[PATCH\] rcu\: Eliminate softirq processing from rcutree"

Re: [PATCH] rcu: Eliminate softirq processing from rcutree

2014-01-27 Thread Mike Galbraith

On Mon, 2014-01-27 at 08:54 -0800, Paul E. McKenney wrote: 
> On Mon, Jan 27, 2014 at 06:10:44AM +0100, Mike Galbraith wrote:
> > On Sat, 2014-01-25 at 06:12 +0100, Mike Galbraith wrote: 
> > > On Fri, 2014-01-24 at 20:50 +0100, Sebastian Andrzej Siewior wrote: 
> > > > * Mike Galbraith | 2014-01-18 04:25:14 [+0100]:
> > > > 
> > > > >> ># timers-do-not-raise-softirq-unconditionally.patch
> > > > >> ># rtmutex-use-a-trylock-for-waiter-lock-in-trylock.patch
> > > > >> >
> > > > >> >..those two out does seem to have stabilized the thing.
> > > > >> 
> > > > >> timers-do-not-raise-softirq-unconditionally.patch is on its way out.
> > > > >> 
> > > > >> rtmutex-use-a-trylock-for-waiter-lock-in-trylock.patch confues me.
> > > > >> Didn't you report once that your box deadlocks without this patch? 
> > > > >> Now
> > > > >> your 64way box on the other hand does not work with it?
> > > > >
> > > > >If 'do not raise' is applied, 'use a trylock' won't save you.  If 'do
> > > > is this just an observation or you do know why it won't save me?
> > > 
> > > It's an observation from beyond the grave from the 64 core box that it
> > > repeatedly did NOT save :)  Autopsy photos below.
> > > 
> > > I've built 3.12.8-rt9 with Stevens v2 "timer: Raise softirq if there's
> > > irq_work" to see if it'll survive.
> > 
> > And it did, configured both as nohz_tick, and nohz_full_all.  The irqs
> > are enabled warning in can_stop_full_tick() fired for nohz_full_all, but
> > that's it.
> > 
> > For grins, I also applied Paul's v3 timer latency series while testing
> > nohz_full_all config.   The box was heavily loaded the vast majority of
> > the time, but it didn't explode or do anything obviously evil.
> 
> Cool!  May I add your Tested-by?

Certainly.

-Mike

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] rcu: Eliminate softirq processing from rcutree

2014-01-27 Thread Paul E. McKenney

On Mon, Jan 27, 2014 at 06:10:44AM +0100, Mike Galbraith wrote:
> On Sat, 2014-01-25 at 06:12 +0100, Mike Galbraith wrote: 
> > On Fri, 2014-01-24 at 20:50 +0100, Sebastian Andrzej Siewior wrote: 
> > > * Mike Galbraith | 2014-01-18 04:25:14 [+0100]:
> > > 
> > > >> ># timers-do-not-raise-softirq-unconditionally.patch
> > > >> ># rtmutex-use-a-trylock-for-waiter-lock-in-trylock.patch
> > > >> >
> > > >> >..those two out does seem to have stabilized the thing.
> > > >> 
> > > >> timers-do-not-raise-softirq-unconditionally.patch is on its way out.
> > > >> 
> > > >> rtmutex-use-a-trylock-for-waiter-lock-in-trylock.patch confues me.
> > > >> Didn't you report once that your box deadlocks without this patch? Now
> > > >> your 64way box on the other hand does not work with it?
> > > >
> > > >If 'do not raise' is applied, 'use a trylock' won't save you.  If 'do
> > > is this just an observation or you do know why it won't save me?
> > 
> > It's an observation from beyond the grave from the 64 core box that it
> > repeatedly did NOT save :)  Autopsy photos below.
> > 
> > I've built 3.12.8-rt9 with Stevens v2 "timer: Raise softirq if there's
> > irq_work" to see if it'll survive.
> 
> And it did, configured both as nohz_tick, and nohz_full_all.  The irqs
> are enabled warning in can_stop_full_tick() fired for nohz_full_all, but
> that's it.
> 
> For grins, I also applied Paul's v3 timer latency series while testing
> nohz_full_all config.   The box was heavily loaded the vast majority of
> the time, but it didn't explode or do anything obviously evil.

Cool!  May I add your Tested-by?

Thanx, Paul

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] rcu: Eliminate softirq processing from rcutree

2014-01-27 Thread Paul E. McKenney

On Mon, Jan 27, 2014 at 06:10:44AM +0100, Mike Galbraith wrote:
 On Sat, 2014-01-25 at 06:12 +0100, Mike Galbraith wrote: 
  On Fri, 2014-01-24 at 20:50 +0100, Sebastian Andrzej Siewior wrote: 
   * Mike Galbraith | 2014-01-18 04:25:14 [+0100]:
   
# timers-do-not-raise-softirq-unconditionally.patch
# rtmutex-use-a-trylock-for-waiter-lock-in-trylock.patch

..those two out does seem to have stabilized the thing.

timers-do-not-raise-softirq-unconditionally.patch is on its way out.

rtmutex-use-a-trylock-for-waiter-lock-in-trylock.patch confues me.
Didn't you report once that your box deadlocks without this patch? Now
your 64way box on the other hand does not work with it?
   
   If 'do not raise' is applied, 'use a trylock' won't save you.  If 'do
   is this just an observation or you do know why it won't save me?
  
  It's an observation from beyond the grave from the 64 core box that it
  repeatedly did NOT save :)  Autopsy photos below.
  
  I've built 3.12.8-rt9 with Stevens v2 timer: Raise softirq if there's
  irq_work to see if it'll survive.
 
 And it did, configured both as nohz_tick, and nohz_full_all.  The irqs
 are enabled warning in can_stop_full_tick() fired for nohz_full_all, but
 that's it.
 
 For grins, I also applied Paul's v3 timer latency series while testing
 nohz_full_all config.   The box was heavily loaded the vast majority of
 the time, but it didn't explode or do anything obviously evil.

Cool!  May I add your Tested-by?

Thanx, Paul

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] rcu: Eliminate softirq processing from rcutree

2014-01-27 Thread Mike Galbraith

On Mon, 2014-01-27 at 08:54 -0800, Paul E. McKenney wrote: 
 On Mon, Jan 27, 2014 at 06:10:44AM +0100, Mike Galbraith wrote:
  On Sat, 2014-01-25 at 06:12 +0100, Mike Galbraith wrote: 
   On Fri, 2014-01-24 at 20:50 +0100, Sebastian Andrzej Siewior wrote: 
* Mike Galbraith | 2014-01-18 04:25:14 [+0100]:

 # timers-do-not-raise-softirq-unconditionally.patch
 # rtmutex-use-a-trylock-for-waiter-lock-in-trylock.patch
 
 ..those two out does seem to have stabilized the thing.
 
 timers-do-not-raise-softirq-unconditionally.patch is on its way out.
 
 rtmutex-use-a-trylock-for-waiter-lock-in-trylock.patch confues me.
 Didn't you report once that your box deadlocks without this patch? 
 Now
 your 64way box on the other hand does not work with it?

If 'do not raise' is applied, 'use a trylock' won't save you.  If 'do
is this just an observation or you do know why it won't save me?
   
   It's an observation from beyond the grave from the 64 core box that it
   repeatedly did NOT save :)  Autopsy photos below.
   
   I've built 3.12.8-rt9 with Stevens v2 timer: Raise softirq if there's
   irq_work to see if it'll survive.
  
  And it did, configured both as nohz_tick, and nohz_full_all.  The irqs
  are enabled warning in can_stop_full_tick() fired for nohz_full_all, but
  that's it.
  
  For grins, I also applied Paul's v3 timer latency series while testing
  nohz_full_all config.   The box was heavily loaded the vast majority of
  the time, but it didn't explode or do anything obviously evil.
 
 Cool!  May I add your Tested-by?

Certainly.

-Mike

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] rcu: Eliminate softirq processing from rcutree

2014-01-26 Thread Mike Galbraith

On Sat, 2014-01-25 at 06:12 +0100, Mike Galbraith wrote: 
> On Fri, 2014-01-24 at 20:50 +0100, Sebastian Andrzej Siewior wrote: 
> > * Mike Galbraith | 2014-01-18 04:25:14 [+0100]:
> > 
> > >> ># timers-do-not-raise-softirq-unconditionally.patch
> > >> ># rtmutex-use-a-trylock-for-waiter-lock-in-trylock.patch
> > >> >
> > >> >..those two out does seem to have stabilized the thing.
> > >> 
> > >> timers-do-not-raise-softirq-unconditionally.patch is on its way out.
> > >> 
> > >> rtmutex-use-a-trylock-for-waiter-lock-in-trylock.patch confues me.
> > >> Didn't you report once that your box deadlocks without this patch? Now
> > >> your 64way box on the other hand does not work with it?
> > >
> > >If 'do not raise' is applied, 'use a trylock' won't save you.  If 'do
> > is this just an observation or you do know why it won't save me?
> 
> It's an observation from beyond the grave from the 64 core box that it
> repeatedly did NOT save :)  Autopsy photos below.
> 
> I've built 3.12.8-rt9 with Stevens v2 "timer: Raise softirq if there's
> irq_work" to see if it'll survive.

And it did, configured both as nohz_tick, and nohz_full_all.  The irqs
are enabled warning in can_stop_full_tick() fired for nohz_full_all, but
that's it.

For grins, I also applied Paul's v3 timer latency series while testing
nohz_full_all config.   The box was heavily loaded the vast majority of
the time, but it didn't explode or do anything obviously evil.

-Mike

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] rcu: Eliminate softirq processing from rcutree

2014-01-26 Thread Mike Galbraith

On Sat, 2014-01-25 at 06:12 +0100, Mike Galbraith wrote: 
 On Fri, 2014-01-24 at 20:50 +0100, Sebastian Andrzej Siewior wrote: 
  * Mike Galbraith | 2014-01-18 04:25:14 [+0100]:
  
   # timers-do-not-raise-softirq-unconditionally.patch
   # rtmutex-use-a-trylock-for-waiter-lock-in-trylock.patch
   
   ..those two out does seem to have stabilized the thing.
   
   timers-do-not-raise-softirq-unconditionally.patch is on its way out.
   
   rtmutex-use-a-trylock-for-waiter-lock-in-trylock.patch confues me.
   Didn't you report once that your box deadlocks without this patch? Now
   your 64way box on the other hand does not work with it?
  
  If 'do not raise' is applied, 'use a trylock' won't save you.  If 'do
  is this just an observation or you do know why it won't save me?
 
 It's an observation from beyond the grave from the 64 core box that it
 repeatedly did NOT save :)  Autopsy photos below.
 
 I've built 3.12.8-rt9 with Stevens v2 timer: Raise softirq if there's
 irq_work to see if it'll survive.

And it did, configured both as nohz_tick, and nohz_full_all.  The irqs
are enabled warning in can_stop_full_tick() fired for nohz_full_all, but
that's it.

For grins, I also applied Paul's v3 timer latency series while testing
nohz_full_all config.   The box was heavily loaded the vast majority of
the time, but it didn't explode or do anything obviously evil.

-Mike

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] rcu: Eliminate softirq processing from rcutree

2014-01-24 Thread Mike Galbraith

On Fri, 2014-01-24 at 20:46 +0100, Sebastian Andrzej Siewior wrote: 
> * Mike Galbraith | 2013-12-23 06:12:39 [+0100]:
> 
> >P.S.
> >
> >virgin -rt7 doing tbench 64 + make -j64
> >
> >[   97.907960] perf samples too long (3138 > 2500), lowering 
> >kernel.perf_event_max_sample_rate to 5
> >[  103.047921] perf samples too long (5544 > 5000), lowering 
> >kernel.perf_event_max_sample_rate to 25000
> >[  181.561271] perf samples too long (10318 > 1), lowering 
> >kernel.perf_event_max_sample_rate to 13000
> >[  184.243750] INFO: NMI handler (perf_event_nmi_handler) took too long to 
> >run: 1.084 msecs
> >[  248.914422] perf samples too long (19719 > 19230), lowering 
> >kernel.perf_event_max_sample_rate to 7000
> >[  382.116674] NOHZ: local_softirq_pending 10
> This is block
> 
> >[  405.201593] perf samples too long (36824 > 35714), lowering 
> >kernel.perf_event_max_sample_rate to 4000
> >[  444.704185] NOHZ: local_softirq_pending 08
> >[  444.704208] NOHZ: local_softirq_pending 08
> >[  444.704579] NOHZ: local_softirq_pending 08
> >[  444.704678] NOHZ: local_softirq_pending 08
> >[  444.705100] NOHZ: local_softirq_pending 08
> >[  444.705980] NOHZ: local_softirq_pending 08
> >[  444.705994] NOHZ: local_softirq_pending 08
> >[  444.708315] NOHZ: local_softirq_pending 08
> >[  444.710348] NOHZ: local_softirq_pending 08
> 
> and this is RX. Is your testcase heavy disk-io or heavy disk-io +
> network?

Yeah.

-Mike

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] rcu: Eliminate softirq processing from rcutree

2014-01-24 Thread Mike Galbraith

On Fri, 2014-01-24 at 20:50 +0100, Sebastian Andrzej Siewior wrote: 
> * Mike Galbraith | 2014-01-18 04:25:14 [+0100]:
> 
> >> ># timers-do-not-raise-softirq-unconditionally.patch
> >> ># rtmutex-use-a-trylock-for-waiter-lock-in-trylock.patch
> >> >
> >> >..those two out does seem to have stabilized the thing.
> >> 
> >> timers-do-not-raise-softirq-unconditionally.patch is on its way out.
> >> 
> >> rtmutex-use-a-trylock-for-waiter-lock-in-trylock.patch confues me.
> >> Didn't you report once that your box deadlocks without this patch? Now
> >> your 64way box on the other hand does not work with it?
> >
> >If 'do not raise' is applied, 'use a trylock' won't save you.  If 'do
> is this just an observation or you do know why it won't save me?

It's an observation from beyond the grave from the 64 core box that it
repeatedly did NOT save :)  Autopsy photos below.

I've built 3.12.8-rt9 with Stevens v2 "timer: Raise softirq if there's
irq_work" to see if it'll survive.

nohz_full_all:
PID: 508TASK: 8802739ba340  CPU: 16  COMMAND: "ksoftirqd/16"
 #0 [880276806a40] machine_kexec at 8103bc07
 #1 [880276806aa0] crash_kexec at 810d56b3
 #2 [880276806b70] panic at 815bf8b0
 #3 [880276806bf0] watchdog_overflow_callback at 810fed3d
 #4 [880276806c10] __perf_event_overflow at 81131928
 #5 [880276806ca0] perf_event_overflow at 81132254
 #6 [880276806cb0] intel_pmu_handle_irq at 8102078f
 #7 [880276806de0] perf_event_nmi_handler at 815c5825
 #8 [880276806e10] nmi_handle at 815c4ed3
 #9 [880276806ea0] default_do_nmi at 815c5063   


#10 [880276806ed0] do_nmi at 815c5388   


#11 [880276806ef0] end_repeat_nmi at 815c4371   


[exception RIP: _raw_spin_trylock+48]   


RIP: 815c3790  RSP: 880276803e28  RFLAGS: 0002  


RAX: 0010  RBX: 0010  RCX: 0002 


RDX: 880276803e28  RSI: 0018  RDI: 0001 


RBP: 815c3790   R8: 815c3790   R9: 0018
R10: 880276803e28  R11: 0002  R12: 
R13: 880273a0c000  R14: 8802739ba340  R15: 880273a03fd8
ORIG_RAX: 880273a03fd8  CS: 0010  SS: 0018
---  ---
#12 [880276803e28] _raw_spin_trylock at 815c3790
#13 [880276803e30] rt_spin_lock_slowunlock_hirq at 815c2cc8
#14 [880276803e50] rt_spin_unlock_after_trylock_in_irq at 815c3425
#15 [880276803e60] get_next_timer_interrupt at 810684a7
#16 [880276803ed0] tick_nohz_stop_sched_tick at 810c5f2e
#17 [880276803f50] tick_nohz_irq_exit at 810c6333
#18 [880276803f70] irq_exit at 81060065
#19 [880276803f90] smp_apic_timer_interrupt at 810358f5
#20 [880276803fb0] apic_timer_interrupt at 815cbf9d
---  ---
#21 [880273a03b28] apic_timer_interrupt at 815cbf9d
[exception RIP: _raw_spin_lock+50]
RIP: 815c3642  RSP: 880273a03bd8  RFLAGS: 0202
RAX: 8b49  RBX: 880272157290  RCX: 8802739ba340
RDX: 8b4a  RSI: 0010  RDI: 880273a0c000
RBP: 880273a03bd8   R8: 0001   R9: 
R10:   R11: 0001  R12: 810927b5
R13: 880273a03b68  R14: 0010  R15: 0010
ORIG_RAX: ff10  CS: 0010  SS: 0018
#22 [880273a03be0] rt_spin_lock_slowlock at 815c2591
#23 [880273a03cc0] rt_spin_lock at 815c3362
#24 [880273a03cd0] run_timer_softirq at 81069002
#25 [880273a03d70] handle_softirq at 81060d0f
#26

Re: [PATCH] rcu: Eliminate softirq processing from rcutree

2014-01-24 Thread Sebastian Andrzej Siewior

* Mike Galbraith | 2014-01-18 04:25:14 [+0100]:

>> ># timers-do-not-raise-softirq-unconditionally.patch
>> ># rtmutex-use-a-trylock-for-waiter-lock-in-trylock.patch
>> >
>> >..those two out does seem to have stabilized the thing.
>> 
>> timers-do-not-raise-softirq-unconditionally.patch is on its way out.
>> 
>> rtmutex-use-a-trylock-for-waiter-lock-in-trylock.patch confues me.
>> Didn't you report once that your box deadlocks without this patch? Now
>> your 64way box on the other hand does not work with it?
>
>If 'do not raise' is applied, 'use a trylock' won't save you.  If 'do
is this just an observation or you do know why it won't save me?
Currently I think to go back to the version where the waiter_lock was
taken with irqs off. However I would prefer to trigger this myself so I
would know what is going on instead blindly apply patches.

>-Mike

Sebastian
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] rcu: Eliminate softirq processing from rcutree

2014-01-24 Thread Sebastian Andrzej Siewior

* Mike Galbraith | 2013-12-23 06:12:39 [+0100]:

>P.S.
>
>virgin -rt7 doing tbench 64 + make -j64
>
>[   97.907960] perf samples too long (3138 > 2500), lowering 
>kernel.perf_event_max_sample_rate to 5
>[  103.047921] perf samples too long (5544 > 5000), lowering 
>kernel.perf_event_max_sample_rate to 25000
>[  181.561271] perf samples too long (10318 > 1), lowering 
>kernel.perf_event_max_sample_rate to 13000
>[  184.243750] INFO: NMI handler (perf_event_nmi_handler) took too long to 
>run: 1.084 msecs
>[  248.914422] perf samples too long (19719 > 19230), lowering 
>kernel.perf_event_max_sample_rate to 7000
>[  382.116674] NOHZ: local_softirq_pending 10
This is block

>[  405.201593] perf samples too long (36824 > 35714), lowering 
>kernel.perf_event_max_sample_rate to 4000
>[  444.704185] NOHZ: local_softirq_pending 08
>[  444.704208] NOHZ: local_softirq_pending 08
>[  444.704579] NOHZ: local_softirq_pending 08
>[  444.704678] NOHZ: local_softirq_pending 08
>[  444.705100] NOHZ: local_softirq_pending 08
>[  444.705980] NOHZ: local_softirq_pending 08
>[  444.705994] NOHZ: local_softirq_pending 08
>[  444.708315] NOHZ: local_softirq_pending 08
>[  444.710348] NOHZ: local_softirq_pending 08

and this is RX. Is your testcase heavy disk-io or heavy disk-io +
network?

Sebastian
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] rcu: Eliminate softirq processing from rcutree

2014-01-24 Thread Sebastian Andrzej Siewior

* Mike Galbraith | 2013-12-23 06:12:39 [+0100]:

P.S.

virgin -rt7 doing tbench 64 + make -j64

[   97.907960] perf samples too long (3138  2500), lowering 
kernel.perf_event_max_sample_rate to 5
[  103.047921] perf samples too long (5544  5000), lowering 
kernel.perf_event_max_sample_rate to 25000
[  181.561271] perf samples too long (10318  1), lowering 
kernel.perf_event_max_sample_rate to 13000
[  184.243750] INFO: NMI handler (perf_event_nmi_handler) took too long to 
run: 1.084 msecs
[  248.914422] perf samples too long (19719  19230), lowering 
kernel.perf_event_max_sample_rate to 7000
[  382.116674] NOHZ: local_softirq_pending 10
This is block

[  405.201593] perf samples too long (36824  35714), lowering 
kernel.perf_event_max_sample_rate to 4000
[  444.704185] NOHZ: local_softirq_pending 08
[  444.704208] NOHZ: local_softirq_pending 08
[  444.704579] NOHZ: local_softirq_pending 08
[  444.704678] NOHZ: local_softirq_pending 08
[  444.705100] NOHZ: local_softirq_pending 08
[  444.705980] NOHZ: local_softirq_pending 08
[  444.705994] NOHZ: local_softirq_pending 08
[  444.708315] NOHZ: local_softirq_pending 08
[  444.710348] NOHZ: local_softirq_pending 08

and this is RX. Is your testcase heavy disk-io or heavy disk-io +
network?

Sebastian
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] rcu: Eliminate softirq processing from rcutree

2014-01-24 Thread Sebastian Andrzej Siewior

* Mike Galbraith | 2014-01-18 04:25:14 [+0100]:

 # timers-do-not-raise-softirq-unconditionally.patch
 # rtmutex-use-a-trylock-for-waiter-lock-in-trylock.patch
 
 ..those two out does seem to have stabilized the thing.
 
 timers-do-not-raise-softirq-unconditionally.patch is on its way out.
 
 rtmutex-use-a-trylock-for-waiter-lock-in-trylock.patch confues me.
 Didn't you report once that your box deadlocks without this patch? Now
 your 64way box on the other hand does not work with it?

If 'do not raise' is applied, 'use a trylock' won't save you.  If 'do
is this just an observation or you do know why it won't save me?
Currently I think to go back to the version where the waiter_lock was
taken with irqs off. However I would prefer to trigger this myself so I
would know what is going on instead blindly apply patches.

-Mike

Sebastian
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] rcu: Eliminate softirq processing from rcutree

2014-01-24 Thread Mike Galbraith

On Fri, 2014-01-24 at 20:50 +0100, Sebastian Andrzej Siewior wrote: 
 * Mike Galbraith | 2014-01-18 04:25:14 [+0100]:
 
  # timers-do-not-raise-softirq-unconditionally.patch
  # rtmutex-use-a-trylock-for-waiter-lock-in-trylock.patch
  
  ..those two out does seem to have stabilized the thing.
  
  timers-do-not-raise-softirq-unconditionally.patch is on its way out.
  
  rtmutex-use-a-trylock-for-waiter-lock-in-trylock.patch confues me.
  Didn't you report once that your box deadlocks without this patch? Now
  your 64way box on the other hand does not work with it?
 
 If 'do not raise' is applied, 'use a trylock' won't save you.  If 'do
 is this just an observation or you do know why it won't save me?

It's an observation from beyond the grave from the 64 core box that it
repeatedly did NOT save :)  Autopsy photos below.

I've built 3.12.8-rt9 with Stevens v2 timer: Raise softirq if there's
irq_work to see if it'll survive.

nohz_full_all:
PID: 508TASK: 8802739ba340  CPU: 16  COMMAND: ksoftirqd/16
 #0 [880276806a40] machine_kexec at 8103bc07
 #1 [880276806aa0] crash_kexec at 810d56b3
 #2 [880276806b70] panic at 815bf8b0
 #3 [880276806bf0] watchdog_overflow_callback at 810fed3d
 #4 [880276806c10] __perf_event_overflow at 81131928
 #5 [880276806ca0] perf_event_overflow at 81132254
 #6 [880276806cb0] intel_pmu_handle_irq at 8102078f
 #7 [880276806de0] perf_event_nmi_handler at 815c5825
 #8 [880276806e10] nmi_handle at 815c4ed3
 #9 [880276806ea0] default_do_nmi at 815c5063   


#10 [880276806ed0] do_nmi at 815c5388   


#11 [880276806ef0] end_repeat_nmi at 815c4371   


[exception RIP: _raw_spin_trylock+48]   


RIP: 815c3790  RSP: 880276803e28  RFLAGS: 0002  


RAX: 0010  RBX: 0010  RCX: 0002 


RDX: 880276803e28  RSI: 0018  RDI: 0001 


RBP: 815c3790   R8: 815c3790   R9: 0018
R10: 880276803e28  R11: 0002  R12: 
R13: 880273a0c000  R14: 8802739ba340  R15: 880273a03fd8
ORIG_RAX: 880273a03fd8  CS: 0010  SS: 0018
--- RT exception stack ---
#12 [880276803e28] _raw_spin_trylock at 815c3790
#13 [880276803e30] rt_spin_lock_slowunlock_hirq at 815c2cc8
#14 [880276803e50] rt_spin_unlock_after_trylock_in_irq at 815c3425
#15 [880276803e60] get_next_timer_interrupt at 810684a7
#16 [880276803ed0] tick_nohz_stop_sched_tick at 810c5f2e
#17 [880276803f50] tick_nohz_irq_exit at 810c6333
#18 [880276803f70] irq_exit at 81060065
#19 [880276803f90] smp_apic_timer_interrupt at 810358f5
#20 [880276803fb0] apic_timer_interrupt at 815cbf9d
--- IRQ stack ---
#21 [880273a03b28] apic_timer_interrupt at 815cbf9d
[exception RIP: _raw_spin_lock+50]
RIP: 815c3642  RSP: 880273a03bd8  RFLAGS: 0202
RAX: 8b49  RBX: 880272157290  RCX: 8802739ba340
RDX: 8b4a  RSI: 0010  RDI: 880273a0c000
RBP: 880273a03bd8   R8: 0001   R9: 
R10:   R11: 0001  R12: 810927b5
R13: 880273a03b68  R14: 0010  R15: 0010
ORIG_RAX: ff10  CS: 0010  SS: 0018
#22 [880273a03be0] rt_spin_lock_slowlock at 815c2591
#23 [880273a03cc0] rt_spin_lock at 815c3362
#24 [880273a03cd0] run_timer_softirq at 81069002
#25 [880273a03d70] handle_softirq at 81060d0f
#26 [880273a03db0]

Re: [PATCH] rcu: Eliminate softirq processing from rcutree

2014-01-24 Thread Mike Galbraith

On Fri, 2014-01-24 at 20:46 +0100, Sebastian Andrzej Siewior wrote: 
 * Mike Galbraith | 2013-12-23 06:12:39 [+0100]:
 
 P.S.
 
 virgin -rt7 doing tbench 64 + make -j64
 
 [   97.907960] perf samples too long (3138  2500), lowering 
 kernel.perf_event_max_sample_rate to 5
 [  103.047921] perf samples too long (5544  5000), lowering 
 kernel.perf_event_max_sample_rate to 25000
 [  181.561271] perf samples too long (10318  1), lowering 
 kernel.perf_event_max_sample_rate to 13000
 [  184.243750] INFO: NMI handler (perf_event_nmi_handler) took too long to 
 run: 1.084 msecs
 [  248.914422] perf samples too long (19719  19230), lowering 
 kernel.perf_event_max_sample_rate to 7000
 [  382.116674] NOHZ: local_softirq_pending 10
 This is block
 
 [  405.201593] perf samples too long (36824  35714), lowering 
 kernel.perf_event_max_sample_rate to 4000
 [  444.704185] NOHZ: local_softirq_pending 08
 [  444.704208] NOHZ: local_softirq_pending 08
 [  444.704579] NOHZ: local_softirq_pending 08
 [  444.704678] NOHZ: local_softirq_pending 08
 [  444.705100] NOHZ: local_softirq_pending 08
 [  444.705980] NOHZ: local_softirq_pending 08
 [  444.705994] NOHZ: local_softirq_pending 08
 [  444.708315] NOHZ: local_softirq_pending 08
 [  444.710348] NOHZ: local_softirq_pending 08
 
 and this is RX. Is your testcase heavy disk-io or heavy disk-io +
 network?

Yeah.

-Mike

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] rcu: Eliminate softirq processing from rcutree

2014-01-17 Thread Mike Galbraith

On Fri, 2014-01-17 at 18:23 +0100, Sebastian Andrzej Siewior wrote:

> So I had rtmutex-take-the-waiter-lock-with-irqs-off.patch in my queue
> which took the waiter lock with irqs off. This should be the same thing
> you try do here.

(yeah, these are just whacked mole body bags;)

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] rcu: Eliminate softirq processing from rcutree

2014-01-17 Thread Mike Galbraith

On Fri, 2014-01-17 at 18:14 +0100, Sebastian Andrzej Siewior wrote: 
> * Mike Galbraith | 2013-12-25 18:37:37 [+0100]:
> 
> >On Tue, 2013-12-24 at 23:55 -0800, Paul E. McKenney wrote: 
> >> On Wed, Dec 25, 2013 at 04:07:34AM +0100, Mike Galbraith wrote:
> >
> >Having sufficiently recovered from turkey overdose to be able to slither
> >upstairs (bump bump bump) to check on the box, commenting..
> >
> ># timers-do-not-raise-softirq-unconditionally.patch
> ># rtmutex-use-a-trylock-for-waiter-lock-in-trylock.patch
> >
> >..those two out does seem to have stabilized the thing.
> 
> timers-do-not-raise-softirq-unconditionally.patch is on its way out.
> 
> rtmutex-use-a-trylock-for-waiter-lock-in-trylock.patch confues me.
> Didn't you report once that your box deadlocks without this patch? Now
> your 64way box on the other hand does not work with it?

If 'do not raise' is applied, 'use a trylock' won't save you.  If 'do
not raise' is not applied, _and_ you wisely do not try to turn on very
expensive nohz_full, things work fine without 'use a trylock'.

-Mike

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] rcu: Eliminate softirq processing from rcutree

2014-01-17 Thread Sebastian Andrzej Siewior

* Mike Galbraith | 2013-12-26 11:03:32 [+0100]:

>On Wed, 2013-12-25 at 04:07 +0100, Mike Galbraith wrote:
>> On Tue, 2013-12-24 at 11:36 -0800, Paul E. McKenney wrote: 
>
>> > So which code do you think deserves the big lump of coal?  ;-)
>> 
>> Sebastian's NO_HZ_FULL locking fixes.
>
>Whack-a-mole hasn't yet dug up any new moles.
>
>---
> kernel/timer.c |4 
> 1 file changed, 4 insertions(+)
>
>Index: linux-2.6/kernel/timer.c
>===
>--- linux-2.6.orig/kernel/timer.c
>+++ linux-2.6/kernel/timer.c
>@@ -764,7 +764,9 @@ __mod_timer(struct timer_list *timer, un
>   timer_stats_timer_set_start_info(timer);
>   BUG_ON(!timer->function);
> 
>+  local_irq_disable_rt();
>   base = lock_timer_base(timer, );
>+  local_irq_enable_rt();
> 
>   ret = detach_if_pending(timer, base, false);
>   if (!ret && pending_only)
>@@ -1198,7 +1200,9 @@ static inline void __run_timers(struct t
> {
>   struct timer_list *timer;
> 
>+  local_irq_disable_rt();
>   spin_lock_irq(>lock);
>+  local_irq_enable_rt();
>   while (time_after_eq(jiffies, base->timer_jiffies)) {
>   struct list_head work_list;
>   struct list_head *head = _list;
>---
> kernel/time/tick-sched.c |2 ++
> 1 file changed, 2 insertions(+)

So I had rtmutex-take-the-waiter-lock-with-irqs-off.patch in my queue
which took the waiter lock with irqs off. This should be the same thing
you try do here.

>Index: linux-2.6/kernel/time/tick-sched.c
>===
>--- linux-2.6.orig/kernel/time/tick-sched.c
>+++ linux-2.6/kernel/time/tick-sched.c
>@@ -216,7 +216,9 @@ void __tick_nohz_full_check(void)
> 
> static void nohz_full_kick_work_func(struct irq_work *work)
> {
>+  local_irq_disable_rt();
>   __tick_nohz_full_check();
>+  local_irq_enable_rt();
> }

and this should be fixed differently. Since we come from a thread and
check "is current running" but by current we mean a user task and not a
kernel thread.

> 
> static DEFINE_PER_CPU(struct irq_work, nohz_full_kick_work) = {
>

Sebastian
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] rcu: Eliminate softirq processing from rcutree

2014-01-17 Thread Sebastian Andrzej Siewior

* Mike Galbraith | 2013-12-25 18:37:37 [+0100]:

>On Tue, 2013-12-24 at 23:55 -0800, Paul E. McKenney wrote: 
>> On Wed, Dec 25, 2013 at 04:07:34AM +0100, Mike Galbraith wrote:
>
>Having sufficiently recovered from turkey overdose to be able to slither
>upstairs (bump bump bump) to check on the box, commenting..
>
># timers-do-not-raise-softirq-unconditionally.patch
># rtmutex-use-a-trylock-for-waiter-lock-in-trylock.patch
>
>..those two out does seem to have stabilized the thing.

timers-do-not-raise-softirq-unconditionally.patch is on its way out.

rtmutex-use-a-trylock-for-waiter-lock-in-trylock.patch confues me.
Didn't you report once that your box deadlocks without this patch? Now
your 64way box on the other hand does not work with it?

>Merry Christmasss,
>
>-Mike

Sebastian
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] rcu: Eliminate softirq processing from rcutree

2014-01-17 Thread Sebastian Andrzej Siewior

* Mike Galbraith | 2013-12-25 18:37:37 [+0100]:

On Tue, 2013-12-24 at 23:55 -0800, Paul E. McKenney wrote: 
 On Wed, Dec 25, 2013 at 04:07:34AM +0100, Mike Galbraith wrote:

Having sufficiently recovered from turkey overdose to be able to slither
upstairs (bump bump bump) to check on the box, commenting..

# timers-do-not-raise-softirq-unconditionally.patch
# rtmutex-use-a-trylock-for-waiter-lock-in-trylock.patch

..those two out does seem to have stabilized the thing.

timers-do-not-raise-softirq-unconditionally.patch is on its way out.

rtmutex-use-a-trylock-for-waiter-lock-in-trylock.patch confues me.
Didn't you report once that your box deadlocks without this patch? Now
your 64way box on the other hand does not work with it?

Merry Christmasss,

-Mike

Sebastian
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] rcu: Eliminate softirq processing from rcutree

2014-01-17 Thread Sebastian Andrzej Siewior

* Mike Galbraith | 2013-12-26 11:03:32 [+0100]:

On Wed, 2013-12-25 at 04:07 +0100, Mike Galbraith wrote:
 On Tue, 2013-12-24 at 11:36 -0800, Paul E. McKenney wrote: 

  So which code do you think deserves the big lump of coal?  ;-)
 
 Sebastian's NO_HZ_FULL locking fixes.

Whack-a-mole hasn't yet dug up any new moles.

---
 kernel/timer.c |4 
 1 file changed, 4 insertions(+)

Index: linux-2.6/kernel/timer.c
===
--- linux-2.6.orig/kernel/timer.c
+++ linux-2.6/kernel/timer.c
@@ -764,7 +764,9 @@ __mod_timer(struct timer_list *timer, un
   timer_stats_timer_set_start_info(timer);
   BUG_ON(!timer-function);
 
+  local_irq_disable_rt();
   base = lock_timer_base(timer, flags);
+  local_irq_enable_rt();
 
   ret = detach_if_pending(timer, base, false);
   if (!ret  pending_only)
@@ -1198,7 +1200,9 @@ static inline void __run_timers(struct t
 {
   struct timer_list *timer;
 
+  local_irq_disable_rt();
   spin_lock_irq(base-lock);
+  local_irq_enable_rt();
   while (time_after_eq(jiffies, base-timer_jiffies)) {
   struct list_head work_list;
   struct list_head *head = work_list;
---
 kernel/time/tick-sched.c |2 ++
 1 file changed, 2 insertions(+)

So I had rtmutex-take-the-waiter-lock-with-irqs-off.patch in my queue
which took the waiter lock with irqs off. This should be the same thing
you try do here.

Index: linux-2.6/kernel/time/tick-sched.c
===
--- linux-2.6.orig/kernel/time/tick-sched.c
+++ linux-2.6/kernel/time/tick-sched.c
@@ -216,7 +216,9 @@ void __tick_nohz_full_check(void)
 
 static void nohz_full_kick_work_func(struct irq_work *work)
 {
+  local_irq_disable_rt();
   __tick_nohz_full_check();
+  local_irq_enable_rt();
 }

and this should be fixed differently. Since we come from a thread and
check is current running but by current we mean a user task and not a
kernel thread.

 
 static DEFINE_PER_CPU(struct irq_work, nohz_full_kick_work) = {


Sebastian
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] rcu: Eliminate softirq processing from rcutree

2014-01-17 Thread Mike Galbraith

On Fri, 2014-01-17 at 18:14 +0100, Sebastian Andrzej Siewior wrote: 
 * Mike Galbraith | 2013-12-25 18:37:37 [+0100]:
 
 On Tue, 2013-12-24 at 23:55 -0800, Paul E. McKenney wrote: 
  On Wed, Dec 25, 2013 at 04:07:34AM +0100, Mike Galbraith wrote:
 
 Having sufficiently recovered from turkey overdose to be able to slither
 upstairs (bump bump bump) to check on the box, commenting..
 
 # timers-do-not-raise-softirq-unconditionally.patch
 # rtmutex-use-a-trylock-for-waiter-lock-in-trylock.patch
 
 ..those two out does seem to have stabilized the thing.
 
 timers-do-not-raise-softirq-unconditionally.patch is on its way out.
 
 rtmutex-use-a-trylock-for-waiter-lock-in-trylock.patch confues me.
 Didn't you report once that your box deadlocks without this patch? Now
 your 64way box on the other hand does not work with it?

If 'do not raise' is applied, 'use a trylock' won't save you.  If 'do
not raise' is not applied, _and_ you wisely do not try to turn on very
expensive nohz_full, things work fine without 'use a trylock'.

-Mike

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] rcu: Eliminate softirq processing from rcutree

2014-01-17 Thread Mike Galbraith

On Fri, 2014-01-17 at 18:23 +0100, Sebastian Andrzej Siewior wrote:

 So I had rtmutex-take-the-waiter-lock-with-irqs-off.patch in my queue
 which took the waiter lock with irqs off. This should be the same thing
 you try do here.

(yeah, these are just whacked mole body bags;)

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] rcu: Eliminate softirq processing from rcutree

2013-12-26 Thread Mike Galbraith

On Wed, 2013-12-25 at 04:07 +0100, Mike Galbraith wrote:
> On Tue, 2013-12-24 at 11:36 -0800, Paul E. McKenney wrote: 

> > So which code do you think deserves the big lump of coal?  ;-)
> 
> Sebastian's NO_HZ_FULL locking fixes.

Whack-a-mole hasn't yet dug up any new moles.

---
 kernel/timer.c |4 
 1 file changed, 4 insertions(+)

Index: linux-2.6/kernel/timer.c
===
--- linux-2.6.orig/kernel/timer.c
+++ linux-2.6/kernel/timer.c
@@ -764,7 +764,9 @@ __mod_timer(struct timer_list *timer, un
timer_stats_timer_set_start_info(timer);
BUG_ON(!timer->function);
 
+   local_irq_disable_rt();
base = lock_timer_base(timer, );
+   local_irq_enable_rt();
 
ret = detach_if_pending(timer, base, false);
if (!ret && pending_only)
@@ -1198,7 +1200,9 @@ static inline void __run_timers(struct t
 {
struct timer_list *timer;
 
+   local_irq_disable_rt();
spin_lock_irq(>lock);
+   local_irq_enable_rt();
while (time_after_eq(jiffies, base->timer_jiffies)) {
struct list_head work_list;
struct list_head *head = _list;
---
 kernel/time/tick-sched.c |2 ++
 1 file changed, 2 insertions(+)

Index: linux-2.6/kernel/time/tick-sched.c
===
--- linux-2.6.orig/kernel/time/tick-sched.c
+++ linux-2.6/kernel/time/tick-sched.c
@@ -216,7 +216,9 @@ void __tick_nohz_full_check(void)
 
 static void nohz_full_kick_work_func(struct irq_work *work)
 {
+   local_irq_disable_rt();
__tick_nohz_full_check();
+   local_irq_enable_rt();
 }
 
 static DEFINE_PER_CPU(struct irq_work, nohz_full_kick_work) = {



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] rcu: Eliminate softirq processing from rcutree

2013-12-26 Thread Mike Galbraith

On Wed, 2013-12-25 at 04:07 +0100, Mike Galbraith wrote:
 On Tue, 2013-12-24 at 11:36 -0800, Paul E. McKenney wrote: 

  So which code do you think deserves the big lump of coal?  ;-)
 
 Sebastian's NO_HZ_FULL locking fixes.

Whack-a-mole hasn't yet dug up any new moles.

---
 kernel/timer.c |4 
 1 file changed, 4 insertions(+)

Index: linux-2.6/kernel/timer.c
===
--- linux-2.6.orig/kernel/timer.c
+++ linux-2.6/kernel/timer.c
@@ -764,7 +764,9 @@ __mod_timer(struct timer_list *timer, un
timer_stats_timer_set_start_info(timer);
BUG_ON(!timer-function);
 
+   local_irq_disable_rt();
base = lock_timer_base(timer, flags);
+   local_irq_enable_rt();
 
ret = detach_if_pending(timer, base, false);
if (!ret  pending_only)
@@ -1198,7 +1200,9 @@ static inline void __run_timers(struct t
 {
struct timer_list *timer;
 
+   local_irq_disable_rt();
spin_lock_irq(base-lock);
+   local_irq_enable_rt();
while (time_after_eq(jiffies, base-timer_jiffies)) {
struct list_head work_list;
struct list_head *head = work_list;
---
 kernel/time/tick-sched.c |2 ++
 1 file changed, 2 insertions(+)

Index: linux-2.6/kernel/time/tick-sched.c
===
--- linux-2.6.orig/kernel/time/tick-sched.c
+++ linux-2.6/kernel/time/tick-sched.c
@@ -216,7 +216,9 @@ void __tick_nohz_full_check(void)
 
 static void nohz_full_kick_work_func(struct irq_work *work)
 {
+   local_irq_disable_rt();
__tick_nohz_full_check();
+   local_irq_enable_rt();
 }
 
 static DEFINE_PER_CPU(struct irq_work, nohz_full_kick_work) = {



--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] rcu: Eliminate softirq processing from rcutree

2013-12-25 Thread Mike Galbraith

On Tue, 2013-12-24 at 23:55 -0800, Paul E. McKenney wrote: 
> On Wed, Dec 25, 2013 at 04:07:34AM +0100, Mike Galbraith wrote:

> > > So which code do you think deserves the big lump of coal?  ;-)
> > 
> > Sebastian's NO_HZ_FULL locking fixes.  Locking is hard, and rt sure
> > doesn't make it any easier, so lets give him a cookie or three to nibble
> > on while he ponders that trylock stuff again instead :)
> 
> Fair enough.  Does Sebastian prefer milk and cookies or the other
> tradition of beer and a cigar?  ;-)

Having sufficiently recovered from turkey overdose to be able to slither
upstairs (bump bump bump) to check on the box, commenting..

# timers-do-not-raise-softirq-unconditionally.patch
# rtmutex-use-a-trylock-for-waiter-lock-in-trylock.patch

..those two out does seem to have stabilized the thing.

Merry Christmasss,

-Mike

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] rcu: Eliminate softirq processing from rcutree

2013-12-25 Thread Mike Galbraith

On Tue, 2013-12-24 at 23:55 -0800, Paul E. McKenney wrote: 
 On Wed, Dec 25, 2013 at 04:07:34AM +0100, Mike Galbraith wrote:

   So which code do you think deserves the big lump of coal?  ;-)
  
  Sebastian's NO_HZ_FULL locking fixes.  Locking is hard, and rt sure
  doesn't make it any easier, so lets give him a cookie or three to nibble
  on while he ponders that trylock stuff again instead :)
 
 Fair enough.  Does Sebastian prefer milk and cookies or the other
 tradition of beer and a cigar?  ;-)

Having sufficiently recovered from turkey overdose to be able to slither
upstairs (bump bump bump) to check on the box, commenting..

# timers-do-not-raise-softirq-unconditionally.patch
# rtmutex-use-a-trylock-for-waiter-lock-in-trylock.patch

..those two out does seem to have stabilized the thing.

Merry Christmasss,

-Mike

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] rcu: Eliminate softirq processing from rcutree

2013-12-24 Thread Paul E. McKenney

On Wed, Dec 25, 2013 at 04:07:34AM +0100, Mike Galbraith wrote:
> On Tue, 2013-12-24 at 11:36 -0800, Paul E. McKenney wrote: 
> > On Mon, Dec 23, 2013 at 05:38:53AM +0100, Mike Galbraith wrote:
> > > On Sun, 2013-12-22 at 09:57 +0100, Mike Galbraith wrote: 
> > > > I'll let the box give
> > > > RCU something to do for a couple days.  No news is good news.
> > > 
> > > Ho ho hum, merry christmas, gift attached.
> > 
> > Hmmm...  I guess I should take a moment to work out who has been
> > naughty and nice...
> > 
> > > I'll beat on virgin -rt7, see if it survives, then re-apply RCU patch
> > > and retest.  This kernel had nohz_full enabled, along with Sebastian's
> > > pending -rt fix for same, so RCU patch was not only not running solo,
> > > box was running a known somewhat buggy config as well.  Box was doing
> > > endless tbench 64 when it started stalling fwiw.
> > 
> > [72788.040872] NMI backtrace for cpu 31
> > [72788.040874] CPU: 31 PID: 13975 Comm: tbench Tainted: GW
> > 3.12.6-rt7-nohz #192
> > [72788.040874] Hardware name: Hewlett-Packard ProLiant DL980 G7, BIOS P66 
> > 07/07/2010
> > [72788.040875] task: 8802728e3db0 ti: 88026deb2000 task.ti: 
> > 88026deb2000
> > [72788.040877] RIP: 0010:[]  [] 
> > _raw_spin_trylock+0x14/0x80
> > [72788.040878] RSP: 0018:8802769e3e58  EFLAGS: 0002
> > [72788.040879] RAX: 88026deb3fd8 RBX: 880273544000 RCX: 
> > 7bc87bc6
> > [72788.040879] RDX:  RSI: 8802728e3db0 RDI: 
> > 880273544000
> > [72788.040880] RBP: 88026deb39f8 R08: 063c14effd0f R09: 
> > 0119
> > [72788.040881] R10: 0005 R11: 8802769f2260 R12: 
> > 8802728e3db0
> > [72788.040881] R13: 001f R14: 8802769ebcc0 R15: 
> > 810c4730
> > [72788.040883] FS:  7f7cd380a700() GS:8802769e() 
> > knlGS:
> > [72788.040883] CS:  0010 DS:  ES:  CR0: 80050033
> > [72788.040884] CR2: 0068a0e8 CR3: 000267ba4000 CR4: 
> > 07e0
> > [72788.040885] Stack:
> > [72788.040886]  88026deb39f8 815e2aa0  
> > 8106711a
> > [72788.040887]  8802769ec4e0 8802769ec4e0 8802769e3f58 
> > 810c44bd
> > [72788.040888]  88026deb39f8 88026deb39f8 15ed4f5ff89b 
> > 810c476e
> > [72788.040889] Call Trace:
> > [72788.040889]   
> > [72788.040891]  [] ? 
> > rt_spin_lock_slowunlock_hirq+0x10/0x20
> > [72788.040893]  [] ? update_process_times+0x3a/0x60
> > [72788.040895]  [] ? tick_sched_handle+0x2d/0x70
> > [72788.040896]  [] ? tick_sched_timer+0x3e/0x70
> > [72788.040898]  [] ? __run_hrtimer+0x13d/0x260
> > [72788.040900]  [] ? hrtimer_interrupt+0x12c/0x310
> > [72788.040901]  [] ? vtime_account_system+0x4e/0xf0
> > [72788.040903]  [] ? smp_apic_timer_interrupt+0x36/0x50
> > [72788.040904]  [] ? apic_timer_interrupt+0x6d/0x80
> > [72788.040905]   
> > [72788.040906]  [] ? _raw_spin_lock+0x2a/0x40
> > [72788.040908]  [] ? rt_spin_lock_slowlock+0x33/0x2d0
> > [72788.040910]  [] ? migrate_enable+0xc4/0x220
> > [72788.040911]  [] ? ip_finish_output+0x258/0x450
> > [72788.040913]  [] ? lock_timer_base+0x41/0x80
> > [72788.040914]  [] ? mod_timer+0x66/0x290
> > [72788.040916]  [] ? sk_reset_timer+0xf/0x20
> > [72788.040917]  [] ? tcp_write_xmit+0x1cf/0x5d0
> > [72788.040919]  [] ? __tcp_push_pending_frames+0x25/0x60
> > [72788.040921]  [] ? tcp_sendmsg+0x114/0xbb0
> > [72788.040923]  [] ? sock_sendmsg+0xaf/0xf0
> > [72788.040925]  [] ? touch_atime+0x65/0x150
> > [72788.040927]  [] ? SyS_sendto+0x118/0x190
> > [72788.040929]  [] ? vtime_account_user+0x66/0x100
> > [72788.040930]  [] ? syscall_trace_enter+0x2a/0x260
> > [72788.040932]  [] ? tracesys+0xdd/0xe2
> > 
> > The most likely suspect is the rt_spin_lock_slowlock() that is apparently
> > being acquired by migrate_enable().  This could be due to:
> > 
> > 1.  Massive contention on that lock.
> > 
> > 2.  Someone else holding that lock for excessive time periods.
> > Evidence in favor: CPU 0 appears to be running within
> > migrate_enable().  But isn't migrate_enable() really quite
> > lightweight?
> > 
> > 3.  Possible looping in the networking stack -- but this seems
> > unlikely given that we appear to have caught a lock acquisition
> > in the act.  (Not impossible, however, if there are lots of
> > migrate_enable() calls in the networking stack, which there
> > are due to all the per-CPU work.)
> > 
> > So which code do you think deserves the big lump of coal?  ;-)
> 
> Sebastian's NO_HZ_FULL locking fixes.  Locking is hard, and rt sure
> doesn't make it any easier, so lets give him a cookie or three to nibble
> on while he ponders that trylock stuff again instead :)

Fair enough.  Does Sebastian prefer milk and cookies or the other
tradition of beer and a cigar?  ;-)

Thanx, Paul

--
To unsubscribe from this list: send the

Re: [PATCH] rcu: Eliminate softirq processing from rcutree

2013-12-24 Thread Mike Galbraith

On Tue, 2013-12-24 at 11:36 -0800, Paul E. McKenney wrote: 
> On Mon, Dec 23, 2013 at 05:38:53AM +0100, Mike Galbraith wrote:
> > On Sun, 2013-12-22 at 09:57 +0100, Mike Galbraith wrote: 
> > > I'll let the box give
> > > RCU something to do for a couple days.  No news is good news.
> > 
> > Ho ho hum, merry christmas, gift attached.
> 
> Hmmm...  I guess I should take a moment to work out who has been
> naughty and nice...
> 
> > I'll beat on virgin -rt7, see if it survives, then re-apply RCU patch
> > and retest.  This kernel had nohz_full enabled, along with Sebastian's
> > pending -rt fix for same, so RCU patch was not only not running solo,
> > box was running a known somewhat buggy config as well.  Box was doing
> > endless tbench 64 when it started stalling fwiw.
> 
> [72788.040872] NMI backtrace for cpu 31
> [72788.040874] CPU: 31 PID: 13975 Comm: tbench Tainted: GW
> 3.12.6-rt7-nohz #192
> [72788.040874] Hardware name: Hewlett-Packard ProLiant DL980 G7, BIOS P66 
> 07/07/2010
> [72788.040875] task: 8802728e3db0 ti: 88026deb2000 task.ti: 
> 88026deb2000
> [72788.040877] RIP: 0010:[]  [] 
> _raw_spin_trylock+0x14/0x80
> [72788.040878] RSP: 0018:8802769e3e58  EFLAGS: 0002
> [72788.040879] RAX: 88026deb3fd8 RBX: 880273544000 RCX: 
> 7bc87bc6
> [72788.040879] RDX:  RSI: 8802728e3db0 RDI: 
> 880273544000
> [72788.040880] RBP: 88026deb39f8 R08: 063c14effd0f R09: 
> 0119
> [72788.040881] R10: 0005 R11: 8802769f2260 R12: 
> 8802728e3db0
> [72788.040881] R13: 001f R14: 8802769ebcc0 R15: 
> 810c4730
> [72788.040883] FS:  7f7cd380a700() GS:8802769e() 
> knlGS:
> [72788.040883] CS:  0010 DS:  ES:  CR0: 80050033
> [72788.040884] CR2: 0068a0e8 CR3: 000267ba4000 CR4: 
> 07e0
> [72788.040885] Stack:
> [72788.040886]  88026deb39f8 815e2aa0  
> 8106711a
> [72788.040887]  8802769ec4e0 8802769ec4e0 8802769e3f58 
> 810c44bd
> [72788.040888]  88026deb39f8 88026deb39f8 15ed4f5ff89b 
> 810c476e
> [72788.040889] Call Trace:
> [72788.040889]   
> [72788.040891]  [] ? rt_spin_lock_slowunlock_hirq+0x10/0x20
> [72788.040893]  [] ? update_process_times+0x3a/0x60
> [72788.040895]  [] ? tick_sched_handle+0x2d/0x70
> [72788.040896]  [] ? tick_sched_timer+0x3e/0x70
> [72788.040898]  [] ? __run_hrtimer+0x13d/0x260
> [72788.040900]  [] ? hrtimer_interrupt+0x12c/0x310
> [72788.040901]  [] ? vtime_account_system+0x4e/0xf0
> [72788.040903]  [] ? smp_apic_timer_interrupt+0x36/0x50
> [72788.040904]  [] ? apic_timer_interrupt+0x6d/0x80
> [72788.040905]   
> [72788.040906]  [] ? _raw_spin_lock+0x2a/0x40
> [72788.040908]  [] ? rt_spin_lock_slowlock+0x33/0x2d0
> [72788.040910]  [] ? migrate_enable+0xc4/0x220
> [72788.040911]  [] ? ip_finish_output+0x258/0x450
> [72788.040913]  [] ? lock_timer_base+0x41/0x80
> [72788.040914]  [] ? mod_timer+0x66/0x290
> [72788.040916]  [] ? sk_reset_timer+0xf/0x20
> [72788.040917]  [] ? tcp_write_xmit+0x1cf/0x5d0
> [72788.040919]  [] ? __tcp_push_pending_frames+0x25/0x60
> [72788.040921]  [] ? tcp_sendmsg+0x114/0xbb0
> [72788.040923]  [] ? sock_sendmsg+0xaf/0xf0
> [72788.040925]  [] ? touch_atime+0x65/0x150
> [72788.040927]  [] ? SyS_sendto+0x118/0x190
> [72788.040929]  [] ? vtime_account_user+0x66/0x100
> [72788.040930]  [] ? syscall_trace_enter+0x2a/0x260
> [72788.040932]  [] ? tracesys+0xdd/0xe2
> 
> The most likely suspect is the rt_spin_lock_slowlock() that is apparently
> being acquired by migrate_enable().  This could be due to:
> 
> 1.Massive contention on that lock.
> 
> 2.Someone else holding that lock for excessive time periods.
>   Evidence in favor: CPU 0 appears to be running within
>   migrate_enable().  But isn't migrate_enable() really quite
>   lightweight?
> 
> 3.Possible looping in the networking stack -- but this seems
>   unlikely given that we appear to have caught a lock acquisition
>   in the act.  (Not impossible, however, if there are lots of
>   migrate_enable() calls in the networking stack, which there
>   are due to all the per-CPU work.)
> 
> So which code do you think deserves the big lump of coal?  ;-)

Sebastian's NO_HZ_FULL locking fixes.  Locking is hard, and rt sure
doesn't make it any easier, so lets give him a cookie or three to nibble
on while he ponders that trylock stuff again instead :)

-Mike

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] rcu: Eliminate softirq processing from rcutree

2013-12-24 Thread Paul E. McKenney

On Mon, Dec 23, 2013 at 05:38:53AM +0100, Mike Galbraith wrote:
> On Sun, 2013-12-22 at 09:57 +0100, Mike Galbraith wrote: 
> > I'll let the box give
> > RCU something to do for a couple days.  No news is good news.
> 
> Ho ho hum, merry christmas, gift attached.

Hmmm...  I guess I should take a moment to work out who has been
naughty and nice...

> I'll beat on virgin -rt7, see if it survives, then re-apply RCU patch
> and retest.  This kernel had nohz_full enabled, along with Sebastian's
> pending -rt fix for same, so RCU patch was not only not running solo,
> box was running a known somewhat buggy config as well.  Box was doing
> endless tbench 64 when it started stalling fwiw.

[72788.040872] NMI backtrace for cpu 31
[72788.040874] CPU: 31 PID: 13975 Comm: tbench Tainted: GW
3.12.6-rt7-nohz #192
[72788.040874] Hardware name: Hewlett-Packard ProLiant DL980 G7, BIOS P66 
07/07/2010
[72788.040875] task: 8802728e3db0 ti: 88026deb2000 task.ti: 
88026deb2000
[72788.040877] RIP: 0010:[]  [] 
_raw_spin_trylock+0x14/0x80
[72788.040878] RSP: 0018:8802769e3e58  EFLAGS: 0002
[72788.040879] RAX: 88026deb3fd8 RBX: 880273544000 RCX: 7bc87bc6
[72788.040879] RDX:  RSI: 8802728e3db0 RDI: 880273544000
[72788.040880] RBP: 88026deb39f8 R08: 063c14effd0f R09: 0119
[72788.040881] R10: 0005 R11: 8802769f2260 R12: 8802728e3db0
[72788.040881] R13: 001f R14: 8802769ebcc0 R15: 810c4730
[72788.040883] FS:  7f7cd380a700() GS:8802769e() 
knlGS:
[72788.040883] CS:  0010 DS:  ES:  CR0: 80050033
[72788.040884] CR2: 0068a0e8 CR3: 000267ba4000 CR4: 07e0
[72788.040885] Stack:
[72788.040886]  88026deb39f8 815e2aa0  
8106711a
[72788.040887]  8802769ec4e0 8802769ec4e0 8802769e3f58 
810c44bd
[72788.040888]  88026deb39f8 88026deb39f8 15ed4f5ff89b 
810c476e
[72788.040889] Call Trace:
[72788.040889]   
[72788.040891]  [] ? rt_spin_lock_slowunlock_hirq+0x10/0x20
[72788.040893]  [] ? update_process_times+0x3a/0x60
[72788.040895]  [] ? tick_sched_handle+0x2d/0x70
[72788.040896]  [] ? tick_sched_timer+0x3e/0x70
[72788.040898]  [] ? __run_hrtimer+0x13d/0x260
[72788.040900]  [] ? hrtimer_interrupt+0x12c/0x310
[72788.040901]  [] ? vtime_account_system+0x4e/0xf0
[72788.040903]  [] ? smp_apic_timer_interrupt+0x36/0x50
[72788.040904]  [] ? apic_timer_interrupt+0x6d/0x80
[72788.040905]   
[72788.040906]  [] ? _raw_spin_lock+0x2a/0x40
[72788.040908]  [] ? rt_spin_lock_slowlock+0x33/0x2d0
[72788.040910]  [] ? migrate_enable+0xc4/0x220
[72788.040911]  [] ? ip_finish_output+0x258/0x450
[72788.040913]  [] ? lock_timer_base+0x41/0x80
[72788.040914]  [] ? mod_timer+0x66/0x290
[72788.040916]  [] ? sk_reset_timer+0xf/0x20
[72788.040917]  [] ? tcp_write_xmit+0x1cf/0x5d0
[72788.040919]  [] ? __tcp_push_pending_frames+0x25/0x60
[72788.040921]  [] ? tcp_sendmsg+0x114/0xbb0
[72788.040923]  [] ? sock_sendmsg+0xaf/0xf0
[72788.040925]  [] ? touch_atime+0x65/0x150
[72788.040927]  [] ? SyS_sendto+0x118/0x190
[72788.040929]  [] ? vtime_account_user+0x66/0x100
[72788.040930]  [] ? syscall_trace_enter+0x2a/0x260
[72788.040932]  [] ? tracesys+0xdd/0xe2

The most likely suspect is the rt_spin_lock_slowlock() that is apparently
being acquired by migrate_enable().  This could be due to:

1.  Massive contention on that lock.

2.  Someone else holding that lock for excessive time periods.
Evidence in favor: CPU 0 appears to be running within
migrate_enable().  But isn't migrate_enable() really quite
lightweight?

3.  Possible looping in the networking stack -- but this seems
unlikely given that we appear to have caught a lock acquisition
in the act.  (Not impossible, however, if there are lots of
migrate_enable() calls in the networking stack, which there
are due to all the per-CPU work.)

So which code do you think deserves the big lump of coal?  ;-)

Thanx, Paul

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] rcu: Eliminate softirq processing from rcutree

2013-12-24 Thread Paul E. McKenney

On Mon, Dec 23, 2013 at 05:38:53AM +0100, Mike Galbraith wrote:
 On Sun, 2013-12-22 at 09:57 +0100, Mike Galbraith wrote: 
  I'll let the box give
  RCU something to do for a couple days.  No news is good news.
 
 Ho ho hum, merry christmas, gift attached.

Hmmm...  I guess I should take a moment to work out who has been
naughty and nice...

 I'll beat on virgin -rt7, see if it survives, then re-apply RCU patch
 and retest.  This kernel had nohz_full enabled, along with Sebastian's
 pending -rt fix for same, so RCU patch was not only not running solo,
 box was running a known somewhat buggy config as well.  Box was doing
 endless tbench 64 when it started stalling fwiw.

[72788.040872] NMI backtrace for cpu 31
[72788.040874] CPU: 31 PID: 13975 Comm: tbench Tainted: GW
3.12.6-rt7-nohz #192
[72788.040874] Hardware name: Hewlett-Packard ProLiant DL980 G7, BIOS P66 
07/07/2010
[72788.040875] task: 8802728e3db0 ti: 88026deb2000 task.ti: 
88026deb2000
[72788.040877] RIP: 0010:[815e34e4]  [815e34e4] 
_raw_spin_trylock+0x14/0x80
[72788.040878] RSP: 0018:8802769e3e58  EFLAGS: 0002
[72788.040879] RAX: 88026deb3fd8 RBX: 880273544000 RCX: 7bc87bc6
[72788.040879] RDX:  RSI: 8802728e3db0 RDI: 880273544000
[72788.040880] RBP: 88026deb39f8 R08: 063c14effd0f R09: 0119
[72788.040881] R10: 0005 R11: 8802769f2260 R12: 8802728e3db0
[72788.040881] R13: 001f R14: 8802769ebcc0 R15: 810c4730
[72788.040883] FS:  7f7cd380a700() GS:8802769e() 
knlGS:
[72788.040883] CS:  0010 DS:  ES:  CR0: 80050033
[72788.040884] CR2: 0068a0e8 CR3: 000267ba4000 CR4: 07e0
[72788.040885] Stack:
[72788.040886]  88026deb39f8 815e2aa0  
8106711a
[72788.040887]  8802769ec4e0 8802769ec4e0 8802769e3f58 
810c44bd
[72788.040888]  88026deb39f8 88026deb39f8 15ed4f5ff89b 
810c476e
[72788.040889] Call Trace:
[72788.040889]  IRQ 
[72788.040891]  [815e2aa0] ? rt_spin_lock_slowunlock_hirq+0x10/0x20
[72788.040893]  [8106711a] ? update_process_times+0x3a/0x60
[72788.040895]  [810c44bd] ? tick_sched_handle+0x2d/0x70
[72788.040896]  [810c476e] ? tick_sched_timer+0x3e/0x70
[72788.040898]  [810839dd] ? __run_hrtimer+0x13d/0x260
[72788.040900]  [81083c2c] ? hrtimer_interrupt+0x12c/0x310
[72788.040901]  [8109593e] ? vtime_account_system+0x4e/0xf0
[72788.040903]  [81035656] ? smp_apic_timer_interrupt+0x36/0x50
[72788.040904]  [815ebc9d] ? apic_timer_interrupt+0x6d/0x80
[72788.040905]  EOI 
[72788.040906]  [815e338a] ? _raw_spin_lock+0x2a/0x40
[72788.040908]  [815e23b3] ? rt_spin_lock_slowlock+0x33/0x2d0
[72788.040910]  [8108ee44] ? migrate_enable+0xc4/0x220
[72788.040911]  [8152f888] ? ip_finish_output+0x258/0x450
[72788.040913]  [81067011] ? lock_timer_base+0x41/0x80
[72788.040914]  [81068db6] ? mod_timer+0x66/0x290
[72788.040916]  [814df02f] ? sk_reset_timer+0xf/0x20
[72788.040917]  [81547d7f] ? tcp_write_xmit+0x1cf/0x5d0
[72788.040919]  [815481e5] ? __tcp_push_pending_frames+0x25/0x60
[72788.040921]  [81539e34] ? tcp_sendmsg+0x114/0xbb0
[72788.040923]  [814dbc1f] ? sock_sendmsg+0xaf/0xf0
[72788.040925]  [811bf5e5] ? touch_atime+0x65/0x150
[72788.040927]  [814dbd78] ? SyS_sendto+0x118/0x190
[72788.040929]  [81095b66] ? vtime_account_user+0x66/0x100
[72788.040930]  [8100f36a] ? syscall_trace_enter+0x2a/0x260
[72788.040932]  [815eb249] ? tracesys+0xdd/0xe2

The most likely suspect is the rt_spin_lock_slowlock() that is apparently
being acquired by migrate_enable().  This could be due to:

1.  Massive contention on that lock.

2.  Someone else holding that lock for excessive time periods.
Evidence in favor: CPU 0 appears to be running within
migrate_enable().  But isn't migrate_enable() really quite
lightweight?

3.  Possible looping in the networking stack -- but this seems
unlikely given that we appear to have caught a lock acquisition
in the act.  (Not impossible, however, if there are lots of
migrate_enable() calls in the networking stack, which there
are due to all the per-CPU work.)

So which code do you think deserves the big lump of coal?  ;-)

Thanx, Paul

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] rcu: Eliminate softirq processing from rcutree

2013-12-24 Thread Mike Galbraith

On Tue, 2013-12-24 at 11:36 -0800, Paul E. McKenney wrote: 
 On Mon, Dec 23, 2013 at 05:38:53AM +0100, Mike Galbraith wrote:
  On Sun, 2013-12-22 at 09:57 +0100, Mike Galbraith wrote: 
   I'll let the box give
   RCU something to do for a couple days.  No news is good news.
  
  Ho ho hum, merry christmas, gift attached.
 
 Hmmm...  I guess I should take a moment to work out who has been
 naughty and nice...
 
  I'll beat on virgin -rt7, see if it survives, then re-apply RCU patch
  and retest.  This kernel had nohz_full enabled, along with Sebastian's
  pending -rt fix for same, so RCU patch was not only not running solo,
  box was running a known somewhat buggy config as well.  Box was doing
  endless tbench 64 when it started stalling fwiw.
 
 [72788.040872] NMI backtrace for cpu 31
 [72788.040874] CPU: 31 PID: 13975 Comm: tbench Tainted: GW
 3.12.6-rt7-nohz #192
 [72788.040874] Hardware name: Hewlett-Packard ProLiant DL980 G7, BIOS P66 
 07/07/2010
 [72788.040875] task: 8802728e3db0 ti: 88026deb2000 task.ti: 
 88026deb2000
 [72788.040877] RIP: 0010:[815e34e4]  [815e34e4] 
 _raw_spin_trylock+0x14/0x80
 [72788.040878] RSP: 0018:8802769e3e58  EFLAGS: 0002
 [72788.040879] RAX: 88026deb3fd8 RBX: 880273544000 RCX: 
 7bc87bc6
 [72788.040879] RDX:  RSI: 8802728e3db0 RDI: 
 880273544000
 [72788.040880] RBP: 88026deb39f8 R08: 063c14effd0f R09: 
 0119
 [72788.040881] R10: 0005 R11: 8802769f2260 R12: 
 8802728e3db0
 [72788.040881] R13: 001f R14: 8802769ebcc0 R15: 
 810c4730
 [72788.040883] FS:  7f7cd380a700() GS:8802769e() 
 knlGS:
 [72788.040883] CS:  0010 DS:  ES:  CR0: 80050033
 [72788.040884] CR2: 0068a0e8 CR3: 000267ba4000 CR4: 
 07e0
 [72788.040885] Stack:
 [72788.040886]  88026deb39f8 815e2aa0  
 8106711a
 [72788.040887]  8802769ec4e0 8802769ec4e0 8802769e3f58 
 810c44bd
 [72788.040888]  88026deb39f8 88026deb39f8 15ed4f5ff89b 
 810c476e
 [72788.040889] Call Trace:
 [72788.040889]  IRQ 
 [72788.040891]  [815e2aa0] ? rt_spin_lock_slowunlock_hirq+0x10/0x20
 [72788.040893]  [8106711a] ? update_process_times+0x3a/0x60
 [72788.040895]  [810c44bd] ? tick_sched_handle+0x2d/0x70
 [72788.040896]  [810c476e] ? tick_sched_timer+0x3e/0x70
 [72788.040898]  [810839dd] ? __run_hrtimer+0x13d/0x260
 [72788.040900]  [81083c2c] ? hrtimer_interrupt+0x12c/0x310
 [72788.040901]  [8109593e] ? vtime_account_system+0x4e/0xf0
 [72788.040903]  [81035656] ? smp_apic_timer_interrupt+0x36/0x50
 [72788.040904]  [815ebc9d] ? apic_timer_interrupt+0x6d/0x80
 [72788.040905]  EOI 
 [72788.040906]  [815e338a] ? _raw_spin_lock+0x2a/0x40
 [72788.040908]  [815e23b3] ? rt_spin_lock_slowlock+0x33/0x2d0
 [72788.040910]  [8108ee44] ? migrate_enable+0xc4/0x220
 [72788.040911]  [8152f888] ? ip_finish_output+0x258/0x450
 [72788.040913]  [81067011] ? lock_timer_base+0x41/0x80
 [72788.040914]  [81068db6] ? mod_timer+0x66/0x290
 [72788.040916]  [814df02f] ? sk_reset_timer+0xf/0x20
 [72788.040917]  [81547d7f] ? tcp_write_xmit+0x1cf/0x5d0
 [72788.040919]  [815481e5] ? __tcp_push_pending_frames+0x25/0x60
 [72788.040921]  [81539e34] ? tcp_sendmsg+0x114/0xbb0
 [72788.040923]  [814dbc1f] ? sock_sendmsg+0xaf/0xf0
 [72788.040925]  [811bf5e5] ? touch_atime+0x65/0x150
 [72788.040927]  [814dbd78] ? SyS_sendto+0x118/0x190
 [72788.040929]  [81095b66] ? vtime_account_user+0x66/0x100
 [72788.040930]  [8100f36a] ? syscall_trace_enter+0x2a/0x260
 [72788.040932]  [815eb249] ? tracesys+0xdd/0xe2
 
 The most likely suspect is the rt_spin_lock_slowlock() that is apparently
 being acquired by migrate_enable().  This could be due to:
 
 1.Massive contention on that lock.
 
 2.Someone else holding that lock for excessive time periods.
   Evidence in favor: CPU 0 appears to be running within
   migrate_enable().  But isn't migrate_enable() really quite
   lightweight?
 
 3.Possible looping in the networking stack -- but this seems
   unlikely given that we appear to have caught a lock acquisition
   in the act.  (Not impossible, however, if there are lots of
   migrate_enable() calls in the networking stack, which there
   are due to all the per-CPU work.)
 
 So which code do you think deserves the big lump of coal?  ;-)

Sebastian's NO_HZ_FULL locking fixes.  Locking is hard, and rt sure
doesn't make it any easier, so lets give him a cookie or three to nibble
on while he ponders that trylock stuff again instead :)

-Mike

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org

Re: [PATCH] rcu: Eliminate softirq processing from rcutree

2013-12-24 Thread Paul E. McKenney

On Wed, Dec 25, 2013 at 04:07:34AM +0100, Mike Galbraith wrote:
 On Tue, 2013-12-24 at 11:36 -0800, Paul E. McKenney wrote: 
  On Mon, Dec 23, 2013 at 05:38:53AM +0100, Mike Galbraith wrote:
   On Sun, 2013-12-22 at 09:57 +0100, Mike Galbraith wrote: 
I'll let the box give
RCU something to do for a couple days.  No news is good news.
   
   Ho ho hum, merry christmas, gift attached.
  
  Hmmm...  I guess I should take a moment to work out who has been
  naughty and nice...
  
   I'll beat on virgin -rt7, see if it survives, then re-apply RCU patch
   and retest.  This kernel had nohz_full enabled, along with Sebastian's
   pending -rt fix for same, so RCU patch was not only not running solo,
   box was running a known somewhat buggy config as well.  Box was doing
   endless tbench 64 when it started stalling fwiw.
  
  [72788.040872] NMI backtrace for cpu 31
  [72788.040874] CPU: 31 PID: 13975 Comm: tbench Tainted: GW
  3.12.6-rt7-nohz #192
  [72788.040874] Hardware name: Hewlett-Packard ProLiant DL980 G7, BIOS P66 
  07/07/2010
  [72788.040875] task: 8802728e3db0 ti: 88026deb2000 task.ti: 
  88026deb2000
  [72788.040877] RIP: 0010:[815e34e4]  [815e34e4] 
  _raw_spin_trylock+0x14/0x80
  [72788.040878] RSP: 0018:8802769e3e58  EFLAGS: 0002
  [72788.040879] RAX: 88026deb3fd8 RBX: 880273544000 RCX: 
  7bc87bc6
  [72788.040879] RDX:  RSI: 8802728e3db0 RDI: 
  880273544000
  [72788.040880] RBP: 88026deb39f8 R08: 063c14effd0f R09: 
  0119
  [72788.040881] R10: 0005 R11: 8802769f2260 R12: 
  8802728e3db0
  [72788.040881] R13: 001f R14: 8802769ebcc0 R15: 
  810c4730
  [72788.040883] FS:  7f7cd380a700() GS:8802769e() 
  knlGS:
  [72788.040883] CS:  0010 DS:  ES:  CR0: 80050033
  [72788.040884] CR2: 0068a0e8 CR3: 000267ba4000 CR4: 
  07e0
  [72788.040885] Stack:
  [72788.040886]  88026deb39f8 815e2aa0  
  8106711a
  [72788.040887]  8802769ec4e0 8802769ec4e0 8802769e3f58 
  810c44bd
  [72788.040888]  88026deb39f8 88026deb39f8 15ed4f5ff89b 
  810c476e
  [72788.040889] Call Trace:
  [72788.040889]  IRQ 
  [72788.040891]  [815e2aa0] ? 
  rt_spin_lock_slowunlock_hirq+0x10/0x20
  [72788.040893]  [8106711a] ? update_process_times+0x3a/0x60
  [72788.040895]  [810c44bd] ? tick_sched_handle+0x2d/0x70
  [72788.040896]  [810c476e] ? tick_sched_timer+0x3e/0x70
  [72788.040898]  [810839dd] ? __run_hrtimer+0x13d/0x260
  [72788.040900]  [81083c2c] ? hrtimer_interrupt+0x12c/0x310
  [72788.040901]  [8109593e] ? vtime_account_system+0x4e/0xf0
  [72788.040903]  [81035656] ? smp_apic_timer_interrupt+0x36/0x50
  [72788.040904]  [815ebc9d] ? apic_timer_interrupt+0x6d/0x80
  [72788.040905]  EOI 
  [72788.040906]  [815e338a] ? _raw_spin_lock+0x2a/0x40
  [72788.040908]  [815e23b3] ? rt_spin_lock_slowlock+0x33/0x2d0
  [72788.040910]  [8108ee44] ? migrate_enable+0xc4/0x220
  [72788.040911]  [8152f888] ? ip_finish_output+0x258/0x450
  [72788.040913]  [81067011] ? lock_timer_base+0x41/0x80
  [72788.040914]  [81068db6] ? mod_timer+0x66/0x290
  [72788.040916]  [814df02f] ? sk_reset_timer+0xf/0x20
  [72788.040917]  [81547d7f] ? tcp_write_xmit+0x1cf/0x5d0
  [72788.040919]  [815481e5] ? __tcp_push_pending_frames+0x25/0x60
  [72788.040921]  [81539e34] ? tcp_sendmsg+0x114/0xbb0
  [72788.040923]  [814dbc1f] ? sock_sendmsg+0xaf/0xf0
  [72788.040925]  [811bf5e5] ? touch_atime+0x65/0x150
  [72788.040927]  [814dbd78] ? SyS_sendto+0x118/0x190
  [72788.040929]  [81095b66] ? vtime_account_user+0x66/0x100
  [72788.040930]  [8100f36a] ? syscall_trace_enter+0x2a/0x260
  [72788.040932]  [815eb249] ? tracesys+0xdd/0xe2
  
  The most likely suspect is the rt_spin_lock_slowlock() that is apparently
  being acquired by migrate_enable().  This could be due to:
  
  1.  Massive contention on that lock.
  
  2.  Someone else holding that lock for excessive time periods.
  Evidence in favor: CPU 0 appears to be running within
  migrate_enable().  But isn't migrate_enable() really quite
  lightweight?
  
  3.  Possible looping in the networking stack -- but this seems
  unlikely given that we appear to have caught a lock acquisition
  in the act.  (Not impossible, however, if there are lots of
  migrate_enable() calls in the networking stack, which there
  are due to all the per-CPU work.)
  
  So which code do you think deserves the big lump of coal?  ;-)
 
 Sebastian's NO_HZ_FULL locking fixes.  Locking is hard, and rt sure
 doesn't make it any easier, so lets give him a cookie or three to nibble
 on while he ponders that trylock stuff again

Re: [PATCH] rcu: Eliminate softirq processing from rcutree

2013-12-22 Thread Mike Galbraith

On Mon, 2013-12-23 at 05:38 +0100, Mike Galbraith wrote: 
> On Sun, 2013-12-22 at 09:57 +0100, Mike Galbraith wrote: 
> > I'll let the box give
> > RCU something to do for a couple days.  No news is good news.
> 
> Ho ho hum, merry christmas, gift attached.
> 
> I'll beat on virgin -rt7, see if it survives, then re-apply RCU patch
> and retest.  This kernel had nohz_full enabled, along with Sebastian's
> pending -rt fix for same, so RCU patch was not only not running solo,
> box was running a known somewhat buggy config as well.  Box was doing
> endless tbench 64 when it started stalling fwiw.
> 
> -Mike

P.S.

virgin -rt7 doing tbench 64 + make -j64

[   97.907960] perf samples too long (3138 > 2500), lowering 
kernel.perf_event_max_sample_rate to 5
[  103.047921] perf samples too long (5544 > 5000), lowering 
kernel.perf_event_max_sample_rate to 25000
[  181.561271] perf samples too long (10318 > 1), lowering 
kernel.perf_event_max_sample_rate to 13000
[  184.243750] INFO: NMI handler (perf_event_nmi_handler) took too long to run: 
1.084 msecs
[  248.914422] perf samples too long (19719 > 19230), lowering 
kernel.perf_event_max_sample_rate to 7000
[  382.116674] NOHZ: local_softirq_pending 10
[  405.201593] perf samples too long (36824 > 35714), lowering 
kernel.perf_event_max_sample_rate to 4000
[  444.704185] NOHZ: local_softirq_pending 08
[  444.704208] NOHZ: local_softirq_pending 08
[  444.704579] NOHZ: local_softirq_pending 08
[  444.704678] NOHZ: local_softirq_pending 08
[  444.705100] NOHZ: local_softirq_pending 08
[  444.705980] NOHZ: local_softirq_pending 08
[  444.705994] NOHZ: local_softirq_pending 08
[  444.708315] NOHZ: local_softirq_pending 08
[  444.710348] NOHZ: local_softirq_pending 08
[  474.435582] INFO: NMI handler (perf_event_nmi_handler) took too long to run: 
1.096 msecs
[  475.994055] perf samples too long (63124 > 62500), lowering 
kernel.perf_event_max_sample_rate to 2000

Those annoying perf gripes are generic, not -rt.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] rcu: Eliminate softirq processing from rcutree

2013-12-22 Thread Mike Galbraith

On Sun, 2013-12-22 at 09:57 +0100, Mike Galbraith wrote: 
> I'll let the box give
> RCU something to do for a couple days.  No news is good news.

Ho ho hum, merry christmas, gift attached.

I'll beat on virgin -rt7, see if it survives, then re-apply RCU patch
and retest.  This kernel had nohz_full enabled, along with Sebastian's
pending -rt fix for same, so RCU patch was not only not running solo,
box was running a known somewhat buggy config as well.  Box was doing
endless tbench 64 when it started stalling fwiw.

-Mike

vogelweide-stall.gz
Description: GNU Zip compressed data

Re: [PATCH] rcu: Eliminate softirq processing from rcutree

2013-12-22 Thread Mike Galbraith

On Sun, 2013-12-22 at 04:07 +0100, Mike Galbraith wrote: 
> On Sat, 2013-12-21 at 20:39 +0100, Sebastian Andrzej Siewior wrote: 
> > From: "Paul E. McKenney" 
> > 
> > Running RCU out of softirq is a problem for some workloads that would
> > like to manage RCU core processing independently of other softirq work,
> > for example, setting kthread priority.  This commit therefore moves the
> > RCU core work from softirq to a per-CPU/per-flavor SCHED_OTHER kthread
> > named rcuc.  The SCHED_OTHER approach avoids the scalability problems
> > that appeared with the earlier attempt to move RCU core processing to
> > from softirq to kthreads.  That said, kernels built with RCU_BOOST=y
> > will run the rcuc kthreads at the RCU-boosting priority.
> 
> I'll take this for a spin on my 64 core test box.
> 
> I'm pretty sure I'll still end up having to split softirq threads again
> though, as big box has been unable to meet jitter requirements without,
> and last upstream rt kernel tested still couldn't.

Still can't fwiw, but whatever, back to $subject.  I'll let the box give
RCU something to do for a couple days.  No news is good news.

-Mike

30 minute isolated core jitter test says tinkering will definitely be
required.  3.0-rt does single digit worst case on same old box.  Darn.

(test is imperfect, but good enough)

FREQ=960 FRAMES=1728000 LOOP=5 using CPUs 4 - 23
FREQ=1000 FRAMES=180 LOOP=48000 using CPUs 24 - 43
FREQ=300 FRAMES=54 LOOP=16 using CPUs 44 - 63
on your marks... get set... POW!
Cpu FramesMin Max(Frame)  Avg Sigma LastTrans 
Fliers(Frames) 
4   1727979   0.0159  181.66 (1043545)0.4492  0.58760 (0) 16 
(828505,828506,859225,859226,889945,..1043546)
5   1727980   0.0159  181.90 (1013305)0.4560  0.61180 (0) 16 
(798265,798266,828985,828986,859705,..1013306)
6   1727981   0.0159  189.05 (1013785)0.3691  0.62250 (0) 16 
(798745,798746,829465,829466,860185,..1013786)
7   1727982   0.0159  177.88 (983546) 0.2885  0.52690 (0) 16 
(768505,768506,799225,799226,829945,..983546)
8   1727984   0.0159  192.63 (984025) 0.3131  0.63070 (0) 18 
(738265,738266,768985,768986,799705,..984026)
9   1727985   0.0159  16.43 (801406)  0.6562  0.57940 (0) 
10  1727986   0.0159  186.94 (954266) 0.3514  0.62520 (0) 16 
(739225,739226,769945,769946,800665,..954266)
11  1727987   0.0159  194.06 (954745) 0.4341  0.65470 (0) 18 
(708985,708986,739705,739706,770425,..954746)
12  1727989   0.0159  13.61 (67116)   0.3364  0.42940 (0) 
13  1727990   0.0159  186.19 (894265) 0.3955  0.61130 (0) 16 
(679225,679226,709945,709946,740665,..894266)
14  1727991   0.0159  192.18 (894746) 0.4410  0.64490 (0) 18 
(648985,648986,679705,679706,710425,..894746)
15  1727993   0.0159  183.36 (833786) 0.5582  0.66550 (0) 16 
(618745,618746,649465,649466,680185,..833786)
16  1727994   0.0159  193.61 (895706) 0.6073  0.73820 (0) 17 
(649945,680665,680666,711385,711386,..895706)
17  1727995   0.0159  36.94 (739943)  0.7135  0.75430 (0) 6 
(173558,173559,739943,739944,1224751,1224752)
18  1727996   0.0159  167.39 (835226) 0.8385  0.82870 (0) 16 
(620185,620186,650905,650906,681625,..835226)
19  1727997   0.0159  172.84 (804985) 0.5110  0.69590 (0) 17 
(589946,620665,620666,651385,651386,..835706)
20  1727999   0.0159  180.47 (774745) 0.7566  0.75620 (0) 16 
(559705,559706,590425,590426,621145,..774746)
21  1728000   0.0159  169.74 (744505) 0.7719  0.81540 (0) 16 
(560185,560186,590905,590906,621625,..775226)
22  1728000   0.0159  194.80 (836667) 0.6799  0.70630 (0) 16 
(590906,590907,622105,622106,652346,..836667)
23  1728000   0.0159  183.12 (745466) 0.6733  0.70910 (0) 16 
(530425,530426,561145,561146,591865,..745466)
24  180   0.0725  7.46 (132730)   0.5375  0.44620 (0) 
25  180   0.0725  7.23 (132730)   0.5725  0.48160 (0) 
26  180   0.0725  7.23 (132730)   0.5119  0.41940 (0) 
27  180   0.0725  4.93 (132730)   0.4102  0.33790 (0) 
28  180   0.0725  5.08 (444312)   0.4275  0.35100 (0) 
29  180   0.0725  6.75 (132717)   0.5501  0.52320 (0) 
30  180   0.0725  11.61 (12026)   0.3811  0.39340 (0) 
31  180   0.0725  11.61 (12526)   0.4054  0.45510 (0) 
32  180   0.0725  50.95 (13026)   0.6015  0.56170 (0) 31 
(13026,13027,45026,45027,77026,..909027)
33  180   0.0725  62.63 (13526)   0.5643  0.59220 (0) 112 
(13526,13527,45526,45527,77526,..1773527)
34  180   0.0725  70.26 (14026)   0.3698  0.61320 (0) 112 
(14026,14027,46026,46027,78026,..1774027)
35  180   0.0725  84.57 (14526)   0.6490  0.79810 (0) 112 
(14526,14527,46526,46527,78526,..1774527)
36  180   0.0725  81.94 (943026)  0.3917  0.63870 (0) 112 
(15026,15027,47026,47027,79026,..1775027)
37  180   0.0725  93.86 (15526)   0.6346  0.85800 (0)

Re: [PATCH] rcu: Eliminate softirq processing from rcutree

2013-12-22 Thread Mike Galbraith

On Sun, 2013-12-22 at 04:07 +0100, Mike Galbraith wrote: 
 On Sat, 2013-12-21 at 20:39 +0100, Sebastian Andrzej Siewior wrote: 
  From: Paul E. McKenney paul...@linux.vnet.ibm.com
  
  Running RCU out of softirq is a problem for some workloads that would
  like to manage RCU core processing independently of other softirq work,
  for example, setting kthread priority.  This commit therefore moves the
  RCU core work from softirq to a per-CPU/per-flavor SCHED_OTHER kthread
  named rcuc.  The SCHED_OTHER approach avoids the scalability problems
  that appeared with the earlier attempt to move RCU core processing to
  from softirq to kthreads.  That said, kernels built with RCU_BOOST=y
  will run the rcuc kthreads at the RCU-boosting priority.
 
 I'll take this for a spin on my 64 core test box.
 
 I'm pretty sure I'll still end up having to split softirq threads again
 though, as big box has been unable to meet jitter requirements without,
 and last upstream rt kernel tested still couldn't.

Still can't fwiw, but whatever, back to $subject.  I'll let the box give
RCU something to do for a couple days.  No news is good news.

-Mike

30 minute isolated core jitter test says tinkering will definitely be
required.  3.0-rt does single digit worst case on same old box.  Darn.

(test is imperfect, but good enough)

FREQ=960 FRAMES=1728000 LOOP=5 using CPUs 4 - 23
FREQ=1000 FRAMES=180 LOOP=48000 using CPUs 24 - 43
FREQ=300 FRAMES=54 LOOP=16 using CPUs 44 - 63
on your marks... get set... POW!
Cpu FramesMin Max(Frame)  Avg Sigma LastTrans 
Fliers(Frames) 
4   1727979   0.0159  181.66 (1043545)0.4492  0.58760 (0) 16 
(828505,828506,859225,859226,889945,..1043546)
5   1727980   0.0159  181.90 (1013305)0.4560  0.61180 (0) 16 
(798265,798266,828985,828986,859705,..1013306)
6   1727981   0.0159  189.05 (1013785)0.3691  0.62250 (0) 16 
(798745,798746,829465,829466,860185,..1013786)
7   1727982   0.0159  177.88 (983546) 0.2885  0.52690 (0) 16 
(768505,768506,799225,799226,829945,..983546)
8   1727984   0.0159  192.63 (984025) 0.3131  0.63070 (0) 18 
(738265,738266,768985,768986,799705,..984026)
9   1727985   0.0159  16.43 (801406)  0.6562  0.57940 (0) 
10  1727986   0.0159  186.94 (954266) 0.3514  0.62520 (0) 16 
(739225,739226,769945,769946,800665,..954266)
11  1727987   0.0159  194.06 (954745) 0.4341  0.65470 (0) 18 
(708985,708986,739705,739706,770425,..954746)
12  1727989   0.0159  13.61 (67116)   0.3364  0.42940 (0) 
13  1727990   0.0159  186.19 (894265) 0.3955  0.61130 (0) 16 
(679225,679226,709945,709946,740665,..894266)
14  1727991   0.0159  192.18 (894746) 0.4410  0.64490 (0) 18 
(648985,648986,679705,679706,710425,..894746)
15  1727993   0.0159  183.36 (833786) 0.5582  0.66550 (0) 16 
(618745,618746,649465,649466,680185,..833786)
16  1727994   0.0159  193.61 (895706) 0.6073  0.73820 (0) 17 
(649945,680665,680666,711385,711386,..895706)
17  1727995   0.0159  36.94 (739943)  0.7135  0.75430 (0) 6 
(173558,173559,739943,739944,1224751,1224752)
18  1727996   0.0159  167.39 (835226) 0.8385  0.82870 (0) 16 
(620185,620186,650905,650906,681625,..835226)
19  1727997   0.0159  172.84 (804985) 0.5110  0.69590 (0) 17 
(589946,620665,620666,651385,651386,..835706)
20  1727999   0.0159  180.47 (774745) 0.7566  0.75620 (0) 16 
(559705,559706,590425,590426,621145,..774746)
21  1728000   0.0159  169.74 (744505) 0.7719  0.81540 (0) 16 
(560185,560186,590905,590906,621625,..775226)
22  1728000   0.0159  194.80 (836667) 0.6799  0.70630 (0) 16 
(590906,590907,622105,622106,652346,..836667)
23  1728000   0.0159  183.12 (745466) 0.6733  0.70910 (0) 16 
(530425,530426,561145,561146,591865,..745466)
24  180   0.0725  7.46 (132730)   0.5375  0.44620 (0) 
25  180   0.0725  7.23 (132730)   0.5725  0.48160 (0) 
26  180   0.0725  7.23 (132730)   0.5119  0.41940 (0) 
27  180   0.0725  4.93 (132730)   0.4102  0.33790 (0) 
28  180   0.0725  5.08 (444312)   0.4275  0.35100 (0) 
29  180   0.0725  6.75 (132717)   0.5501  0.52320 (0) 
30  180   0.0725  11.61 (12026)   0.3811  0.39340 (0) 
31  180   0.0725  11.61 (12526)   0.4054  0.45510 (0) 
32  180   0.0725  50.95 (13026)   0.6015  0.56170 (0) 31 
(13026,13027,45026,45027,77026,..909027)
33  180   0.0725  62.63 (13526)   0.5643  0.59220 (0) 112 
(13526,13527,45526,45527,77526,..1773527)
34  180   0.0725  70.26 (14026)   0.3698  0.61320 (0) 112 
(14026,14027,46026,46027,78026,..1774027)
35  180   0.0725  84.57 (14526)   0.6490  0.79810 (0) 112 
(14526,14527,46526,46527,78526,..1774527)
36  180   0.0725  81.94 (943026)  0.3917  0.63870 (0) 112 
(15026,15027,47026,47027,79026,..1775027)
37  180   0.0725  93.86 (15526)   0.6346  0.85800 (0) 112

Re: [PATCH] rcu: Eliminate softirq processing from rcutree

2013-12-22 Thread Mike Galbraith

On Sun, 2013-12-22 at 09:57 +0100, Mike Galbraith wrote: 
 I'll let the box give
 RCU something to do for a couple days.  No news is good news.

Ho ho hum, merry christmas, gift attached.

I'll beat on virgin -rt7, see if it survives, then re-apply RCU patch
and retest.  This kernel had nohz_full enabled, along with Sebastian's
pending -rt fix for same, so RCU patch was not only not running solo,
box was running a known somewhat buggy config as well.  Box was doing
endless tbench 64 when it started stalling fwiw.

-Mike


vogelweide-stall.gz
Description: GNU Zip compressed data

Re: [PATCH] rcu: Eliminate softirq processing from rcutree

2013-12-22 Thread Mike Galbraith

On Mon, 2013-12-23 at 05:38 +0100, Mike Galbraith wrote: 
 On Sun, 2013-12-22 at 09:57 +0100, Mike Galbraith wrote: 
  I'll let the box give
  RCU something to do for a couple days.  No news is good news.
 
 Ho ho hum, merry christmas, gift attached.
 
 I'll beat on virgin -rt7, see if it survives, then re-apply RCU patch
 and retest.  This kernel had nohz_full enabled, along with Sebastian's
 pending -rt fix for same, so RCU patch was not only not running solo,
 box was running a known somewhat buggy config as well.  Box was doing
 endless tbench 64 when it started stalling fwiw.
 
 -Mike

P.S.

virgin -rt7 doing tbench 64 + make -j64

[   97.907960] perf samples too long (3138  2500), lowering 
kernel.perf_event_max_sample_rate to 5
[  103.047921] perf samples too long (5544  5000), lowering 
kernel.perf_event_max_sample_rate to 25000
[  181.561271] perf samples too long (10318  1), lowering 
kernel.perf_event_max_sample_rate to 13000
[  184.243750] INFO: NMI handler (perf_event_nmi_handler) took too long to run: 
1.084 msecs
[  248.914422] perf samples too long (19719  19230), lowering 
kernel.perf_event_max_sample_rate to 7000
[  382.116674] NOHZ: local_softirq_pending 10
[  405.201593] perf samples too long (36824  35714), lowering 
kernel.perf_event_max_sample_rate to 4000
[  444.704185] NOHZ: local_softirq_pending 08
[  444.704208] NOHZ: local_softirq_pending 08
[  444.704579] NOHZ: local_softirq_pending 08
[  444.704678] NOHZ: local_softirq_pending 08
[  444.705100] NOHZ: local_softirq_pending 08
[  444.705980] NOHZ: local_softirq_pending 08
[  444.705994] NOHZ: local_softirq_pending 08
[  444.708315] NOHZ: local_softirq_pending 08
[  444.710348] NOHZ: local_softirq_pending 08
[  474.435582] INFO: NMI handler (perf_event_nmi_handler) took too long to run: 
1.096 msecs
[  475.994055] perf samples too long (63124  62500), lowering 
kernel.perf_event_max_sample_rate to 2000

Those annoying perf gripes are generic, not -rt.

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] rcu: Eliminate softirq processing from rcutree

2013-12-21 Thread Mike Galbraith

On Sat, 2013-12-21 at 20:39 +0100, Sebastian Andrzej Siewior wrote: 
> From: "Paul E. McKenney" 
> 
> Running RCU out of softirq is a problem for some workloads that would
> like to manage RCU core processing independently of other softirq work,
> for example, setting kthread priority.  This commit therefore moves the
> RCU core work from softirq to a per-CPU/per-flavor SCHED_OTHER kthread
> named rcuc.  The SCHED_OTHER approach avoids the scalability problems
> that appeared with the earlier attempt to move RCU core processing to
> from softirq to kthreads.  That said, kernels built with RCU_BOOST=y
> will run the rcuc kthreads at the RCU-boosting priority.

I'll take this for a spin on my 64 core test box.

I'm pretty sure I'll still end up having to split softirq threads again
though, as big box has been unable to meet jitter requirements without,
and last upstream rt kernel tested still couldn't.

-Mike

Hm.  Another thing I'll have to check again is btrfs locking fix, and
generic IO deadlocks if you don't pull your plug upon first rtmutex
block.  In 3.0, both were required for box to survive heavy fs pounding.
Oh yeah, and the pain of rt tasks playing idle balance for SCHED_OTHER
tasks, and nohz balancing crud, and cpupri cost when cores are isolated
and and.. sigh, big boxen _suck_ ;-)

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] rcu: Eliminate softirq processing from rcutree

2013-12-21 Thread Sebastian Andrzej Siewior

From: "Paul E. McKenney" 

Running RCU out of softirq is a problem for some workloads that would
like to manage RCU core processing independently of other softirq work,
for example, setting kthread priority.  This commit therefore moves the
RCU core work from softirq to a per-CPU/per-flavor SCHED_OTHER kthread
named rcuc.  The SCHED_OTHER approach avoids the scalability problems
that appeared with the earlier attempt to move RCU core processing to
from softirq to kthreads.  That said, kernels built with RCU_BOOST=y
will run the rcuc kthreads at the RCU-boosting priority.

Reported-by: Thomas Gleixner 
Signed-off-by: Paul E. McKenney 
---

I intend to apply this for the next -RT relase. My powerpc test box runs
with this for more than 24h without anything bad happending.

 kernel/rcutree.c| 113 +++-
 kernel/rcutree.h|   3 +-
 kernel/rcutree_plugin.h | 134 +---
 3 files changed, 113 insertions(+), 137 deletions(-)

diff --git a/kernel/rcutree.c b/kernel/rcutree.c
index f4f61bb..507fab1 100644
--- a/kernel/rcutree.c
+++ b/kernel/rcutree.c
@@ -55,6 +55,11 @@
 #include 
 #include 
 #include 
+#include 
+#include 
+#include 
+#include 
+#include "time/tick-internal.h"
 
 #include "rcutree.h"
 #include 
@@ -145,8 +150,6 @@ EXPORT_SYMBOL_GPL(rcu_scheduler_active);
  */
 static int rcu_scheduler_fully_active __read_mostly;
 
-#ifdef CONFIG_RCU_BOOST
-
 /*
  * Control variables for per-CPU and per-rcu_node kthreads.  These
  * handle all flavors of RCU.
@@ -156,8 +159,6 @@ DEFINE_PER_CPU(unsigned int, rcu_cpu_kthread_status);
 DEFINE_PER_CPU(unsigned int, rcu_cpu_kthread_loops);
 DEFINE_PER_CPU(char, rcu_cpu_has_work);
 
-#endif /* #ifdef CONFIG_RCU_BOOST */
-
 static void rcu_boost_kthread_setaffinity(struct rcu_node *rnp, int 
outgoingcpu);
 static void invoke_rcu_core(void);
 static void invoke_rcu_callbacks(struct rcu_state *rsp, struct rcu_data *rdp);
@@ -2226,16 +2227,14 @@ __rcu_process_callbacks(struct rcu_state *rsp)
 /*
  * Do RCU core processing for the current CPU.
  */
-static void rcu_process_callbacks(struct softirq_action *unused)
+static void rcu_process_callbacks(void)
 {
struct rcu_state *rsp;
 
if (cpu_is_offline(smp_processor_id()))
return;
-   trace_rcu_utilization(TPS("Start RCU core"));
for_each_rcu_flavor(rsp)
__rcu_process_callbacks(rsp);
-   trace_rcu_utilization(TPS("End RCU core"));
 }
 
 /*
@@ -2249,18 +2248,105 @@ static void invoke_rcu_callbacks(struct rcu_state 
*rsp, struct rcu_data *rdp)
 {
if (unlikely(!ACCESS_ONCE(rcu_scheduler_fully_active)))
return;
-   if (likely(!rsp->boost)) {
-   rcu_do_batch(rsp, rdp);
+   rcu_do_batch(rsp, rdp);
+}
+
+static void rcu_wake_cond(struct task_struct *t, int status)
+{
+   /*
+* If the thread is yielding, only wake it when this
+* is invoked from idle
+*/
+   if (t && (status != RCU_KTHREAD_YIELDING || is_idle_task(current)))
+   wake_up_process(t);
+}
+
+/*
+ * Wake up this CPU's rcuc kthread to do RCU core processing.
+ */
+static void invoke_rcu_core(void)
+{
+   unsigned long flags;
+   struct task_struct *t;
+
+   if (!cpu_online(smp_processor_id()))
return;
+   local_irq_save(flags);
+   __this_cpu_write(rcu_cpu_has_work, 1);
+   t = __this_cpu_read(rcu_cpu_kthread_task);
+   if (t != NULL && current != t)
+   rcu_wake_cond(t, __this_cpu_read(rcu_cpu_kthread_status));
+   local_irq_restore(flags);
+}
+
+static void rcu_cpu_kthread_park(unsigned int cpu)
+{
+   per_cpu(rcu_cpu_kthread_status, cpu) = RCU_KTHREAD_OFFCPU;
+}
+
+static int rcu_cpu_kthread_should_run(unsigned int cpu)
+{
+   return __this_cpu_read(rcu_cpu_has_work);
+}
+
+/*
+ * Per-CPU kernel thread that invokes RCU callbacks.  This replaces the
+ * RCU softirq used in flavors and configurations of RCU that do not
+ * support RCU priority boosting.
+ */
+static void rcu_cpu_kthread(unsigned int cpu)
+{
+   unsigned int *statusp = &__get_cpu_var(rcu_cpu_kthread_status);
+   char work, *workp = &__get_cpu_var(rcu_cpu_has_work);
+   int spincnt;
+
+   for (spincnt = 0; spincnt < 10; spincnt++) {
+   trace_rcu_utilization(TPS("Start CPU kthread@rcu_wait"));
+   local_bh_disable();
+   *statusp = RCU_KTHREAD_RUNNING;
+   this_cpu_inc(rcu_cpu_kthread_loops);
+   local_irq_disable();
+   work = *workp;
+   *workp = 0;
+   local_irq_enable();
+   if (work)
+   rcu_process_callbacks();
+   local_bh_enable();
+   if (*workp == 0) {
+   trace_rcu_utilization(TPS("End CPU kthread@rcu_wait"));
+   *statusp = RCU_KTHREAD_WAITING;
+   return;
+

[PATCH] rcu: Eliminate softirq processing from rcutree

2013-12-21 Thread Sebastian Andrzej Siewior

From: Paul E. McKenney paul...@linux.vnet.ibm.com

Running RCU out of softirq is a problem for some workloads that would
like to manage RCU core processing independently of other softirq work,
for example, setting kthread priority.  This commit therefore moves the
RCU core work from softirq to a per-CPU/per-flavor SCHED_OTHER kthread
named rcuc.  The SCHED_OTHER approach avoids the scalability problems
that appeared with the earlier attempt to move RCU core processing to
from softirq to kthreads.  That said, kernels built with RCU_BOOST=y
will run the rcuc kthreads at the RCU-boosting priority.

Reported-by: Thomas Gleixner t...@linutronix.de
Signed-off-by: Paul E. McKenney paul...@linux.vnet.ibm.com
---

I intend to apply this for the next -RT relase. My powerpc test box runs
with this for more than 24h without anything bad happending.

 kernel/rcutree.c| 113 +++-
 kernel/rcutree.h|   3 +-
 kernel/rcutree_plugin.h | 134 +---
 3 files changed, 113 insertions(+), 137 deletions(-)

diff --git a/kernel/rcutree.c b/kernel/rcutree.c
index f4f61bb..507fab1 100644
--- a/kernel/rcutree.c
+++ b/kernel/rcutree.c
@@ -55,6 +55,11 @@
 #include linux/random.h
 #include linux/ftrace_event.h
 #include linux/suspend.h
+#include linux/delay.h
+#include linux/gfp.h
+#include linux/oom.h
+#include linux/smpboot.h
+#include time/tick-internal.h
 
 #include rcutree.h
 #include trace/events/rcu.h
@@ -145,8 +150,6 @@ EXPORT_SYMBOL_GPL(rcu_scheduler_active);
  */
 static int rcu_scheduler_fully_active __read_mostly;
 
-#ifdef CONFIG_RCU_BOOST
-
 /*
  * Control variables for per-CPU and per-rcu_node kthreads.  These
  * handle all flavors of RCU.
@@ -156,8 +159,6 @@ DEFINE_PER_CPU(unsigned int, rcu_cpu_kthread_status);
 DEFINE_PER_CPU(unsigned int, rcu_cpu_kthread_loops);
 DEFINE_PER_CPU(char, rcu_cpu_has_work);
 
-#endif /* #ifdef CONFIG_RCU_BOOST */
-
 static void rcu_boost_kthread_setaffinity(struct rcu_node *rnp, int 
outgoingcpu);
 static void invoke_rcu_core(void);
 static void invoke_rcu_callbacks(struct rcu_state *rsp, struct rcu_data *rdp);
@@ -2226,16 +2227,14 @@ __rcu_process_callbacks(struct rcu_state *rsp)
 /*
  * Do RCU core processing for the current CPU.
  */
-static void rcu_process_callbacks(struct softirq_action *unused)
+static void rcu_process_callbacks(void)
 {
struct rcu_state *rsp;
 
if (cpu_is_offline(smp_processor_id()))
return;
-   trace_rcu_utilization(TPS(Start RCU core));
for_each_rcu_flavor(rsp)
__rcu_process_callbacks(rsp);
-   trace_rcu_utilization(TPS(End RCU core));
 }
 
 /*
@@ -2249,18 +2248,105 @@ static void invoke_rcu_callbacks(struct rcu_state 
*rsp, struct rcu_data *rdp)
 {
if (unlikely(!ACCESS_ONCE(rcu_scheduler_fully_active)))
return;
-   if (likely(!rsp-boost)) {
-   rcu_do_batch(rsp, rdp);
+   rcu_do_batch(rsp, rdp);
+}
+
+static void rcu_wake_cond(struct task_struct *t, int status)
+{
+   /*
+* If the thread is yielding, only wake it when this
+* is invoked from idle
+*/
+   if (t  (status != RCU_KTHREAD_YIELDING || is_idle_task(current)))
+   wake_up_process(t);
+}
+
+/*
+ * Wake up this CPU's rcuc kthread to do RCU core processing.
+ */
+static void invoke_rcu_core(void)
+{
+   unsigned long flags;
+   struct task_struct *t;
+
+   if (!cpu_online(smp_processor_id()))
return;
+   local_irq_save(flags);
+   __this_cpu_write(rcu_cpu_has_work, 1);
+   t = __this_cpu_read(rcu_cpu_kthread_task);
+   if (t != NULL  current != t)
+   rcu_wake_cond(t, __this_cpu_read(rcu_cpu_kthread_status));
+   local_irq_restore(flags);
+}
+
+static void rcu_cpu_kthread_park(unsigned int cpu)
+{
+   per_cpu(rcu_cpu_kthread_status, cpu) = RCU_KTHREAD_OFFCPU;
+}
+
+static int rcu_cpu_kthread_should_run(unsigned int cpu)
+{
+   return __this_cpu_read(rcu_cpu_has_work);
+}
+
+/*
+ * Per-CPU kernel thread that invokes RCU callbacks.  This replaces the
+ * RCU softirq used in flavors and configurations of RCU that do not
+ * support RCU priority boosting.
+ */
+static void rcu_cpu_kthread(unsigned int cpu)
+{
+   unsigned int *statusp = __get_cpu_var(rcu_cpu_kthread_status);
+   char work, *workp = __get_cpu_var(rcu_cpu_has_work);
+   int spincnt;
+
+   for (spincnt = 0; spincnt  10; spincnt++) {
+   trace_rcu_utilization(TPS(Start CPU kthread@rcu_wait));
+   local_bh_disable();
+   *statusp = RCU_KTHREAD_RUNNING;
+   this_cpu_inc(rcu_cpu_kthread_loops);
+   local_irq_disable();
+   work = *workp;
+   *workp = 0;
+   local_irq_enable();
+   if (work)
+   rcu_process_callbacks();
+   local_bh_enable();
+   if (*workp == 0) {
+

Re: [PATCH] rcu: Eliminate softirq processing from rcutree

2013-12-21 Thread Mike Galbraith

On Sat, 2013-12-21 at 20:39 +0100, Sebastian Andrzej Siewior wrote: 
 From: Paul E. McKenney paul...@linux.vnet.ibm.com
 
 Running RCU out of softirq is a problem for some workloads that would
 like to manage RCU core processing independently of other softirq work,
 for example, setting kthread priority.  This commit therefore moves the
 RCU core work from softirq to a per-CPU/per-flavor SCHED_OTHER kthread
 named rcuc.  The SCHED_OTHER approach avoids the scalability problems
 that appeared with the earlier attempt to move RCU core processing to
 from softirq to kthreads.  That said, kernels built with RCU_BOOST=y
 will run the rcuc kthreads at the RCU-boosting priority.

I'll take this for a spin on my 64 core test box.

I'm pretty sure I'll still end up having to split softirq threads again
though, as big box has been unable to meet jitter requirements without,
and last upstream rt kernel tested still couldn't.

-Mike

Hm.  Another thing I'll have to check again is btrfs locking fix, and
generic IO deadlocks if you don't pull your plug upon first rtmutex
block.  In 3.0, both were required for box to survive heavy fs pounding.
Oh yeah, and the pain of rt tasks playing idle balance for SCHED_OTHER
tasks, and nohz balancing crud, and cpupri cost when cores are isolated
and and.. sigh, big boxen _suck_ ;-)

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

42 matches

Mail list logo