Re: [PATCH] rcu: Eliminate softirq processing from rcutree
On Mon, 2014-01-27 at 08:54 -0800, Paul E. McKenney wrote: > On Mon, Jan 27, 2014 at 06:10:44AM +0100, Mike Galbraith wrote: > > On Sat, 2014-01-25 at 06:12 +0100, Mike Galbraith wrote: > > > On Fri, 2014-01-24 at 20:50 +0100, Sebastian Andrzej Siewior wrote: > > > > * Mike Galbraith | 2014-01-18 04:25:14 [+0100]: > > > > > > > > >> ># timers-do-not-raise-softirq-unconditionally.patch > > > > >> ># rtmutex-use-a-trylock-for-waiter-lock-in-trylock.patch > > > > >> > > > > > >> >..those two out does seem to have stabilized the thing. > > > > >> > > > > >> timers-do-not-raise-softirq-unconditionally.patch is on its way out. > > > > >> > > > > >> rtmutex-use-a-trylock-for-waiter-lock-in-trylock.patch confues me. > > > > >> Didn't you report once that your box deadlocks without this patch? > > > > >> Now > > > > >> your 64way box on the other hand does not work with it? > > > > > > > > > >If 'do not raise' is applied, 'use a trylock' won't save you. If 'do > > > > is this just an observation or you do know why it won't save me? > > > > > > It's an observation from beyond the grave from the 64 core box that it > > > repeatedly did NOT save :) Autopsy photos below. > > > > > > I've built 3.12.8-rt9 with Stevens v2 "timer: Raise softirq if there's > > > irq_work" to see if it'll survive. > > > > And it did, configured both as nohz_tick, and nohz_full_all. The irqs > > are enabled warning in can_stop_full_tick() fired for nohz_full_all, but > > that's it. > > > > For grins, I also applied Paul's v3 timer latency series while testing > > nohz_full_all config. The box was heavily loaded the vast majority of > > the time, but it didn't explode or do anything obviously evil. > > Cool! May I add your Tested-by? Certainly. -Mike -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] rcu: Eliminate softirq processing from rcutree
On Mon, Jan 27, 2014 at 06:10:44AM +0100, Mike Galbraith wrote: > On Sat, 2014-01-25 at 06:12 +0100, Mike Galbraith wrote: > > On Fri, 2014-01-24 at 20:50 +0100, Sebastian Andrzej Siewior wrote: > > > * Mike Galbraith | 2014-01-18 04:25:14 [+0100]: > > > > > > >> ># timers-do-not-raise-softirq-unconditionally.patch > > > >> ># rtmutex-use-a-trylock-for-waiter-lock-in-trylock.patch > > > >> > > > > >> >..those two out does seem to have stabilized the thing. > > > >> > > > >> timers-do-not-raise-softirq-unconditionally.patch is on its way out. > > > >> > > > >> rtmutex-use-a-trylock-for-waiter-lock-in-trylock.patch confues me. > > > >> Didn't you report once that your box deadlocks without this patch? Now > > > >> your 64way box on the other hand does not work with it? > > > > > > > >If 'do not raise' is applied, 'use a trylock' won't save you. If 'do > > > is this just an observation or you do know why it won't save me? > > > > It's an observation from beyond the grave from the 64 core box that it > > repeatedly did NOT save :) Autopsy photos below. > > > > I've built 3.12.8-rt9 with Stevens v2 "timer: Raise softirq if there's > > irq_work" to see if it'll survive. > > And it did, configured both as nohz_tick, and nohz_full_all. The irqs > are enabled warning in can_stop_full_tick() fired for nohz_full_all, but > that's it. > > For grins, I also applied Paul's v3 timer latency series while testing > nohz_full_all config. The box was heavily loaded the vast majority of > the time, but it didn't explode or do anything obviously evil. Cool! May I add your Tested-by? Thanx, Paul -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] rcu: Eliminate softirq processing from rcutree
On Mon, Jan 27, 2014 at 06:10:44AM +0100, Mike Galbraith wrote: On Sat, 2014-01-25 at 06:12 +0100, Mike Galbraith wrote: On Fri, 2014-01-24 at 20:50 +0100, Sebastian Andrzej Siewior wrote: * Mike Galbraith | 2014-01-18 04:25:14 [+0100]: # timers-do-not-raise-softirq-unconditionally.patch # rtmutex-use-a-trylock-for-waiter-lock-in-trylock.patch ..those two out does seem to have stabilized the thing. timers-do-not-raise-softirq-unconditionally.patch is on its way out. rtmutex-use-a-trylock-for-waiter-lock-in-trylock.patch confues me. Didn't you report once that your box deadlocks without this patch? Now your 64way box on the other hand does not work with it? If 'do not raise' is applied, 'use a trylock' won't save you. If 'do is this just an observation or you do know why it won't save me? It's an observation from beyond the grave from the 64 core box that it repeatedly did NOT save :) Autopsy photos below. I've built 3.12.8-rt9 with Stevens v2 timer: Raise softirq if there's irq_work to see if it'll survive. And it did, configured both as nohz_tick, and nohz_full_all. The irqs are enabled warning in can_stop_full_tick() fired for nohz_full_all, but that's it. For grins, I also applied Paul's v3 timer latency series while testing nohz_full_all config. The box was heavily loaded the vast majority of the time, but it didn't explode or do anything obviously evil. Cool! May I add your Tested-by? Thanx, Paul -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] rcu: Eliminate softirq processing from rcutree
On Mon, 2014-01-27 at 08:54 -0800, Paul E. McKenney wrote: On Mon, Jan 27, 2014 at 06:10:44AM +0100, Mike Galbraith wrote: On Sat, 2014-01-25 at 06:12 +0100, Mike Galbraith wrote: On Fri, 2014-01-24 at 20:50 +0100, Sebastian Andrzej Siewior wrote: * Mike Galbraith | 2014-01-18 04:25:14 [+0100]: # timers-do-not-raise-softirq-unconditionally.patch # rtmutex-use-a-trylock-for-waiter-lock-in-trylock.patch ..those two out does seem to have stabilized the thing. timers-do-not-raise-softirq-unconditionally.patch is on its way out. rtmutex-use-a-trylock-for-waiter-lock-in-trylock.patch confues me. Didn't you report once that your box deadlocks without this patch? Now your 64way box on the other hand does not work with it? If 'do not raise' is applied, 'use a trylock' won't save you. If 'do is this just an observation or you do know why it won't save me? It's an observation from beyond the grave from the 64 core box that it repeatedly did NOT save :) Autopsy photos below. I've built 3.12.8-rt9 with Stevens v2 timer: Raise softirq if there's irq_work to see if it'll survive. And it did, configured both as nohz_tick, and nohz_full_all. The irqs are enabled warning in can_stop_full_tick() fired for nohz_full_all, but that's it. For grins, I also applied Paul's v3 timer latency series while testing nohz_full_all config. The box was heavily loaded the vast majority of the time, but it didn't explode or do anything obviously evil. Cool! May I add your Tested-by? Certainly. -Mike -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] rcu: Eliminate softirq processing from rcutree
On Sat, 2014-01-25 at 06:12 +0100, Mike Galbraith wrote: > On Fri, 2014-01-24 at 20:50 +0100, Sebastian Andrzej Siewior wrote: > > * Mike Galbraith | 2014-01-18 04:25:14 [+0100]: > > > > >> ># timers-do-not-raise-softirq-unconditionally.patch > > >> ># rtmutex-use-a-trylock-for-waiter-lock-in-trylock.patch > > >> > > > >> >..those two out does seem to have stabilized the thing. > > >> > > >> timers-do-not-raise-softirq-unconditionally.patch is on its way out. > > >> > > >> rtmutex-use-a-trylock-for-waiter-lock-in-trylock.patch confues me. > > >> Didn't you report once that your box deadlocks without this patch? Now > > >> your 64way box on the other hand does not work with it? > > > > > >If 'do not raise' is applied, 'use a trylock' won't save you. If 'do > > is this just an observation or you do know why it won't save me? > > It's an observation from beyond the grave from the 64 core box that it > repeatedly did NOT save :) Autopsy photos below. > > I've built 3.12.8-rt9 with Stevens v2 "timer: Raise softirq if there's > irq_work" to see if it'll survive. And it did, configured both as nohz_tick, and nohz_full_all. The irqs are enabled warning in can_stop_full_tick() fired for nohz_full_all, but that's it. For grins, I also applied Paul's v3 timer latency series while testing nohz_full_all config. The box was heavily loaded the vast majority of the time, but it didn't explode or do anything obviously evil. -Mike -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] rcu: Eliminate softirq processing from rcutree
On Sat, 2014-01-25 at 06:12 +0100, Mike Galbraith wrote: On Fri, 2014-01-24 at 20:50 +0100, Sebastian Andrzej Siewior wrote: * Mike Galbraith | 2014-01-18 04:25:14 [+0100]: # timers-do-not-raise-softirq-unconditionally.patch # rtmutex-use-a-trylock-for-waiter-lock-in-trylock.patch ..those two out does seem to have stabilized the thing. timers-do-not-raise-softirq-unconditionally.patch is on its way out. rtmutex-use-a-trylock-for-waiter-lock-in-trylock.patch confues me. Didn't you report once that your box deadlocks without this patch? Now your 64way box on the other hand does not work with it? If 'do not raise' is applied, 'use a trylock' won't save you. If 'do is this just an observation or you do know why it won't save me? It's an observation from beyond the grave from the 64 core box that it repeatedly did NOT save :) Autopsy photos below. I've built 3.12.8-rt9 with Stevens v2 timer: Raise softirq if there's irq_work to see if it'll survive. And it did, configured both as nohz_tick, and nohz_full_all. The irqs are enabled warning in can_stop_full_tick() fired for nohz_full_all, but that's it. For grins, I also applied Paul's v3 timer latency series while testing nohz_full_all config. The box was heavily loaded the vast majority of the time, but it didn't explode or do anything obviously evil. -Mike -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] rcu: Eliminate softirq processing from rcutree
On Fri, 2014-01-24 at 20:46 +0100, Sebastian Andrzej Siewior wrote: > * Mike Galbraith | 2013-12-23 06:12:39 [+0100]: > > >P.S. > > > >virgin -rt7 doing tbench 64 + make -j64 > > > >[ 97.907960] perf samples too long (3138 > 2500), lowering > >kernel.perf_event_max_sample_rate to 5 > >[ 103.047921] perf samples too long (5544 > 5000), lowering > >kernel.perf_event_max_sample_rate to 25000 > >[ 181.561271] perf samples too long (10318 > 1), lowering > >kernel.perf_event_max_sample_rate to 13000 > >[ 184.243750] INFO: NMI handler (perf_event_nmi_handler) took too long to > >run: 1.084 msecs > >[ 248.914422] perf samples too long (19719 > 19230), lowering > >kernel.perf_event_max_sample_rate to 7000 > >[ 382.116674] NOHZ: local_softirq_pending 10 > This is block > > >[ 405.201593] perf samples too long (36824 > 35714), lowering > >kernel.perf_event_max_sample_rate to 4000 > >[ 444.704185] NOHZ: local_softirq_pending 08 > >[ 444.704208] NOHZ: local_softirq_pending 08 > >[ 444.704579] NOHZ: local_softirq_pending 08 > >[ 444.704678] NOHZ: local_softirq_pending 08 > >[ 444.705100] NOHZ: local_softirq_pending 08 > >[ 444.705980] NOHZ: local_softirq_pending 08 > >[ 444.705994] NOHZ: local_softirq_pending 08 > >[ 444.708315] NOHZ: local_softirq_pending 08 > >[ 444.710348] NOHZ: local_softirq_pending 08 > > and this is RX. Is your testcase heavy disk-io or heavy disk-io + > network? Yeah. -Mike -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] rcu: Eliminate softirq processing from rcutree
On Fri, 2014-01-24 at 20:50 +0100, Sebastian Andrzej Siewior wrote: > * Mike Galbraith | 2014-01-18 04:25:14 [+0100]: > > >> ># timers-do-not-raise-softirq-unconditionally.patch > >> ># rtmutex-use-a-trylock-for-waiter-lock-in-trylock.patch > >> > > >> >..those two out does seem to have stabilized the thing. > >> > >> timers-do-not-raise-softirq-unconditionally.patch is on its way out. > >> > >> rtmutex-use-a-trylock-for-waiter-lock-in-trylock.patch confues me. > >> Didn't you report once that your box deadlocks without this patch? Now > >> your 64way box on the other hand does not work with it? > > > >If 'do not raise' is applied, 'use a trylock' won't save you. If 'do > is this just an observation or you do know why it won't save me? It's an observation from beyond the grave from the 64 core box that it repeatedly did NOT save :) Autopsy photos below. I've built 3.12.8-rt9 with Stevens v2 "timer: Raise softirq if there's irq_work" to see if it'll survive. nohz_full_all: PID: 508TASK: 8802739ba340 CPU: 16 COMMAND: "ksoftirqd/16" #0 [880276806a40] machine_kexec at 8103bc07 #1 [880276806aa0] crash_kexec at 810d56b3 #2 [880276806b70] panic at 815bf8b0 #3 [880276806bf0] watchdog_overflow_callback at 810fed3d #4 [880276806c10] __perf_event_overflow at 81131928 #5 [880276806ca0] perf_event_overflow at 81132254 #6 [880276806cb0] intel_pmu_handle_irq at 8102078f #7 [880276806de0] perf_event_nmi_handler at 815c5825 #8 [880276806e10] nmi_handle at 815c4ed3 #9 [880276806ea0] default_do_nmi at 815c5063 #10 [880276806ed0] do_nmi at 815c5388 #11 [880276806ef0] end_repeat_nmi at 815c4371 [exception RIP: _raw_spin_trylock+48] RIP: 815c3790 RSP: 880276803e28 RFLAGS: 0002 RAX: 0010 RBX: 0010 RCX: 0002 RDX: 880276803e28 RSI: 0018 RDI: 0001 RBP: 815c3790 R8: 815c3790 R9: 0018 R10: 880276803e28 R11: 0002 R12: R13: 880273a0c000 R14: 8802739ba340 R15: 880273a03fd8 ORIG_RAX: 880273a03fd8 CS: 0010 SS: 0018 --- --- #12 [880276803e28] _raw_spin_trylock at 815c3790 #13 [880276803e30] rt_spin_lock_slowunlock_hirq at 815c2cc8 #14 [880276803e50] rt_spin_unlock_after_trylock_in_irq at 815c3425 #15 [880276803e60] get_next_timer_interrupt at 810684a7 #16 [880276803ed0] tick_nohz_stop_sched_tick at 810c5f2e #17 [880276803f50] tick_nohz_irq_exit at 810c6333 #18 [880276803f70] irq_exit at 81060065 #19 [880276803f90] smp_apic_timer_interrupt at 810358f5 #20 [880276803fb0] apic_timer_interrupt at 815cbf9d --- --- #21 [880273a03b28] apic_timer_interrupt at 815cbf9d [exception RIP: _raw_spin_lock+50] RIP: 815c3642 RSP: 880273a03bd8 RFLAGS: 0202 RAX: 8b49 RBX: 880272157290 RCX: 8802739ba340 RDX: 8b4a RSI: 0010 RDI: 880273a0c000 RBP: 880273a03bd8 R8: 0001 R9: R10: R11: 0001 R12: 810927b5 R13: 880273a03b68 R14: 0010 R15: 0010 ORIG_RAX: ff10 CS: 0010 SS: 0018 #22 [880273a03be0] rt_spin_lock_slowlock at 815c2591 #23 [880273a03cc0] rt_spin_lock at 815c3362 #24 [880273a03cd0] run_timer_softirq at 81069002 #25 [880273a03d70] handle_softirq at 81060d0f #26
Re: [PATCH] rcu: Eliminate softirq processing from rcutree
* Mike Galbraith | 2014-01-18 04:25:14 [+0100]: >> ># timers-do-not-raise-softirq-unconditionally.patch >> ># rtmutex-use-a-trylock-for-waiter-lock-in-trylock.patch >> > >> >..those two out does seem to have stabilized the thing. >> >> timers-do-not-raise-softirq-unconditionally.patch is on its way out. >> >> rtmutex-use-a-trylock-for-waiter-lock-in-trylock.patch confues me. >> Didn't you report once that your box deadlocks without this patch? Now >> your 64way box on the other hand does not work with it? > >If 'do not raise' is applied, 'use a trylock' won't save you. If 'do is this just an observation or you do know why it won't save me? Currently I think to go back to the version where the waiter_lock was taken with irqs off. However I would prefer to trigger this myself so I would know what is going on instead blindly apply patches. >-Mike Sebastian -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] rcu: Eliminate softirq processing from rcutree
* Mike Galbraith | 2013-12-23 06:12:39 [+0100]: >P.S. > >virgin -rt7 doing tbench 64 + make -j64 > >[ 97.907960] perf samples too long (3138 > 2500), lowering >kernel.perf_event_max_sample_rate to 5 >[ 103.047921] perf samples too long (5544 > 5000), lowering >kernel.perf_event_max_sample_rate to 25000 >[ 181.561271] perf samples too long (10318 > 1), lowering >kernel.perf_event_max_sample_rate to 13000 >[ 184.243750] INFO: NMI handler (perf_event_nmi_handler) took too long to >run: 1.084 msecs >[ 248.914422] perf samples too long (19719 > 19230), lowering >kernel.perf_event_max_sample_rate to 7000 >[ 382.116674] NOHZ: local_softirq_pending 10 This is block >[ 405.201593] perf samples too long (36824 > 35714), lowering >kernel.perf_event_max_sample_rate to 4000 >[ 444.704185] NOHZ: local_softirq_pending 08 >[ 444.704208] NOHZ: local_softirq_pending 08 >[ 444.704579] NOHZ: local_softirq_pending 08 >[ 444.704678] NOHZ: local_softirq_pending 08 >[ 444.705100] NOHZ: local_softirq_pending 08 >[ 444.705980] NOHZ: local_softirq_pending 08 >[ 444.705994] NOHZ: local_softirq_pending 08 >[ 444.708315] NOHZ: local_softirq_pending 08 >[ 444.710348] NOHZ: local_softirq_pending 08 and this is RX. Is your testcase heavy disk-io or heavy disk-io + network? Sebastian -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] rcu: Eliminate softirq processing from rcutree
* Mike Galbraith | 2013-12-23 06:12:39 [+0100]: P.S. virgin -rt7 doing tbench 64 + make -j64 [ 97.907960] perf samples too long (3138 2500), lowering kernel.perf_event_max_sample_rate to 5 [ 103.047921] perf samples too long (5544 5000), lowering kernel.perf_event_max_sample_rate to 25000 [ 181.561271] perf samples too long (10318 1), lowering kernel.perf_event_max_sample_rate to 13000 [ 184.243750] INFO: NMI handler (perf_event_nmi_handler) took too long to run: 1.084 msecs [ 248.914422] perf samples too long (19719 19230), lowering kernel.perf_event_max_sample_rate to 7000 [ 382.116674] NOHZ: local_softirq_pending 10 This is block [ 405.201593] perf samples too long (36824 35714), lowering kernel.perf_event_max_sample_rate to 4000 [ 444.704185] NOHZ: local_softirq_pending 08 [ 444.704208] NOHZ: local_softirq_pending 08 [ 444.704579] NOHZ: local_softirq_pending 08 [ 444.704678] NOHZ: local_softirq_pending 08 [ 444.705100] NOHZ: local_softirq_pending 08 [ 444.705980] NOHZ: local_softirq_pending 08 [ 444.705994] NOHZ: local_softirq_pending 08 [ 444.708315] NOHZ: local_softirq_pending 08 [ 444.710348] NOHZ: local_softirq_pending 08 and this is RX. Is your testcase heavy disk-io or heavy disk-io + network? Sebastian -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] rcu: Eliminate softirq processing from rcutree
* Mike Galbraith | 2014-01-18 04:25:14 [+0100]: # timers-do-not-raise-softirq-unconditionally.patch # rtmutex-use-a-trylock-for-waiter-lock-in-trylock.patch ..those two out does seem to have stabilized the thing. timers-do-not-raise-softirq-unconditionally.patch is on its way out. rtmutex-use-a-trylock-for-waiter-lock-in-trylock.patch confues me. Didn't you report once that your box deadlocks without this patch? Now your 64way box on the other hand does not work with it? If 'do not raise' is applied, 'use a trylock' won't save you. If 'do is this just an observation or you do know why it won't save me? Currently I think to go back to the version where the waiter_lock was taken with irqs off. However I would prefer to trigger this myself so I would know what is going on instead blindly apply patches. -Mike Sebastian -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] rcu: Eliminate softirq processing from rcutree
On Fri, 2014-01-24 at 20:50 +0100, Sebastian Andrzej Siewior wrote: * Mike Galbraith | 2014-01-18 04:25:14 [+0100]: # timers-do-not-raise-softirq-unconditionally.patch # rtmutex-use-a-trylock-for-waiter-lock-in-trylock.patch ..those two out does seem to have stabilized the thing. timers-do-not-raise-softirq-unconditionally.patch is on its way out. rtmutex-use-a-trylock-for-waiter-lock-in-trylock.patch confues me. Didn't you report once that your box deadlocks without this patch? Now your 64way box on the other hand does not work with it? If 'do not raise' is applied, 'use a trylock' won't save you. If 'do is this just an observation or you do know why it won't save me? It's an observation from beyond the grave from the 64 core box that it repeatedly did NOT save :) Autopsy photos below. I've built 3.12.8-rt9 with Stevens v2 timer: Raise softirq if there's irq_work to see if it'll survive. nohz_full_all: PID: 508TASK: 8802739ba340 CPU: 16 COMMAND: ksoftirqd/16 #0 [880276806a40] machine_kexec at 8103bc07 #1 [880276806aa0] crash_kexec at 810d56b3 #2 [880276806b70] panic at 815bf8b0 #3 [880276806bf0] watchdog_overflow_callback at 810fed3d #4 [880276806c10] __perf_event_overflow at 81131928 #5 [880276806ca0] perf_event_overflow at 81132254 #6 [880276806cb0] intel_pmu_handle_irq at 8102078f #7 [880276806de0] perf_event_nmi_handler at 815c5825 #8 [880276806e10] nmi_handle at 815c4ed3 #9 [880276806ea0] default_do_nmi at 815c5063 #10 [880276806ed0] do_nmi at 815c5388 #11 [880276806ef0] end_repeat_nmi at 815c4371 [exception RIP: _raw_spin_trylock+48] RIP: 815c3790 RSP: 880276803e28 RFLAGS: 0002 RAX: 0010 RBX: 0010 RCX: 0002 RDX: 880276803e28 RSI: 0018 RDI: 0001 RBP: 815c3790 R8: 815c3790 R9: 0018 R10: 880276803e28 R11: 0002 R12: R13: 880273a0c000 R14: 8802739ba340 R15: 880273a03fd8 ORIG_RAX: 880273a03fd8 CS: 0010 SS: 0018 --- RT exception stack --- #12 [880276803e28] _raw_spin_trylock at 815c3790 #13 [880276803e30] rt_spin_lock_slowunlock_hirq at 815c2cc8 #14 [880276803e50] rt_spin_unlock_after_trylock_in_irq at 815c3425 #15 [880276803e60] get_next_timer_interrupt at 810684a7 #16 [880276803ed0] tick_nohz_stop_sched_tick at 810c5f2e #17 [880276803f50] tick_nohz_irq_exit at 810c6333 #18 [880276803f70] irq_exit at 81060065 #19 [880276803f90] smp_apic_timer_interrupt at 810358f5 #20 [880276803fb0] apic_timer_interrupt at 815cbf9d --- IRQ stack --- #21 [880273a03b28] apic_timer_interrupt at 815cbf9d [exception RIP: _raw_spin_lock+50] RIP: 815c3642 RSP: 880273a03bd8 RFLAGS: 0202 RAX: 8b49 RBX: 880272157290 RCX: 8802739ba340 RDX: 8b4a RSI: 0010 RDI: 880273a0c000 RBP: 880273a03bd8 R8: 0001 R9: R10: R11: 0001 R12: 810927b5 R13: 880273a03b68 R14: 0010 R15: 0010 ORIG_RAX: ff10 CS: 0010 SS: 0018 #22 [880273a03be0] rt_spin_lock_slowlock at 815c2591 #23 [880273a03cc0] rt_spin_lock at 815c3362 #24 [880273a03cd0] run_timer_softirq at 81069002 #25 [880273a03d70] handle_softirq at 81060d0f #26 [880273a03db0]
Re: [PATCH] rcu: Eliminate softirq processing from rcutree
On Fri, 2014-01-24 at 20:46 +0100, Sebastian Andrzej Siewior wrote: * Mike Galbraith | 2013-12-23 06:12:39 [+0100]: P.S. virgin -rt7 doing tbench 64 + make -j64 [ 97.907960] perf samples too long (3138 2500), lowering kernel.perf_event_max_sample_rate to 5 [ 103.047921] perf samples too long (5544 5000), lowering kernel.perf_event_max_sample_rate to 25000 [ 181.561271] perf samples too long (10318 1), lowering kernel.perf_event_max_sample_rate to 13000 [ 184.243750] INFO: NMI handler (perf_event_nmi_handler) took too long to run: 1.084 msecs [ 248.914422] perf samples too long (19719 19230), lowering kernel.perf_event_max_sample_rate to 7000 [ 382.116674] NOHZ: local_softirq_pending 10 This is block [ 405.201593] perf samples too long (36824 35714), lowering kernel.perf_event_max_sample_rate to 4000 [ 444.704185] NOHZ: local_softirq_pending 08 [ 444.704208] NOHZ: local_softirq_pending 08 [ 444.704579] NOHZ: local_softirq_pending 08 [ 444.704678] NOHZ: local_softirq_pending 08 [ 444.705100] NOHZ: local_softirq_pending 08 [ 444.705980] NOHZ: local_softirq_pending 08 [ 444.705994] NOHZ: local_softirq_pending 08 [ 444.708315] NOHZ: local_softirq_pending 08 [ 444.710348] NOHZ: local_softirq_pending 08 and this is RX. Is your testcase heavy disk-io or heavy disk-io + network? Yeah. -Mike -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] rcu: Eliminate softirq processing from rcutree
On Fri, 2014-01-17 at 18:23 +0100, Sebastian Andrzej Siewior wrote: > So I had rtmutex-take-the-waiter-lock-with-irqs-off.patch in my queue > which took the waiter lock with irqs off. This should be the same thing > you try do here. (yeah, these are just whacked mole body bags;) -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] rcu: Eliminate softirq processing from rcutree
On Fri, 2014-01-17 at 18:14 +0100, Sebastian Andrzej Siewior wrote: > * Mike Galbraith | 2013-12-25 18:37:37 [+0100]: > > >On Tue, 2013-12-24 at 23:55 -0800, Paul E. McKenney wrote: > >> On Wed, Dec 25, 2013 at 04:07:34AM +0100, Mike Galbraith wrote: > > > >Having sufficiently recovered from turkey overdose to be able to slither > >upstairs (bump bump bump) to check on the box, commenting.. > > > ># timers-do-not-raise-softirq-unconditionally.patch > ># rtmutex-use-a-trylock-for-waiter-lock-in-trylock.patch > > > >..those two out does seem to have stabilized the thing. > > timers-do-not-raise-softirq-unconditionally.patch is on its way out. > > rtmutex-use-a-trylock-for-waiter-lock-in-trylock.patch confues me. > Didn't you report once that your box deadlocks without this patch? Now > your 64way box on the other hand does not work with it? If 'do not raise' is applied, 'use a trylock' won't save you. If 'do not raise' is not applied, _and_ you wisely do not try to turn on very expensive nohz_full, things work fine without 'use a trylock'. -Mike -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] rcu: Eliminate softirq processing from rcutree
* Mike Galbraith | 2013-12-26 11:03:32 [+0100]: >On Wed, 2013-12-25 at 04:07 +0100, Mike Galbraith wrote: >> On Tue, 2013-12-24 at 11:36 -0800, Paul E. McKenney wrote: > >> > So which code do you think deserves the big lump of coal? ;-) >> >> Sebastian's NO_HZ_FULL locking fixes. > >Whack-a-mole hasn't yet dug up any new moles. > >--- > kernel/timer.c |4 > 1 file changed, 4 insertions(+) > >Index: linux-2.6/kernel/timer.c >=== >--- linux-2.6.orig/kernel/timer.c >+++ linux-2.6/kernel/timer.c >@@ -764,7 +764,9 @@ __mod_timer(struct timer_list *timer, un > timer_stats_timer_set_start_info(timer); > BUG_ON(!timer->function); > >+ local_irq_disable_rt(); > base = lock_timer_base(timer, ); >+ local_irq_enable_rt(); > > ret = detach_if_pending(timer, base, false); > if (!ret && pending_only) >@@ -1198,7 +1200,9 @@ static inline void __run_timers(struct t > { > struct timer_list *timer; > >+ local_irq_disable_rt(); > spin_lock_irq(>lock); >+ local_irq_enable_rt(); > while (time_after_eq(jiffies, base->timer_jiffies)) { > struct list_head work_list; > struct list_head *head = _list; >--- > kernel/time/tick-sched.c |2 ++ > 1 file changed, 2 insertions(+) So I had rtmutex-take-the-waiter-lock-with-irqs-off.patch in my queue which took the waiter lock with irqs off. This should be the same thing you try do here. >Index: linux-2.6/kernel/time/tick-sched.c >=== >--- linux-2.6.orig/kernel/time/tick-sched.c >+++ linux-2.6/kernel/time/tick-sched.c >@@ -216,7 +216,9 @@ void __tick_nohz_full_check(void) > > static void nohz_full_kick_work_func(struct irq_work *work) > { >+ local_irq_disable_rt(); > __tick_nohz_full_check(); >+ local_irq_enable_rt(); > } and this should be fixed differently. Since we come from a thread and check "is current running" but by current we mean a user task and not a kernel thread. > > static DEFINE_PER_CPU(struct irq_work, nohz_full_kick_work) = { > Sebastian -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] rcu: Eliminate softirq processing from rcutree
* Mike Galbraith | 2013-12-25 18:37:37 [+0100]: >On Tue, 2013-12-24 at 23:55 -0800, Paul E. McKenney wrote: >> On Wed, Dec 25, 2013 at 04:07:34AM +0100, Mike Galbraith wrote: > >Having sufficiently recovered from turkey overdose to be able to slither >upstairs (bump bump bump) to check on the box, commenting.. > ># timers-do-not-raise-softirq-unconditionally.patch ># rtmutex-use-a-trylock-for-waiter-lock-in-trylock.patch > >..those two out does seem to have stabilized the thing. timers-do-not-raise-softirq-unconditionally.patch is on its way out. rtmutex-use-a-trylock-for-waiter-lock-in-trylock.patch confues me. Didn't you report once that your box deadlocks without this patch? Now your 64way box on the other hand does not work with it? >Merry Christmasss, > >-Mike Sebastian -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] rcu: Eliminate softirq processing from rcutree
* Mike Galbraith | 2013-12-25 18:37:37 [+0100]: On Tue, 2013-12-24 at 23:55 -0800, Paul E. McKenney wrote: On Wed, Dec 25, 2013 at 04:07:34AM +0100, Mike Galbraith wrote: Having sufficiently recovered from turkey overdose to be able to slither upstairs (bump bump bump) to check on the box, commenting.. # timers-do-not-raise-softirq-unconditionally.patch # rtmutex-use-a-trylock-for-waiter-lock-in-trylock.patch ..those two out does seem to have stabilized the thing. timers-do-not-raise-softirq-unconditionally.patch is on its way out. rtmutex-use-a-trylock-for-waiter-lock-in-trylock.patch confues me. Didn't you report once that your box deadlocks without this patch? Now your 64way box on the other hand does not work with it? Merry Christmasss, -Mike Sebastian -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] rcu: Eliminate softirq processing from rcutree
* Mike Galbraith | 2013-12-26 11:03:32 [+0100]: On Wed, 2013-12-25 at 04:07 +0100, Mike Galbraith wrote: On Tue, 2013-12-24 at 11:36 -0800, Paul E. McKenney wrote: So which code do you think deserves the big lump of coal? ;-) Sebastian's NO_HZ_FULL locking fixes. Whack-a-mole hasn't yet dug up any new moles. --- kernel/timer.c |4 1 file changed, 4 insertions(+) Index: linux-2.6/kernel/timer.c === --- linux-2.6.orig/kernel/timer.c +++ linux-2.6/kernel/timer.c @@ -764,7 +764,9 @@ __mod_timer(struct timer_list *timer, un timer_stats_timer_set_start_info(timer); BUG_ON(!timer-function); + local_irq_disable_rt(); base = lock_timer_base(timer, flags); + local_irq_enable_rt(); ret = detach_if_pending(timer, base, false); if (!ret pending_only) @@ -1198,7 +1200,9 @@ static inline void __run_timers(struct t { struct timer_list *timer; + local_irq_disable_rt(); spin_lock_irq(base-lock); + local_irq_enable_rt(); while (time_after_eq(jiffies, base-timer_jiffies)) { struct list_head work_list; struct list_head *head = work_list; --- kernel/time/tick-sched.c |2 ++ 1 file changed, 2 insertions(+) So I had rtmutex-take-the-waiter-lock-with-irqs-off.patch in my queue which took the waiter lock with irqs off. This should be the same thing you try do here. Index: linux-2.6/kernel/time/tick-sched.c === --- linux-2.6.orig/kernel/time/tick-sched.c +++ linux-2.6/kernel/time/tick-sched.c @@ -216,7 +216,9 @@ void __tick_nohz_full_check(void) static void nohz_full_kick_work_func(struct irq_work *work) { + local_irq_disable_rt(); __tick_nohz_full_check(); + local_irq_enable_rt(); } and this should be fixed differently. Since we come from a thread and check is current running but by current we mean a user task and not a kernel thread. static DEFINE_PER_CPU(struct irq_work, nohz_full_kick_work) = { Sebastian -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] rcu: Eliminate softirq processing from rcutree
On Fri, 2014-01-17 at 18:14 +0100, Sebastian Andrzej Siewior wrote: * Mike Galbraith | 2013-12-25 18:37:37 [+0100]: On Tue, 2013-12-24 at 23:55 -0800, Paul E. McKenney wrote: On Wed, Dec 25, 2013 at 04:07:34AM +0100, Mike Galbraith wrote: Having sufficiently recovered from turkey overdose to be able to slither upstairs (bump bump bump) to check on the box, commenting.. # timers-do-not-raise-softirq-unconditionally.patch # rtmutex-use-a-trylock-for-waiter-lock-in-trylock.patch ..those two out does seem to have stabilized the thing. timers-do-not-raise-softirq-unconditionally.patch is on its way out. rtmutex-use-a-trylock-for-waiter-lock-in-trylock.patch confues me. Didn't you report once that your box deadlocks without this patch? Now your 64way box on the other hand does not work with it? If 'do not raise' is applied, 'use a trylock' won't save you. If 'do not raise' is not applied, _and_ you wisely do not try to turn on very expensive nohz_full, things work fine without 'use a trylock'. -Mike -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] rcu: Eliminate softirq processing from rcutree
On Fri, 2014-01-17 at 18:23 +0100, Sebastian Andrzej Siewior wrote: So I had rtmutex-take-the-waiter-lock-with-irqs-off.patch in my queue which took the waiter lock with irqs off. This should be the same thing you try do here. (yeah, these are just whacked mole body bags;) -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] rcu: Eliminate softirq processing from rcutree
On Wed, 2013-12-25 at 04:07 +0100, Mike Galbraith wrote: > On Tue, 2013-12-24 at 11:36 -0800, Paul E. McKenney wrote: > > So which code do you think deserves the big lump of coal? ;-) > > Sebastian's NO_HZ_FULL locking fixes. Whack-a-mole hasn't yet dug up any new moles. --- kernel/timer.c |4 1 file changed, 4 insertions(+) Index: linux-2.6/kernel/timer.c === --- linux-2.6.orig/kernel/timer.c +++ linux-2.6/kernel/timer.c @@ -764,7 +764,9 @@ __mod_timer(struct timer_list *timer, un timer_stats_timer_set_start_info(timer); BUG_ON(!timer->function); + local_irq_disable_rt(); base = lock_timer_base(timer, ); + local_irq_enable_rt(); ret = detach_if_pending(timer, base, false); if (!ret && pending_only) @@ -1198,7 +1200,9 @@ static inline void __run_timers(struct t { struct timer_list *timer; + local_irq_disable_rt(); spin_lock_irq(>lock); + local_irq_enable_rt(); while (time_after_eq(jiffies, base->timer_jiffies)) { struct list_head work_list; struct list_head *head = _list; --- kernel/time/tick-sched.c |2 ++ 1 file changed, 2 insertions(+) Index: linux-2.6/kernel/time/tick-sched.c === --- linux-2.6.orig/kernel/time/tick-sched.c +++ linux-2.6/kernel/time/tick-sched.c @@ -216,7 +216,9 @@ void __tick_nohz_full_check(void) static void nohz_full_kick_work_func(struct irq_work *work) { + local_irq_disable_rt(); __tick_nohz_full_check(); + local_irq_enable_rt(); } static DEFINE_PER_CPU(struct irq_work, nohz_full_kick_work) = { -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] rcu: Eliminate softirq processing from rcutree
On Wed, 2013-12-25 at 04:07 +0100, Mike Galbraith wrote: On Tue, 2013-12-24 at 11:36 -0800, Paul E. McKenney wrote: So which code do you think deserves the big lump of coal? ;-) Sebastian's NO_HZ_FULL locking fixes. Whack-a-mole hasn't yet dug up any new moles. --- kernel/timer.c |4 1 file changed, 4 insertions(+) Index: linux-2.6/kernel/timer.c === --- linux-2.6.orig/kernel/timer.c +++ linux-2.6/kernel/timer.c @@ -764,7 +764,9 @@ __mod_timer(struct timer_list *timer, un timer_stats_timer_set_start_info(timer); BUG_ON(!timer-function); + local_irq_disable_rt(); base = lock_timer_base(timer, flags); + local_irq_enable_rt(); ret = detach_if_pending(timer, base, false); if (!ret pending_only) @@ -1198,7 +1200,9 @@ static inline void __run_timers(struct t { struct timer_list *timer; + local_irq_disable_rt(); spin_lock_irq(base-lock); + local_irq_enable_rt(); while (time_after_eq(jiffies, base-timer_jiffies)) { struct list_head work_list; struct list_head *head = work_list; --- kernel/time/tick-sched.c |2 ++ 1 file changed, 2 insertions(+) Index: linux-2.6/kernel/time/tick-sched.c === --- linux-2.6.orig/kernel/time/tick-sched.c +++ linux-2.6/kernel/time/tick-sched.c @@ -216,7 +216,9 @@ void __tick_nohz_full_check(void) static void nohz_full_kick_work_func(struct irq_work *work) { + local_irq_disable_rt(); __tick_nohz_full_check(); + local_irq_enable_rt(); } static DEFINE_PER_CPU(struct irq_work, nohz_full_kick_work) = { -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] rcu: Eliminate softirq processing from rcutree
On Tue, 2013-12-24 at 23:55 -0800, Paul E. McKenney wrote: > On Wed, Dec 25, 2013 at 04:07:34AM +0100, Mike Galbraith wrote: > > > So which code do you think deserves the big lump of coal? ;-) > > > > Sebastian's NO_HZ_FULL locking fixes. Locking is hard, and rt sure > > doesn't make it any easier, so lets give him a cookie or three to nibble > > on while he ponders that trylock stuff again instead :) > > Fair enough. Does Sebastian prefer milk and cookies or the other > tradition of beer and a cigar? ;-) Having sufficiently recovered from turkey overdose to be able to slither upstairs (bump bump bump) to check on the box, commenting.. # timers-do-not-raise-softirq-unconditionally.patch # rtmutex-use-a-trylock-for-waiter-lock-in-trylock.patch ..those two out does seem to have stabilized the thing. Merry Christmasss, -Mike -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] rcu: Eliminate softirq processing from rcutree
On Tue, 2013-12-24 at 23:55 -0800, Paul E. McKenney wrote: On Wed, Dec 25, 2013 at 04:07:34AM +0100, Mike Galbraith wrote: So which code do you think deserves the big lump of coal? ;-) Sebastian's NO_HZ_FULL locking fixes. Locking is hard, and rt sure doesn't make it any easier, so lets give him a cookie or three to nibble on while he ponders that trylock stuff again instead :) Fair enough. Does Sebastian prefer milk and cookies or the other tradition of beer and a cigar? ;-) Having sufficiently recovered from turkey overdose to be able to slither upstairs (bump bump bump) to check on the box, commenting.. # timers-do-not-raise-softirq-unconditionally.patch # rtmutex-use-a-trylock-for-waiter-lock-in-trylock.patch ..those two out does seem to have stabilized the thing. Merry Christmasss, -Mike -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] rcu: Eliminate softirq processing from rcutree
On Wed, Dec 25, 2013 at 04:07:34AM +0100, Mike Galbraith wrote: > On Tue, 2013-12-24 at 11:36 -0800, Paul E. McKenney wrote: > > On Mon, Dec 23, 2013 at 05:38:53AM +0100, Mike Galbraith wrote: > > > On Sun, 2013-12-22 at 09:57 +0100, Mike Galbraith wrote: > > > > I'll let the box give > > > > RCU something to do for a couple days. No news is good news. > > > > > > Ho ho hum, merry christmas, gift attached. > > > > Hmmm... I guess I should take a moment to work out who has been > > naughty and nice... > > > > > I'll beat on virgin -rt7, see if it survives, then re-apply RCU patch > > > and retest. This kernel had nohz_full enabled, along with Sebastian's > > > pending -rt fix for same, so RCU patch was not only not running solo, > > > box was running a known somewhat buggy config as well. Box was doing > > > endless tbench 64 when it started stalling fwiw. > > > > [72788.040872] NMI backtrace for cpu 31 > > [72788.040874] CPU: 31 PID: 13975 Comm: tbench Tainted: GW > > 3.12.6-rt7-nohz #192 > > [72788.040874] Hardware name: Hewlett-Packard ProLiant DL980 G7, BIOS P66 > > 07/07/2010 > > [72788.040875] task: 8802728e3db0 ti: 88026deb2000 task.ti: > > 88026deb2000 > > [72788.040877] RIP: 0010:[] [] > > _raw_spin_trylock+0x14/0x80 > > [72788.040878] RSP: 0018:8802769e3e58 EFLAGS: 0002 > > [72788.040879] RAX: 88026deb3fd8 RBX: 880273544000 RCX: > > 7bc87bc6 > > [72788.040879] RDX: RSI: 8802728e3db0 RDI: > > 880273544000 > > [72788.040880] RBP: 88026deb39f8 R08: 063c14effd0f R09: > > 0119 > > [72788.040881] R10: 0005 R11: 8802769f2260 R12: > > 8802728e3db0 > > [72788.040881] R13: 001f R14: 8802769ebcc0 R15: > > 810c4730 > > [72788.040883] FS: 7f7cd380a700() GS:8802769e() > > knlGS: > > [72788.040883] CS: 0010 DS: ES: CR0: 80050033 > > [72788.040884] CR2: 0068a0e8 CR3: 000267ba4000 CR4: > > 07e0 > > [72788.040885] Stack: > > [72788.040886] 88026deb39f8 815e2aa0 > > 8106711a > > [72788.040887] 8802769ec4e0 8802769ec4e0 8802769e3f58 > > 810c44bd > > [72788.040888] 88026deb39f8 88026deb39f8 15ed4f5ff89b > > 810c476e > > [72788.040889] Call Trace: > > [72788.040889] > > [72788.040891] [] ? > > rt_spin_lock_slowunlock_hirq+0x10/0x20 > > [72788.040893] [] ? update_process_times+0x3a/0x60 > > [72788.040895] [] ? tick_sched_handle+0x2d/0x70 > > [72788.040896] [] ? tick_sched_timer+0x3e/0x70 > > [72788.040898] [] ? __run_hrtimer+0x13d/0x260 > > [72788.040900] [] ? hrtimer_interrupt+0x12c/0x310 > > [72788.040901] [] ? vtime_account_system+0x4e/0xf0 > > [72788.040903] [] ? smp_apic_timer_interrupt+0x36/0x50 > > [72788.040904] [] ? apic_timer_interrupt+0x6d/0x80 > > [72788.040905] > > [72788.040906] [] ? _raw_spin_lock+0x2a/0x40 > > [72788.040908] [] ? rt_spin_lock_slowlock+0x33/0x2d0 > > [72788.040910] [] ? migrate_enable+0xc4/0x220 > > [72788.040911] [] ? ip_finish_output+0x258/0x450 > > [72788.040913] [] ? lock_timer_base+0x41/0x80 > > [72788.040914] [] ? mod_timer+0x66/0x290 > > [72788.040916] [] ? sk_reset_timer+0xf/0x20 > > [72788.040917] [] ? tcp_write_xmit+0x1cf/0x5d0 > > [72788.040919] [] ? __tcp_push_pending_frames+0x25/0x60 > > [72788.040921] [] ? tcp_sendmsg+0x114/0xbb0 > > [72788.040923] [] ? sock_sendmsg+0xaf/0xf0 > > [72788.040925] [] ? touch_atime+0x65/0x150 > > [72788.040927] [] ? SyS_sendto+0x118/0x190 > > [72788.040929] [] ? vtime_account_user+0x66/0x100 > > [72788.040930] [] ? syscall_trace_enter+0x2a/0x260 > > [72788.040932] [] ? tracesys+0xdd/0xe2 > > > > The most likely suspect is the rt_spin_lock_slowlock() that is apparently > > being acquired by migrate_enable(). This could be due to: > > > > 1. Massive contention on that lock. > > > > 2. Someone else holding that lock for excessive time periods. > > Evidence in favor: CPU 0 appears to be running within > > migrate_enable(). But isn't migrate_enable() really quite > > lightweight? > > > > 3. Possible looping in the networking stack -- but this seems > > unlikely given that we appear to have caught a lock acquisition > > in the act. (Not impossible, however, if there are lots of > > migrate_enable() calls in the networking stack, which there > > are due to all the per-CPU work.) > > > > So which code do you think deserves the big lump of coal? ;-) > > Sebastian's NO_HZ_FULL locking fixes. Locking is hard, and rt sure > doesn't make it any easier, so lets give him a cookie or three to nibble > on while he ponders that trylock stuff again instead :) Fair enough. Does Sebastian prefer milk and cookies or the other tradition of beer and a cigar? ;-) Thanx, Paul -- To unsubscribe from this list: send the
Re: [PATCH] rcu: Eliminate softirq processing from rcutree
On Tue, 2013-12-24 at 11:36 -0800, Paul E. McKenney wrote: > On Mon, Dec 23, 2013 at 05:38:53AM +0100, Mike Galbraith wrote: > > On Sun, 2013-12-22 at 09:57 +0100, Mike Galbraith wrote: > > > I'll let the box give > > > RCU something to do for a couple days. No news is good news. > > > > Ho ho hum, merry christmas, gift attached. > > Hmmm... I guess I should take a moment to work out who has been > naughty and nice... > > > I'll beat on virgin -rt7, see if it survives, then re-apply RCU patch > > and retest. This kernel had nohz_full enabled, along with Sebastian's > > pending -rt fix for same, so RCU patch was not only not running solo, > > box was running a known somewhat buggy config as well. Box was doing > > endless tbench 64 when it started stalling fwiw. > > [72788.040872] NMI backtrace for cpu 31 > [72788.040874] CPU: 31 PID: 13975 Comm: tbench Tainted: GW > 3.12.6-rt7-nohz #192 > [72788.040874] Hardware name: Hewlett-Packard ProLiant DL980 G7, BIOS P66 > 07/07/2010 > [72788.040875] task: 8802728e3db0 ti: 88026deb2000 task.ti: > 88026deb2000 > [72788.040877] RIP: 0010:[] [] > _raw_spin_trylock+0x14/0x80 > [72788.040878] RSP: 0018:8802769e3e58 EFLAGS: 0002 > [72788.040879] RAX: 88026deb3fd8 RBX: 880273544000 RCX: > 7bc87bc6 > [72788.040879] RDX: RSI: 8802728e3db0 RDI: > 880273544000 > [72788.040880] RBP: 88026deb39f8 R08: 063c14effd0f R09: > 0119 > [72788.040881] R10: 0005 R11: 8802769f2260 R12: > 8802728e3db0 > [72788.040881] R13: 001f R14: 8802769ebcc0 R15: > 810c4730 > [72788.040883] FS: 7f7cd380a700() GS:8802769e() > knlGS: > [72788.040883] CS: 0010 DS: ES: CR0: 80050033 > [72788.040884] CR2: 0068a0e8 CR3: 000267ba4000 CR4: > 07e0 > [72788.040885] Stack: > [72788.040886] 88026deb39f8 815e2aa0 > 8106711a > [72788.040887] 8802769ec4e0 8802769ec4e0 8802769e3f58 > 810c44bd > [72788.040888] 88026deb39f8 88026deb39f8 15ed4f5ff89b > 810c476e > [72788.040889] Call Trace: > [72788.040889] > [72788.040891] [] ? rt_spin_lock_slowunlock_hirq+0x10/0x20 > [72788.040893] [] ? update_process_times+0x3a/0x60 > [72788.040895] [] ? tick_sched_handle+0x2d/0x70 > [72788.040896] [] ? tick_sched_timer+0x3e/0x70 > [72788.040898] [] ? __run_hrtimer+0x13d/0x260 > [72788.040900] [] ? hrtimer_interrupt+0x12c/0x310 > [72788.040901] [] ? vtime_account_system+0x4e/0xf0 > [72788.040903] [] ? smp_apic_timer_interrupt+0x36/0x50 > [72788.040904] [] ? apic_timer_interrupt+0x6d/0x80 > [72788.040905] > [72788.040906] [] ? _raw_spin_lock+0x2a/0x40 > [72788.040908] [] ? rt_spin_lock_slowlock+0x33/0x2d0 > [72788.040910] [] ? migrate_enable+0xc4/0x220 > [72788.040911] [] ? ip_finish_output+0x258/0x450 > [72788.040913] [] ? lock_timer_base+0x41/0x80 > [72788.040914] [] ? mod_timer+0x66/0x290 > [72788.040916] [] ? sk_reset_timer+0xf/0x20 > [72788.040917] [] ? tcp_write_xmit+0x1cf/0x5d0 > [72788.040919] [] ? __tcp_push_pending_frames+0x25/0x60 > [72788.040921] [] ? tcp_sendmsg+0x114/0xbb0 > [72788.040923] [] ? sock_sendmsg+0xaf/0xf0 > [72788.040925] [] ? touch_atime+0x65/0x150 > [72788.040927] [] ? SyS_sendto+0x118/0x190 > [72788.040929] [] ? vtime_account_user+0x66/0x100 > [72788.040930] [] ? syscall_trace_enter+0x2a/0x260 > [72788.040932] [] ? tracesys+0xdd/0xe2 > > The most likely suspect is the rt_spin_lock_slowlock() that is apparently > being acquired by migrate_enable(). This could be due to: > > 1.Massive contention on that lock. > > 2.Someone else holding that lock for excessive time periods. > Evidence in favor: CPU 0 appears to be running within > migrate_enable(). But isn't migrate_enable() really quite > lightweight? > > 3.Possible looping in the networking stack -- but this seems > unlikely given that we appear to have caught a lock acquisition > in the act. (Not impossible, however, if there are lots of > migrate_enable() calls in the networking stack, which there > are due to all the per-CPU work.) > > So which code do you think deserves the big lump of coal? ;-) Sebastian's NO_HZ_FULL locking fixes. Locking is hard, and rt sure doesn't make it any easier, so lets give him a cookie or three to nibble on while he ponders that trylock stuff again instead :) -Mike -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] rcu: Eliminate softirq processing from rcutree
On Mon, Dec 23, 2013 at 05:38:53AM +0100, Mike Galbraith wrote: > On Sun, 2013-12-22 at 09:57 +0100, Mike Galbraith wrote: > > I'll let the box give > > RCU something to do for a couple days. No news is good news. > > Ho ho hum, merry christmas, gift attached. Hmmm... I guess I should take a moment to work out who has been naughty and nice... > I'll beat on virgin -rt7, see if it survives, then re-apply RCU patch > and retest. This kernel had nohz_full enabled, along with Sebastian's > pending -rt fix for same, so RCU patch was not only not running solo, > box was running a known somewhat buggy config as well. Box was doing > endless tbench 64 when it started stalling fwiw. [72788.040872] NMI backtrace for cpu 31 [72788.040874] CPU: 31 PID: 13975 Comm: tbench Tainted: GW 3.12.6-rt7-nohz #192 [72788.040874] Hardware name: Hewlett-Packard ProLiant DL980 G7, BIOS P66 07/07/2010 [72788.040875] task: 8802728e3db0 ti: 88026deb2000 task.ti: 88026deb2000 [72788.040877] RIP: 0010:[] [] _raw_spin_trylock+0x14/0x80 [72788.040878] RSP: 0018:8802769e3e58 EFLAGS: 0002 [72788.040879] RAX: 88026deb3fd8 RBX: 880273544000 RCX: 7bc87bc6 [72788.040879] RDX: RSI: 8802728e3db0 RDI: 880273544000 [72788.040880] RBP: 88026deb39f8 R08: 063c14effd0f R09: 0119 [72788.040881] R10: 0005 R11: 8802769f2260 R12: 8802728e3db0 [72788.040881] R13: 001f R14: 8802769ebcc0 R15: 810c4730 [72788.040883] FS: 7f7cd380a700() GS:8802769e() knlGS: [72788.040883] CS: 0010 DS: ES: CR0: 80050033 [72788.040884] CR2: 0068a0e8 CR3: 000267ba4000 CR4: 07e0 [72788.040885] Stack: [72788.040886] 88026deb39f8 815e2aa0 8106711a [72788.040887] 8802769ec4e0 8802769ec4e0 8802769e3f58 810c44bd [72788.040888] 88026deb39f8 88026deb39f8 15ed4f5ff89b 810c476e [72788.040889] Call Trace: [72788.040889] [72788.040891] [] ? rt_spin_lock_slowunlock_hirq+0x10/0x20 [72788.040893] [] ? update_process_times+0x3a/0x60 [72788.040895] [] ? tick_sched_handle+0x2d/0x70 [72788.040896] [] ? tick_sched_timer+0x3e/0x70 [72788.040898] [] ? __run_hrtimer+0x13d/0x260 [72788.040900] [] ? hrtimer_interrupt+0x12c/0x310 [72788.040901] [] ? vtime_account_system+0x4e/0xf0 [72788.040903] [] ? smp_apic_timer_interrupt+0x36/0x50 [72788.040904] [] ? apic_timer_interrupt+0x6d/0x80 [72788.040905] [72788.040906] [] ? _raw_spin_lock+0x2a/0x40 [72788.040908] [] ? rt_spin_lock_slowlock+0x33/0x2d0 [72788.040910] [] ? migrate_enable+0xc4/0x220 [72788.040911] [] ? ip_finish_output+0x258/0x450 [72788.040913] [] ? lock_timer_base+0x41/0x80 [72788.040914] [] ? mod_timer+0x66/0x290 [72788.040916] [] ? sk_reset_timer+0xf/0x20 [72788.040917] [] ? tcp_write_xmit+0x1cf/0x5d0 [72788.040919] [] ? __tcp_push_pending_frames+0x25/0x60 [72788.040921] [] ? tcp_sendmsg+0x114/0xbb0 [72788.040923] [] ? sock_sendmsg+0xaf/0xf0 [72788.040925] [] ? touch_atime+0x65/0x150 [72788.040927] [] ? SyS_sendto+0x118/0x190 [72788.040929] [] ? vtime_account_user+0x66/0x100 [72788.040930] [] ? syscall_trace_enter+0x2a/0x260 [72788.040932] [] ? tracesys+0xdd/0xe2 The most likely suspect is the rt_spin_lock_slowlock() that is apparently being acquired by migrate_enable(). This could be due to: 1. Massive contention on that lock. 2. Someone else holding that lock for excessive time periods. Evidence in favor: CPU 0 appears to be running within migrate_enable(). But isn't migrate_enable() really quite lightweight? 3. Possible looping in the networking stack -- but this seems unlikely given that we appear to have caught a lock acquisition in the act. (Not impossible, however, if there are lots of migrate_enable() calls in the networking stack, which there are due to all the per-CPU work.) So which code do you think deserves the big lump of coal? ;-) Thanx, Paul -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] rcu: Eliminate softirq processing from rcutree
On Mon, Dec 23, 2013 at 05:38:53AM +0100, Mike Galbraith wrote: On Sun, 2013-12-22 at 09:57 +0100, Mike Galbraith wrote: I'll let the box give RCU something to do for a couple days. No news is good news. Ho ho hum, merry christmas, gift attached. Hmmm... I guess I should take a moment to work out who has been naughty and nice... I'll beat on virgin -rt7, see if it survives, then re-apply RCU patch and retest. This kernel had nohz_full enabled, along with Sebastian's pending -rt fix for same, so RCU patch was not only not running solo, box was running a known somewhat buggy config as well. Box was doing endless tbench 64 when it started stalling fwiw. [72788.040872] NMI backtrace for cpu 31 [72788.040874] CPU: 31 PID: 13975 Comm: tbench Tainted: GW 3.12.6-rt7-nohz #192 [72788.040874] Hardware name: Hewlett-Packard ProLiant DL980 G7, BIOS P66 07/07/2010 [72788.040875] task: 8802728e3db0 ti: 88026deb2000 task.ti: 88026deb2000 [72788.040877] RIP: 0010:[815e34e4] [815e34e4] _raw_spin_trylock+0x14/0x80 [72788.040878] RSP: 0018:8802769e3e58 EFLAGS: 0002 [72788.040879] RAX: 88026deb3fd8 RBX: 880273544000 RCX: 7bc87bc6 [72788.040879] RDX: RSI: 8802728e3db0 RDI: 880273544000 [72788.040880] RBP: 88026deb39f8 R08: 063c14effd0f R09: 0119 [72788.040881] R10: 0005 R11: 8802769f2260 R12: 8802728e3db0 [72788.040881] R13: 001f R14: 8802769ebcc0 R15: 810c4730 [72788.040883] FS: 7f7cd380a700() GS:8802769e() knlGS: [72788.040883] CS: 0010 DS: ES: CR0: 80050033 [72788.040884] CR2: 0068a0e8 CR3: 000267ba4000 CR4: 07e0 [72788.040885] Stack: [72788.040886] 88026deb39f8 815e2aa0 8106711a [72788.040887] 8802769ec4e0 8802769ec4e0 8802769e3f58 810c44bd [72788.040888] 88026deb39f8 88026deb39f8 15ed4f5ff89b 810c476e [72788.040889] Call Trace: [72788.040889] IRQ [72788.040891] [815e2aa0] ? rt_spin_lock_slowunlock_hirq+0x10/0x20 [72788.040893] [8106711a] ? update_process_times+0x3a/0x60 [72788.040895] [810c44bd] ? tick_sched_handle+0x2d/0x70 [72788.040896] [810c476e] ? tick_sched_timer+0x3e/0x70 [72788.040898] [810839dd] ? __run_hrtimer+0x13d/0x260 [72788.040900] [81083c2c] ? hrtimer_interrupt+0x12c/0x310 [72788.040901] [8109593e] ? vtime_account_system+0x4e/0xf0 [72788.040903] [81035656] ? smp_apic_timer_interrupt+0x36/0x50 [72788.040904] [815ebc9d] ? apic_timer_interrupt+0x6d/0x80 [72788.040905] EOI [72788.040906] [815e338a] ? _raw_spin_lock+0x2a/0x40 [72788.040908] [815e23b3] ? rt_spin_lock_slowlock+0x33/0x2d0 [72788.040910] [8108ee44] ? migrate_enable+0xc4/0x220 [72788.040911] [8152f888] ? ip_finish_output+0x258/0x450 [72788.040913] [81067011] ? lock_timer_base+0x41/0x80 [72788.040914] [81068db6] ? mod_timer+0x66/0x290 [72788.040916] [814df02f] ? sk_reset_timer+0xf/0x20 [72788.040917] [81547d7f] ? tcp_write_xmit+0x1cf/0x5d0 [72788.040919] [815481e5] ? __tcp_push_pending_frames+0x25/0x60 [72788.040921] [81539e34] ? tcp_sendmsg+0x114/0xbb0 [72788.040923] [814dbc1f] ? sock_sendmsg+0xaf/0xf0 [72788.040925] [811bf5e5] ? touch_atime+0x65/0x150 [72788.040927] [814dbd78] ? SyS_sendto+0x118/0x190 [72788.040929] [81095b66] ? vtime_account_user+0x66/0x100 [72788.040930] [8100f36a] ? syscall_trace_enter+0x2a/0x260 [72788.040932] [815eb249] ? tracesys+0xdd/0xe2 The most likely suspect is the rt_spin_lock_slowlock() that is apparently being acquired by migrate_enable(). This could be due to: 1. Massive contention on that lock. 2. Someone else holding that lock for excessive time periods. Evidence in favor: CPU 0 appears to be running within migrate_enable(). But isn't migrate_enable() really quite lightweight? 3. Possible looping in the networking stack -- but this seems unlikely given that we appear to have caught a lock acquisition in the act. (Not impossible, however, if there are lots of migrate_enable() calls in the networking stack, which there are due to all the per-CPU work.) So which code do you think deserves the big lump of coal? ;-) Thanx, Paul -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] rcu: Eliminate softirq processing from rcutree
On Tue, 2013-12-24 at 11:36 -0800, Paul E. McKenney wrote: On Mon, Dec 23, 2013 at 05:38:53AM +0100, Mike Galbraith wrote: On Sun, 2013-12-22 at 09:57 +0100, Mike Galbraith wrote: I'll let the box give RCU something to do for a couple days. No news is good news. Ho ho hum, merry christmas, gift attached. Hmmm... I guess I should take a moment to work out who has been naughty and nice... I'll beat on virgin -rt7, see if it survives, then re-apply RCU patch and retest. This kernel had nohz_full enabled, along with Sebastian's pending -rt fix for same, so RCU patch was not only not running solo, box was running a known somewhat buggy config as well. Box was doing endless tbench 64 when it started stalling fwiw. [72788.040872] NMI backtrace for cpu 31 [72788.040874] CPU: 31 PID: 13975 Comm: tbench Tainted: GW 3.12.6-rt7-nohz #192 [72788.040874] Hardware name: Hewlett-Packard ProLiant DL980 G7, BIOS P66 07/07/2010 [72788.040875] task: 8802728e3db0 ti: 88026deb2000 task.ti: 88026deb2000 [72788.040877] RIP: 0010:[815e34e4] [815e34e4] _raw_spin_trylock+0x14/0x80 [72788.040878] RSP: 0018:8802769e3e58 EFLAGS: 0002 [72788.040879] RAX: 88026deb3fd8 RBX: 880273544000 RCX: 7bc87bc6 [72788.040879] RDX: RSI: 8802728e3db0 RDI: 880273544000 [72788.040880] RBP: 88026deb39f8 R08: 063c14effd0f R09: 0119 [72788.040881] R10: 0005 R11: 8802769f2260 R12: 8802728e3db0 [72788.040881] R13: 001f R14: 8802769ebcc0 R15: 810c4730 [72788.040883] FS: 7f7cd380a700() GS:8802769e() knlGS: [72788.040883] CS: 0010 DS: ES: CR0: 80050033 [72788.040884] CR2: 0068a0e8 CR3: 000267ba4000 CR4: 07e0 [72788.040885] Stack: [72788.040886] 88026deb39f8 815e2aa0 8106711a [72788.040887] 8802769ec4e0 8802769ec4e0 8802769e3f58 810c44bd [72788.040888] 88026deb39f8 88026deb39f8 15ed4f5ff89b 810c476e [72788.040889] Call Trace: [72788.040889] IRQ [72788.040891] [815e2aa0] ? rt_spin_lock_slowunlock_hirq+0x10/0x20 [72788.040893] [8106711a] ? update_process_times+0x3a/0x60 [72788.040895] [810c44bd] ? tick_sched_handle+0x2d/0x70 [72788.040896] [810c476e] ? tick_sched_timer+0x3e/0x70 [72788.040898] [810839dd] ? __run_hrtimer+0x13d/0x260 [72788.040900] [81083c2c] ? hrtimer_interrupt+0x12c/0x310 [72788.040901] [8109593e] ? vtime_account_system+0x4e/0xf0 [72788.040903] [81035656] ? smp_apic_timer_interrupt+0x36/0x50 [72788.040904] [815ebc9d] ? apic_timer_interrupt+0x6d/0x80 [72788.040905] EOI [72788.040906] [815e338a] ? _raw_spin_lock+0x2a/0x40 [72788.040908] [815e23b3] ? rt_spin_lock_slowlock+0x33/0x2d0 [72788.040910] [8108ee44] ? migrate_enable+0xc4/0x220 [72788.040911] [8152f888] ? ip_finish_output+0x258/0x450 [72788.040913] [81067011] ? lock_timer_base+0x41/0x80 [72788.040914] [81068db6] ? mod_timer+0x66/0x290 [72788.040916] [814df02f] ? sk_reset_timer+0xf/0x20 [72788.040917] [81547d7f] ? tcp_write_xmit+0x1cf/0x5d0 [72788.040919] [815481e5] ? __tcp_push_pending_frames+0x25/0x60 [72788.040921] [81539e34] ? tcp_sendmsg+0x114/0xbb0 [72788.040923] [814dbc1f] ? sock_sendmsg+0xaf/0xf0 [72788.040925] [811bf5e5] ? touch_atime+0x65/0x150 [72788.040927] [814dbd78] ? SyS_sendto+0x118/0x190 [72788.040929] [81095b66] ? vtime_account_user+0x66/0x100 [72788.040930] [8100f36a] ? syscall_trace_enter+0x2a/0x260 [72788.040932] [815eb249] ? tracesys+0xdd/0xe2 The most likely suspect is the rt_spin_lock_slowlock() that is apparently being acquired by migrate_enable(). This could be due to: 1.Massive contention on that lock. 2.Someone else holding that lock for excessive time periods. Evidence in favor: CPU 0 appears to be running within migrate_enable(). But isn't migrate_enable() really quite lightweight? 3.Possible looping in the networking stack -- but this seems unlikely given that we appear to have caught a lock acquisition in the act. (Not impossible, however, if there are lots of migrate_enable() calls in the networking stack, which there are due to all the per-CPU work.) So which code do you think deserves the big lump of coal? ;-) Sebastian's NO_HZ_FULL locking fixes. Locking is hard, and rt sure doesn't make it any easier, so lets give him a cookie or three to nibble on while he ponders that trylock stuff again instead :) -Mike -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org
Re: [PATCH] rcu: Eliminate softirq processing from rcutree
On Wed, Dec 25, 2013 at 04:07:34AM +0100, Mike Galbraith wrote: On Tue, 2013-12-24 at 11:36 -0800, Paul E. McKenney wrote: On Mon, Dec 23, 2013 at 05:38:53AM +0100, Mike Galbraith wrote: On Sun, 2013-12-22 at 09:57 +0100, Mike Galbraith wrote: I'll let the box give RCU something to do for a couple days. No news is good news. Ho ho hum, merry christmas, gift attached. Hmmm... I guess I should take a moment to work out who has been naughty and nice... I'll beat on virgin -rt7, see if it survives, then re-apply RCU patch and retest. This kernel had nohz_full enabled, along with Sebastian's pending -rt fix for same, so RCU patch was not only not running solo, box was running a known somewhat buggy config as well. Box was doing endless tbench 64 when it started stalling fwiw. [72788.040872] NMI backtrace for cpu 31 [72788.040874] CPU: 31 PID: 13975 Comm: tbench Tainted: GW 3.12.6-rt7-nohz #192 [72788.040874] Hardware name: Hewlett-Packard ProLiant DL980 G7, BIOS P66 07/07/2010 [72788.040875] task: 8802728e3db0 ti: 88026deb2000 task.ti: 88026deb2000 [72788.040877] RIP: 0010:[815e34e4] [815e34e4] _raw_spin_trylock+0x14/0x80 [72788.040878] RSP: 0018:8802769e3e58 EFLAGS: 0002 [72788.040879] RAX: 88026deb3fd8 RBX: 880273544000 RCX: 7bc87bc6 [72788.040879] RDX: RSI: 8802728e3db0 RDI: 880273544000 [72788.040880] RBP: 88026deb39f8 R08: 063c14effd0f R09: 0119 [72788.040881] R10: 0005 R11: 8802769f2260 R12: 8802728e3db0 [72788.040881] R13: 001f R14: 8802769ebcc0 R15: 810c4730 [72788.040883] FS: 7f7cd380a700() GS:8802769e() knlGS: [72788.040883] CS: 0010 DS: ES: CR0: 80050033 [72788.040884] CR2: 0068a0e8 CR3: 000267ba4000 CR4: 07e0 [72788.040885] Stack: [72788.040886] 88026deb39f8 815e2aa0 8106711a [72788.040887] 8802769ec4e0 8802769ec4e0 8802769e3f58 810c44bd [72788.040888] 88026deb39f8 88026deb39f8 15ed4f5ff89b 810c476e [72788.040889] Call Trace: [72788.040889] IRQ [72788.040891] [815e2aa0] ? rt_spin_lock_slowunlock_hirq+0x10/0x20 [72788.040893] [8106711a] ? update_process_times+0x3a/0x60 [72788.040895] [810c44bd] ? tick_sched_handle+0x2d/0x70 [72788.040896] [810c476e] ? tick_sched_timer+0x3e/0x70 [72788.040898] [810839dd] ? __run_hrtimer+0x13d/0x260 [72788.040900] [81083c2c] ? hrtimer_interrupt+0x12c/0x310 [72788.040901] [8109593e] ? vtime_account_system+0x4e/0xf0 [72788.040903] [81035656] ? smp_apic_timer_interrupt+0x36/0x50 [72788.040904] [815ebc9d] ? apic_timer_interrupt+0x6d/0x80 [72788.040905] EOI [72788.040906] [815e338a] ? _raw_spin_lock+0x2a/0x40 [72788.040908] [815e23b3] ? rt_spin_lock_slowlock+0x33/0x2d0 [72788.040910] [8108ee44] ? migrate_enable+0xc4/0x220 [72788.040911] [8152f888] ? ip_finish_output+0x258/0x450 [72788.040913] [81067011] ? lock_timer_base+0x41/0x80 [72788.040914] [81068db6] ? mod_timer+0x66/0x290 [72788.040916] [814df02f] ? sk_reset_timer+0xf/0x20 [72788.040917] [81547d7f] ? tcp_write_xmit+0x1cf/0x5d0 [72788.040919] [815481e5] ? __tcp_push_pending_frames+0x25/0x60 [72788.040921] [81539e34] ? tcp_sendmsg+0x114/0xbb0 [72788.040923] [814dbc1f] ? sock_sendmsg+0xaf/0xf0 [72788.040925] [811bf5e5] ? touch_atime+0x65/0x150 [72788.040927] [814dbd78] ? SyS_sendto+0x118/0x190 [72788.040929] [81095b66] ? vtime_account_user+0x66/0x100 [72788.040930] [8100f36a] ? syscall_trace_enter+0x2a/0x260 [72788.040932] [815eb249] ? tracesys+0xdd/0xe2 The most likely suspect is the rt_spin_lock_slowlock() that is apparently being acquired by migrate_enable(). This could be due to: 1. Massive contention on that lock. 2. Someone else holding that lock for excessive time periods. Evidence in favor: CPU 0 appears to be running within migrate_enable(). But isn't migrate_enable() really quite lightweight? 3. Possible looping in the networking stack -- but this seems unlikely given that we appear to have caught a lock acquisition in the act. (Not impossible, however, if there are lots of migrate_enable() calls in the networking stack, which there are due to all the per-CPU work.) So which code do you think deserves the big lump of coal? ;-) Sebastian's NO_HZ_FULL locking fixes. Locking is hard, and rt sure doesn't make it any easier, so lets give him a cookie or three to nibble on while he ponders that trylock stuff again
Re: [PATCH] rcu: Eliminate softirq processing from rcutree
On Mon, 2013-12-23 at 05:38 +0100, Mike Galbraith wrote: > On Sun, 2013-12-22 at 09:57 +0100, Mike Galbraith wrote: > > I'll let the box give > > RCU something to do for a couple days. No news is good news. > > Ho ho hum, merry christmas, gift attached. > > I'll beat on virgin -rt7, see if it survives, then re-apply RCU patch > and retest. This kernel had nohz_full enabled, along with Sebastian's > pending -rt fix for same, so RCU patch was not only not running solo, > box was running a known somewhat buggy config as well. Box was doing > endless tbench 64 when it started stalling fwiw. > > -Mike P.S. virgin -rt7 doing tbench 64 + make -j64 [ 97.907960] perf samples too long (3138 > 2500), lowering kernel.perf_event_max_sample_rate to 5 [ 103.047921] perf samples too long (5544 > 5000), lowering kernel.perf_event_max_sample_rate to 25000 [ 181.561271] perf samples too long (10318 > 1), lowering kernel.perf_event_max_sample_rate to 13000 [ 184.243750] INFO: NMI handler (perf_event_nmi_handler) took too long to run: 1.084 msecs [ 248.914422] perf samples too long (19719 > 19230), lowering kernel.perf_event_max_sample_rate to 7000 [ 382.116674] NOHZ: local_softirq_pending 10 [ 405.201593] perf samples too long (36824 > 35714), lowering kernel.perf_event_max_sample_rate to 4000 [ 444.704185] NOHZ: local_softirq_pending 08 [ 444.704208] NOHZ: local_softirq_pending 08 [ 444.704579] NOHZ: local_softirq_pending 08 [ 444.704678] NOHZ: local_softirq_pending 08 [ 444.705100] NOHZ: local_softirq_pending 08 [ 444.705980] NOHZ: local_softirq_pending 08 [ 444.705994] NOHZ: local_softirq_pending 08 [ 444.708315] NOHZ: local_softirq_pending 08 [ 444.710348] NOHZ: local_softirq_pending 08 [ 474.435582] INFO: NMI handler (perf_event_nmi_handler) took too long to run: 1.096 msecs [ 475.994055] perf samples too long (63124 > 62500), lowering kernel.perf_event_max_sample_rate to 2000 Those annoying perf gripes are generic, not -rt. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] rcu: Eliminate softirq processing from rcutree
On Sun, 2013-12-22 at 09:57 +0100, Mike Galbraith wrote: > I'll let the box give > RCU something to do for a couple days. No news is good news. Ho ho hum, merry christmas, gift attached. I'll beat on virgin -rt7, see if it survives, then re-apply RCU patch and retest. This kernel had nohz_full enabled, along with Sebastian's pending -rt fix for same, so RCU patch was not only not running solo, box was running a known somewhat buggy config as well. Box was doing endless tbench 64 when it started stalling fwiw. -Mike vogelweide-stall.gz Description: GNU Zip compressed data
Re: [PATCH] rcu: Eliminate softirq processing from rcutree
On Sun, 2013-12-22 at 04:07 +0100, Mike Galbraith wrote: > On Sat, 2013-12-21 at 20:39 +0100, Sebastian Andrzej Siewior wrote: > > From: "Paul E. McKenney" > > > > Running RCU out of softirq is a problem for some workloads that would > > like to manage RCU core processing independently of other softirq work, > > for example, setting kthread priority. This commit therefore moves the > > RCU core work from softirq to a per-CPU/per-flavor SCHED_OTHER kthread > > named rcuc. The SCHED_OTHER approach avoids the scalability problems > > that appeared with the earlier attempt to move RCU core processing to > > from softirq to kthreads. That said, kernels built with RCU_BOOST=y > > will run the rcuc kthreads at the RCU-boosting priority. > > I'll take this for a spin on my 64 core test box. > > I'm pretty sure I'll still end up having to split softirq threads again > though, as big box has been unable to meet jitter requirements without, > and last upstream rt kernel tested still couldn't. Still can't fwiw, but whatever, back to $subject. I'll let the box give RCU something to do for a couple days. No news is good news. -Mike 30 minute isolated core jitter test says tinkering will definitely be required. 3.0-rt does single digit worst case on same old box. Darn. (test is imperfect, but good enough) FREQ=960 FRAMES=1728000 LOOP=5 using CPUs 4 - 23 FREQ=1000 FRAMES=180 LOOP=48000 using CPUs 24 - 43 FREQ=300 FRAMES=54 LOOP=16 using CPUs 44 - 63 on your marks... get set... POW! Cpu FramesMin Max(Frame) Avg Sigma LastTrans Fliers(Frames) 4 1727979 0.0159 181.66 (1043545)0.4492 0.58760 (0) 16 (828505,828506,859225,859226,889945,..1043546) 5 1727980 0.0159 181.90 (1013305)0.4560 0.61180 (0) 16 (798265,798266,828985,828986,859705,..1013306) 6 1727981 0.0159 189.05 (1013785)0.3691 0.62250 (0) 16 (798745,798746,829465,829466,860185,..1013786) 7 1727982 0.0159 177.88 (983546) 0.2885 0.52690 (0) 16 (768505,768506,799225,799226,829945,..983546) 8 1727984 0.0159 192.63 (984025) 0.3131 0.63070 (0) 18 (738265,738266,768985,768986,799705,..984026) 9 1727985 0.0159 16.43 (801406) 0.6562 0.57940 (0) 10 1727986 0.0159 186.94 (954266) 0.3514 0.62520 (0) 16 (739225,739226,769945,769946,800665,..954266) 11 1727987 0.0159 194.06 (954745) 0.4341 0.65470 (0) 18 (708985,708986,739705,739706,770425,..954746) 12 1727989 0.0159 13.61 (67116) 0.3364 0.42940 (0) 13 1727990 0.0159 186.19 (894265) 0.3955 0.61130 (0) 16 (679225,679226,709945,709946,740665,..894266) 14 1727991 0.0159 192.18 (894746) 0.4410 0.64490 (0) 18 (648985,648986,679705,679706,710425,..894746) 15 1727993 0.0159 183.36 (833786) 0.5582 0.66550 (0) 16 (618745,618746,649465,649466,680185,..833786) 16 1727994 0.0159 193.61 (895706) 0.6073 0.73820 (0) 17 (649945,680665,680666,711385,711386,..895706) 17 1727995 0.0159 36.94 (739943) 0.7135 0.75430 (0) 6 (173558,173559,739943,739944,1224751,1224752) 18 1727996 0.0159 167.39 (835226) 0.8385 0.82870 (0) 16 (620185,620186,650905,650906,681625,..835226) 19 1727997 0.0159 172.84 (804985) 0.5110 0.69590 (0) 17 (589946,620665,620666,651385,651386,..835706) 20 1727999 0.0159 180.47 (774745) 0.7566 0.75620 (0) 16 (559705,559706,590425,590426,621145,..774746) 21 1728000 0.0159 169.74 (744505) 0.7719 0.81540 (0) 16 (560185,560186,590905,590906,621625,..775226) 22 1728000 0.0159 194.80 (836667) 0.6799 0.70630 (0) 16 (590906,590907,622105,622106,652346,..836667) 23 1728000 0.0159 183.12 (745466) 0.6733 0.70910 (0) 16 (530425,530426,561145,561146,591865,..745466) 24 180 0.0725 7.46 (132730) 0.5375 0.44620 (0) 25 180 0.0725 7.23 (132730) 0.5725 0.48160 (0) 26 180 0.0725 7.23 (132730) 0.5119 0.41940 (0) 27 180 0.0725 4.93 (132730) 0.4102 0.33790 (0) 28 180 0.0725 5.08 (444312) 0.4275 0.35100 (0) 29 180 0.0725 6.75 (132717) 0.5501 0.52320 (0) 30 180 0.0725 11.61 (12026) 0.3811 0.39340 (0) 31 180 0.0725 11.61 (12526) 0.4054 0.45510 (0) 32 180 0.0725 50.95 (13026) 0.6015 0.56170 (0) 31 (13026,13027,45026,45027,77026,..909027) 33 180 0.0725 62.63 (13526) 0.5643 0.59220 (0) 112 (13526,13527,45526,45527,77526,..1773527) 34 180 0.0725 70.26 (14026) 0.3698 0.61320 (0) 112 (14026,14027,46026,46027,78026,..1774027) 35 180 0.0725 84.57 (14526) 0.6490 0.79810 (0) 112 (14526,14527,46526,46527,78526,..1774527) 36 180 0.0725 81.94 (943026) 0.3917 0.63870 (0) 112 (15026,15027,47026,47027,79026,..1775027) 37 180 0.0725 93.86 (15526) 0.6346 0.85800 (0)
Re: [PATCH] rcu: Eliminate softirq processing from rcutree
On Sun, 2013-12-22 at 04:07 +0100, Mike Galbraith wrote: On Sat, 2013-12-21 at 20:39 +0100, Sebastian Andrzej Siewior wrote: From: Paul E. McKenney paul...@linux.vnet.ibm.com Running RCU out of softirq is a problem for some workloads that would like to manage RCU core processing independently of other softirq work, for example, setting kthread priority. This commit therefore moves the RCU core work from softirq to a per-CPU/per-flavor SCHED_OTHER kthread named rcuc. The SCHED_OTHER approach avoids the scalability problems that appeared with the earlier attempt to move RCU core processing to from softirq to kthreads. That said, kernels built with RCU_BOOST=y will run the rcuc kthreads at the RCU-boosting priority. I'll take this for a spin on my 64 core test box. I'm pretty sure I'll still end up having to split softirq threads again though, as big box has been unable to meet jitter requirements without, and last upstream rt kernel tested still couldn't. Still can't fwiw, but whatever, back to $subject. I'll let the box give RCU something to do for a couple days. No news is good news. -Mike 30 minute isolated core jitter test says tinkering will definitely be required. 3.0-rt does single digit worst case on same old box. Darn. (test is imperfect, but good enough) FREQ=960 FRAMES=1728000 LOOP=5 using CPUs 4 - 23 FREQ=1000 FRAMES=180 LOOP=48000 using CPUs 24 - 43 FREQ=300 FRAMES=54 LOOP=16 using CPUs 44 - 63 on your marks... get set... POW! Cpu FramesMin Max(Frame) Avg Sigma LastTrans Fliers(Frames) 4 1727979 0.0159 181.66 (1043545)0.4492 0.58760 (0) 16 (828505,828506,859225,859226,889945,..1043546) 5 1727980 0.0159 181.90 (1013305)0.4560 0.61180 (0) 16 (798265,798266,828985,828986,859705,..1013306) 6 1727981 0.0159 189.05 (1013785)0.3691 0.62250 (0) 16 (798745,798746,829465,829466,860185,..1013786) 7 1727982 0.0159 177.88 (983546) 0.2885 0.52690 (0) 16 (768505,768506,799225,799226,829945,..983546) 8 1727984 0.0159 192.63 (984025) 0.3131 0.63070 (0) 18 (738265,738266,768985,768986,799705,..984026) 9 1727985 0.0159 16.43 (801406) 0.6562 0.57940 (0) 10 1727986 0.0159 186.94 (954266) 0.3514 0.62520 (0) 16 (739225,739226,769945,769946,800665,..954266) 11 1727987 0.0159 194.06 (954745) 0.4341 0.65470 (0) 18 (708985,708986,739705,739706,770425,..954746) 12 1727989 0.0159 13.61 (67116) 0.3364 0.42940 (0) 13 1727990 0.0159 186.19 (894265) 0.3955 0.61130 (0) 16 (679225,679226,709945,709946,740665,..894266) 14 1727991 0.0159 192.18 (894746) 0.4410 0.64490 (0) 18 (648985,648986,679705,679706,710425,..894746) 15 1727993 0.0159 183.36 (833786) 0.5582 0.66550 (0) 16 (618745,618746,649465,649466,680185,..833786) 16 1727994 0.0159 193.61 (895706) 0.6073 0.73820 (0) 17 (649945,680665,680666,711385,711386,..895706) 17 1727995 0.0159 36.94 (739943) 0.7135 0.75430 (0) 6 (173558,173559,739943,739944,1224751,1224752) 18 1727996 0.0159 167.39 (835226) 0.8385 0.82870 (0) 16 (620185,620186,650905,650906,681625,..835226) 19 1727997 0.0159 172.84 (804985) 0.5110 0.69590 (0) 17 (589946,620665,620666,651385,651386,..835706) 20 1727999 0.0159 180.47 (774745) 0.7566 0.75620 (0) 16 (559705,559706,590425,590426,621145,..774746) 21 1728000 0.0159 169.74 (744505) 0.7719 0.81540 (0) 16 (560185,560186,590905,590906,621625,..775226) 22 1728000 0.0159 194.80 (836667) 0.6799 0.70630 (0) 16 (590906,590907,622105,622106,652346,..836667) 23 1728000 0.0159 183.12 (745466) 0.6733 0.70910 (0) 16 (530425,530426,561145,561146,591865,..745466) 24 180 0.0725 7.46 (132730) 0.5375 0.44620 (0) 25 180 0.0725 7.23 (132730) 0.5725 0.48160 (0) 26 180 0.0725 7.23 (132730) 0.5119 0.41940 (0) 27 180 0.0725 4.93 (132730) 0.4102 0.33790 (0) 28 180 0.0725 5.08 (444312) 0.4275 0.35100 (0) 29 180 0.0725 6.75 (132717) 0.5501 0.52320 (0) 30 180 0.0725 11.61 (12026) 0.3811 0.39340 (0) 31 180 0.0725 11.61 (12526) 0.4054 0.45510 (0) 32 180 0.0725 50.95 (13026) 0.6015 0.56170 (0) 31 (13026,13027,45026,45027,77026,..909027) 33 180 0.0725 62.63 (13526) 0.5643 0.59220 (0) 112 (13526,13527,45526,45527,77526,..1773527) 34 180 0.0725 70.26 (14026) 0.3698 0.61320 (0) 112 (14026,14027,46026,46027,78026,..1774027) 35 180 0.0725 84.57 (14526) 0.6490 0.79810 (0) 112 (14526,14527,46526,46527,78526,..1774527) 36 180 0.0725 81.94 (943026) 0.3917 0.63870 (0) 112 (15026,15027,47026,47027,79026,..1775027) 37 180 0.0725 93.86 (15526) 0.6346 0.85800 (0) 112
Re: [PATCH] rcu: Eliminate softirq processing from rcutree
On Sun, 2013-12-22 at 09:57 +0100, Mike Galbraith wrote: I'll let the box give RCU something to do for a couple days. No news is good news. Ho ho hum, merry christmas, gift attached. I'll beat on virgin -rt7, see if it survives, then re-apply RCU patch and retest. This kernel had nohz_full enabled, along with Sebastian's pending -rt fix for same, so RCU patch was not only not running solo, box was running a known somewhat buggy config as well. Box was doing endless tbench 64 when it started stalling fwiw. -Mike vogelweide-stall.gz Description: GNU Zip compressed data
Re: [PATCH] rcu: Eliminate softirq processing from rcutree
On Mon, 2013-12-23 at 05:38 +0100, Mike Galbraith wrote: On Sun, 2013-12-22 at 09:57 +0100, Mike Galbraith wrote: I'll let the box give RCU something to do for a couple days. No news is good news. Ho ho hum, merry christmas, gift attached. I'll beat on virgin -rt7, see if it survives, then re-apply RCU patch and retest. This kernel had nohz_full enabled, along with Sebastian's pending -rt fix for same, so RCU patch was not only not running solo, box was running a known somewhat buggy config as well. Box was doing endless tbench 64 when it started stalling fwiw. -Mike P.S. virgin -rt7 doing tbench 64 + make -j64 [ 97.907960] perf samples too long (3138 2500), lowering kernel.perf_event_max_sample_rate to 5 [ 103.047921] perf samples too long (5544 5000), lowering kernel.perf_event_max_sample_rate to 25000 [ 181.561271] perf samples too long (10318 1), lowering kernel.perf_event_max_sample_rate to 13000 [ 184.243750] INFO: NMI handler (perf_event_nmi_handler) took too long to run: 1.084 msecs [ 248.914422] perf samples too long (19719 19230), lowering kernel.perf_event_max_sample_rate to 7000 [ 382.116674] NOHZ: local_softirq_pending 10 [ 405.201593] perf samples too long (36824 35714), lowering kernel.perf_event_max_sample_rate to 4000 [ 444.704185] NOHZ: local_softirq_pending 08 [ 444.704208] NOHZ: local_softirq_pending 08 [ 444.704579] NOHZ: local_softirq_pending 08 [ 444.704678] NOHZ: local_softirq_pending 08 [ 444.705100] NOHZ: local_softirq_pending 08 [ 444.705980] NOHZ: local_softirq_pending 08 [ 444.705994] NOHZ: local_softirq_pending 08 [ 444.708315] NOHZ: local_softirq_pending 08 [ 444.710348] NOHZ: local_softirq_pending 08 [ 474.435582] INFO: NMI handler (perf_event_nmi_handler) took too long to run: 1.096 msecs [ 475.994055] perf samples too long (63124 62500), lowering kernel.perf_event_max_sample_rate to 2000 Those annoying perf gripes are generic, not -rt. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] rcu: Eliminate softirq processing from rcutree
On Sat, 2013-12-21 at 20:39 +0100, Sebastian Andrzej Siewior wrote: > From: "Paul E. McKenney" > > Running RCU out of softirq is a problem for some workloads that would > like to manage RCU core processing independently of other softirq work, > for example, setting kthread priority. This commit therefore moves the > RCU core work from softirq to a per-CPU/per-flavor SCHED_OTHER kthread > named rcuc. The SCHED_OTHER approach avoids the scalability problems > that appeared with the earlier attempt to move RCU core processing to > from softirq to kthreads. That said, kernels built with RCU_BOOST=y > will run the rcuc kthreads at the RCU-boosting priority. I'll take this for a spin on my 64 core test box. I'm pretty sure I'll still end up having to split softirq threads again though, as big box has been unable to meet jitter requirements without, and last upstream rt kernel tested still couldn't. -Mike Hm. Another thing I'll have to check again is btrfs locking fix, and generic IO deadlocks if you don't pull your plug upon first rtmutex block. In 3.0, both were required for box to survive heavy fs pounding. Oh yeah, and the pain of rt tasks playing idle balance for SCHED_OTHER tasks, and nohz balancing crud, and cpupri cost when cores are isolated and and.. sigh, big boxen _suck_ ;-) -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] rcu: Eliminate softirq processing from rcutree
From: "Paul E. McKenney" Running RCU out of softirq is a problem for some workloads that would like to manage RCU core processing independently of other softirq work, for example, setting kthread priority. This commit therefore moves the RCU core work from softirq to a per-CPU/per-flavor SCHED_OTHER kthread named rcuc. The SCHED_OTHER approach avoids the scalability problems that appeared with the earlier attempt to move RCU core processing to from softirq to kthreads. That said, kernels built with RCU_BOOST=y will run the rcuc kthreads at the RCU-boosting priority. Reported-by: Thomas Gleixner Signed-off-by: Paul E. McKenney --- I intend to apply this for the next -RT relase. My powerpc test box runs with this for more than 24h without anything bad happending. kernel/rcutree.c| 113 +++- kernel/rcutree.h| 3 +- kernel/rcutree_plugin.h | 134 +--- 3 files changed, 113 insertions(+), 137 deletions(-) diff --git a/kernel/rcutree.c b/kernel/rcutree.c index f4f61bb..507fab1 100644 --- a/kernel/rcutree.c +++ b/kernel/rcutree.c @@ -55,6 +55,11 @@ #include #include #include +#include +#include +#include +#include +#include "time/tick-internal.h" #include "rcutree.h" #include @@ -145,8 +150,6 @@ EXPORT_SYMBOL_GPL(rcu_scheduler_active); */ static int rcu_scheduler_fully_active __read_mostly; -#ifdef CONFIG_RCU_BOOST - /* * Control variables for per-CPU and per-rcu_node kthreads. These * handle all flavors of RCU. @@ -156,8 +159,6 @@ DEFINE_PER_CPU(unsigned int, rcu_cpu_kthread_status); DEFINE_PER_CPU(unsigned int, rcu_cpu_kthread_loops); DEFINE_PER_CPU(char, rcu_cpu_has_work); -#endif /* #ifdef CONFIG_RCU_BOOST */ - static void rcu_boost_kthread_setaffinity(struct rcu_node *rnp, int outgoingcpu); static void invoke_rcu_core(void); static void invoke_rcu_callbacks(struct rcu_state *rsp, struct rcu_data *rdp); @@ -2226,16 +2227,14 @@ __rcu_process_callbacks(struct rcu_state *rsp) /* * Do RCU core processing for the current CPU. */ -static void rcu_process_callbacks(struct softirq_action *unused) +static void rcu_process_callbacks(void) { struct rcu_state *rsp; if (cpu_is_offline(smp_processor_id())) return; - trace_rcu_utilization(TPS("Start RCU core")); for_each_rcu_flavor(rsp) __rcu_process_callbacks(rsp); - trace_rcu_utilization(TPS("End RCU core")); } /* @@ -2249,18 +2248,105 @@ static void invoke_rcu_callbacks(struct rcu_state *rsp, struct rcu_data *rdp) { if (unlikely(!ACCESS_ONCE(rcu_scheduler_fully_active))) return; - if (likely(!rsp->boost)) { - rcu_do_batch(rsp, rdp); + rcu_do_batch(rsp, rdp); +} + +static void rcu_wake_cond(struct task_struct *t, int status) +{ + /* +* If the thread is yielding, only wake it when this +* is invoked from idle +*/ + if (t && (status != RCU_KTHREAD_YIELDING || is_idle_task(current))) + wake_up_process(t); +} + +/* + * Wake up this CPU's rcuc kthread to do RCU core processing. + */ +static void invoke_rcu_core(void) +{ + unsigned long flags; + struct task_struct *t; + + if (!cpu_online(smp_processor_id())) return; + local_irq_save(flags); + __this_cpu_write(rcu_cpu_has_work, 1); + t = __this_cpu_read(rcu_cpu_kthread_task); + if (t != NULL && current != t) + rcu_wake_cond(t, __this_cpu_read(rcu_cpu_kthread_status)); + local_irq_restore(flags); +} + +static void rcu_cpu_kthread_park(unsigned int cpu) +{ + per_cpu(rcu_cpu_kthread_status, cpu) = RCU_KTHREAD_OFFCPU; +} + +static int rcu_cpu_kthread_should_run(unsigned int cpu) +{ + return __this_cpu_read(rcu_cpu_has_work); +} + +/* + * Per-CPU kernel thread that invokes RCU callbacks. This replaces the + * RCU softirq used in flavors and configurations of RCU that do not + * support RCU priority boosting. + */ +static void rcu_cpu_kthread(unsigned int cpu) +{ + unsigned int *statusp = &__get_cpu_var(rcu_cpu_kthread_status); + char work, *workp = &__get_cpu_var(rcu_cpu_has_work); + int spincnt; + + for (spincnt = 0; spincnt < 10; spincnt++) { + trace_rcu_utilization(TPS("Start CPU kthread@rcu_wait")); + local_bh_disable(); + *statusp = RCU_KTHREAD_RUNNING; + this_cpu_inc(rcu_cpu_kthread_loops); + local_irq_disable(); + work = *workp; + *workp = 0; + local_irq_enable(); + if (work) + rcu_process_callbacks(); + local_bh_enable(); + if (*workp == 0) { + trace_rcu_utilization(TPS("End CPU kthread@rcu_wait")); + *statusp = RCU_KTHREAD_WAITING; + return; +
[PATCH] rcu: Eliminate softirq processing from rcutree
From: Paul E. McKenney paul...@linux.vnet.ibm.com Running RCU out of softirq is a problem for some workloads that would like to manage RCU core processing independently of other softirq work, for example, setting kthread priority. This commit therefore moves the RCU core work from softirq to a per-CPU/per-flavor SCHED_OTHER kthread named rcuc. The SCHED_OTHER approach avoids the scalability problems that appeared with the earlier attempt to move RCU core processing to from softirq to kthreads. That said, kernels built with RCU_BOOST=y will run the rcuc kthreads at the RCU-boosting priority. Reported-by: Thomas Gleixner t...@linutronix.de Signed-off-by: Paul E. McKenney paul...@linux.vnet.ibm.com --- I intend to apply this for the next -RT relase. My powerpc test box runs with this for more than 24h without anything bad happending. kernel/rcutree.c| 113 +++- kernel/rcutree.h| 3 +- kernel/rcutree_plugin.h | 134 +--- 3 files changed, 113 insertions(+), 137 deletions(-) diff --git a/kernel/rcutree.c b/kernel/rcutree.c index f4f61bb..507fab1 100644 --- a/kernel/rcutree.c +++ b/kernel/rcutree.c @@ -55,6 +55,11 @@ #include linux/random.h #include linux/ftrace_event.h #include linux/suspend.h +#include linux/delay.h +#include linux/gfp.h +#include linux/oom.h +#include linux/smpboot.h +#include time/tick-internal.h #include rcutree.h #include trace/events/rcu.h @@ -145,8 +150,6 @@ EXPORT_SYMBOL_GPL(rcu_scheduler_active); */ static int rcu_scheduler_fully_active __read_mostly; -#ifdef CONFIG_RCU_BOOST - /* * Control variables for per-CPU and per-rcu_node kthreads. These * handle all flavors of RCU. @@ -156,8 +159,6 @@ DEFINE_PER_CPU(unsigned int, rcu_cpu_kthread_status); DEFINE_PER_CPU(unsigned int, rcu_cpu_kthread_loops); DEFINE_PER_CPU(char, rcu_cpu_has_work); -#endif /* #ifdef CONFIG_RCU_BOOST */ - static void rcu_boost_kthread_setaffinity(struct rcu_node *rnp, int outgoingcpu); static void invoke_rcu_core(void); static void invoke_rcu_callbacks(struct rcu_state *rsp, struct rcu_data *rdp); @@ -2226,16 +2227,14 @@ __rcu_process_callbacks(struct rcu_state *rsp) /* * Do RCU core processing for the current CPU. */ -static void rcu_process_callbacks(struct softirq_action *unused) +static void rcu_process_callbacks(void) { struct rcu_state *rsp; if (cpu_is_offline(smp_processor_id())) return; - trace_rcu_utilization(TPS(Start RCU core)); for_each_rcu_flavor(rsp) __rcu_process_callbacks(rsp); - trace_rcu_utilization(TPS(End RCU core)); } /* @@ -2249,18 +2248,105 @@ static void invoke_rcu_callbacks(struct rcu_state *rsp, struct rcu_data *rdp) { if (unlikely(!ACCESS_ONCE(rcu_scheduler_fully_active))) return; - if (likely(!rsp-boost)) { - rcu_do_batch(rsp, rdp); + rcu_do_batch(rsp, rdp); +} + +static void rcu_wake_cond(struct task_struct *t, int status) +{ + /* +* If the thread is yielding, only wake it when this +* is invoked from idle +*/ + if (t (status != RCU_KTHREAD_YIELDING || is_idle_task(current))) + wake_up_process(t); +} + +/* + * Wake up this CPU's rcuc kthread to do RCU core processing. + */ +static void invoke_rcu_core(void) +{ + unsigned long flags; + struct task_struct *t; + + if (!cpu_online(smp_processor_id())) return; + local_irq_save(flags); + __this_cpu_write(rcu_cpu_has_work, 1); + t = __this_cpu_read(rcu_cpu_kthread_task); + if (t != NULL current != t) + rcu_wake_cond(t, __this_cpu_read(rcu_cpu_kthread_status)); + local_irq_restore(flags); +} + +static void rcu_cpu_kthread_park(unsigned int cpu) +{ + per_cpu(rcu_cpu_kthread_status, cpu) = RCU_KTHREAD_OFFCPU; +} + +static int rcu_cpu_kthread_should_run(unsigned int cpu) +{ + return __this_cpu_read(rcu_cpu_has_work); +} + +/* + * Per-CPU kernel thread that invokes RCU callbacks. This replaces the + * RCU softirq used in flavors and configurations of RCU that do not + * support RCU priority boosting. + */ +static void rcu_cpu_kthread(unsigned int cpu) +{ + unsigned int *statusp = __get_cpu_var(rcu_cpu_kthread_status); + char work, *workp = __get_cpu_var(rcu_cpu_has_work); + int spincnt; + + for (spincnt = 0; spincnt 10; spincnt++) { + trace_rcu_utilization(TPS(Start CPU kthread@rcu_wait)); + local_bh_disable(); + *statusp = RCU_KTHREAD_RUNNING; + this_cpu_inc(rcu_cpu_kthread_loops); + local_irq_disable(); + work = *workp; + *workp = 0; + local_irq_enable(); + if (work) + rcu_process_callbacks(); + local_bh_enable(); + if (*workp == 0) { +
Re: [PATCH] rcu: Eliminate softirq processing from rcutree
On Sat, 2013-12-21 at 20:39 +0100, Sebastian Andrzej Siewior wrote: From: Paul E. McKenney paul...@linux.vnet.ibm.com Running RCU out of softirq is a problem for some workloads that would like to manage RCU core processing independently of other softirq work, for example, setting kthread priority. This commit therefore moves the RCU core work from softirq to a per-CPU/per-flavor SCHED_OTHER kthread named rcuc. The SCHED_OTHER approach avoids the scalability problems that appeared with the earlier attempt to move RCU core processing to from softirq to kthreads. That said, kernels built with RCU_BOOST=y will run the rcuc kthreads at the RCU-boosting priority. I'll take this for a spin on my 64 core test box. I'm pretty sure I'll still end up having to split softirq threads again though, as big box has been unable to meet jitter requirements without, and last upstream rt kernel tested still couldn't. -Mike Hm. Another thing I'll have to check again is btrfs locking fix, and generic IO deadlocks if you don't pull your plug upon first rtmutex block. In 3.0, both were required for box to survive heavy fs pounding. Oh yeah, and the pain of rt tasks playing idle balance for SCHED_OTHER tasks, and nohz balancing crud, and cpupri cost when cores are isolated and and.. sigh, big boxen _suck_ ;-) -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/