Re: RFC: futex_wait() can DoS the tick
On Thu, 2015-06-11 at 20:52 +0200, Thomas Gleixner wrote: > On Thu, 11 Jun 2015, Mike Galbraith wrote: > > On Thu, 2015-06-11 at 15:13 +0200, Thomas Gleixner wrote: > > > > > The kernel does not care whether the actual interrupt happens or the > > > timer is expired by other means. > > > > Final question: Notabug explanation fits.. but why would the behavior > > delta be so radical and repeatable w/wo deadline timer on some boxen? > > Good question. Cannot answer that w/o looking at traces. I may come back with some, I spent enough time doing circular notabug chasing all over hell and creation to be a tad annoyed ;-) -Mike -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: RFC: futex_wait() can DoS the tick
On Thu, 11 Jun 2015, Mike Galbraith wrote: > On Thu, 2015-06-11 at 15:13 +0200, Thomas Gleixner wrote: > > > The kernel does not care whether the actual interrupt happens or the > > timer is expired by other means. > > Final question: Notabug explanation fits.. but why would the behavior > delta be so radical and repeatable w/wo deadline timer on some boxen? Good question. Cannot answer that w/o looking at traces. Thanks, tglx -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: RFC: futex_wait() can DoS the tick
On Thu, 2015-06-11 at 15:13 +0200, Thomas Gleixner wrote: > The kernel does not care whether the actual interrupt happens or the > timer is expired by other means. Final question: Notabug explanation fits.. but why would the behavior delta be so radical and repeatable w/wo deadline timer on some boxen? -Mike -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: RFC: futex_wait() can DoS the tick
On Thu, 2015-06-11 at 15:13 +0200, Thomas Gleixner wrote: > What you are looking at is the actual timer interrupt vector, but > that's not the the tick. > > # grep 'expire_entry.*tick_sched_timer' trace2 | wc -l > 45 > > 175 / 45 =~ 4 . So I you have CPNFIG_HZ=250 > > The kernel does not care whether the actual interrupt happens or the > timer is expired by other means. Learn something new every day. I (obviously) had no idea that it's perfectly fine for interrupts to go away when the CPU is busy. Oh well. Thanks for the new wrinkle in the mush between my ears. -Mike -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: RFC: futex_wait() can DoS the tick
On Thu, 11 Jun 2015, Mike Galbraith wrote: > On Thu, 2015-06-11 at 10:34 +0200, Thomas Gleixner wrote: > > > Can you try the following, please? > > > > Enable function tracer and hrtimer events manually. Then watch the irq > > count on cpu3. If it stalls or becomes slow, then stop the trace with > > > > echo 0 >/sys/kernel/debug/tracing/tracing_on > > > > If the overhead of the function tracer hides the problem, then try just > > with hrtimer, sched_switch and irq events. > > Yeah, function tracer hides it. After some fiddling with buffer size to > be able to ever see a tick, I finally have a trace trimmed with a tick > on each end, 175ms gap between them. It's still 1.5 after xz, sending > off list. What you are looking at is the actual timer interrupt vector, but that's not the the tick. # grep 'expire_entry.*tick_sched_timer' trace2 | wc -l 45 175 / 45 =~ 4 . So I you have CPNFIG_HZ=250 The kernel does not care whether the actual interrupt happens or the timer is expired by other means. If Xen relies on the vector, it's none of my problems Thanks, tglx -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: RFC: futex_wait() can DoS the tick
On Thu, 2015-06-11 at 10:34 +0200, Thomas Gleixner wrote: > Can you try the following, please? > > Enable function tracer and hrtimer events manually. Then watch the irq > count on cpu3. If it stalls or becomes slow, then stop the trace with > > echo 0 >/sys/kernel/debug/tracing/tracing_on > > If the overhead of the function tracer hides the problem, then try just > with hrtimer, sched_switch and irq events. Yeah, function tracer hides it. After some fiddling with buffer size to be able to ever see a tick, I finally have a trace trimmed with a tick on each end, 175ms gap between them. It's still 1.5 after xz, sending off list. -Mike -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: RFC: futex_wait() can DoS the tick
On Thu, 11 Jun 2015, Mike Galbraith wrote: > On Wed, 2015-06-10 at 20:59 +0200, Thomas Gleixner wrote: > Well, trying to do that like so... > > trace-cmd record -M 8 -p function -e irq:* -e irq_vectors:local* -e > timer:hrtimer* -- sleep 1 > ..capture takes much more than a second... > > LOC: 248161 226536 42091 38892 Local timer interrupts > LOC: 248381 226783 42092 38901 Local timer interrupts > LOC: 248668 227038 42092 38903 Local timer interrupts > LOC: 248963 227277 42092 38904 Local timer interrupts > LOC: 249214 227515 42092 38905 Local timer interrupts > LOC: 249486 227732 42092 38905 Local timer interrupts > LOC: 249720 227961 42092 38905 Local timer interrupts > > ...but more importantly, when it gets cranked up.. > > homer:~/tmp # trace-cmd report > report > homer:~/tmp # grep local_timer_entry report|wc -l > 252 > > ...it scares the problem away. Can you try the following, please? Enable function tracer and hrtimer events manually. Then watch the irq count on cpu3. If it stalls or becomes slow, then stop the trace with echo 0 >/sys/kernel/debug/tracing/tracing_on If the overhead of the function tracer hides the problem, then try just with hrtimer, sched_switch and irq events. Thanks, tglx -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: RFC: futex_wait() can DoS the tick
On Thu, 2015-06-11 at 09:35 +0200, Mike Galbraith wrote: > On Wed, 2015-06-10 at 17:12 +0200, Thomas Gleixner wrote: > > > I just booted a SNB with lapic=notscdeadline and ran that test > > program. All what happens is - as expected - that ksoftirqd runs more > > than we would like it to. I cannot observe any anomality vs. local > > timer interrupts at all. > > Ditto on a E5-4620 box this morning, no sign of it. E5-2670 box I had > verified "went away" (dropped off the net on reboot, doesn't respond to remote button poking) -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: RFC: futex_wait() can DoS the tick
On Wed, 2015-06-10 at 17:12 +0200, Thomas Gleixner wrote: > I just booted a SNB with lapic=notscdeadline and ran that test > program. All what happens is - as expected - that ksoftirqd runs more > than we would like it to. I cannot observe any anomality vs. local > timer interrupts at all. Ditto on a E5-4620 box this morning, no sign of it. E5-2670 box I had verified "went away" when I put master on it this morning, but that doesn't matter much, having deadline timer feature alone definitely doesn't mean box can be convinced to reproduce the stall. Hohum. -Mike -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: RFC: futex_wait() can DoS the tick
On Thu, 2015-06-11 at 10:34 +0200, Thomas Gleixner wrote: Can you try the following, please? Enable function tracer and hrtimer events manually. Then watch the irq count on cpu3. If it stalls or becomes slow, then stop the trace with echo 0 /sys/kernel/debug/tracing/tracing_on If the overhead of the function tracer hides the problem, then try just with hrtimer, sched_switch and irq events. Yeah, function tracer hides it. After some fiddling with buffer size to be able to ever see a tick, I finally have a trace trimmed with a tick on each end, 175ms gap between them. It's still 1.5 after xz, sending off list. -Mike -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: RFC: futex_wait() can DoS the tick
On Thu, 11 Jun 2015, Mike Galbraith wrote: On Thu, 2015-06-11 at 10:34 +0200, Thomas Gleixner wrote: Can you try the following, please? Enable function tracer and hrtimer events manually. Then watch the irq count on cpu3. If it stalls or becomes slow, then stop the trace with echo 0 /sys/kernel/debug/tracing/tracing_on If the overhead of the function tracer hides the problem, then try just with hrtimer, sched_switch and irq events. Yeah, function tracer hides it. After some fiddling with buffer size to be able to ever see a tick, I finally have a trace trimmed with a tick on each end, 175ms gap between them. It's still 1.5 after xz, sending off list. What you are looking at is the actual timer interrupt vector, but that's not the the tick. # grep 'expire_entry.*tick_sched_timer' trace2 | wc -l 45 175 / 45 =~ 4 . So I you have CPNFIG_HZ=250 The kernel does not care whether the actual interrupt happens or the timer is expired by other means. If Xen relies on the vector, it's none of my problems Thanks, tglx -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: RFC: futex_wait() can DoS the tick
On Thu, 2015-06-11 at 15:13 +0200, Thomas Gleixner wrote: What you are looking at is the actual timer interrupt vector, but that's not the the tick. # grep 'expire_entry.*tick_sched_timer' trace2 | wc -l 45 175 / 45 =~ 4 . So I you have CPNFIG_HZ=250 The kernel does not care whether the actual interrupt happens or the timer is expired by other means. Learn something new every day. I (obviously) had no idea that it's perfectly fine for interrupts to go away when the CPU is busy. Oh well. Thanks for the new wrinkle in the mush between my ears. -Mike -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: RFC: futex_wait() can DoS the tick
On Wed, 2015-06-10 at 17:12 +0200, Thomas Gleixner wrote: I just booted a SNB with lapic=notscdeadline and ran that test program. All what happens is - as expected - that ksoftirqd runs more than we would like it to. I cannot observe any anomality vs. local timer interrupts at all. Ditto on a E5-4620 box this morning, no sign of it. E5-2670 box I had verified went away when I put master on it this morning, but that doesn't matter much, having deadline timer feature alone definitely doesn't mean box can be convinced to reproduce the stall. Hohum. -Mike -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: RFC: futex_wait() can DoS the tick
On Thu, 2015-06-11 at 09:35 +0200, Mike Galbraith wrote: On Wed, 2015-06-10 at 17:12 +0200, Thomas Gleixner wrote: I just booted a SNB with lapic=notscdeadline and ran that test program. All what happens is - as expected - that ksoftirqd runs more than we would like it to. I cannot observe any anomality vs. local timer interrupts at all. Ditto on a E5-4620 box this morning, no sign of it. E5-2670 box I had verified went away (dropped off the net on reboot, doesn't respond to remote button poking) -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: RFC: futex_wait() can DoS the tick
On Thu, 11 Jun 2015, Mike Galbraith wrote: On Wed, 2015-06-10 at 20:59 +0200, Thomas Gleixner wrote: Well, trying to do that like so... trace-cmd record -M 8 -p function -e irq:* -e irq_vectors:local* -e timer:hrtimer* -- sleep 1 ..capture takes much more than a second... LOC: 248161 226536 42091 38892 Local timer interrupts LOC: 248381 226783 42092 38901 Local timer interrupts LOC: 248668 227038 42092 38903 Local timer interrupts LOC: 248963 227277 42092 38904 Local timer interrupts LOC: 249214 227515 42092 38905 Local timer interrupts LOC: 249486 227732 42092 38905 Local timer interrupts LOC: 249720 227961 42092 38905 Local timer interrupts ...but more importantly, when it gets cranked up.. homer:~/tmp # trace-cmd report report homer:~/tmp # grep local_timer_entry report|wc -l 252 ...it scares the problem away. Can you try the following, please? Enable function tracer and hrtimer events manually. Then watch the irq count on cpu3. If it stalls or becomes slow, then stop the trace with echo 0 /sys/kernel/debug/tracing/tracing_on If the overhead of the function tracer hides the problem, then try just with hrtimer, sched_switch and irq events. Thanks, tglx -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: RFC: futex_wait() can DoS the tick
On Thu, 2015-06-11 at 15:13 +0200, Thomas Gleixner wrote: The kernel does not care whether the actual interrupt happens or the timer is expired by other means. Final question: Notabug explanation fits.. but why would the behavior delta be so radical and repeatable w/wo deadline timer on some boxen? -Mike -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: RFC: futex_wait() can DoS the tick
On Thu, 11 Jun 2015, Mike Galbraith wrote: On Thu, 2015-06-11 at 15:13 +0200, Thomas Gleixner wrote: The kernel does not care whether the actual interrupt happens or the timer is expired by other means. Final question: Notabug explanation fits.. but why would the behavior delta be so radical and repeatable w/wo deadline timer on some boxen? Good question. Cannot answer that w/o looking at traces. Thanks, tglx -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: RFC: futex_wait() can DoS the tick
On Thu, 2015-06-11 at 20:52 +0200, Thomas Gleixner wrote: On Thu, 11 Jun 2015, Mike Galbraith wrote: On Thu, 2015-06-11 at 15:13 +0200, Thomas Gleixner wrote: The kernel does not care whether the actual interrupt happens or the timer is expired by other means. Final question: Notabug explanation fits.. but why would the behavior delta be so radical and repeatable w/wo deadline timer on some boxen? Good question. Cannot answer that w/o looking at traces. I may come back with some, I spent enough time doing circular notabug chasing all over hell and creation to be a tad annoyed ;-) -Mike -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: RFC: futex_wait() can DoS the tick
On Wed, 2015-06-10 at 20:59 +0200, Thomas Gleixner wrote: > On Wed, 10 Jun 2015, Mike Galbraith wrote: > > > On Wed, 2015-06-10 at 17:12 +0200, Thomas Gleixner wrote: > > > > > Some more information about your symptoms in form of configuration, > > > extra patches, kernel traces etc. would be appreciated. > > > > homer:~/tmp # perf record -C 3 -eirq:*,irq_vectors:local* -- sleep 10 > > [ perf record: Woken up 2 times to write data ] > > [ perf record: Captured and wrote 1380.507 MB perf.data (19530595 samples) ] > > homer:~/tmp # perf script | grep irq_vectors:local_timer > > ksoftirqd/319 [003] 5586.564238: irq_vectors:local_timer_entry: > > vector=239 > > ksoftirqd/319 [003] 5586.564239: irq_vectors:local_timer_exit: > > vector=239 > > ksoftirqd/319 [003] 5586.568238: irq_vectors:local_timer_entry: > > vector=239 > > ksoftirqd/319 [003] 5586.568239: irq_vectors:local_timer_exit: > > vector=239 > > ksoftirqd/319 [003] 5586.632218: irq_vectors:local_timer_entry: > > vector=239 > > That gap is odd. Can you fire up the function tracer please? Well, trying to do that like so... trace-cmd record -M 8 -p function -e irq:* -e irq_vectors:local* -e timer:hrtimer* -- sleep 1 ..capture takes much more than a second... LOC: 248161 226536 42091 38892 Local timer interrupts LOC: 248381 226783 42092 38901 Local timer interrupts LOC: 248668 227038 42092 38903 Local timer interrupts LOC: 248963 227277 42092 38904 Local timer interrupts LOC: 249214 227515 42092 38905 Local timer interrupts LOC: 249486 227732 42092 38905 Local timer interrupts LOC: 249720 227961 42092 38905 Local timer interrupts LOC: 249996 228200 42092 38906 Local timer interrupts LOC: 250215 228450 42092 38906 Local timer interrupts LOC: 250489 228704 42092 38906 Local timer interrupts LOC: 250729 228952 42099 38964 Local timer interrupts LOC: 250951 229141 42100 39215 Local timer interrupts LOC: 251168 229374 42119 39468 Local timer interrupts LOC: 251426 229641 42131 39719 Local timer interrupts LOC: 251652 229868 42131 39970 Local timer interrupts LOC: 251881 230141 42132 40220 Local timer interrupts LOC: 252131 230371 42132 40471 Local timer interrupts LOC: 252361 230646 42133 40721 Local timer interrupts LOC: 252611 230875 42133 40972 Local timer interrupts LOC: 252866 231149 42137 41222 Local timer interrupts LOC: 253103 231386 42153 41450 Local timer interrupts ...but more importantly, when it gets cranked up.. homer:~/tmp # trace-cmd report > report homer:~/tmp # grep local_timer_entry report|wc -l 252 ...it scares the problem away. (4 cpus because I fired up cpuset shield, isolates 2-3, takes 4-7 down) -Mike -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: RFC: futex_wait() can DoS the tick
On Wed, 10 Jun 2015, Steven Rostedt wrote: > On Wed, 10 Jun 2015 20:59:49 +0200 (CEST) > Thomas Gleixner wrote: > > > On Wed, 10 Jun 2015, Mike Galbraith wrote: > > > > > On Wed, 2015-06-10 at 17:12 +0200, Thomas Gleixner wrote: > > > > > > > Some more information about your symptoms in form of configuration, > > > > extra patches, kernel traces etc. would be appreciated. > > > > > > homer:~/tmp # perf record -C 3 -eirq:*,irq_vectors:local* -- sleep 10 > > I would recommend using trace-cmd especially if you add function > tracing, as it's made more for fast tracing. > > -- Steve > > > > [ perf record: Woken up 2 times to write data ] > > > [ perf record: Captured and wrote 1380.507 MB perf.data (19530595 > > > samples) ] > > > homer:~/tmp # perf script | grep irq_vectors:local_timer > > > ksoftirqd/319 [003] 5586.564238: irq_vectors:local_timer_entry: > > > vector=239 > > > ksoftirqd/319 [003] 5586.564239: irq_vectors:local_timer_exit: > > > vector=239 > > > ksoftirqd/319 [003] 5586.568238: irq_vectors:local_timer_entry: > > > vector=239 > > > ksoftirqd/319 [003] 5586.568239: irq_vectors:local_timer_exit: > > > vector=239 > > > ksoftirqd/319 [003] 5586.632218: irq_vectors:local_timer_entry: > > > vector=239 > > > > That gap is odd. Can you fire up the function tracer please? And please enable hrtimer events as well. Output of that cpu is good enough. Thanks, tglx -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: RFC: futex_wait() can DoS the tick
On Wed, 10 Jun 2015 20:59:49 +0200 (CEST) Thomas Gleixner wrote: > On Wed, 10 Jun 2015, Mike Galbraith wrote: > > > On Wed, 2015-06-10 at 17:12 +0200, Thomas Gleixner wrote: > > > > > Some more information about your symptoms in form of configuration, > > > extra patches, kernel traces etc. would be appreciated. > > > > homer:~/tmp # perf record -C 3 -eirq:*,irq_vectors:local* -- sleep 10 I would recommend using trace-cmd especially if you add function tracing, as it's made more for fast tracing. -- Steve > > [ perf record: Woken up 2 times to write data ] > > [ perf record: Captured and wrote 1380.507 MB perf.data (19530595 samples) ] > > homer:~/tmp # perf script | grep irq_vectors:local_timer > > ksoftirqd/319 [003] 5586.564238: irq_vectors:local_timer_entry: > > vector=239 > > ksoftirqd/319 [003] 5586.564239: irq_vectors:local_timer_exit: > > vector=239 > > ksoftirqd/319 [003] 5586.568238: irq_vectors:local_timer_entry: > > vector=239 > > ksoftirqd/319 [003] 5586.568239: irq_vectors:local_timer_exit: > > vector=239 > > ksoftirqd/319 [003] 5586.632218: irq_vectors:local_timer_entry: > > vector=239 > > That gap is odd. Can you fire up the function tracer please? > > Thanks, > > tglx -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: RFC: futex_wait() can DoS the tick
On Wed, 10 Jun 2015, Mike Galbraith wrote: > On Wed, 2015-06-10 at 17:12 +0200, Thomas Gleixner wrote: > > > Some more information about your symptoms in form of configuration, > > extra patches, kernel traces etc. would be appreciated. > > homer:~/tmp # perf record -C 3 -eirq:*,irq_vectors:local* -- sleep 10 > [ perf record: Woken up 2 times to write data ] > [ perf record: Captured and wrote 1380.507 MB perf.data (19530595 samples) ] > homer:~/tmp # perf script | grep irq_vectors:local_timer > ksoftirqd/319 [003] 5586.564238: irq_vectors:local_timer_entry: > vector=239 > ksoftirqd/319 [003] 5586.564239: irq_vectors:local_timer_exit: > vector=239 > ksoftirqd/319 [003] 5586.568238: irq_vectors:local_timer_entry: > vector=239 > ksoftirqd/319 [003] 5586.568239: irq_vectors:local_timer_exit: > vector=239 > ksoftirqd/319 [003] 5586.632218: irq_vectors:local_timer_entry: > vector=239 That gap is odd. Can you fire up the function tracer please? Thanks, tglx -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: RFC: futex_wait() can DoS the tick
On Wed, 10 Jun 2015, Mike Galbraith wrote: > Virgin source or kernels with zillion+ patches, doesn't matter. To test > virgin source earlier than EFI_STUB I had to pollute the source with > EFI backports, but nothing else. > > Just a sec while I check yet again that absolutely virgin master really > really does stall Yup. I pinned the tescase to CPU3.. > > LOC: 7432 5803 5418 6387 3371 3039 > 5757 3080 Local timer interrupts <== here > LOC: 7560 6028 5632 6394 3538 3195 > 5937 3084 Local timer interrupts > LOC: 7747 6135 5720 6394 3543 3262 > 6087 3086 Local timer interrupts > LOC: 7930 6206 5785 6394 3571 3288 > 6303 3087 Local timer interrupts So now it would be helpful to have a trace to see what CPU3 is actually doing. Thanks, tglx -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: RFC: futex_wait() can DoS the tick
On Wed, 2015-06-10 at 17:12 +0200, Thomas Gleixner wrote: > Some more information about your symptoms in form of configuration, > extra patches, kernel traces etc. would be appreciated. homer:~/tmp # perf record -C 3 -eirq:*,irq_vectors:local* -- sleep 10 [ perf record: Woken up 2 times to write data ] [ perf record: Captured and wrote 1380.507 MB perf.data (19530595 samples) ] homer:~/tmp # perf script | grep irq_vectors:local_timer ksoftirqd/319 [003] 5586.564238: irq_vectors:local_timer_entry: vector=239 ksoftirqd/319 [003] 5586.564239: irq_vectors:local_timer_exit: vector=239 ksoftirqd/319 [003] 5586.568238: irq_vectors:local_timer_entry: vector=239 ksoftirqd/319 [003] 5586.568239: irq_vectors:local_timer_exit: vector=239 ksoftirqd/319 [003] 5586.632218: irq_vectors:local_timer_entry: vector=239 ksoftirqd/319 [003] 5586.632219: irq_vectors:local_timer_exit: vector=239 bnc886785_livel 9817 [003] 5587.056069: irq_vectors:local_timer_entry: vector=239 bnc886785_livel 9817 [003] 5587.056071: irq_vectors:local_timer_exit: vector=239 ksoftirqd/319 [003] 5587.307978: irq_vectors:local_timer_entry: vector=239 ksoftirqd/319 [003] 5587.307978: irq_vectors:local_timer_exit: vector=239 ksoftirqd/319 [003] 5588.255633: irq_vectors:local_timer_entry: vector=239 ksoftirqd/319 [003] 5588.255635: irq_vectors:local_timer_exit: vector=239 ksoftirqd/319 [003] 5589.627132: irq_vectors:local_timer_entry: vector=239 ksoftirqd/319 [003] 5589.627133: irq_vectors:local_timer_exit: vector=239 bnc886785_livel 9817 [003] 5590.334877: irq_vectors:local_timer_entry: vector=239 bnc886785_livel 9817 [003] 5590.334883: irq_vectors:local_timer_exit: vector=239 bnc886785_livel 9817 [003] 5590.510815: irq_vectors:local_timer_entry: vector=239 bnc886785_livel 9817 [003] 5590.510816: irq_vectors:local_timer_exit: vector=239 ksoftirqd/319 [003] 5590.558794: irq_vectors:local_timer_entry: vector=239 ksoftirqd/319 [003] 5590.558796: irq_vectors:local_timer_exit: vector=239 bnc886785_livel 9817 [003] 5590.822707: irq_vectors:local_timer_entry: vector=239 bnc886785_livel 9817 [003] 5590.822709: irq_vectors:local_timer_exit: vector=239 ksoftirqd/319 [003] 5592.346152: irq_vectors:local_timer_entry: vector=239 ksoftirqd/319 [003] 5592.346153: irq_vectors:local_timer_exit: vector=239 bnc886785_livel 9817 [003] 5592.558074: irq_vectors:local_timer_entry: vector=239 bnc886785_livel 9817 [003] 5592.558076: irq_vectors:local_timer_exit: vector=239 bnc886785_livel 9817 [003] 5593.741647: irq_vectors:local_timer_entry: vector=239 bnc886785_livel 9817 [003] 5593.741649: irq_vectors:local_timer_exit: vector=239 ksoftirqd/319 [003] 5594.953208: irq_vectors:local_timer_entry: vector=239 ksoftirqd/319 [003] 5594.953209: irq_vectors:local_timer_exit: vector=239 bnc886785_livel 9817 [003] 5595.009197: irq_vectors:local_timer_entry: vector=239 bnc886785_livel 9817 [003] 5595.009198: irq_vectors:local_timer_exit: vector=239 ksoftirqd/319 [003] 5595.397049: irq_vectors:local_timer_entry: vector=239 ksoftirqd/319 [003] 5595.397050: irq_vectors:local_timer_exit: vector=239 bnc886785_livel 9817 [003] 5595.473025: irq_vectors:local_timer_entry: vector=239 bnc886785_livel 9817 [003] 5595.473027: irq_vectors:local_timer_exit: vector=239 homer:~/tmp # -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: RFC: futex_wait() can DoS the tick
On Wed, 2015-06-10 at 17:12 +0200, Thomas Gleixner wrote: > On Wed, 10 Jun 2015, Mike Galbraith wrote: > > The above was handed to me by a colleague working on a Xen guest that > > livelocked. I at first though Xen arch must have a weird problem, but > > when I tried proggy on my desktop box, while it didn't stop the tick > > completely as it did the Xen box, it slowed it to a crawl. I noticed > > that this did not happen with newer kernels, so a bisecting I did go, > > and found that... > > > > 279f14614 x86: apic: Use tsc deadline for oneshot when available > > > > ..is what fixed it up. Trouble is, while it fixes up my Haswell box, a > > This does not make any sense at all. It does not matter whether the > box uses tscdeadline or local apic timer. We do not even program the > hardware because we see that the event is in the past already. Yup. > So we raise the hrtimer softirqd, which then expires the timer. So all > what happens is that ksoftirqd accumulates runtime, but I cannot at > all see how that amounts to a DoS and brings the machine to a grinding > halt. The tick certainly appears to crawl here, and Dom0 boxen gripe if you let them not tick at all for a while. > I just booted a SNB with lapic=notscdeadline and ran that test > program. All what happens is - as expected - that ksoftirqd runs more > than we would like it to. I cannot observe any anomality vs. local > timer interrupts at all. If I run this pinned on an otherwise idle > core, then I get ~ CONFIG_HZ interrupts per second, which is what you > expect when the cpu never reaches idle. Hm. In order to successfully bisect the thing 3.7->3.8 I ran 2xCPUS copies because the first bisect went gaga. I'm not having any trouble reproducing on master with a single pinned copy though, nor did I have any on any of the kernels either stable or enterprise I tested, and that's quite a few. Whatever, that first bisect did go bad. > > The below targets the symptom, consider it hrtimer cluebat attractant. > > By now I know to take your patches with a grain of salt :) Sodium being bad for blood pressure is a medical myth. > Some more information about your symptoms in form of configuration, > extra patches, kernel traces etc. would be appreciated. Virgin source or kernels with zillion+ patches, doesn't matter. To test virgin source earlier than EFI_STUB I had to pollute the source with EFI backports, but nothing else. Just a sec while I check yet again that absolutely virgin master really really does stall Yup. I pinned the tescase to CPU3.. while sleep 1; do grep LOC /proc/interrupts; done LOC: 6706 5367 5053 6217 3031 2866 5477 3022 Local timer interrupts LOC: 6753 5391 5074 6238 3058 2894 5576 3034 Local timer interrupts LOC: 6791 5422 5104 6265 3066 2903 5582 3039 Local timer interrupts LOC: 6846 5472 5154 6293 3096 2909 5595 3042 Local timer interrupts LOC: 6855 5518 5177 6325 3199 2920 5613 3046 Local timer interrupts LOC: 6892 5552 5217 6338 3234 2935 5637 3053 Local timer interrupts LOC: 6983 5568 5236 6347 3244 2944 5660 3065 Local timer interrupts LOC: 7028 5583 5251 6363 3262 2963 5673 3071 Local timer interrupts LOC: 7217 5676 5343 6383 3305 2976 5682 3078 Local timer interrupts LOC: 7432 5803 5418 6387 3371 3039 5757 3080 Local timer interrupts <== here LOC: 7560 6028 5632 6394 3538 3195 5937 3084 Local timer interrupts LOC: 7747 6135 5720 6394 3543 3262 6087 3086 Local timer interrupts LOC: 7930 6206 5785 6394 3571 3288 6303 3087 Local timer interrupts LOC: 8057 6299 5842 6394 3606 3346 6415 3088 Local timer interrupts LOC: 8236 6361 5921 6394 3632 3409 6630 3090 Local timer interrupts LOC: 8382 6448 6004 6394 3664 3478 6754 3090 Local timer interrupts LOC: 8460 6571 6124 6394 3690 3542 6951 3092 Local timer interrupts LOC: 8605 6670 6224 6394 3723 3614 7078 3093 Local timer interrupts LOC: 8710 6842 6323 6394 3776 3702 7295 3123 Local timer interrupts LOC: 8868 6947 6402 6394 3828 3784 7422 3149 Local timer
Re: RFC: futex_wait() can DoS the tick
On Wed, 10 Jun 2015, Mike Galbraith wrote: > The above was handed to me by a colleague working on a Xen guest that > livelocked. I at first though Xen arch must have a weird problem, but > when I tried proggy on my desktop box, while it didn't stop the tick > completely as it did the Xen box, it slowed it to a crawl. I noticed > that this did not happen with newer kernels, so a bisecting I did go, > and found that... > > 279f14614 x86: apic: Use tsc deadline for oneshot when available > > ..is what fixed it up. Trouble is, while it fixes up my Haswell box, a This does not make any sense at all. It does not matter whether the box uses tscdeadline or local apic timer. We do not even program the hardware because we see that the event is in the past already. So we raise the hrtimer softirqd, which then expires the timer. So all what happens is that ksoftirqd accumulates runtime, but I cannot at all see how that amounts to a DoS and brings the machine to a grinding halt. > Xen dom0 remains busted by that testcase whether that patch is applied > to the host or not, even though the hypervisor supports deadline timer, > and seemingly regardless of CPU type all together. > > Of all the x86_64 bare metal boxen I've tested, only those with the TSC > deadline timer have shown the issue, and there it goes away as of v3.8 > unless you boot lapic=notscdeadline. I just booted a SNB with lapic=notscdeadline and ran that test program. All what happens is - as expected - that ksoftirqd runs more than we would like it to. I cannot observe any anomality vs. local timer interrupts at all. If I run this pinned on an otherwise idle core, then I get ~ CONFIG_HZ interrupts per second, which is what you expect when the cpu never reaches idle. With the changes pending in tip/timers/core we get more timer interrupts instead of offloading crap to ksoftirqd, but they cannot lead to a DoS either and we do not care whether the user spends its cycles looping in user space or firing timer interrupts. It can only do as long as it is on the cpu. These timers (futex, nanosleep, poll, ...) are oneshot and all timers which are self rearming are rate limited by the fact that we only rearm when the previous event has been consumed by the task which scheduled it. So the scheduler controls how many of these events can be created from user space. > However, given any x86_64 Intel box with TSC deadline timer (ivy, sandy, > hasbeen) can be made to exhibit the symptom, there may be other arches > that get seriously dinged up or maybe even as thoroughly b0rked as Xen > does when hrtimer_interrupt() is pounded into the ground by userspace. > > Alternatively, should someone out there know that all bare metal is in > fact fine post 279f14614, that person will likely also know what the Xen > folks need to do to fix up their busted arch. > > The below targets the symptom, consider it hrtimer cluebat attractant. By now I know to take your patches with a grain of salt :) Some more information about your symptoms in form of configuration, extra patches, kernel traces etc. would be appreciated. Thanks, tglx -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RFC: futex_wait() can DoS the tick
Greetings, Like so... #include #include #include #include int sys_futex(void *addr1, int op, int val1, struct timespec *timeout, void *addr2, int val3) { return syscall(SYS_futex, addr1, op, val1, timeout, addr2, val3); } int main() { struct timespec t; int f = 1; clock_gettime(CLOCK_REALTIME, ); t.tv_sec -= 10; while (1) { sys_futex(, FUTEX_WAIT_BITSET | FUTEX_CLOCK_REALTIME, 1, , NULL, FUTEX_BITSET_MATCH_ANY); } } The above was handed to me by a colleague working on a Xen guest that livelocked. I at first though Xen arch must have a weird problem, but when I tried proggy on my desktop box, while it didn't stop the tick completely as it did the Xen box, it slowed it to a crawl. I noticed that this did not happen with newer kernels, so a bisecting I did go, and found that... 279f14614 x86: apic: Use tsc deadline for oneshot when available ..is what fixed it up. Trouble is, while it fixes up my Haswell box, a Xen dom0 remains busted by that testcase whether that patch is applied to the host or not, even though the hypervisor supports deadline timer, and seemingly regardless of CPU type all together. Of all the x86_64 bare metal boxen I've tested, only those with the TSC deadline timer have shown the issue, and there it goes away as of v3.8 unless you boot lapic=notscdeadline. However, given any x86_64 Intel box with TSC deadline timer (ivy, sandy, hasbeen) can be made to exhibit the symptom, there may be other arches that get seriously dinged up or maybe even as thoroughly b0rked as Xen does when hrtimer_interrupt() is pounded into the ground by userspace. Alternatively, should someone out there know that all bare metal is in fact fine post 279f14614, that person will likely also know what the Xen folks need to do to fix up their busted arch. The below targets the symptom, consider it hrtimer cluebat attractant. --- kernel/time/hrtimer.c | 31 --- 1 file changed, 28 insertions(+), 3 deletions(-) --- a/kernel/time/hrtimer.c +++ b/kernel/time/hrtimer.c @@ -933,6 +933,8 @@ remove_hrtimer(struct hrtimer *timer, st return 0; } +static enum hrtimer_restart hrtimer_wakeup(struct hrtimer *timer); + int __hrtimer_start_range_ns(struct hrtimer *timer, ktime_t tim, unsigned long delta_ns, const enum hrtimer_mode mode, int wakeup) @@ -980,8 +982,27 @@ int __hrtimer_start_range_ns(struct hrti * on dynticks target. */ wake_up_nohz_cpu(new_base->cpu_base->cpu); - } else if (new_base->cpu_base == this_cpu_ptr(_bases) && - hrtimer_reprogram(timer, new_base)) { + } else if (new_base->cpu_base == this_cpu_ptr(_bases)) { + int res = hrtimer_reprogram(timer, new_base); + + if (!res) + goto out; + + /* +* If a buggy app tries forever to be awakened in the past, +* banging on hrtimer_interrupt() at high speed can stall +* the tick, and on a Xen box, forever. On haswell with +* tsc_deadline_timer disabled you can see it, though it +* only slows the tick way down. Other bare metal boxes +* may also be terminally affected. +*/ + if (unlikely(wakeup && !ret && IS_ERR_VALUE(res) && + timer->function == hrtimer_wakeup)) { + debug_deactivate(timer); + __remove_hrtimer(timer, new_base, 0, 0); + ret = -ETIMEDOUT; + } + /* * Only allow reprogramming if the new base is on this CPU. * (it might still be on another CPU if the timer was pending) @@ -994,7 +1015,10 @@ int __hrtimer_start_range_ns(struct hrti * lock ordering issue vs. rq->lock. */ raw_spin_unlock(_base->cpu_base->lock); - raise_softirq_irqoff(HRTIMER_SOFTIRQ); + if (!IS_ERR_VALUE(ret)) + raise_softirq_irqoff(HRTIMER_SOFTIRQ); + else + hrtimer_wakeup(timer); local_irq_restore(flags); return ret; } else { @@ -1002,6 +1026,7 @@ int __hrtimer_start_range_ns(struct hrti } } +out: unlock_hrtimer_base(timer, ); return ret; -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RFC: futex_wait() can DoS the tick
Greetings, Like so... #include time.h #include sys/time.h #include linux/futex.h #include sys/syscall.h int sys_futex(void *addr1, int op, int val1, struct timespec *timeout, void *addr2, int val3) { return syscall(SYS_futex, addr1, op, val1, timeout, addr2, val3); } int main() { struct timespec t; int f = 1; clock_gettime(CLOCK_REALTIME, t); t.tv_sec -= 10; while (1) { sys_futex(f, FUTEX_WAIT_BITSET | FUTEX_CLOCK_REALTIME, 1, t, NULL, FUTEX_BITSET_MATCH_ANY); } } The above was handed to me by a colleague working on a Xen guest that livelocked. I at first though Xen arch must have a weird problem, but when I tried proggy on my desktop box, while it didn't stop the tick completely as it did the Xen box, it slowed it to a crawl. I noticed that this did not happen with newer kernels, so a bisecting I did go, and found that... 279f14614 x86: apic: Use tsc deadline for oneshot when available ..is what fixed it up. Trouble is, while it fixes up my Haswell box, a Xen dom0 remains busted by that testcase whether that patch is applied to the host or not, even though the hypervisor supports deadline timer, and seemingly regardless of CPU type all together. Of all the x86_64 bare metal boxen I've tested, only those with the TSC deadline timer have shown the issue, and there it goes away as of v3.8 unless you boot lapic=notscdeadline. However, given any x86_64 Intel box with TSC deadline timer (ivy, sandy, hasbeen) can be made to exhibit the symptom, there may be other arches that get seriously dinged up or maybe even as thoroughly b0rked as Xen does when hrtimer_interrupt() is pounded into the ground by userspace. Alternatively, should someone out there know that all bare metal is in fact fine post 279f14614, that person will likely also know what the Xen folks need to do to fix up their busted arch. The below targets the symptom, consider it hrtimer cluebat attractant. --- kernel/time/hrtimer.c | 31 --- 1 file changed, 28 insertions(+), 3 deletions(-) --- a/kernel/time/hrtimer.c +++ b/kernel/time/hrtimer.c @@ -933,6 +933,8 @@ remove_hrtimer(struct hrtimer *timer, st return 0; } +static enum hrtimer_restart hrtimer_wakeup(struct hrtimer *timer); + int __hrtimer_start_range_ns(struct hrtimer *timer, ktime_t tim, unsigned long delta_ns, const enum hrtimer_mode mode, int wakeup) @@ -980,8 +982,27 @@ int __hrtimer_start_range_ns(struct hrti * on dynticks target. */ wake_up_nohz_cpu(new_base-cpu_base-cpu); - } else if (new_base-cpu_base == this_cpu_ptr(hrtimer_bases) - hrtimer_reprogram(timer, new_base)) { + } else if (new_base-cpu_base == this_cpu_ptr(hrtimer_bases)) { + int res = hrtimer_reprogram(timer, new_base); + + if (!res) + goto out; + + /* +* If a buggy app tries forever to be awakened in the past, +* banging on hrtimer_interrupt() at high speed can stall +* the tick, and on a Xen box, forever. On haswell with +* tsc_deadline_timer disabled you can see it, though it +* only slows the tick way down. Other bare metal boxes +* may also be terminally affected. +*/ + if (unlikely(wakeup !ret IS_ERR_VALUE(res) + timer-function == hrtimer_wakeup)) { + debug_deactivate(timer); + __remove_hrtimer(timer, new_base, 0, 0); + ret = -ETIMEDOUT; + } + /* * Only allow reprogramming if the new base is on this CPU. * (it might still be on another CPU if the timer was pending) @@ -994,7 +1015,10 @@ int __hrtimer_start_range_ns(struct hrti * lock ordering issue vs. rq-lock. */ raw_spin_unlock(new_base-cpu_base-lock); - raise_softirq_irqoff(HRTIMER_SOFTIRQ); + if (!IS_ERR_VALUE(ret)) + raise_softirq_irqoff(HRTIMER_SOFTIRQ); + else + hrtimer_wakeup(timer); local_irq_restore(flags); return ret; } else { @@ -1002,6 +1026,7 @@ int __hrtimer_start_range_ns(struct hrti } } +out: unlock_hrtimer_base(timer, flags); return ret; -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: RFC: futex_wait() can DoS the tick
On Wed, 2015-06-10 at 17:12 +0200, Thomas Gleixner wrote: On Wed, 10 Jun 2015, Mike Galbraith wrote: The above was handed to me by a colleague working on a Xen guest that livelocked. I at first though Xen arch must have a weird problem, but when I tried proggy on my desktop box, while it didn't stop the tick completely as it did the Xen box, it slowed it to a crawl. I noticed that this did not happen with newer kernels, so a bisecting I did go, and found that... 279f14614 x86: apic: Use tsc deadline for oneshot when available ..is what fixed it up. Trouble is, while it fixes up my Haswell box, a This does not make any sense at all. It does not matter whether the box uses tscdeadline or local apic timer. We do not even program the hardware because we see that the event is in the past already. Yup. So we raise the hrtimer softirqd, which then expires the timer. So all what happens is that ksoftirqd accumulates runtime, but I cannot at all see how that amounts to a DoS and brings the machine to a grinding halt. The tick certainly appears to crawl here, and Dom0 boxen gripe if you let them not tick at all for a while. I just booted a SNB with lapic=notscdeadline and ran that test program. All what happens is - as expected - that ksoftirqd runs more than we would like it to. I cannot observe any anomality vs. local timer interrupts at all. If I run this pinned on an otherwise idle core, then I get ~ CONFIG_HZ interrupts per second, which is what you expect when the cpu never reaches idle. Hm. In order to successfully bisect the thing 3.7-3.8 I ran 2xCPUS copies because the first bisect went gaga. I'm not having any trouble reproducing on master with a single pinned copy though, nor did I have any on any of the kernels either stable or enterprise I tested, and that's quite a few. Whatever, that first bisect did go bad. The below targets the symptom, consider it hrtimer cluebat attractant. By now I know to take your patches with a grain of salt :) Sodium being bad for blood pressure is a medical myth. Some more information about your symptoms in form of configuration, extra patches, kernel traces etc. would be appreciated. Virgin source or kernels with zillion+ patches, doesn't matter. To test virgin source earlier than EFI_STUB I had to pollute the source with EFI backports, but nothing else. Just a sec while I check yet again that absolutely virgin master really really does stall Yup. I pinned the tescase to CPU3.. while sleep 1; do grep LOC /proc/interrupts; done LOC: 6706 5367 5053 6217 3031 2866 5477 3022 Local timer interrupts LOC: 6753 5391 5074 6238 3058 2894 5576 3034 Local timer interrupts LOC: 6791 5422 5104 6265 3066 2903 5582 3039 Local timer interrupts LOC: 6846 5472 5154 6293 3096 2909 5595 3042 Local timer interrupts LOC: 6855 5518 5177 6325 3199 2920 5613 3046 Local timer interrupts LOC: 6892 5552 5217 6338 3234 2935 5637 3053 Local timer interrupts LOC: 6983 5568 5236 6347 3244 2944 5660 3065 Local timer interrupts LOC: 7028 5583 5251 6363 3262 2963 5673 3071 Local timer interrupts LOC: 7217 5676 5343 6383 3305 2976 5682 3078 Local timer interrupts LOC: 7432 5803 5418 6387 3371 3039 5757 3080 Local timer interrupts == here LOC: 7560 6028 5632 6394 3538 3195 5937 3084 Local timer interrupts LOC: 7747 6135 5720 6394 3543 3262 6087 3086 Local timer interrupts LOC: 7930 6206 5785 6394 3571 3288 6303 3087 Local timer interrupts LOC: 8057 6299 5842 6394 3606 3346 6415 3088 Local timer interrupts LOC: 8236 6361 5921 6394 3632 3409 6630 3090 Local timer interrupts LOC: 8382 6448 6004 6394 3664 3478 6754 3090 Local timer interrupts LOC: 8460 6571 6124 6394 3690 3542 6951 3092 Local timer interrupts LOC: 8605 6670 6224 6394 3723 3614 7078 3093 Local timer interrupts LOC: 8710 6842 6323 6394 3776 3702 7295 3123 Local timer interrupts LOC: 8868 6947 6402 6394 3828 3784 7422 3149 Local timer interrupts LOC: 9077 7124 6523
Re: RFC: futex_wait() can DoS the tick
On Wed, 10 Jun 2015, Steven Rostedt wrote: On Wed, 10 Jun 2015 20:59:49 +0200 (CEST) Thomas Gleixner t...@linutronix.de wrote: On Wed, 10 Jun 2015, Mike Galbraith wrote: On Wed, 2015-06-10 at 17:12 +0200, Thomas Gleixner wrote: Some more information about your symptoms in form of configuration, extra patches, kernel traces etc. would be appreciated. homer:~/tmp # perf record -C 3 -eirq:*,irq_vectors:local* -- sleep 10 I would recommend using trace-cmd especially if you add function tracing, as it's made more for fast tracing. -- Steve [ perf record: Woken up 2 times to write data ] [ perf record: Captured and wrote 1380.507 MB perf.data (19530595 samples) ] homer:~/tmp # perf script | grep irq_vectors:local_timer ksoftirqd/319 [003] 5586.564238: irq_vectors:local_timer_entry: vector=239 ksoftirqd/319 [003] 5586.564239: irq_vectors:local_timer_exit: vector=239 ksoftirqd/319 [003] 5586.568238: irq_vectors:local_timer_entry: vector=239 ksoftirqd/319 [003] 5586.568239: irq_vectors:local_timer_exit: vector=239 ksoftirqd/319 [003] 5586.632218: irq_vectors:local_timer_entry: vector=239 That gap is odd. Can you fire up the function tracer please? And please enable hrtimer events as well. Output of that cpu is good enough. Thanks, tglx -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: RFC: futex_wait() can DoS the tick
On Wed, 10 Jun 2015 20:59:49 +0200 (CEST) Thomas Gleixner t...@linutronix.de wrote: On Wed, 10 Jun 2015, Mike Galbraith wrote: On Wed, 2015-06-10 at 17:12 +0200, Thomas Gleixner wrote: Some more information about your symptoms in form of configuration, extra patches, kernel traces etc. would be appreciated. homer:~/tmp # perf record -C 3 -eirq:*,irq_vectors:local* -- sleep 10 I would recommend using trace-cmd especially if you add function tracing, as it's made more for fast tracing. -- Steve [ perf record: Woken up 2 times to write data ] [ perf record: Captured and wrote 1380.507 MB perf.data (19530595 samples) ] homer:~/tmp # perf script | grep irq_vectors:local_timer ksoftirqd/319 [003] 5586.564238: irq_vectors:local_timer_entry: vector=239 ksoftirqd/319 [003] 5586.564239: irq_vectors:local_timer_exit: vector=239 ksoftirqd/319 [003] 5586.568238: irq_vectors:local_timer_entry: vector=239 ksoftirqd/319 [003] 5586.568239: irq_vectors:local_timer_exit: vector=239 ksoftirqd/319 [003] 5586.632218: irq_vectors:local_timer_entry: vector=239 That gap is odd. Can you fire up the function tracer please? Thanks, tglx -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: RFC: futex_wait() can DoS the tick
On Wed, 10 Jun 2015, Mike Galbraith wrote: Virgin source or kernels with zillion+ patches, doesn't matter. To test virgin source earlier than EFI_STUB I had to pollute the source with EFI backports, but nothing else. Just a sec while I check yet again that absolutely virgin master really really does stall Yup. I pinned the tescase to CPU3.. LOC: 7432 5803 5418 6387 3371 3039 5757 3080 Local timer interrupts == here LOC: 7560 6028 5632 6394 3538 3195 5937 3084 Local timer interrupts LOC: 7747 6135 5720 6394 3543 3262 6087 3086 Local timer interrupts LOC: 7930 6206 5785 6394 3571 3288 6303 3087 Local timer interrupts So now it would be helpful to have a trace to see what CPU3 is actually doing. Thanks, tglx -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: RFC: futex_wait() can DoS the tick
On Wed, 10 Jun 2015, Mike Galbraith wrote: On Wed, 2015-06-10 at 17:12 +0200, Thomas Gleixner wrote: Some more information about your symptoms in form of configuration, extra patches, kernel traces etc. would be appreciated. homer:~/tmp # perf record -C 3 -eirq:*,irq_vectors:local* -- sleep 10 [ perf record: Woken up 2 times to write data ] [ perf record: Captured and wrote 1380.507 MB perf.data (19530595 samples) ] homer:~/tmp # perf script | grep irq_vectors:local_timer ksoftirqd/319 [003] 5586.564238: irq_vectors:local_timer_entry: vector=239 ksoftirqd/319 [003] 5586.564239: irq_vectors:local_timer_exit: vector=239 ksoftirqd/319 [003] 5586.568238: irq_vectors:local_timer_entry: vector=239 ksoftirqd/319 [003] 5586.568239: irq_vectors:local_timer_exit: vector=239 ksoftirqd/319 [003] 5586.632218: irq_vectors:local_timer_entry: vector=239 That gap is odd. Can you fire up the function tracer please? Thanks, tglx -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: RFC: futex_wait() can DoS the tick
On Wed, 2015-06-10 at 17:12 +0200, Thomas Gleixner wrote: Some more information about your symptoms in form of configuration, extra patches, kernel traces etc. would be appreciated. homer:~/tmp # perf record -C 3 -eirq:*,irq_vectors:local* -- sleep 10 [ perf record: Woken up 2 times to write data ] [ perf record: Captured and wrote 1380.507 MB perf.data (19530595 samples) ] homer:~/tmp # perf script | grep irq_vectors:local_timer ksoftirqd/319 [003] 5586.564238: irq_vectors:local_timer_entry: vector=239 ksoftirqd/319 [003] 5586.564239: irq_vectors:local_timer_exit: vector=239 ksoftirqd/319 [003] 5586.568238: irq_vectors:local_timer_entry: vector=239 ksoftirqd/319 [003] 5586.568239: irq_vectors:local_timer_exit: vector=239 ksoftirqd/319 [003] 5586.632218: irq_vectors:local_timer_entry: vector=239 ksoftirqd/319 [003] 5586.632219: irq_vectors:local_timer_exit: vector=239 bnc886785_livel 9817 [003] 5587.056069: irq_vectors:local_timer_entry: vector=239 bnc886785_livel 9817 [003] 5587.056071: irq_vectors:local_timer_exit: vector=239 ksoftirqd/319 [003] 5587.307978: irq_vectors:local_timer_entry: vector=239 ksoftirqd/319 [003] 5587.307978: irq_vectors:local_timer_exit: vector=239 ksoftirqd/319 [003] 5588.255633: irq_vectors:local_timer_entry: vector=239 ksoftirqd/319 [003] 5588.255635: irq_vectors:local_timer_exit: vector=239 ksoftirqd/319 [003] 5589.627132: irq_vectors:local_timer_entry: vector=239 ksoftirqd/319 [003] 5589.627133: irq_vectors:local_timer_exit: vector=239 bnc886785_livel 9817 [003] 5590.334877: irq_vectors:local_timer_entry: vector=239 bnc886785_livel 9817 [003] 5590.334883: irq_vectors:local_timer_exit: vector=239 bnc886785_livel 9817 [003] 5590.510815: irq_vectors:local_timer_entry: vector=239 bnc886785_livel 9817 [003] 5590.510816: irq_vectors:local_timer_exit: vector=239 ksoftirqd/319 [003] 5590.558794: irq_vectors:local_timer_entry: vector=239 ksoftirqd/319 [003] 5590.558796: irq_vectors:local_timer_exit: vector=239 bnc886785_livel 9817 [003] 5590.822707: irq_vectors:local_timer_entry: vector=239 bnc886785_livel 9817 [003] 5590.822709: irq_vectors:local_timer_exit: vector=239 ksoftirqd/319 [003] 5592.346152: irq_vectors:local_timer_entry: vector=239 ksoftirqd/319 [003] 5592.346153: irq_vectors:local_timer_exit: vector=239 bnc886785_livel 9817 [003] 5592.558074: irq_vectors:local_timer_entry: vector=239 bnc886785_livel 9817 [003] 5592.558076: irq_vectors:local_timer_exit: vector=239 bnc886785_livel 9817 [003] 5593.741647: irq_vectors:local_timer_entry: vector=239 bnc886785_livel 9817 [003] 5593.741649: irq_vectors:local_timer_exit: vector=239 ksoftirqd/319 [003] 5594.953208: irq_vectors:local_timer_entry: vector=239 ksoftirqd/319 [003] 5594.953209: irq_vectors:local_timer_exit: vector=239 bnc886785_livel 9817 [003] 5595.009197: irq_vectors:local_timer_entry: vector=239 bnc886785_livel 9817 [003] 5595.009198: irq_vectors:local_timer_exit: vector=239 ksoftirqd/319 [003] 5595.397049: irq_vectors:local_timer_entry: vector=239 ksoftirqd/319 [003] 5595.397050: irq_vectors:local_timer_exit: vector=239 bnc886785_livel 9817 [003] 5595.473025: irq_vectors:local_timer_entry: vector=239 bnc886785_livel 9817 [003] 5595.473027: irq_vectors:local_timer_exit: vector=239 homer:~/tmp # -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: RFC: futex_wait() can DoS the tick
On Wed, 2015-06-10 at 20:59 +0200, Thomas Gleixner wrote: On Wed, 10 Jun 2015, Mike Galbraith wrote: On Wed, 2015-06-10 at 17:12 +0200, Thomas Gleixner wrote: Some more information about your symptoms in form of configuration, extra patches, kernel traces etc. would be appreciated. homer:~/tmp # perf record -C 3 -eirq:*,irq_vectors:local* -- sleep 10 [ perf record: Woken up 2 times to write data ] [ perf record: Captured and wrote 1380.507 MB perf.data (19530595 samples) ] homer:~/tmp # perf script | grep irq_vectors:local_timer ksoftirqd/319 [003] 5586.564238: irq_vectors:local_timer_entry: vector=239 ksoftirqd/319 [003] 5586.564239: irq_vectors:local_timer_exit: vector=239 ksoftirqd/319 [003] 5586.568238: irq_vectors:local_timer_entry: vector=239 ksoftirqd/319 [003] 5586.568239: irq_vectors:local_timer_exit: vector=239 ksoftirqd/319 [003] 5586.632218: irq_vectors:local_timer_entry: vector=239 That gap is odd. Can you fire up the function tracer please? Well, trying to do that like so... trace-cmd record -M 8 -p function -e irq:* -e irq_vectors:local* -e timer:hrtimer* -- sleep 1 ..capture takes much more than a second... LOC: 248161 226536 42091 38892 Local timer interrupts LOC: 248381 226783 42092 38901 Local timer interrupts LOC: 248668 227038 42092 38903 Local timer interrupts LOC: 248963 227277 42092 38904 Local timer interrupts LOC: 249214 227515 42092 38905 Local timer interrupts LOC: 249486 227732 42092 38905 Local timer interrupts LOC: 249720 227961 42092 38905 Local timer interrupts LOC: 249996 228200 42092 38906 Local timer interrupts LOC: 250215 228450 42092 38906 Local timer interrupts LOC: 250489 228704 42092 38906 Local timer interrupts LOC: 250729 228952 42099 38964 Local timer interrupts LOC: 250951 229141 42100 39215 Local timer interrupts LOC: 251168 229374 42119 39468 Local timer interrupts LOC: 251426 229641 42131 39719 Local timer interrupts LOC: 251652 229868 42131 39970 Local timer interrupts LOC: 251881 230141 42132 40220 Local timer interrupts LOC: 252131 230371 42132 40471 Local timer interrupts LOC: 252361 230646 42133 40721 Local timer interrupts LOC: 252611 230875 42133 40972 Local timer interrupts LOC: 252866 231149 42137 41222 Local timer interrupts LOC: 253103 231386 42153 41450 Local timer interrupts ...but more importantly, when it gets cranked up.. homer:~/tmp # trace-cmd report report homer:~/tmp # grep local_timer_entry report|wc -l 252 ...it scares the problem away. (4 cpus because I fired up cpuset shield, isolates 2-3, takes 4-7 down) -Mike -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: RFC: futex_wait() can DoS the tick
On Wed, 10 Jun 2015, Mike Galbraith wrote: The above was handed to me by a colleague working on a Xen guest that livelocked. I at first though Xen arch must have a weird problem, but when I tried proggy on my desktop box, while it didn't stop the tick completely as it did the Xen box, it slowed it to a crawl. I noticed that this did not happen with newer kernels, so a bisecting I did go, and found that... 279f14614 x86: apic: Use tsc deadline for oneshot when available ..is what fixed it up. Trouble is, while it fixes up my Haswell box, a This does not make any sense at all. It does not matter whether the box uses tscdeadline or local apic timer. We do not even program the hardware because we see that the event is in the past already. So we raise the hrtimer softirqd, which then expires the timer. So all what happens is that ksoftirqd accumulates runtime, but I cannot at all see how that amounts to a DoS and brings the machine to a grinding halt. Xen dom0 remains busted by that testcase whether that patch is applied to the host or not, even though the hypervisor supports deadline timer, and seemingly regardless of CPU type all together. Of all the x86_64 bare metal boxen I've tested, only those with the TSC deadline timer have shown the issue, and there it goes away as of v3.8 unless you boot lapic=notscdeadline. I just booted a SNB with lapic=notscdeadline and ran that test program. All what happens is - as expected - that ksoftirqd runs more than we would like it to. I cannot observe any anomality vs. local timer interrupts at all. If I run this pinned on an otherwise idle core, then I get ~ CONFIG_HZ interrupts per second, which is what you expect when the cpu never reaches idle. With the changes pending in tip/timers/core we get more timer interrupts instead of offloading crap to ksoftirqd, but they cannot lead to a DoS either and we do not care whether the user spends its cycles looping in user space or firing timer interrupts. It can only do as long as it is on the cpu. These timers (futex, nanosleep, poll, ...) are oneshot and all timers which are self rearming are rate limited by the fact that we only rearm when the previous event has been consumed by the task which scheduled it. So the scheduler controls how many of these events can be created from user space. However, given any x86_64 Intel box with TSC deadline timer (ivy, sandy, hasbeen) can be made to exhibit the symptom, there may be other arches that get seriously dinged up or maybe even as thoroughly b0rked as Xen does when hrtimer_interrupt() is pounded into the ground by userspace. Alternatively, should someone out there know that all bare metal is in fact fine post 279f14614, that person will likely also know what the Xen folks need to do to fix up their busted arch. The below targets the symptom, consider it hrtimer cluebat attractant. By now I know to take your patches with a grain of salt :) Some more information about your symptoms in form of configuration, extra patches, kernel traces etc. would be appreciated. Thanks, tglx -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/