On Wed, 2010-08-18 at 22:26 +0200, Stephane Eranian wrote:
> Hi,
>
> I ran into some issue with the NMI watchdog not firing in a deadlock
> situation. After some debugging I found the source of the problem.
>
> The NMI watchdog is currently subject, like any other events, to interrupt
> throttling. The heart of the problem is that if you are deadlocked on a CPU
> with interrupts masked, the timer interrupt won't fire, therefore the
> hwc->interrupts
> field won't be reset. Then, depending on the max sampling rate, you
> could eventually
> fail the max interrupt rate test in __pfm_overflow_handler() and
> perf_events would
> throttle, i.e., stop, the NMI watchdog event before the 5s delay to panic.
> Thus, you would never get the panic. I ran into this problem myself.
>
> This is a serious issue because perf_events must ensure the watchdog can
> always fire, regardless of the interrupt masking situation.
>
> Look like one way of solving the problem would be to mark the NMI watchdog
> event as immune to throttling. The event being internal to the kernel we could
> trust the event setup from perf_event_create_kernel_counter().
Something like so?
---
kernel/watchdog.c | 3 +++
1 files changed, 3 insertions(+), 0 deletions(-)
diff --git a/kernel/watchdog.c b/kernel/watchdog.c
index 613bc1f..e0fe6e4 100644
--- a/kernel/watchdog.c
+++ b/kernel/watchdog.c
@@ -206,6 +206,9 @@ void watchdog_overflow_callback(struct perf_event *event,
int nmi,
struct perf_sample_data *data,
struct pt_regs *regs)
{
+ /* Ensure the watchdog never gets throttled. */
+ event->hw.interrupts = 0;
+
if (__get_cpu_var(watchdog_nmi_touch) == true) {
__get_cpu_var(watchdog_nmi_touch) = false;
return;
------------------------------------------------------------------------------
This SF.net email is sponsored by
Make an app they can't live without
Enter the BlackBerry Developer Challenge
http://p.sf.net/sfu/RIM-dev2dev
_______________________________________________
perfmon2-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/perfmon2-devel