Hi,

I ran into some issue  with the NMI watchdog not firing in a deadlock
situation. After some debugging I found the source of the problem.

The NMI watchdog is currently subject, like any other events, to interrupt
throttling. The heart of the problem is that if you are deadlocked on a CPU
with interrupts masked, the timer interrupt won't fire, therefore the
hwc->interrupts
field won't be reset. Then, depending on the max sampling rate, you
could eventually
fail the max interrupt rate test in __pfm_overflow_handler() and
perf_events would
throttle, i.e., stop, the NMI watchdog event before the 5s delay to panic.
Thus, you would never get the panic. I ran into this problem myself.

This is a serious issue because perf_events must ensure the watchdog can
always fire, regardless of the interrupt masking situation.

Look like one way of solving the problem would be to mark the NMI watchdog
event as immune to throttling. The event being internal to the kernel we could
trust the event setup from perf_event_create_kernel_counter().

------------------------------------------------------------------------------
This SF.net email is sponsored by 

Make an app they can't live without
Enter the BlackBerry Developer Challenge
http://p.sf.net/sfu/RIM-dev2dev 
_______________________________________________
perfmon2-devel mailing list
perfmon2-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/perfmon2-devel

Reply via email to