On Monday 01 May 2006 21:56, Vivek Goyal wrote: > On Fri, Apr 28, 2006 at 06:19:24PM -0400, Don Zickus wrote: > > When kexec goes to issue an nmi it uses set_nmi_callback() to have the > > other cpus execute the proper shutdown code. Unfortunately, under certain > > situations set_nmi_callback will fail (ie oprofile has it reserved > > already). This will cause kexec/kdump to hang and do nothing. :( > > > > Looking at the set_nmi_callback(), there does not seem to be anything > which will make it fail. I think enabling profiling support will only > disable any regular NMI generation from LAPIC for watchdog purposes because > performance registers being used for NMI generation are claimed back. > > So even if profiling is enabled, kexec/kdump should not fail.
profiling just registers a lower priority callback. Also with Don's changes profiling will only trigger when there are profile events anyways - so all the interactions will be much cleaner. > > > After talking to Andi, he mentioned that subsystems should be using the > > notifier callback on the die chain instead. The included patch > > incorporates that. The priority is set to 0, hopefully causing the > > notifier to be the first one called. > > > > Ok if the goal is to force the subsystems to rely on die notifier chain > instead of nmi_callback and getting rid of set_nmi_callback() interfaces, > then it spells some problems for kdump, as kdump is different for other > subsystems. You rightly pointed out that what if chain is corrupted > or if some die notifier funciton hangs. All NMI handlers think they are different and more special than everybody else. Otherwise they wouldn't be NMI. kdump is really in no way special. > > Looks like that notifiers are called in increasing priority order. Looking > at the code, it looks like notifier with priority 0x7fffffff will be called > first. But still there is no gurantee. People registering first with > this priority will be called first. Kdump registers in then end hence > will be called last, so liable to fail. Sorry, but that's just a dumb argument. All kernel code needs to cooperate with others - if there is a problem it's just fixed. But having multiple callbacks just because you don't trust someone else doesn't make sense. -Andi _______________________________________________ fastboot mailing list [email protected] https://lists.osdl.org/mailman/listinfo/fastboot
