Gary, I had a 2nd thought on this. You should not get into a situation where pfm_pmu_conf is NULL.
The function I have looks as follows and what protects from NULL is pfm_pmu_conf_get(). If NULL, it will cause this function to return with error and you simply exit pfm_cpu_notify(). Do you have the same code? static int pfm_cpu_notify(struct notifier_block *nfb, unsigned long action, void *hcpu) { unsigned int cpu = (unsigned long)hcpu; /* no PMU description loaded */ if (pfm_pmu_conf_get(0)) return NOTIFY_OK; switch (action) { case CPU_ONLINE: pfm_debugfs_add_cpu(cpu); PFM_INFO("CPU%d is online", cpu); break; case CPU_UP_PREPARE: PFM_INFO("CPU%d prepare online", cpu); break; case CPU_UP_CANCELED: pfm_debugfs_del_cpu(cpu); PFM_INFO("CPU%d is up canceled", cpu); break; case CPU_DOWN_PREPARE: PFM_INFO("CPU%d prepare offline", cpu); break; case CPU_DOWN_FAILED: PFM_INFO("CPU%d is down failed", cpu); break; case CPU_DEAD: pfm_debugfs_del_cpu(cpu); PFM_INFO("CPU%d is offline", cpu); break; } /* * call PMU module handler if any */ if (pfm_pmu_conf->hotplug_handler) pfm_pmu_conf->hotplug_handler(action, cpu); pfm_pmu_conf_put(); return NOTIFY_OK; } On Fri, Aug 28, 2009 at 12:08 AM, stephane eranian<eran...@googlemail.com> wrote: > On Thu, Aug 27, 2009 at 11:40 PM, <gary.m...@bull.com> wrote: >> >> Hi Stephane >> >> One of our customers has reported that they get a kernel oops any time they >> halt or shutdown nodes in their cluster. They tracked the problem back to >> the >> routine pfm_cpu_notify (in source file perfmon_hotplug.c). >> >> They believe that the code: >> >> /* call PMU module handler if any */ >> if (pfm_pmu_conf->hotplug_handler) >> pfm_pmu_conf->hotplughandler(action,cpu) >> >> is executed when pfm_pmu_conf is null. They have made the following >> change: >> >> /* call PMU module handler if any */ >> if (pfm_pmu_conf != NULL && pfm_pmu_conf->hotplug_handler ) >> pfm_pmu_conf->hotplughandler(action,cpu) >> > > I will add this test in 2.6.30 then. > > Thanks. > >> and retested and the kernel oops no longer occurs. >> >> We are running with the 2.6.28 perfmon patch set but I checked and the code >> is the same >> in the 2.6.29 patch set. The 2.6.29 perfmon patches were not available >> when we built the >> kernel they are running. I now have the 2.6.29 perfmon patches ready to be >> included in >> Bull's next kernel build. >> >> This change looks like a safer way to do this code to me, what do you think >> ?? >> >> Thanks >> Gary >> >> >> ------------------------------------------------------------------------------ >> Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day >> trial. Simplify your report design, integration and deployment - and focus on >> what you do best, core application coding. Discover what's new with >> Crystal Reports now. http://p.sf.net/sfu/bobj-july >> _______________________________________________ >> perfmon2-devel mailing list >> perfmon2-devel@lists.sourceforge.net >> https://lists.sourceforge.net/lists/listinfo/perfmon2-devel >> > ------------------------------------------------------------------------------ Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day trial. Simplify your report design, integration and deployment - and focus on what you do best, core application coding. Discover what's new with Crystal Reports now. http://p.sf.net/sfu/bobj-july _______________________________________________ perfmon2-devel mailing list perfmon2-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/perfmon2-devel