Gary,
I had a 2nd thought on this. You should not get
into a situation where pfm_pmu_conf is NULL.
The function I have looks as follows and
what protects from NULL is pfm_pmu_conf_get().
If NULL, it will cause this function to return with
error and you simply exit pfm_cpu_notify().
Do you have the same code?
static int pfm_cpu_notify(struct notifier_block *nfb,
unsigned long action, void *hcpu)
{
unsigned int cpu = (unsigned long)hcpu;
/* no PMU description loaded */
if (pfm_pmu_conf_get(0))
return NOTIFY_OK;
switch (action) {
case CPU_ONLINE:
pfm_debugfs_add_cpu(cpu);
PFM_INFO("CPU%d is online", cpu);
break;
case CPU_UP_PREPARE:
PFM_INFO("CPU%d prepare online", cpu);
break;
case CPU_UP_CANCELED:
pfm_debugfs_del_cpu(cpu);
PFM_INFO("CPU%d is up canceled", cpu);
break;
case CPU_DOWN_PREPARE:
PFM_INFO("CPU%d prepare offline", cpu);
break;
case CPU_DOWN_FAILED:
PFM_INFO("CPU%d is down failed", cpu);
break;
case CPU_DEAD:
pfm_debugfs_del_cpu(cpu);
PFM_INFO("CPU%d is offline", cpu);
break;
}
/*
* call PMU module handler if any
*/
if (pfm_pmu_conf->hotplug_handler)
pfm_pmu_conf->hotplug_handler(action, cpu);
pfm_pmu_conf_put();
return NOTIFY_OK;
}
On Fri, Aug 28, 2009 at 12:08 AM, stephane
eranian<[email protected]> wrote:
> On Thu, Aug 27, 2009 at 11:40 PM, <[email protected]> wrote:
>>
>> Hi Stephane
>>
>> One of our customers has reported that they get a kernel oops any time they
>> halt or shutdown nodes in their cluster. They tracked the problem back to
>> the
>> routine pfm_cpu_notify (in source file perfmon_hotplug.c).
>>
>> They believe that the code:
>>
>> /* call PMU module handler if any */
>> if (pfm_pmu_conf->hotplug_handler)
>> pfm_pmu_conf->hotplughandler(action,cpu)
>>
>> is executed when pfm_pmu_conf is null. They have made the following
>> change:
>>
>> /* call PMU module handler if any */
>> if (pfm_pmu_conf != NULL && pfm_pmu_conf->hotplug_handler )
>> pfm_pmu_conf->hotplughandler(action,cpu)
>>
>
> I will add this test in 2.6.30 then.
>
> Thanks.
>
>> and retested and the kernel oops no longer occurs.
>>
>> We are running with the 2.6.28 perfmon patch set but I checked and the code
>> is the same
>> in the 2.6.29 patch set. The 2.6.29 perfmon patches were not available
>> when we built the
>> kernel they are running. I now have the 2.6.29 perfmon patches ready to be
>> included in
>> Bull's next kernel build.
>>
>> This change looks like a safer way to do this code to me, what do you think
>> ??
>>
>> Thanks
>> Gary
>>
>>
>> ------------------------------------------------------------------------------
>> Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day
>> trial. Simplify your report design, integration and deployment - and focus on
>> what you do best, core application coding. Discover what's new with
>> Crystal Reports now. http://p.sf.net/sfu/bobj-july
>> _______________________________________________
>> perfmon2-devel mailing list
>> [email protected]
>> https://lists.sourceforge.net/lists/listinfo/perfmon2-devel
>>
>
------------------------------------------------------------------------------
Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day
trial. Simplify your report design, integration and deployment - and focus on
what you do best, core application coding. Discover what's new with
Crystal Reports now. http://p.sf.net/sfu/bobj-july
_______________________________________________
perfmon2-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/perfmon2-devel