Hopefully the attached patch will fix the problem and clean up the error
handling in this failure case.

-Corey

Carol Hebert wrote:
> Hi Corey,
>
> I believe I may have found a problem with the ipmi driver v39 in the
> 2.6.18 kernel when loaded on multi-node systems (in my particular case,
> an dual-node x460 with two BMCs).  At first glance, it appears the
> problem may be in the sysfs code added last January -- it looks like it
> may not be handling the multiple BMCs correctly.   The result is that
> the ipmi_si module won't load and the ipmi device nodes don't get
> created.
>
> I'm only starting to debug the issue but wanted to let you know what
> I've seen asap in case someone's already spotted this problem but I
> missed seeing a patch and also because I'm not a sysfs expert and I
> don't know what the original intent was for how to present multiple BMCs
> (from multi-node systems) in the sysfs.
>
> I'm pasting the stack backtrace below.  Please let me know if you have
> any suggestions or questions.
>
> Thanks much,
>
> Carol Hebert
>
>
> ipmi message handler version 39.0
> IPMI System Interface driver.
> ipmi_si: Trying SMBIOS-specified KCS state machine at I/O address
> 0x90a8, slave address 0x20, irq 0
> PM: Adding info for platform:ipmi_si.0
> PM: Adding info for platform:ipmi_bmc.32
> ipmi: Found new BMC (man_id: 0x000002,  prod_id: 0x0007, dev_id: 0x20)
>  IPMI KCS interface initialized
> ipmi_si: Trying SMBIOS-specified KCS state machine at I/O address 0xca8,
> slave address 0x20, irq 0
> PM: Adding info for platform:ipmi_si.1
> kobject_add failed for ipmi_bmc.32 with -EEXIST, don't try to register
> things with the same name in the same directory.
>  [<c04051e3>] show_trace_log_lvl+0x58/0x16a
>  [<c04057f0>] show_trace+0xd/0x10
>  [<c0405900>] dump_stack+0x19/0x1b
>  [<c04e7529>] kobject_add+0x14b/0x171
>  [<c0550ced>] device_add+0x7a/0x2de
>  [<c0553b5f>] platform_device_add+0xde/0x10e
>  [<c0553ba4>] platform_device_register+0x15/0x18
>  [<f8b09bf2>] ipmi_register_smi+0x538/0x94a [ipmi_msghandler]
>  [<f980be5e>] try_smi_init+0x3ff/0x5a7 [ipmi_si]
>  [<f980c99e>] init_ipmi_si+0x40f/0x6db [ipmi_si]
>  [<c04427ee>] sys_init_module+0x16ad/0x1856
>  [<c0403fb7>] syscall_call+0x7/0xb
> DWARF2 unwinder stuck at syscall_call+0x7/0xb
> Leftover inexact backtrace:
>  [<c04057f0>] show_trace+0xd/0x10
>  [<c0405900>] dump_stack+0x19/0x1b
>  [<c04e7529>] kobject_add+0x14b/0x171
>  [<c0550ced>] device_add+0x7a/0x2de
>  [<c0553b5f>] platform_device_add+0xde/0x10e
>  [<c0553ba4>] platform_device_register+0x15/0x18
>  [<f8b09bf2>] ipmi_register_smi+0x538/0x94a [ipmi_msghandler]
>  [<f980be5e>] try_smi_init+0x3ff/0x5a7 [ipmi_si]
>  [<f980c99e>] init_ipmi_si+0x40f/0x6db [ipmi_si]
>  [<c04427ee>] sys_init_module+0x16ad/0x1856
>  [<c0403fb7>] syscall_call+0x7/0xb
> ipmi_msghandler: Unable to register bmc device: -17
> ipmi_si: Unable to register device: error -17
> BUG: unable to handle kernel paging request at virtual address 6b6b6c73
>  printing eip:
> c04aa1d4
> *pde = 6b6b6b6b
> Oops: 0000 [#1]
> SMP
> last sysfs file: /class/drm/card0/dev
> Modules linked in: ipmi_si ipmi_msghandler radeon drm autofs4 hidp
> rfcomm l2cap bluetooth sunrpc ipv6 acpi_cpufreq video sbs i2c_ec button
> battery asus_acpi ac parport_pc lp parport joydev sg pcspkr tg3 aacraid
> i2c_piix4 i2c_core ide_cd cdrom serio_raw dm_snapshot dm_zero dm_mirror
> dm_mod aic94xx libsas scsi_transport_sas sd_mod scsi_mod ext3 jbd
> ehci_hcd ohci_hcd uhci_hcd
> CPU:    8
> EIP:    0060:[<c04aa1d4>]    Not tainted VLI
> EFLAGS: 00010212   (2.6.18-1.2702.el5PAE #1)
> EIP is at sysfs_remove_link+0x1/0xd
> eax: 6b6b6c43   ebx: e722ad78   ecx: c042dc05   edx: f8b0aad8
> esi: 6b6b6b6b   edi: e722ad78   ebp: e7152e58   esp: e7152e48
> ds: 007b   es: 007b   ss: 0068
> Process modprobe (pid: 20599, ti=e7152000 task=f72b0030
> task.ti=e7152000)
> Stack: e7152e58 f8b08ebf ffffffef 00000000 e7152e6c f8b09559 ffffffef
> eeb70248
>        ffffffef e7152e84 f980bf34 0118c8be 00000ca8 00000004 00000000
> e7152eac
>        f980c99e 00000000 00000004 d1c2d700 010020ac 00000ca8 f9814480
> f9814480
> Call Trace:
>  [<f8b08ebf>] ipmi_bmc_unregister+0x1c/0x63 [ipmi_msghandler]
>  [<f8b09559>] ipmi_unregister_smi+0xf/0xc3 [ipmi_msghandler]
>  [<f980bf34>] try_smi_init+0x4d5/0x5a7 [ipmi_si]
>  [<f980c99e>] init_ipmi_si+0x40f/0x6db [ipmi_si]
>  [<c04427ee>] sys_init_module+0x16ad/0x1856
>  [<c0403fb7>] syscall_call+0x7/0xb
> DWARF2 unwinder stuck at syscall_call+0x7/0xb
> Leftover inexact backtrace:
>  [<c040537f>] show_stack_log_lvl+0x8a/0x95
>  [<c04054b7>] show_registers+0x12d/0x19a
>  [<c04056b4>] die+0x190/0x293
>  [<c0613331>] do_page_fault+0x4e8/0x5ba
>  [<c0404be9>] error_code+0x39/0x40
>  [<f8b09559>] ipmi_unregister_smi+0xf/0xc3 [ipmi_msghandler]
>  [<f980bf34>] try_smi_init+0x4d5/0x5a7 [ipmi_si]
>  [<f980c99e>] init_ipmi_si+0x40f/0x6db [ipmi_si]
>  [<c04427ee>] sys_init_module+0x16ad/0x1856
>  [<c0403fb7>] syscall_call+0x7/0xb
> Code: f1 f8 ff 8b 45 f0 e8 06 d0 03 00 8b 45 ec e8 fe cf 03 00 8b 55 e4
> 8b 4d e0 8b 41 1c 89 54 81 20 83 c4 14 31 c0 5b 5e 5f 5d c3 55 <8b> 40
> 30 89 e5 e8 d0 e4 ff ff 5d c3 55 89 e5 57 56 89 ce 53 83
> EIP: [<c04aa1d4>] sysfs_remove_link+0x1/0xd SS:ESP 0068:e7152e48
>
>   

This patch adds the product id to the driver model platform device
name, in addition to the device id.  The IPMI speci does not require
that individual BMCs in a system have unique devices IDs, but it
does require that the product id/device id combination be unique.

This also remove a redundant check and cleans up error handling
when the sysfs registration fails.

Index: linux-2.6.18/drivers/char/ipmi/ipmi_msghandler.c
===================================================================
--- linux-2.6.18.orig/drivers/char/ipmi/ipmi_msghandler.c
+++ linux-2.6.18/drivers/char/ipmi/ipmi_msghandler.c
@@ -1894,7 +1894,6 @@ static int __find_bmc_prod_dev_id(struct
 	struct bmc_device *bmc = dev_get_drvdata(dev);
 
 	return (bmc->id.product_id == id->product_id
-		&& bmc->id.product_id == id->product_id
 		&& bmc->id.device_id == id->device_id);
 }
 
@@ -2052,6 +2051,9 @@ static void ipmi_bmc_unregister(ipmi_smi
 {
 	struct bmc_device *bmc = intf->bmc;
 
+	if (!bmc)
+		return;
+
 	sysfs_remove_link(&intf->si_dev->kobj, "bmc");
 	if (intf->my_dev_name) {
 		sysfs_remove_link(&bmc->dev->dev.kobj, intf->my_dev_name);
@@ -2061,6 +2063,7 @@ static void ipmi_bmc_unregister(ipmi_smi
 
 	mutex_lock(&ipmidriver_mutex);
 	kref_put(&bmc->refcount, cleanup_bmc_device);
+	intf->bmc = NULL;
 	mutex_unlock(&ipmidriver_mutex);
 }
 
@@ -2104,9 +2107,12 @@ static int ipmi_bmc_register(ipmi_smi_t 
 		       bmc->id.product_id,
 		       bmc->id.device_id);
 	} else {
-		bmc->dev = platform_device_alloc("ipmi_bmc",
-						 bmc->id.device_id);
+		char name[14];
+		snprintf(name, sizeof(name),
+			 "ipmi_bmc.%4.4x", bmc->id.product_id);
+		bmc->dev = platform_device_alloc(name, bmc->id.device_id);
 		if (!bmc->dev) {
+			mutex_unlock(&ipmidriver_mutex);
 			printk(KERN_ERR
 			       "ipmi_msghandler:"
 			       " Unable to allocate platform device\n");
@@ -2119,6 +2125,8 @@ static int ipmi_bmc_register(ipmi_smi_t 
 		rv = platform_device_register(bmc->dev);
 		mutex_unlock(&ipmidriver_mutex);
 		if (rv) {
+			platform_device_put(bmc->dev);
+			bmc->dev = NULL;
 			printk(KERN_ERR
 			       "ipmi_msghandler:"
 			       " Unable to register bmc device: %d\n",
-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys -- and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
_______________________________________________
Openipmi-developer mailing list
Openipmi-developer@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/openipmi-developer

Reply via email to