My guess would be that there is some miscommunication between driver levels, perhaps a result of running a new nut with an old kernel (latest is 2.6-37)?
That the presence of the "-D" flag appears to suppress the error: * I have hacked the startup script to invoke usbhid-ups directly with > the proper device parameter and -D, and the problem goes away > * I have used the hacked startup script to invoke usbhid-ups directly > with the proper device parameter but *no* -D, and we have a crash! > suggests a place to start, possibly looking for a device/interface timing issue. Dave On Wed, Jan 12, 2011 at 10:08 AM, Alfred Ganz <[email protected]<alfred-ganz%[email protected]> > wrote: > Ladies and Gentlemen, > > I have been trying to sort out a nasty kernel oops for which I would > like to ask for some advice. I don't actually think that this is a nut > problem, although it is triggered by upsdrvctl or usbhid-ups. I rather > suspect the USB library or the associated kernel code. > Here is the configuration information: > * system: Centos-5.5 > * kernel: 2.6.18-194.32.1.el5, latest vanilla for Centos-5.5 > * nut: nut-2.4.3-1, a build provided by Arnaud last fall > * libusb: libusb-0.1.12-5.1, the latest for this system > * the UPS: > device.mfr: APC > device.model: Back-UPS ES 650 > device.serial: QB0514232934 > device.type: ups > ups.firmware: 818.w1.D > ups.mfr.date: 2005/08/10 > ups.productid: 0002 > Sorry Arnaud! > * the driver: > driver.name: usbhid-ups > driver.parameter.pollfreq: 30 > driver.parameter.pollinterval: 1 > driver.parameter.port: auto > driver.version: 2.4.3 > driver.version.data: APC HID 0.95 > driver.version.internal: 0.34 > The problem: > * under certain circumstances system boot fails with a kernel oops > during the upsdrvctl/usbhid-ups phase of the startup script. > Note that we have *not* reached the upsd part of the startup script, > and the startup script of the printer package (cups) would be > reached much later. > * the boot process works fine if another USB device, the USB printer, > is powered up. Unfortunately, I would like to keep it powered off > most of the time if I could do so. > Note, the printer is not the only other USB device on the system, > but the number of active USB devices is being changed. > * Once the system is completely booted, there is no problem starting > and stopping the ups using the startup script. > * the problem appeared late in the lifetime of Centos-5, the system > has shut down due to power loss and restarted with the printer off > before the arrival of the problem. > * the problem has probably appeared with the transition from Centos-5.4 > (kernel-2.6.18-164.15.1.el5) to Centos-5.5 (kernel-2.6.18-194.3.1.el5), > although I noticed it much later > * libusb has been the same since 2008/04/22, I don't believe the problem > appeared that long ago. > * the problem happens with the locally rebuilt (with no changes) > nut-2.2.0-6.1 from Fedora 8, with the nut-2.4.3-1 provided by > Arnaud, and with a locally rebuilt version of nut-2.4.3-1 using > a spec file provided by Arnaud. > Here is what I have tried: > * I have started the system for a crash dump, and below is an extract > from the crash log. > * I have tried to recreate the problem on a virtual machine, but it > doesn't happen there > * I have introduced a delay before the start of the ups startup script > to make sure that the suspected timing problem doesn't happen earlier. > * I have hacked the startup script to invoke usbhid-ups directly with > the proper device parameter and -D, and the problem goes away > * I have used the hacked startup script to invoke usbhid-ups directly > with the proper device parameter but *no* -D, and we have a crash! > So I am now left with the observation that we have a timing problem > somewhere among usbhid-ups, libusb, and the kernel, and I have no > idea how to further narrow things down. > > I would be glad to do some more tests, but I would need your help with > this. > > Thanks for any advice, AG > > > ---------------------------------------------------------------------------- > BUG: unable to handle kernel paging request at virtual address 0000190c > printing eip: > c05954c6 > *pde = 74ecb067 > Oops: 0000 [#1] > SMP > .... > CPU: 0 > EIP: 0060:[<c05954c6>] Not tainted VLI > EFLAGS: 00010206 (2.6.18-194.26.1.el5 #1) > EIP is at hid_close+0x2/0x1f > eax: 00000000 ebx: d216bd20 ecx: f7fff080 edx: 00000000 > esi: d21ebc00 edi: c06b3470 ebp: 00000000 esp: f768ad34 > ds: 007b es: 007b ss: 0068 > Process usbhid-ups (pid: 2437, ti=f768a000 task=f7682000 task.ti=f768a000) > Stack: c05976fd d20fe000 c0595668 d21ebc00 c06b3440 c058e1c5 d21ebc8c > d21ebc14 > c055e859 d21ebc14 d21ebc14 d21ebc00 c055ea91 d21ebc00 c0588351 > ffffffc3 > 00000000 c0590cc0 f75ee340 00000000 00005516 00000000 c00c5512 > d2173400 > Call Trace: > [<c05976fd>] hiddev_disconnect+0x3f/0x5e > [<c0595668>] hid_disconnect+0x81/0xbf > [<c058e1c5>] usb_unbind_interface+0x34/0x6a > [<c055e859>] __device_release_driver+0x7d/0xbb > [<c055ea91>] device_release_driver+0x1c/0x2b > [<c0588351>] usb_driver_release_interface+0x38/0x60 > [<c0590cc0>] proc_ioctl_default+0x10d/0x1d0 > [<c0591ffd>] usbdev_ioctl+0x1027/0x10de > [<c048b06b>] __d_lookup+0x98/0xdb > [<c04c7eff>] inode_has_perm+0x54/0x5c > [<c04c78ab>] avc_has_perm+0x3c/0x46 > [<c04c7eff>] inode_has_perm+0x54/0x5c > [<c0464c32>] __handle_mm_fault+0x463/0xaac > [<c0486290>] do_ioctl+0x47/0x5d > [<c04867f9>] vfs_ioctl+0x47b/0x4d3 > [<c0486899>] sys_ioctl+0x48/0x5f > [<c0404f17>] syscall_call+0x7/0xb > ======================= > Code: 00 05 b0 01 86 83 b4 0c 00 00 fb 8d 83 54 0c 00 00 e8 4c 87 e9 ff 8b > 83 a8 0c 00 00 e8 7d 74 ff ff 31 c0 5b c3 31 d2 eb be 89 c2 <8b> 80 0c 19 00 > 00 48 85 c0 89 82 0c 19 00 00 75 0b 8b 82 a8 0c > EIP: [<c05954c6>] hid_close+0x2/0x1f SS:ESP 0068:f768ad34 > > ---------------------------------------------------------------------------- > > -- > ---------------------------------------------------------------------- > Alfred Ganz alfred-ganz:at:agci.com > AG Consulting, Inc. (203) 624-9667 > 440 Prospect Street # 11 > New Haven, CT 06511 > ---------------------------------------------------------------------- > -- "Ridicule is man's most potent weapon. It’s hard to counterattack ridicule, and it infuriates the opposition, which then reacts to your advantage." Saul Alinsky, Marxist, Obama mentor
_______________________________________________ Nut-upsdev mailing list [email protected] http://lists.alioth.debian.org/mailman/listinfo/nut-upsdev
