Re: USB printing panic
On 2/10/2011 1:33 AM, Martin Husemann wrote: > Can you try something like this? > > Martin Hi Martin, Apologies for the long-delayed response. I'm using 5.1; after making the change in your patch I still see the panic. For now, I've added a test to ulpt_tick for sc->sc_in_xfer == NULL, and simply returning if so. This seems naive, but it solves the panic issue, although it also seems that occasionally communication with the printer is terminated early as well (meaning I have to send jobs multiple times). That could well be another issue with my CUPS setup, though. Bill
Re: USB printing panic
Can you try something like this? Martin Index: ulpt.c === RCS file: /cvsroot/src/sys/dev/usb/ulpt.c,v retrieving revision 1.85 diff -u -p -r1.85 ulpt.c --- ulpt.c 3 Nov 2010 22:34:24 - 1.85 +++ ulpt.c 10 Feb 2011 09:33:25 - @@ -656,7 +656,7 @@ ulptclose(dev_t dev, int flag, int mode, if (sc->sc_has_callout) { DPRINTFN(2, ("ulptclose: stopping read callout\n")); - callout_stop(&sc->sc_read_callout); + callout_halt(&sc->sc_read_callout, NULL); sc->sc_has_callout = 0; }
Re: USB printing panic
Hello Eduardo, All of the below should be taken with the caveat that I'm an amateur at best and I've never looked at the NetBSD kernel (or that of any other OS, for that matter) before. So it's likely to be completely wrong, or worse. On 2/9/2011 4:08 PM, Eduardo Horvath wrote: On Wed, 9 Feb 2011, Bill Green wrote: >> cpu0: data fault: pc=14becc8 addr=0 >> kernel trap 30: data access exception >> Stopped in pid 0.5 (system) at netbsd:usbd_setup_xfer+0x8: ldub >> [ >> %o0 + 0x70], %g3 > > This one is definitely a NULL pointer dereference in the kernel, probably > in usbd_setup_xfer. In usbdi.c there are several functions (usbd_setup_xfer, usbd_transfer, others) which take pointers to structures and don't check whether they are null before using them. In ulpt.c, ulpt_tick calls usbd_setup_xfer and usbd_transfer, passing them a usb_xfer_handle contained in the struct ulpt_softc it gets a pointer to as argument. The following appears to be happening in my case: after rastertoqpdl crashes, the usb transfer is never finished (from the perspective of the printer, which will eventually print a sheet with a timeout error). ulptclose is called, which sets sc.sc_out_xfer (that eventually gets passed to usbd_setup_xfer and friends) to NULL, but leaves set sc (the struct ulpt_softc that ulpt_tick uses). ulpt_tick sometimes (I haven't found where) gets called after ulptclose, and only checks whether sc is null, and NOT sc->sc_out_xfer. I've added a test to ulpt_tick to check if sc->sc_out_xfer is null, and haven't been able to panic the system since. But I'm not sure whether anything else is making calls to the usbd_* functions with similar possible problems, or what the best way to fix this would be. Perhaps one could set the struct ulpt_softc itself to NULL in ulptclose, if other functions in ulpt.c follow the same assumptions? But, as I mentioned, there seem to be a lot of functions in usbdi.c that assume they are getting usable pointers, and these functions get used in a lot of other drivers besides the ulpt code. panic: kernel fault Stopped in pid 0.5 (system) at netbsd:cpu_Debugger+0x4:nop db> bt data_access_fault(b5cbaa0, 30, 1476388, 0, 70, 400) at Definitely a kernel problem but don't know the specifics. You need to dump the trapframe. I think this is the same bug detailed above. I'm not exactly sure what you mean by needing to dump the trapframe unless it is what I've provided below. #0 dumpsys () at ../../../../arch/sparc64/sparc64/machdep.c:755 #1 0x014abeb8 in cpu_reboot (howto=256, user_boot_string=0x0) at ../../../../arch/sparc64/sparc64/machdep.c:623 #2 0x010c7a28 in db_sync_cmd (addr=190633464, have_addr=false, count=-1, modif=0xb5cd4d8 "") at ../../../../ddb/db_command.c:1304 #3 0x010c821c in db_command (last_cmdp=0x180f678) at ../../../../ddb/db_command.c:926 #4 0x010c8514 in db_command_loop () at ../../../../ddb/db_command.c:583 #5 0x010cbc90 in db_trap (type=, code=0) at ../../../../ddb/db_trap.c:101 #6 0x014bbc6c in kdb_trap (type=48, tf=0xb5cd9e0) at ../../../../arch/sparc64/sparc64/db_interface.c:498 #7 0x014b8604 in data_access_fault (tf=0xb5cd9e0, type=48, pc=21757420, addr=0, sfva=0, sfsr=8390665) at ../../../../arch/sparc64/sparc64/trap.c:1200 #8 0x01008b24 in Ldatafault_internal () #9 0x01008b24 in Ldatafault_internal () Previous frame identical to this frame (corrupt stack?) (gdb) bt full [...] #7 0x014b8604 in data_access_fault (tf=0xb5cd9e0, type=48, pc=21757420, addr=0, sfva=0, sfsr=8390665) at ../../../../arch/sparc64/sparc64/trap.c:1200 l = (struct lwp *) 0xb30ef80 p = (struct proc *) 0x1823c98 vm = (struct vmspace *) 0xe0018000 va = 0 rv = 0 access_type = 1 onfault = 0 sticks = 128 ksi = {ksi_flags = 1, ksi_list = {cqe_next = 0xb5cd131, cqe_prev = 0x108360c}, ksi_info = {_signo = 0, ---Type to continue, or q to quit--- _code = 16352, _errno = 510, _pad = 33555456, _reason = {_rt = {_pid = 0, _uid = 16360, _value = {sival_int = 0, sival_ptr = 0x0}}, _child = {_pid = 0, _uid = 16360, _status = 0, _utime = 0, _stime = 0}, _fault = { _addr = 0x3fe8, _trap = 0}, _poll = {_band = 16360, _fd = 0}}}, ksi_lid = 0} lastdouble = 0 [...] (gdb) frame 7 #7 0x014b8604 in data_access_fault (tf=0xb5cd9e0, type=48, pc=21757420, addr=0, sfva=0, sfsr=8390665) at ../../../../arch/sparc64/sparc64/trap.c:1200 1200DEBUGGER(type, tf); (gdb) print *tf $1 = {tf_tstate = 17666409988, tf_pc = 21757420, tf_npc = 21757424, tf_fault = 0, tf_kstack = 0, tf_y = 0, tf_tt = 48, tf_pil = 0 '\0', tf_oldpil = 0 '\0', tf_global = {0, 4294967296, 29442048, 0, 1, 29442048, 2504691800080896, 387520}, tf_out = {25980824, 5, 0, 25980824, 190635096, 0, 190632721, 20291036}, tf_loc
Re: USB printing panic
On Wed, 9 Feb 2011, Bill Green wrote: > I am running NetBSD 5.1 sparc64 on a Sun Ultra 5. A Samsung USB printer is > connected to the system via an NEC-chipset PCI USB host. Printing via CUPS > using the SPLIX drivers (http://splix.sourceforge.net/) causes a kernel > panic. Printing via CUPS across the network (from hosts with > their own drivers) works without problems. > > As far as I can tell, a component of SPLIX (rastertoqpdl) > crashes with SIGBUS, and this sometimes panics the kernel. I don't know what > the bug in SPLIX is, either. > > I'm attaching below a dmesg, and ddb backtraces from two panics; the second, [...] > panic: kernel fault > Stopped in pid 0.5 (system) at netbsd:cpu_Debugger+0x4:nop > db> bt > data_access_fault(b5cbaa0, 30, 1476388, 0, 70, 400) at Definitely a kernel problem but don't know the specifics. You need to dump the trapframe. > cpu0: data fault: pc=14becc8 addr=0 > kernel trap 30: data access exception > Stopped in pid 0.5 (system) at netbsd:usbd_setup_xfer+0x8: ldub > [ > %o0 + 0x70], %g3 This one is definitely a NULL pointer dereference in the kernel, probably in usbd_setup_xfer. Eduardo
USB printing panic
Hello, I am running NetBSD 5.1 sparc64 on a Sun Ultra 5. A Samsung USB printer is connected to the system via an NEC-chipset PCI USB host. Printing via CUPS using the SPLIX drivers (http://splix.sourceforge.net/) causes a kernel panic. Printing via CUPS across the network (from hosts with their own drivers) works without problems. As far as I can tell, a component of SPLIX (rastertoqpdl) crashes with SIGBUS, and this sometimes panics the kernel. I don't know what the bug in SPLIX is, either. I'm attaching below a dmesg, and ddb backtraces from two panics; the second, occuring when a DEBUG kernel was running, is preceded by kernel debugging messages (and I also enabled the debug code in dev/usb/ulpt.c). I also executed a few "show" commands in ddb, although I don't entirely understand them, in the hope that they might be useful. I don't know how to track this any further. I do have the core dump from the last crash and would be happy to help however possible. Boot messages: Sun Ultra 5/10 UPA/PCI (UltraSPARC-IIi 270MHz), No Keyboard OpenBoot 3.11, 256 MB memory installed, Serial #10467420. Ethernet address 8:0:20:9f:b8:5c, Host ID: 809fb85c. Initializing Memory | ok boot debug Boot device: /pci@1f,0/pci@1,1/ide@3/disk@0,0 File and args: debug NetBSD IEEE 1275 Bootblock NetBSD/sparc64 OpenFirmware Boot, Revision 1.13 =0x859bd8 Loading debug: 7579168+367512+483888 [519120+340115]=0x9a2130 Loaded initial symtab at 0x18cfdc8, strtab at 0x194f098, # entries 21604 consinit() stdin node = f0061840 stdout package = f0061840 buffer @ 0x1c05ca0 console is /pci@1f,0/pci@1,1/ebus@1/se@14,40:a Copyright (c) 1996, 1997, 1998, 1999, 2000, 2001, 2002, 2003, 2004, 2005, 2006, 2007, 2008, 2009, 2010 The NetBSD Foundation, Inc. All rights reserved. Copyright (c) 1982, 1986, 1989, 1991, 1993 The Regents of the University of California. All rights reserved. NetBSD 5.1 (GENERIC-DEBUG) #4: Thu Feb 3 16:59:01 PST 2011 b...@puddle.supposedly.org:/home/bill/netbsd-5-1-source/usr/src/sys/arch/sparc64/compile/GENERIC-DEBUG total memory = 256 MB avail memory = 238 MB mainbus0 (root): SUNW,Ultra-5_10 (Sun Ultra 5/10 UPA/PCI): hostid 809fb85c cpu0 at mainbus0: SUNW,UltraSPARC-IIi @ 270 MHz, UPA id 0 cpu0: 16K instruction (32 b/l), 16K data (32 b/l), 256K external (64 b/l) psycho0 at mainbus0 addr 0xfffc4000 psycho0: SUNW,sabre: impl 0, version 0: ign 7c0 bus range 0 to 2; PCI bus 0 extent `psycho mem' (0x0 - 0x), flags = 0x0 0x0 - 0x807f extent `psycho io' (0x0 - 0xff), flags = 0x0 0x0 - 0x47f DVMA map: c000 to e000 IOTSB: 1045a000 to 104da000 memory range: 01ff pci0 at psycho0 ppb0 at pci0 dev 1 function 1: Sun Microsystems Simba PCI bridge (rev. 0x11) pci1 at ppb0 bus 1 ebus0 at pci1 dev 1 function 0 ebus0: Sun Microsystems PCIO Ebus2, revision 0x01 auxio0 at ebus0 addr 726000-726003, 728000-728003, 72a000-72a003, 72c000-72c003, 72f000-72f003 power at ebus0 addr 724000-724003 ipl 37 not configured SUNW,pll at ebus0 addr 504000-504002 not configured sab0 at ebus0 addr 40-40007f ipl 43: rev 3.2 sabtty0 at sab0 port 0: console i/o sabtty1 at sab0 port 1 com0 at ebus0 addr 3083f8-3083ff ipl 41: ns16550a, working fifo kbd0 at com0 com1 at ebus0 addr 3062f8-3062ff ipl 42: ns16550a, working fifo ms0 at com1 wsmouse0 at ms0 mux 0 lpt0 at ebus0 addr 3043bc-3043cb, 30015c-30015d, 70-7f ipl 34 fdthree at ebus0 addr 3023f0-3023f7, 706000-70600f, 72-720003 ipl 39 not configured clock0 at ebus0 addr 0-1fff: mk48t59 flashprom at ebus0 addr 0-f not configured audiocs0 at ebus0 addr 20-2000ff, 702000-70200f, 704000-70400f, 722000-722003 ipl 35 ipl 36: CS4231A audio0 at audiocs0: full duplex, playback, capture hme0 at pci1 dev 1 function 1: Sun Happy Meal Ethernet, rev. 1 hme0: interrupting at ivec 3021 hme0: Ethernet address 08:00:20:9f:b8:5c nsphy0 at hme0 phy 1: DP83840 10/100 media interface, rev. 1 nsphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto machfb0 at pci1 dev 2 function 0: ATI Technologies 3D Rage I/II (rev. 0x9a) machfb0: 16 MB aperture at 0xe100, 4 KB registers at 0x machfb0: memctl 003210b3 machfb0: 2048 KB SGRAM 62.999 MHz, maximum RAMDAC clock 170 MHz gen_cntl: 01000210 mach64_get_mode: 1152 5304 5432 1528 900 902 938 937 machfb0: initial resolution 1152x864 at 8 bpp machfb0: attached to /dev/fb0 machfb0: initializing the DSP wsdisplay1 at machfb0 kbdmux 1 cmdide0 at pci1 dev 3 function 0 cmdide0: CMD Technology PCI0646 (rev. 0x03) cmdide0: primary channel configured to native-PCI mode cmdide0: using ivec 1820 for native-PCI interrupt atabus0 at cmdide0 channel 0 cmdide0: secondary channel configured to native-PCI mode atabus1 at cmdide0 channel 1 ppb1 at pci0 dev 1 function 0: Sun Microsystems Simba PCI bridge (rev. 0x11) pci2 at ppb1 bus 2 ohci0 at pci2 dev 2 function 0: NEC USB Host Controller (rev. 0x43) ohci0: interrupting at ivec 14 ohci0: OHCI version 1.0 usb0 at