Re: Yet another crash in FreeBSD 5.1
--On 7. august 2003 10:33 +0930 Greg 'groggy' Lehey [EMAIL PROTECTED] wrote: Q: If you have a crash, please supply a backtrace from the dump analysis as discussed below under Kernel Panics. Please don't delete the crash dump; it may be needed for further analysis. A: Sorry, I don't have a crash dump. I tried creating one when the computer had crashed by giving the commands panic and then continue but that didn't help. Was this of any help? Not much, unfortunately. I think that these problems occur as the result of some hardware failure, but there's nothing in what you've supplied to indicate that. If you can't repeat it, I fear that it's yet another of the ones that got away. I have now managed to produce a crash dump but I'm not sure if it's any good or not. For some reason I tried to give ddb the panic command twice in a row and then it at least produced a crash dump but I'm not sure if it contains any information. Here is a backtrace at least. Keep in mind that I'm not a C programmer and have no experience with gdb so I must be told what to do to produce more information. [EMAIL PROTECTED]:~/tmp/debug gdb -k kernel.debug vmcore.0 GNU gdb 5.2.1 (FreeBSD) Copyright 2002 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type show copying to see the conditions. There is absolutely no warranty for GDB. Type show warranty for details. This GDB was configured as i386-undermydesk-freebsd... panic: from debugger panic messages: --- Fatal trap 12: page fault while in kernel mode fault virtual address = 0x14 fault code = supervisor write, page not present instruction pointer = 0x8:0xc02e8139 stack pointer = 0x10:0xcac43a00 frame pointer = 0x10:0xcac43a34 code segment= base 0x0, limit 0xf, type 0x1b = DPL 0, pres 1, def32 1, gran 1 processor eflags= interrupt enabled, resume, IOPL = 0 current process = 5 (pagedaemon) panic: from debugger Fatal trap 3: breakpoint instruction fault while in kernel mode instruction pointer = 0x8:0xc048cd34 stack pointer = 0x10:0xcac43780 frame pointer = 0x10:0xcac4378c code segment= base 0x0, limit 0xf, type 0x1b = DPL 0, pres 1, def32 1, gran 1 processor eflags= IOPL = 0 current process = 5 (pagedaemon) panic: from debugger Uptime: 1d13h38m55s Dumping 191 MB ata0: resetting devices .. done 16 32 48 64 80 96 112 128 144 160 176 --- Reading symbols from /usr/obj/usr/src/sys/VIMES/modules/usr/src/sys/modules/vinum/vinum.ko.debug ...done. Loaded symbols for /usr/obj/usr/src/sys/VIMES/modules/usr/src/sys/modules/vinum/vinum.ko.debug Reading symbols from /usr/obj/usr/src/sys/VIMES/modules/usr/src/sys/modules/ipfw/ipfw.ko.debug.. .done. Loaded symbols for /usr/obj/usr/src/sys/VIMES/modules/usr/src/sys/modules/ipfw/ipfw.ko.debug Reading symbols from /boot/kernel/dragon_saver.ko...done. Loaded symbols for /boot/kernel/dragon_saver.ko #0 doadump () at /usr/src/sys/kern/kern_shutdown.c:238 238 dumping++; (kgdb) bt #0 doadump () at /usr/src/sys/kern/kern_shutdown.c:238 #1 0xc031a8f9 in boot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:370 #2 0xc031abeb in panic () at /usr/src/sys/kern/kern_shutdown.c:543 #3 0xc0173e92 in db_panic () at /usr/src/sys/ddb/db_command.c:448 #4 0xc0173e12 in db_command (last_cmdp=0xc0527740, cmd_table=0x0, aux_cmd_tablep=0xc051da0c, aux_cmd_tablep_end=0xc051da24) at /usr/src/sys/ddb/db_command.c:346 #5 0xc0173f26 in db_command_loop () at /usr/src/sys/ddb/db_command.c:470 #6 0xc0176caa in db_trap (type=12, code=0) at /usr/src/sys/ddb/db_trap.c:72 #7 0xc048ca95 in kdb_trap (type=12, code=0, regs=0xcac439c0) at /usr/src/sys/i386/i386/db_interface.c:170 #8 0xc049e772 in trap_fatal (frame=0xcac439c0, eva=0) at /usr/src/sys/i386/i386/trap.c:829 #9 0xc049e482 in trap_pfault (frame=0xcac439c0, usermode=0, eva=20) at /usr/src/sys/i386/i386/trap.c:748 #10 0xc049e05d in trap (frame= {tf_fs = 24, tf_es = 16, tf_ds = 16, tf_edi = -1039907200, tf_esi = -978486016, tf_ebp = -893109708, tf_isp = -893109780, tf_ebx = 0, tf_edx = 0, tf_ecx = 0, tf_eax = 23179264, tf_trapno = 12, tf_err = 2, tf_eip = -1070694087, tf_cs = 8, tf_eflags = 66054, tf_esp = -978486016, tf_ss = -893109736}) at /usr/src/sys/i386/i386/trap.c:433 #11 0xc048e3e8 in calltrap () at {standard input}:96 #12 0xc02e5bc6 in spec_xstrategy (vp=0xc2044680, bp=0xc5ad7d00) at /usr/src/sys/fs/specfs/spec_vnops.c:513 #13 0xc02e5c4b in spec_specstrategy (ap=0x0) at /usr/src/sys/fs/specfs/spec_vnops.c:550 #14 0xc02e4f18 in spec_vnoperate (ap=0x0) at /usr/src/sys/fs/specfs/spec_vnops.c:123 #15 0xc0465c4d in swapdev_strategy (ap=0x0) at vnode_if.h:1114 #16 0xc0452809 in swap_pager_putpages (object=0x0, m=0xcac43bd0, count=1, sync=0, rtvals=0xcac43b40) at
Re: Yet another crash in FreeBSD 5.1
On Sunday, 3 August 2003 at 0:31:45 -0400, John Baldwin wrote: On 03-Aug-2003 Greg 'groggy' Lehey wrote: On Saturday, 2 August 2003 at 16:47:13 +0200, Eivind Olsen wrote: [EMAIL PROTECTED]:~/tmp/debug gdb -k kernel.debug (kgdb) list *(g_dev_strategy+29) This is almost certainly the wrong function. At the very list you should look at the arguments passed to it. Actually, this line can be very instructive. Since 'bp' is valid it is probably the bp2 from g_clone_bio() that is NULL. You might want to ask phk about that one. I think you'll find that there's a null dev pointer in there. As I say, I've seen this scenario before (without GEOM), and I'd be surprised if this were phk's problem. (kgdb) list *(launch_requests+448) No symbol launch_requests in current context. (kgdb) list *(vinumstart+2b2) No symbol vinumstart in current context. (kgdb) Read the links I just sent you. You haven't loaded the Vinum symbols. Bah, this isn't hard for you to do either: ... once you've loaded the symbols. That's why I pointed to the links. As I said to Terry, the real issue here is probably what was happening at the time, not the contents of the dump. Greg -- See complete headers for address and phone numbers pgp0.pgp Description: PGP signature
Re: Yet another crash in FreeBSD 5.1
--On 3. august 2003 00:31 -0400 John Baldwin [EMAIL PROTECTED] wrote: But you knew that. Also, Eivind, you need to use hex, not decimal offsets from the functions. You might want to redo the g_dev_strategy() line with 0x29 instead of 29. I already though about that so I tested the commands both with and without 0x in front of those numbers and I get exactly the same output, so it looks like gdb interprets them as hex anyway: (kgdb) list *(g_dev_strategy+0x29) 0xc02e8139 is in g_dev_strategy (/usr/src/sys/geom/geom_dev.c:415). 410 KASSERT(cp-acr || cp-acw, 411 (Consumer with zero access count in g_dev_strategy)); 412 413 bp2 = g_clone_bio(bp); 414 KASSERT(bp2 != NULL, (XXX: ENOMEM in a bad place)); 415 bp2-bio_offset = (off_t)bp-bio_blkno DEV_BSHIFT; 416 KASSERT(bp2-bio_offset = 0, 417 (Negative bio_offset (%jd) on bio %p, 418 (intmax_t)bp2-bio_offset, bp)); 419 bp2-bio_length = (off_t)bp-bio_bcount; -- Regards / Hilsen Eivind Olsen [EMAIL PROTECTED] ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Yet another crash in FreeBSD 5.1
--On 3. august 2003 09:35 +0930 Greg 'groggy' Lehey [EMAIL PROTECTED] wrote: This is the real issue. Until you supply the information I ask for in the man page or at http://www.vinumvm.org/vinum/how-to-debug.html, only Terry can help you. Ok, I'll try to supply that information: Q: What problems are you having? A: FreeBSD RELENG_5_1 crashes with the following text shown on screen: Fatal trap 12: page fault while in kernel mode fault virtual address = 0x14 fault code = supervisor write, page not present instruction pointer = 0x8:0xc02e8139 stack pointer = 0x10:0xcfb5284c frame pointer = 0x10:0xcfb52880 code segment= base 0x0, limit 0xf, type 0x1b = DPL 0, pres 1, def32 1, gran 1 processor eflags= interrupt enabled, resume, IOPL = 0 current process = 10785 (ctl_cyrusdb) kernel: type 12 trap, code=0 Stopped at g_dev_strategy+0x29:movl%eax,0x14(%ebx) Q: Which version of FreeBSD are you running? A: FreeBSD 5.1, tracking RELENG_5_1, cvsupped in the morning of the 27th of July if I'm not mistaken. Q: Have you made any changes to the system sources, including Vinum? A: No, it's all taken from the cvsup. I do have a custom kernel since I need to use ipfilter but that's really the only change. I've done the following changes: makeoptions DEBUG=-g options DDB options IPFILTER options IPFILTER_LOG options IPFILTER_DEFAULT_BLOCK Q: Supply the output of the vinum list command. If you can't start Vinum, supply the on-disk configuration, as described below. If you can't start Vinum, then (and only then) send a copy of the configuration file. A: Here it is: vimes# vinum vinum - list 2 drives: D WHITE State: up /dev/ad2s1e A: 0/113046 MB (0%) D BLACK State: up /dev/ad0s1d A: 0/113046 MB (0%) 6 volumes: V var State: up Plexes: 2 Size: 6144 MB V usrlocal State: up Plexes: 2 Size: 6144 MB V tmp State: up Plexes: 1 Size:255 MB V usr State: up Plexes: 2 Size: 6144 MB V home State: up Plexes: 2 Size: 8192 MB V storage State: up Plexes: 1 Size:168 GB 10 plexes: P var.p0 C State: up Subdisks: 1 Size: 6144 MB P var.p1 C State: up Subdisks: 1 Size: 6144 MB P usrlocal.p0 C State: up Subdisks: 1 Size: 6144 MB P usrlocal.p1 C State: up Subdisks: 1 Size: 6144 MB P tmp.p0 S State: up Subdisks: 2 Size:255 MB P usr.p0 C State: up Subdisks: 1 Size: 6144 MB P usr.p1 C State: up Subdisks: 1 Size: 6144 MB P home.p0 C State: up Subdisks: 1 Size: 8192 MB P home.p1 C State: up Subdisks: 1 Size: 8192 MB P storage.p0 S State: up Subdisks: 2 Size:168 GB 12 subdisks: S var.p0.s0 State: up D: BLACKSize: 6144 MB S var.p1.s0 State: up D: WHITESize: 6144 MB S usrlocal.p0.s0State: up D: BLACKSize: 6144 MB S usrlocal.p1.s0State: up D: WHITESize: 6144 MB S tmp.p0.s0 State: up D: BLACKSize:127 MB S tmp.p0.s1 State: up D: WHITESize:127 MB S usr.p0.s0 State: up D: BLACKSize: 6144 MB S usr.p1.s0 State: up D: WHITESize: 6144 MB S home.p0.s0State: up D: BLACKSize: 8192 MB S home.p1.s0State: up D: WHITESize: 8192 MB S storage.p0.s0 State: up D: BLACKSize: 84 GB S storage.p0.s1 State: up D: WHITESize: 84 GB vinum - Q: Supply an extract of the Vinum history file. Unless you have explicitly renamed it, it will be /var/log/vinum_history. This file can get very big; please limit it to the time around when you have the problems. Each line contains a timestamp at the beginning, so you will have no difficulty in establishing which data is of relevance. A: It's so small, I'll give the complete vinum_history log: vimes# cat vinum_history 26 Jul 2003 18:43:38.056211 *** vinum started *** 26 Jul 2003 18:43:39.456133 list 26 Jul 2003 18:43:41.631830 list 26 Jul 2003 18:43:42.598409 list 26 Jul 2003 18:43:46.885029 quit 26 Jul 2003 18:43:48.450706 *** vinum started *** 26 Jul 2003 18:43:51.745079 help 26 Jul 2003 18:47:54.213327 *** vinum started *** 26 Jul 2003 18:47:54.216030 create install-vinum.conf drive BLACK device /dev/ad0s1d drive WHITE device /dev/ad2s1e volume var setupstate plex org concat sd length
Re: Yet another crash in FreeBSD 5.1
--On 3. august 2003 09:37 +0930 Greg 'groggy' Lehey [EMAIL PROTECTED] wrote: Read the links I just sent you. You haven't loaded the Vinum symbols. I'm not sure exactly what to do here. I have absolutely no previous experience with kernel debugging, using gdb etc. so I'm lost without specific instructions on what to do, what to try etc. The vinum.ko file is not stripped: [EMAIL PROTECTED]:~/tmp/debug file /boot/kernel/vinum.ko /boot/kernel/vinum.ko: ELF 32-bit LSB shared object, Intel 80386, version 1 (FreeBSD), not stripped [EMAIL PROTECTED]:~/tmp/debug The web page mentions that I should either use the crash dump (which isn't created...) or use remote serial gdb to analyze the problem. I guess I'll have to find a nullmodem cable, install FreeBSD on another computer here (I couldn't find a Windows version of gdb) and try to figure out exactly how to do remote GDB debugging (I've looked around in the developers handbook, specifically http://www.freebsd.org/doc/en_US.ISO8859-1/books/developers-handbook/kerne ldebug-online-gdb.html) -- Regards / Hilsen Eivind Olsen [EMAIL PROTECTED] ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Yet another crash in FreeBSD 5.1
On Sunday, 3 August 2003 at 11:17:49 +0200, Eivind Olsen wrote: --On 3. august 2003 09:37 +0930 Greg 'groggy' Lehey [EMAIL PROTECTED] wrote: Read the links I just sent you. You haven't loaded the Vinum symbols. I'm not sure exactly what to do here. I have absolutely no previous experience with kernel debugging, using gdb etc. so I'm lost without specific instructions on what to do, what to try etc. Don't worry too much about that at the moment. Let me analyze the info you've sent me, and I'll ask some more questions. Greg -- See complete headers for address and phone numbers pgp0.pgp Description: PGP signature
Yet another crash in FreeBSD 5.1
I've now had yet another crash under FreeBSD 5.1 (RELENG_5_1, cvsupped 5-6 days ago) and it looks almost the same as the crash I posted about yesterday (or was it the day before? Here's some output from DDB: Krasj 2.7.2003: Fatal trap 12: page fault while in kernel mode fault virtual address = 0x14 fault code = supervisor write, page not present instruction pointer = 0x8:0xc02e8139 stack pointer = 0x10:0xcfb5284c frame pointer = 0x10:0xcfb52880 code segment= base 0x0, limit 0xf, type 0x1b = DPL 0, pres 1, def32 1, gran 1 processor eflags= interrupt enabled, resume, IOPL = 0 current process = 10785 (ctl_cyrusdb) kernel: type 12 trap, code=0 Stopped at g_dev_strategy+0x29:movl%eax,0x14(%ebx) db show reg cs 0x8 ds0x10 es0x10 fs0x18 ss0x10 eax 0xfd235200 ecx 0 edx 0 ebx 0 esp 0xcfb5284c ebp 0xcfb52880 esi 0xc2156024 _end+0x5ae4 edi 0xc2044900 eip 0xc02e8139 g_dev_strategy+0x29 efl0x10286 dr0 0 dr1 0 dr2 0 dr3 0 dr4 0x0ff0 dr5 0x400 dr6 0x0ff0 dr7 0x400 g_dev_strategy+0x29:movl%eax,0x14(%ebx) db trace g_dev_strategy(c2156024,c2153800,0,cfb528d0,c2099eca) at g_dev_strategy+0x29 launch_requests(c299bf00,0,1,,47) at launch_requests+0x448 vinumstart(c5ada2d0,0,c22ab000,cfb5294c,c02e5bc6) at vinumstart+0x2b2 vinumstrategy(c5ada2d0,0,c09719b0,40,0) at vinumstrategy+0xa6 spec_xstrategy(c215c5b4,c5ada2d0,cfb52968,c02e4f18,cfb52994) at spec_xstrategy+0x306 spec_specstrategy(cfb52994,cfb529b0,c044f7ad,cfb52994,0) at spec_specstrategy+0x1b spec_vnoperate(cfb52994,0,c09719b0,f,c5ada2d0) at spec_vnoperate+0x18 ufs_strategy(cfb529d8,cfb52a0c,c0359a87,cfb529d8,1) at ufs_strategy+0xdd ufs_vnoperate(cfb529d8,1,c0504f45,35e,cfb529f8) at ufs_vnoperate+0x18 bwrite(c5ada2d0,cfb52a5c,c0361aca,c5ada2d0,c5ada400) at bwrite+0x3a7 bawrite(c5ada2d0,c5ada400,10,3c6,20020080) at bawrite+0x1c cluster_wbuild(c30c7124,4000,50,0,4) at cluster_wbuild+0x6ba cluster_write(c5b735c0,9c7c64,0,55,c252b880) at cluster_write+0x571 ffs_write(cfb52be0,c21c2528,c22ab000,227,c2025e00) at ffs_wrie+0x5ff vn_write(c21c2528,cfb52c7c,c252b880,0,c22ab000) at vn_write+0x192 dofilewrite(c22ab000,c21c2528,8,807e000,4000) at dofilewrite+0xe8 write(c22ab000,cfb52d10,c0518514,3fb,3) at write+0x69 syscall(2f,807002f,bfbf002f,0,807e000) at syscall+0x24e Xint0x80_syscall() at Xint0x80_syscall+0x1d --- syscall (4, FreeBSD ELF32, write), eip = 0x282e08b3, esp = 0xbfbfec1c, ebp = 0xbfbfec38 --- db I tried creating a crash dump by issuing the commands panic and then continue but everything seemingly stopped then and nothing was dumped to disk. Can anyone suggest what I do next to find out about this crash? -- Regards Eivind Olsen [EMAIL PROTECTED] ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to [EMAIL PROTECTED]
vinum bug? (Re: Yet another crash in FreeBSD 5.1)
On Sat, Aug 02, 2003 at 10:11:24AM +0200, Eivind Olsen wrote: db trace g_dev_strategy(c2156024,c2153800,0,cfb528d0,c2099eca) at g_dev_strategy+0x29 launch_requests(c299bf00,0,1,,47) at launch_requests+0x448 vinumstart(c5ada2d0,0,c22ab000,cfb5294c,c02e5bc6) at vinumstart+0x2b2 vinumstrategy(c5ada2d0,0,c09719b0,40,0) at vinumstrategy+0xa6 Looks like a problem in vinum. The other backtrace was the same, right? Kris pgp0.pgp Description: PGP signature
Re: Yet another crash in FreeBSD 5.1
Eivind Olsen wrote: Can anyone suggest what I do next to find out about this crash? Fatal trap 12: page fault while in kernel mode fault virtual address = 0x14 Dereference of NULL pointer; reference is for element at offset 0x14 in some structure; this is the equivalent of 5 32 bit ints or pointers into the structure. db trace g_dev_strategy(c2156024,c2153800,0,cfb528d0,c2099eca) at g_dev_strategy+0x29 launch_requests(c299bf00,0,1,,47) at launch_requests+0x448 vinumstart(c5ada2d0,0,c22ab000,cfb5294c,c02e5bc6) at vinumstart+0x2b2 gdb -k kernel.debug (gdb) list *(g_dev_strategy+29) [ ... ] (gdb) list *(launch_requests+448) [ ... ] (gdb) list *(vinumstart+2b2) [ ... ] Will give you the exact source lines involved, assuming you built a debug kernel. You don't actually need a crash dump to debug a stack traceback. -- Terry ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: vinum bug? (Re: Yet another crash in FreeBSD 5.1)
On Sat, Aug 02, 2003 at 02:00:52AM -0700, Kris Kennaway wrote: On Sat, Aug 02, 2003 at 10:11:24AM +0200, Eivind Olsen wrote: db trace g_dev_strategy(c2156024,c2153800,0,cfb528d0,c2099eca) at g_dev_strategy+0x29 launch_requests(c299bf00,0,1,,47) at launch_requests+0x448 vinumstart(c5ada2d0,0,c22ab000,cfb5294c,c02e5bc6) at vinumstart+0x2b2 vinumstrategy(c5ada2d0,0,c09719b0,40,0) at vinumstrategy+0xa6 Looks like a problem in vinum. The other backtrace was the same, right? Please take a look at an older thread named (IIRC) vinum or geom bug? Greg asked for special debug output, but it never happened again for me. A real murphy bug - it happend on three machines once a day and after Gregs response nothing happened over weeks. -- B.Walter BWCThttp://www.bwct.de [EMAIL PROTECTED] [EMAIL PROTECTED] ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: vinum bug? (Re: Yet another crash in FreeBSD 5.1)
[Sending to [EMAIL PROTECTED], and Kris copied in Greg so I'll also do that] --On 2. august 2003 02:00 -0700 Kris Kennaway [EMAIL PROTECTED] wrote: db trace g_dev_strategy(c2156024,c2153800,0,cfb528d0,c2099eca) at g_dev_strategy+0x29 launch_requests(c299bf00,0,1,,47) at launch_requests+0x448 vinumstart(c5ada2d0,0,c22ab000,cfb5294c,c02e5bc6) at vinumstart+0x2b2 vinumstrategy(c5ada2d0,0,c09719b0,40,0) at vinumstrategy+0xa6 Looks like a problem in vinum. The other backtrace was the same, right? Basically the same, yes. Some differences (and many similarities) in the addresses that were referenced. And also almost the same output from the trace command (I see that my first example is missing the dofilewrite() between vn_write() and write() but that might just be because I've forgotten to write down that line (I wrote all this down by hand). So, it looks like it's the same crash again (well, it does look like that to my untrained eye). -- Regards / Hilsen Eivind Olsen [EMAIL PROTECTED] ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Yet another crash in FreeBSD 5.1
--On 2. august 2003 02:11 -0700 Terry Lambert [EMAIL PROTECTED] wrote: db trace g_dev_strategy(c2156024,c2153800,0,cfb528d0,c2099eca) at g_dev_strategy+0x29 launch_requests(c299bf00,0,1,,47) at launch_requests+0x448 vinumstart(c5ada2d0,0,c22ab000,cfb5294c,c02e5bc6) at vinumstart+0x2b2 gdb -k kernel.debug (gdb) list *(g_dev_strategy+29) [ ... ] (gdb) list *(launch_requests+448) [ ... ] (gdb) list *(vinumstart+2b2) [ ... ] Will give you the exact source lines involved, assuming you built a debug kernel. I did. At least I've tried to. :) (I have a kernel.debug which was compiled at the same time as the real kernel I'm using, and it's approx. 30MB in size). You don't actually need a crash dump to debug a stack traceback. This is what I found by using those commands you mentioned: [EMAIL PROTECTED]:~/tmp/debug gdb -k kernel.debug GNU gdb 5.2.1 (FreeBSD) Copyright 2002 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type show copying to see the conditions. There is absolutely no warranty for GDB. Type show warranty for details. This GDB was configured as i386-undermydesk-freebsd... (kgdb) list *(g_dev_strategy+29) 0xc02e812d is in g_dev_strategy (/usr/src/sys/geom/geom_dev.c:415). 410 KASSERT(cp-acr || cp-acw, 411 (Consumer with zero access count in g_dev_strategy)); 412 413 bp2 = g_clone_bio(bp); 414 KASSERT(bp2 != NULL, (XXX: ENOMEM in a bad place)); 415 bp2-bio_offset = (off_t)bp-bio_blkno DEV_BSHIFT; 416 KASSERT(bp2-bio_offset = 0, 417 (Negative bio_offset (%jd) on bio %p, 418 (intmax_t)bp2-bio_offset, bp)); 419 bp2-bio_length = (off_t)bp-bio_bcount; (kgdb) list *(launch_requests+448) No symbol launch_requests in current context. (kgdb) list *(vinumstart+2b2) No symbol vinumstart in current context. (kgdb) If anyone wants to take a look at this themselves I've put the compressed (gzip) debug-kernel available on http://eivind.aminor.no/debug/kernel.debug.gz NOTE! It's approx. 13MB compressed! -- Regards / Hilsen Eivind Olsen [EMAIL PROTECTED] ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: vinum bug? (Re: Yet another crash in FreeBSD 5.1)
--On 2. august 2003 11:16 +0200 Bernd Walter [EMAIL PROTECTED] wrote: Looks like a problem in vinum. The other backtrace was the same, right? Please take a look at an older thread named (IIRC) vinum or geom bug? Greg asked for special debug output, but it never happened again for me. A real murphy bug - it happend on three machines once a day and after Gregs response nothing happened over weeks. Are you thinking of the thread vinum and/or geom panic on alpha from 10th of June? I forgot to mention this but my system is i386 uniprocessor (Pentium2 at 450MHz). In case it's relevant, yes I do run vinum: vinum - l 2 drives: D WHITE State: up /dev/ad2s1e A: 0/113046 MB (0%) D BLACK State: up /dev/ad0s1d A: 0/113046 MB (0%) 6 volumes: V var State: up Plexes: 2 Size: 6144 MB V usrlocal State: up Plexes: 2 Size: 6144 MB V tmp State: up Plexes: 1 Size:255 MB V usr State: up Plexes: 2 Size: 6144 MB V home State: up Plexes: 2 Size: 8192 MB V storage State: up Plexes: 1 Size:168 GB 10 plexes: P var.p0 C State: up Subdisks: 1 Size: 6144 MB P var.p1 C State: up Subdisks: 1 Size: 6144 MB P usrlocal.p0 C State: up Subdisks: 1 Size: 6144 MB P usrlocal.p1 C State: up Subdisks: 1 Size: 6144 MB P tmp.p0 S State: up Subdisks: 2 Size:255 MB P usr.p0 C State: up Subdisks: 1 Size: 6144 MB P usr.p1 C State: up Subdisks: 1 Size: 6144 MB P home.p0 C State: up Subdisks: 1 Size: 8192 MB P home.p1 C State: up Subdisks: 1 Size: 8192 MB P storage.p0 S State: up Subdisks: 2 Size:168 GB 12 subdisks: S var.p0.s0 State: up D: BLACKSize: 6144 MB S var.p1.s0 State: up D: WHITESize: 6144 MB S usrlocal.p0.s0State: up D: BLACKSize: 6144 MB S usrlocal.p1.s0State: up D: WHITESize: 6144 MB S tmp.p0.s0 State: up D: BLACKSize:127 MB S tmp.p0.s1 State: up D: WHITESize:127 MB S usr.p0.s0 State: up D: BLACKSize: 6144 MB S usr.p1.s0 State: up D: WHITESize: 6144 MB S home.p0.s0State: up D: BLACKSize: 8192 MB S home.p1.s0State: up D: WHITESize: 8192 MB S storage.p0.s0 State: up D: BLACKSize: 84 GB S storage.p0.s1 State: up D: WHITESize: 84 GB vinum - -- Regards / Hilsen Eivind Olsen [EMAIL PROTECTED] ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Yet another crash in FreeBSD 5.1
On Saturday, 2 August 2003 at 2:11:24 -0700, Terry Lambert wrote: Eivind Olsen wrote: Can anyone suggest what I do next to find out about this crash? Fatal trap 12: page fault while in kernel mode fault virtual address = 0x14 Dereference of NULL pointer; reference is for element at offset 0x14 in some structure; this is the equivalent of 5 32 bit ints or pointers into the structure. db trace g_dev_strategy(c2156024,c2153800,0,cfb528d0,c2099eca) at g_dev_strategy+0x29 launch_requests(c299bf00,0,1,,47) at launch_requests+0x448 vinumstart(c5ada2d0,0,c22ab000,cfb5294c,c02e5bc6) at vinumstart+0x2b2 gdb -k kernel.debug (gdb) list *(g_dev_strategy+29) [ ... ] (gdb) list *(launch_requests+448) [ ... ] (gdb) list *(vinumstart+2b2) [ ... ] Will give you the exact source lines involved, assuming you built a debug kernel. You don't actually need a crash dump to debug a stack traceback. Great! So you know the answer? Please submit a patch. Seriously, this is nonsense. Yes, it's a null pointer dereference. What? Why? How do you fix it? Finding the first step doesn't solve the problem. Greg -- See complete headers for address and phone numbers pgp0.pgp Description: PGP signature
Re: Yet another crash in FreeBSD 5.1
On Saturday, 2 August 2003 at 17:00:59 +0200, Eivind Olsen wrote: --On 2. august 2003 11:16 +0200 Bernd Walter [EMAIL PROTECTED] wrote: Looks like a problem in vinum. The other backtrace was the same, right? Please take a look at an older thread named (IIRC) vinum or geom bug? Greg asked for special debug output, but it never happened again for me. A real murphy bug - it happend on three machines once a day and after Gregs response nothing happened over weeks. Are you thinking of the thread vinum and/or geom panic on alpha from 10th of June? I forgot to mention this but my system is i386 uniprocessor (Pentium2 at 450MHz). In case it's relevant, yes I do run vinum: Yes, of course you do. That's what the stack trace says, and that's why people mentioned Vinum in the first place: On Saturday, 2 August 2003 at 10:11:24 +0200, Eivind Olsen wrote: Here's some output from DDB: db trace g_dev_strategy(c2156024,c2153800,0,cfb528d0,c2099eca) at g_dev_strategy+0x29 launch_requests(c299bf00,0,1,,47) at launch_requests+0x448 vinumstart(c5ada2d0,0,c22ab000,cfb5294c,c02e5bc6) at vinumstart+0x2b2 vinumstrategy(c5ada2d0,0,c09719b0,40,0) at vinumstrategy+0xa6 On Saturday, 2 August 2003 at 11:16:21 +0200, Bernd Walter wrote: On Sat, Aug 02, 2003 at 02:00:52AM -0700, Kris Kennaway wrote: On Sat, Aug 02, 2003 at 10:11:24AM +0200, Eivind Olsen wrote: db trace g_dev_strategy(c2156024,c2153800,0,cfb528d0,c2099eca) at g_dev_strategy+0x29 launch_requests(c299bf00,0,1,,47) at launch_requests+0x448 vinumstart(c5ada2d0,0,c22ab000,cfb5294c,c02e5bc6) at vinumstart+0x2b2 vinumstrategy(c5ada2d0,0,c09719b0,40,0) at vinumstrategy+0xa6 Looks like a problem in vinum. The other backtrace was the same, right? Please take a look at an older thread named (IIRC) vinum or geom bug? Greg asked for special debug output, but it never happened again for me. A real murphy bug - it happend on three machines once a day and after Gregs response nothing happened over weeks. This is the real issue. Until you supply the information I ask for in the man page or at http://www.vinumvm.org/vinum/how-to-debug.html, only Terry can help you. Greg -- See complete headers for address and phone numbers pgp0.pgp Description: PGP signature
Re: Yet another crash in FreeBSD 5.1
On Saturday, 2 August 2003 at 16:47:13 +0200, Eivind Olsen wrote: --On 2. august 2003 02:11 -0700 Terry Lambert [EMAIL PROTECTED] wrote: db trace g_dev_strategy(c2156024,c2153800,0,cfb528d0,c2099eca) at g_dev_strategy+0x29 launch_requests(c299bf00,0,1,,47) at launch_requests+0x448 vinumstart(c5ada2d0,0,c22ab000,cfb5294c,c02e5bc6) at vinumstart+0x2b2 gdb -k kernel.debug (gdb) list *(g_dev_strategy+29) [ ... ] (gdb) list *(launch_requests+448) [ ... ] (gdb) list *(vinumstart+2b2) [ ... ] Will give you the exact source lines involved, assuming you built a debug kernel. I did. At least I've tried to. :) (I have a kernel.debug which was compiled at the same time as the real kernel I'm using, and it's approx. 30MB in size). You don't actually need a crash dump to debug a stack traceback. This is what I found by using those commands you mentioned: [EMAIL PROTECTED]:~/tmp/debug gdb -k kernel.debug GNU gdb 5.2.1 (FreeBSD) Copyright 2002 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type show copying to see the conditions. There is absolutely no warranty for GDB. Type show warranty for details. This GDB was configured as i386-undermydesk-freebsd... (kgdb) list *(g_dev_strategy+29) This is almost certainly the wrong function. At the very list you should look at the arguments passed to it. (kgdb) list *(launch_requests+448) No symbol launch_requests in current context. (kgdb) list *(vinumstart+2b2) No symbol vinumstart in current context. (kgdb) Read the links I just sent you. You haven't loaded the Vinum symbols. If anyone wants to take a look at this themselves I've put the compressed (gzip) debug-kernel available on http://eivind.aminor.no/debug/kernel.debug.gz NOTE! It's approx. 13MB compressed! The kernel's not much use by itself. Greg -- See complete headers for address and phone numbers pgp0.pgp Description: PGP signature
Re: Yet another crash in FreeBSD 5.1
Eivind Olsen wrote: (kgdb) list *(launch_requests+448) No symbol launch_requests in current context. (kgdb) list *(vinumstart+2b2) No symbol vinumstart in current context. (kgdb) If anyone wants to take a look at this themselves I've put the compressed (gzip) debug-kernel available on http://eivind.aminor.no/debug/kernel.debug.gz NOTE! It's approx. 13MB compressed! If this is repeatable for you, it's recommended that you compile Vinum statically into your kernel, so that you can look at the other symbols in the traceback and obtain source lines for them, as well. It may be that this will be debuggable without that information, but in my experience with similar problems, without a list of arguments to the functions from a live remote debug session and/or a crashdump, the problem is going to have to be found by an engineer eyeballing the call graph and seeing how that particular line could end up with a NULL in bp2 or bp. -- Terry ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Yet another crash in FreeBSD 5.1
Greg 'groggy' Lehey wrote: You don't actually need a crash dump to debug a stack traceback. Great! So you know the answer? Please submit a patch. Seriously, this is nonsense. Yes, it's a null pointer dereference. What? That is precisely what doing what I suggested discovers, Greg. If you haven't seen his response posting: (kgdb) list *(g_dev_strategy+29) 0xc02e812d is in g_dev_strategy (/usr/src/sys/geom/geom_dev.c:415). 410 KASSERT(cp-acr || cp-acw, 411 (Consumer with zero access count in g_dev_strategy)); 412 413 bp2 = g_clone_bio(bp); 414 KASSERT(bp2 != NULL, (XXX: ENOMEM in a bad place)); 415 bp2-bio_offset = (off_t)bp-bio_blkno DEV_BSHIFT; 416 KASSERT(bp2-bio_offset = 0, 417 (Negative bio_offset (%jd) on bio %p, 418 (intmax_t)bp2-bio_offset, bp)); 419 bp2-bio_length = (off_t)bp-bio_bcount; Clearly, bp2 or bp is NULL at the time of the dereference. Why? Programmer error. Either bp2 or bp is a NULL pointer. How do you fix it? It depends on the root cause. If the root cause is that the bp is NULL, then I'd hope that it would have been caught higher up; if it wasn't, then I'd hope that g_clone_bio(bp) would have returned NULL. Is the KASSERT() active at the time of the problem? I don't know; if it isn't, it probably should be converted to an if()...panic(). If it is, then I'd have to expect that the validity fell out from under it as a result of an interrupt, preemption, reentrancy (if the locking didn't prevent it) or SMP races (if the locking didn't prevent it). I really can't answer it for the same reason that I couldn't locate the line in the source code that was failing for him from his posting of hex offsets into functions compiled from unknown source code: I don't have his object set for the problem in question, nor his debug kernel. Finding the first step doesn't solve the problem. No. Finding the first step is *necessary* to solving the problem, but you are entirely correct in pointing out that it's not in itself *sufficient*. But it's one step farther along than he was. I didn't see anyone else helping him take that first step, so I did. -- Terry ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Yet another crash in FreeBSD 5.1
Greg 'groggy' Lehey wrote: Please take a look at an older thread named (IIRC) vinum or geom bug? Greg asked for special debug output, but it never happened again for me. A real murphy bug - it happend on three machines once a day and after Gregs response nothing happened over weeks. This is the real issue. Until you supply the information I ask for in the man page or at http://www.vinumvm.org/vinum/how-to-debug.html, only Terry can help you. This is BS, Greg. I deal with about a traceback every other day, and sometimes as high as 5 in a single day, if it's a busy day for it. The information I gave him gets him to lines of source code, instead of just function names with strange hexadecimal numbers that resolve to instruction offsets that may be specific to his compile flags, date of checkout of the sources from CVS, etc.. I don't know about you, but I can't easily write assembly instructions to tape, run them the tape through my teeth, and read the bits using my dental fillings. If it's a NULL pointer dereference, the place to find it is by turning on what debugging there is, and, if that fails, which it probably will, by eyeballing the lines of source code in question and understanding the code around it well enough that you can tell *how* a pointer there could be NULL. My instructions *get* him those lines of source. If you'll notice from his followup posting of the source in question, Vinum is loaded as a module, and it's the FreeBSD code that Vinum calls, not Vinum, that's causing the crash. There's no reason to be paranoid about your baby with me; unlike some people, personally I like Vinum, so relax and realize that I'm not trying to blame your code by trying to help him squeeze more information out of the data he *is* able to gather. -- Terry ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Yet another crash in FreeBSD 5.1
On Saturday, 2 August 2003 at 17:54:03 -0700, Terry Lambert wrote: Eivind Olsen wrote: (kgdb) list *(launch_requests+448) No symbol launch_requests in current context. (kgdb) list *(vinumstart+2b2) No symbol vinumstart in current context. (kgdb) If anyone wants to take a look at this themselves I've put the compressed (gzip) debug-kernel available on http://eivind.aminor.no/debug/kernel.debug.gz NOTE! It's approx. 13MB compressed! If this is repeatable for you, it's recommended that you compile Vinum statically into your kernel, so that you can look at the other symbols in the traceback and obtain source lines for them, as well. No. It is explicitly discouraged. It may be that this will be debuggable without that information, but in my experience with similar problems, without a list of arguments to the functions from a live remote debug session and/or a crashdump, the problem is going to have to be found by an engineer eyeballing the call graph and seeing how that particular line could end up with a NULL in bp2 or bp. Terry hasn't read the debug instructions. You can load symbols from klds. See the links I pointed to. Greg -- See complete headers for address and phone numbers pgp0.pgp Description: PGP signature
Re: Yet another crash in FreeBSD 5.1
Terry Lambert wrote: There's no reason to be paranoid about your baby with me; unlike some people, personally I like Vinum, so relax and realize that I'm not trying to blame your code by trying to help him squeeze more information out of the data he *is* able to gather. To follow this up: Sometimes you have to work with the information you have available, rather than the information you wish you had available. in an earlier post, he said that he was having problems collecting system crash dumps. So what he has is pretty much what we get to work with. If you think that's fun, try translating a traceback that's a set of hexadecimal instruction addresses for a released product (at least you get the symbol'ed kernel to look at in gdb) from a blurry digital photograph of a computer monitor... -- Terry ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Yet another crash in FreeBSD 5.1
On Saturday, 2 August 2003 at 17:56:49 -0700, Terry Lambert wrote: Greg 'groggy' Lehey wrote: You don't actually need a crash dump to debug a stack traceback. Great! So you know the answer? Please submit a patch. Seriously, this is nonsense. Yes, it's a null pointer dereference. What? That is precisely what doing what I suggested discovers, Greg. Yes, that's what you said already. If you haven't seen his response posting: I saw it and explained why it didn't help. Clearly, bp2 or bp is NULL at the time of the dereference. Why? Programmer error. Either bp2 or bp is a NULL pointer. You're repeating yourself. How do you fix it? It depends on the root cause. *bingo* Here you are having found the first (obvious) step and acting as if the problem has been solved. I really can't answer it OK, why don't you either: 1. Find a way to answer it, or 2. Keep quiet. You're just confusing the issue here. Finding the first step doesn't solve the problem. No. Finding the first step is *necessary* to solving the problem, but you are entirely correct in pointing out that it's not in itself *sufficient*. But it's one step farther along than he was. I didn't see anyone else helping him take that first step, so I did. Sorry, I don't hack in the middle of the night. If you had read the documentation at your disposal, you'd have discovered a lot of help, and also that this is a known problem that crops up sporadically, and that so far we can't find out why. Greg -- See complete headers for address and phone numbers pgp0.pgp Description: PGP signature
Re: Yet another crash in FreeBSD 5.1
On Saturday, 2 August 2003 at 18:06:36 -0700, Terry Lambert wrote: Greg 'groggy' Lehey wrote: Please take a look at an older thread named (IIRC) vinum or geom bug? Greg asked for special debug output, but it never happened again for me. A real murphy bug - it happend on three machines once a day and after Gregs response nothing happened over weeks. This is the real issue. Until you supply the information I ask for in the man page or at http://www.vinumvm.org/vinum/how-to-debug.html, only Terry can help you. This is BS, Greg. I deal with about a traceback every other day, and sometimes as high as 5 in a single day, if it's a busy day for it. Stack traces are pretty common stuff. Your point? The information I gave him gets him to lines of source code, instead of just function names with strange hexadecimal numbers that resolve to instruction offsets that may be specific to his compile flags, date of checkout of the sources from CVS, etc.. The first step of the link above does the same thing. But it's only the first step. I don't know about you, but I can't easily write assembly instructions to tape, run them the tape through my teeth, and read the bits using my dental fillings. Terry, why don't you come to my debug tutorial at the BSDCon next month? I'll show you how to do this properly. I'm not asking for people to interpret hex. I'm asking for people, you included, to find out what debugging help is available. If it's a NULL pointer dereference, the place to find it is by turning on what debugging there is, and, if that fails, which it probably will, No, that will find the null pointer dereference pretty quickly. by eyeballing the lines of source code in question and understanding the code around it well enough that you can tell *how* a pointer there could be NULL. My instructions *get* him those lines of source. You obviously still haven't read the reference. Do that first, and come back when you have either understood things or are having difficulty understanding. But don't shoot off your mouth without knowing what's going on. If you'll notice from his followup posting of the source in question, Vinum is loaded as a module, and it's the FreeBSD code that Vinum calls, not Vinum, that's causing the crash. The bug is almost certainly in Vinum. There's no reason to be paranoid about your baby with me; unlike some people, personally I like Vinum, so relax and realize that I'm not trying to blame your code by trying to help him squeeze more information out of the data he *is* able to gather. This has nothing to do with being paranoid about babies. This has to do with people shooting off their mouths in a public forum without bothering to check details first. Greg -- See complete headers for address and phone numbers pgp0.pgp Description: PGP signature
Re: Yet another crash in FreeBSD 5.1
Greg 'groggy' Lehey wrote: If this is repeatable for you, it's recommended that you compile Vinum statically into your kernel, so that you can look at the other symbols in the traceback and obtain source lines for them, as well. No. It is explicitly discouraged. It saves the dicking around with the .ko files. It may be that this will be debuggable without that information, but in my experience with similar problems, without a list of arguments to the functions from a live remote debug session and/or a crashdump, the problem is going to have to be found by an engineer eyeballing the call graph and seeing how that particular line could end up with a NULL in bp2 or bp. Terry hasn't read the debug instructions. You can load symbols from klds. See the links I pointed to. I read them. You didn't provide examples for a non-crashdump debug session. Rather than give him incorrect information, I gave him a workaround that would guarantee that what information he did obtain would, in fact, be correct. If you would care to take over, without insisting that he be able to produce a crash dump (which he has already stated that he has had trouble doing), be my guest. The best information I can get him, without finding some way to fix his obtaining a crashdump issue (I myself have been unable to obtain one off and on during long stretches, due to the changes in that area by PHK), is to translate his ddb traceback into source code line numbers. -- Terry ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Yet another crash in FreeBSD 5.1
Greg 'groggy' Lehey wrote: The information I gave him gets him to lines of source code, instead of just function names with strange hexadecimal numbers that resolve to instruction offsets that may be specific to his compile flags, date of checkout of the sources from CVS, etc.. The first step of the link above does the same thing. But it's only the first step. No, it does not. The first step of your debugging link does not deal with anything but having a vmcore lying around *which he does not have*. Terry, why don't you come to my debug tutorial at the BSDCon next month? I'll show you how to do this properly. I'm not asking for people to interpret hex. I'm asking for people, you included, to find out what debugging help is available. I might do this; it depends on whether things die down at work by then, or not. Currently, though, I'm really busy fixing bugs exatly like this one. In the past 3 weeks, I've fixed 61 of them, which average out to 4 a day. If it's a NULL pointer dereference, the place to find it is by turning on what debugging there is, and, if that fails, which it probably will, No, that will find the null pointer dereference pretty quickly. You'd hope the entirety of the kernel were that well instrumented... by eyeballing the lines of source code in question and understanding the code around it well enough that you can tell *how* a pointer there could be NULL. My instructions *get* him those lines of source. You obviously still haven't read the reference. Do that first, and come back when you have either understood things or are having difficulty understanding. But don't shoot off your mouth without knowing what's going on. I read the reference. How does it apply in cases like this one, where you don't have a vmcore file? If you'll notice from his followup posting of the source in question, Vinum is loaded as a module, and it's the FreeBSD code that Vinum calls, not Vinum, that's causing the crash. The bug is almost certainly in Vinum. Most likely; I think that it's passing a bad argument to the inferior function. The way I would approach finding this, with only: 1) The line of code where the failure occurred 2) The stack traceback, with no arguments 3) The sources for the code in the stack traceback would be to eyeball the code in #1, and try to figure out how I gould get to that point with that pointer having a NULL value, given my apriori knowledge of the forward call graph. I would examine every intermediate conditional and function call that could effect the value of the pointer and cause it to be NULL at the point in question. This has nothing to do with being paranoid about babies. This has to do with people shooting off their mouths in a public forum without bothering to check details first. It's really hard to talk to you about Vinum. One of the details I wish you would check is whether or not he has a vmcore file, or the ability to get one... -- Terry ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Yet another crash in FreeBSD 5.1
On Saturday, 2 August 2003 at 18:36:24 -0700, Terry Lambert wrote: Greg 'groggy' Lehey wrote: The information I gave him gets him to lines of source code, instead of just function names with strange hexadecimal numbers that resolve to instruction offsets that may be specific to his compile flags, date of checkout of the sources from CVS, etc.. The first step of the link above does the same thing. But it's only the first step. by eyeballing the lines of source code in question and understanding the code around it well enough that you can tell *how* a pointer there could be NULL. My instructions *get* him those lines of source. You obviously still haven't read the reference. Do that first, and come back when you have either understood things or are having difficulty understanding. But don't shoot off your mouth without knowing what's going on. I read the reference. How does it apply in cases like this one, where you don't have a vmcore file? You don't seem to have read the reference very well. It also asks for other supporting information. That's the most important thing at the moment. I know that because I've been there before, and I've looked at a number of these dumps: it's almost certainly related to something he's doing which is not normal. You don't know that, and that's excusable, but it's not excusable that after four or five requests, you still haven't RTFM'd. The way I would approach finding this, with only: 1)The line of code where the failure occurred 2)The stack traceback, with no arguments 3)The sources for the code in the stack traceback would be to eyeball the code in #1, and try to figure out how I gould get to that point with that pointer having a NULL value, given my apriori knowledge of the forward call graph. You have that? I would examine every intermediate conditional and function call that could effect the value of the pointer and cause it to be NULL at the point in question. Go for it. Once I get the log files, I'll start there. One of the details I wish you would check is whether or not he has a vmcore file, or the ability to get one... We'll address that issue when it becomes necessary. Greg -- See complete headers for address and phone numbers pgp0.pgp Description: PGP signature
Re: Yet another crash in FreeBSD 5.1
I fear we may have gotten a bit off-topic. E From: Greg 'groggy' Lehey [EMAIL PROTECTED] To: Terry Lambert [EMAIL PROTECTED] CC: [EMAIL PROTECTED], [EMAIL PROTECTED] Subject: Re: Yet another crash in FreeBSD 5.1 Date: Sun, 3 Aug 2003 11:21:41 +0930 On Saturday, 2 August 2003 at 18:36:24 -0700, Terry Lambert wrote: Greg 'groggy' Lehey wrote: The information I gave him gets him to lines of source code, instead of just function names with strange hexadecimal numbers that resolve to instruction offsets that may be specific to his compile flags, date of checkout of the sources from CVS, etc.. The first step of the link above does the same thing. But it's only the first step. by eyeballing the lines of source code in question and understanding the code around it well enough that you can tell *how* a pointer there could be NULL. My instructions *get* him those lines of source. You obviously still haven't read the reference. Do that first, and come back when you have either understood things or are having difficulty understanding. But don't shoot off your mouth without knowing what's going on. I read the reference. How does it apply in cases like this one, where you don't have a vmcore file? You don't seem to have read the reference very well. It also asks for other supporting information. That's the most important thing at the moment. I know that because I've been there before, and I've looked at a number of these dumps: it's almost certainly related to something he's doing which is not normal. You don't know that, and that's excusable, but it's not excusable that after four or five requests, you still haven't RTFM'd. The way I would approach finding this, with only: 1) The line of code where the failure occurred 2) The stack traceback, with no arguments 3) The sources for the code in the stack traceback would be to eyeball the code in #1, and try to figure out how I gould get to that point with that pointer having a NULL value, given my apriori knowledge of the forward call graph. You have that? I would examine every intermediate conditional and function call that could effect the value of the pointer and cause it to be NULL at the point in question. Go for it. Once I get the log files, I'll start there. One of the details I wish you would check is whether or not he has a vmcore file, or the ability to get one... We'll address that issue when it becomes necessary. Greg -- See complete headers for address and phone numbers attach3 _ Tired of spam? Get advanced junk mail protection with MSN 8. http://join.msn.com/?page=features/junkmail ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Yet another crash in FreeBSD 5.1
On 03-Aug-2003 Greg 'groggy' Lehey wrote: On Saturday, 2 August 2003 at 16:47:13 +0200, Eivind Olsen wrote: --On 2. august 2003 02:11 -0700 Terry Lambert [EMAIL PROTECTED] wrote: db trace g_dev_strategy(c2156024,c2153800,0,cfb528d0,c2099eca) at g_dev_strategy+0x29 launch_requests(c299bf00,0,1,,47) at launch_requests+0x448 vinumstart(c5ada2d0,0,c22ab000,cfb5294c,c02e5bc6) at vinumstart+0x2b2 gdb -k kernel.debug (gdb) list *(g_dev_strategy+29) [ ... ] (gdb) list *(launch_requests+448) [ ... ] (gdb) list *(vinumstart+2b2) [ ... ] Will give you the exact source lines involved, assuming you built a debug kernel. I did. At least I've tried to. :) (I have a kernel.debug which was compiled at the same time as the real kernel I'm using, and it's approx. 30MB in size). You don't actually need a crash dump to debug a stack traceback. This is what I found by using those commands you mentioned: [EMAIL PROTECTED]:~/tmp/debug gdb -k kernel.debug GNU gdb 5.2.1 (FreeBSD) Copyright 2002 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type show copying to see the conditions. There is absolutely no warranty for GDB. Type show warranty for details. This GDB was configured as i386-undermydesk-freebsd... (kgdb) list *(g_dev_strategy+29) This is almost certainly the wrong function. At the very list you should look at the arguments passed to it. Actually, this line can be very instructive. Since 'bp' is valid it is probably the bp2 from g_clone_bio() that is NULL. You might want to ask phk about that one. (kgdb) list *(launch_requests+448) No symbol launch_requests in current context. (kgdb) list *(vinumstart+2b2) No symbol vinumstart in current context. (kgdb) Read the links I just sent you. You haven't loaded the Vinum symbols. Bah, this isn't hard for you to do either: (gdb) l *(launch_requests+0x448) 0xad58 is in launch_requests (/usr/src/sys/dev/vinum/vinumrequest.c:448). 443 microtime(rqe-launchtime);/* time we launched this request */ 444 logrq(loginfo_rqe, (union rqinfou) rqe, rq-bp); 445 } 446 #endif 447 /* fire off the request */ 448 DEV_STRATEGY(rqe-b); 449 } 450 } 451 } 452 return 0; But you knew that. Also, Eivind, you need to use hex, not decimal offsets from the functions. You might want to redo the g_dev_strategy() line with 0x29 instead of 29. -- John Baldwin [EMAIL PROTECTED]http://www.FreeBSD.org/~jhb/ Power Users Use the Power to Serve! - http://www.FreeBSD.org/ ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to [EMAIL PROTECTED]