Re: Debugging -current SMPNG HANG on heavy disk-io
Hi, On Tue, 19 Sep 2000 [EMAIL PROTECTED] wrote: ... > This looks like you've hit the limit for the FFS node memory type. BINGO! >FFS node262144 65536K 65536K 65536K 20244600 6 256 So the symptom is clear. But the cause? With pre SMPng I had the default kmem sizes (which is 12MB I think). Now I bumped kern.vm.kmem.size 4 times to 4096 (which leads to 20k max-mem for FFS node) and still can't tar /usr/ports to /dev/null! Where comes the increased memory consumption from ?!? BTW: What is the KMEM exactly: Kernel real memory? Kernel virtual memory? Thanks anyway for your answer and your efforts! Bye! Michael Reifenberger ^.*Plaut.*$, IT, R/3 Basis, GPS To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: Debugging -current SMPNG HANG on heavy disk-io
On Tue, 19 Sep 2000 [EMAIL PROTECTED] wrote: ... > This looks like you've hit the limit for the FFS node memory type. > > vmstat -m will indicate if this is correct. > ... > Increasing the kmem_map size (by setting a loader variable > (kern.vm.kmem.size) or defining VM_KMEM_SIZE and VM_KMEM_SIZE_MAX in > the kernel config file) should help. I'll try. Thanks for the hint! BTW: Is it possible to dynamically adjust the limit of the node mem? The system shouldn't freeze anyway. Bye! Michael Reifenberger ^.*Plaut.*$, IT, R/3 Basis, GPS To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: Debugging -current SMPNG HANG on heavy disk-io
> (kgdb) ps > pidprocaddruid pri ppid pgrp flag stat comm wchan >37 c7874a00 c96650000 32 636 004086 3 tar piperd c9663f20 >36 c7874bc0 c960a0000 32 636 004006 3 tar FFS node >c02f4220 This looks like you've hit the limit for the FFS node memory type. vmstat -m will indicate if this is correct. If you see somethinig like Memory statistics by type Type Kern Type InUse MemUse HighUse Limit Requests Limit Limit Size(s) [] FFS node262144 65536K 65536K 65536K 20244600 6 256 [] Memory Totals: In UseFreeRequests 93897K608K 9482590 (i.e. MemUse == Limit), then you've hit the limit. The process allocating a FFS node normally holds a vnode lock, resulting in a cascade of vnode locks and a frozen system. Increasing the kmem_map size (by setting a loader variable (kern.vm.kmem.size) or defining VM_KMEM_SIZE and VM_KMEM_SIZE_MAX in the kernel config file) should help. - Tor Egge To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: Debugging -current SMPNG HANG on heavy disk-io
On Sun, 17 Sep 2000, John Baldwin wrote: ... > Hmm, could it be lockmgr() related? How can I proof? Bye! Michael Reifenberger ^.*Plaut.*$, IT, R/3 Basis, GPS To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: Debugging -current SMPNG HANG on heavy disk-io
Greg Lehey wrote: > On Monday, 18 September 2000 at 1:29:34 +0200, Michael Reifenberger wrote: > > On Mon, 18 Sep 2000, Greg Lehey wrote: > > ... > >> Oops, that's what comes of typing hurriedly early in the morning. > >> > >> p ((struct proc *)gd_curproc)->p_comm > >> p ((struct proc *)gd_curproc)->p_pid > > Works better: > > (kgdb) p ((struct proc *)gd_curproc)->p_comm > > $6 = "irq1: atkbd0\000\000\000\000" > > (kgdb) p ((struct proc *)gd_curproc)->p_pid > > $7 = 0x10 > > Hmm. I suppose that's reasonable, since you've just pressed a key. > > We obviously have a problem here, but I'm not going to be able to look > at it myself until Friday or Saturday. Anybody else want to take a > look? There's also the possibility that a problem I had seen and not > investigated could in fact be the same problem: I got it tarring and > untarring across an NFS connection. Hmm, could it be lockmgr() related? -- John Baldwin <[EMAIL PROTECTED]> -- http://www.FreeBSD.org/~jhb/ "Power Users Use the Power to Serve!" - http://www.FreeBSD.org/ To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: Debugging -current SMPNG HANG on heavy disk-io
On Monday, 18 September 2000 at 1:29:34 +0200, Michael Reifenberger wrote: > On Mon, 18 Sep 2000, Greg Lehey wrote: > ... >> Oops, that's what comes of typing hurriedly early in the morning. >> >> p ((struct proc *)gd_curproc)->p_comm >> p ((struct proc *)gd_curproc)->p_pid > Works better: > (kgdb) p ((struct proc *)gd_curproc)->p_comm > $6 = "irq1: atkbd0\000\000\000\000" > (kgdb) p ((struct proc *)gd_curproc)->p_pid > $7 = 0x10 Hmm. I suppose that's reasonable, since you've just pressed a key. We obviously have a problem here, but I'm not going to be able to look at it myself until Friday or Saturday. Anybody else want to take a look? There's also the possibility that a problem I had seen and not investigated could in fact be the same problem: I got it tarring and untarring across an NFS connection. Greg -- Finger [EMAIL PROTECTED] for PGP public key See complete headers for address and phone numbers To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: Debugging -current SMPNG HANG on heavy disk-io
On Mon, 18 Sep 2000, Greg Lehey wrote: ... > Oops, that's what comes of typing hurriedly early in the morning. > > p ((struct proc *)gd_curproc)->p_comm > p ((struct proc *)gd_curproc)->p_pid Works better: (kgdb) p ((struct proc *)gd_curproc)->p_comm $6 = "irq1: atkbd0\000\000\000\000" (kgdb) p ((struct proc *)gd_curproc)->p_pid $7 = 0x10 Bye! Michael Reifenberger ^.*Plaut.*$, IT, R/3 Basis, GPS To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: Debugging -current SMPNG HANG on heavy disk-io
On Monday, 18 September 2000 at 1:23:30 +0200, Michael Reifenberger wrote: > On Mon, 18 Sep 2000, Greg Lehey wrote: > ... >> You could also show the content of p->p_pid. If you don't have a p >> pointer in the frame you're looking at, use ((struct >> *proc)gd_curproc)->p_pid and ((struct *proc)gd_curproc)->p_comm. We >> need to know what is hanging. > Sorry doesn't seem to work: > (kgdb) p p->p_comm > No symbol "p" in current context. > (kgdb) p ((struct*proc)gd_curproc)->p_pid > A syntax error in expression, near `proc)gd_curproc)->p_pid'. > (kgdb) p ((struct *proc)gd_curproc)->p_comm > A syntax error in expression, near `proc)gd_curproc)->p_comm'. > (kgdb) p gd_curproc > $1 = 0xc78760c0 Oops, that's what comes of typing hurriedly early in the morning. p ((struct proc *)gd_curproc)->p_comm p ((struct proc *)gd_curproc)->p_pid Greg -- Finger [EMAIL PROTECTED] for PGP public key See complete headers for address and phone numbers To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: Debugging -current SMPNG HANG on heavy disk-io
On Mon, 18 Sep 2000, Greg Lehey wrote: ... > You could also show the content of p->p_pid. If you don't have a p > pointer in the frame you're looking at, use ((struct > *proc)gd_curproc)->p_pid and ((struct *proc)gd_curproc)->p_comm. We > need to know what is hanging. Sorry doesn't seem to work: (kgdb) p p->p_comm No symbol "p" in current context. (kgdb) p ((struct*proc)gd_curproc)->p_pid A syntax error in expression, near `proc)gd_curproc)->p_pid'. (kgdb) p ((struct *proc)gd_curproc)->p_comm A syntax error in expression, near `proc)gd_curproc)->p_comm'. (kgdb) p gd_curproc $1 = 0xc78760c0 Bye! Michael Reifenberger ^.*Plaut.*$, IT, R/3 Basis, GPS To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: Debugging -current SMPNG HANG on heavy disk-io
On Sunday, 17 September 2000 at 16:29:41 +0200, Michael Reifenberger wrote: > Hi, > ... >> The frames above are what the system went to as the result of your >> debugger request. I'd also be interested to see the output of the >> 'icnt' macro (if this is UP machine) or 'icnt1' (if it's SMP), and >> 'ps' (the macro I promised above). > (kgdb) icnt > 1215544*566*0 0* 0 0 1 0 > 1555964*0* 0* 0* 0 0* 22636* 11 > 1 0 0 0 0 0 441031 > imen: 6f0b > (kgdb) ps > pidprocaddruid pri ppid pgrp flag stat comm wchan >37 c7874a00 c96650000 32 636 004086 3 tar piperd c9663f20 >36 c7874bc0 c960a0000 32 636 004006 3 tar FFS node >c02f4220 >35 c7874d80 c96070000 32 635 004006 3 tar inode c1d2fa00 > 6 c7874f40 c96040000 32 1 6 004086 3 sh wait c7874f40 > 5 c7875100 c82950000 4 0 0 000204 3 syncer syncer c03236e8 > 4 c78752c0 c82930000 4 0 0 100204 3 bufdaemonpsleep c03072f0 > 3 c7875480 c82910000 4 0 0 000204 3 vmdaemon psleep c0317a00 > 2 c7875640 c828f0000 4 0 0 100204 3 pagedaemon psleep c02f5938 >21 c7875800 c78d40000 1*0 0 000204 2 irq8: rtc >20 c78759c0 c78d20000 1*0 0 000204 2 irq0: clk >19 c7875b80 c78b0 7*0 0 000204 6 irq5: pcm0 >18 c7875d40 c788e0000 7*0 0 000204 6 irq7: ppc0 >17 c7875f00 c788c0000 7*0 0 000204 6 irq12: psm0 >16 c78760c0 c788a0000 7*0 0 000204 2 irq1: atkbd0 >15 c7876280 c78870000 6*0 0 000204 6 irq6: fdc0 >14 c7876440 c78850000 6*0 0 000204 6 irq15: ata1 >13 c7876600 c78830000 6*0 0 000204 2 irq14: ata0 >12 c78767c0 c78810000 4 0 0 000204 3 random rndslp c0322934 >11 c7876980 c787f0000 15*0 0 008204 6 softinterrupt >10 c7876b40 c787d0000 4 0 0 008204 2 idle > 1 c7876d00 c787b0000 4 0 1 004284 3 init wait c7876d00 > 0 c0322960 c03c0 4 0 0 000204 3 swapper sched c0322960 > ... >> handler. At this point, it would be very interesting to see the value >> of p->p_comm, which is the process name at the end of the ps listing. >> >>> (kgdb) proc 35 >> >> Why are you interested in this process? > It was one of the tar's which I grabbed by hand (without your ps macro) > ... > > Whats next to show :-) To quote: >> At this point, it would be very interesting to see the value of >> p->p_comm, which is the process name at the end of the ps listing. You could also show the content of p->p_pid. If you don't have a p pointer in the frame you're looking at, use ((struct *proc)gd_curproc)->p_pid and ((struct *proc)gd_curproc)->p_comm. We need to know what is hanging. I'm probably going on holiday for the rest of the week; somebody else should pick this one up. Greg -- Finger [EMAIL PROTECTED] for PGP public key See complete headers for address and phone numbers To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: Debugging -current SMPNG HANG on heavy disk-io
Hi, if the order of the ps macro is correct, here the backtraces of the procs 35,36,37: Copyright (c) 1992-2000 The FreeBSD Project. Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994 The Regents of the University of California. All rights reserved. FreeBSD 5.0-CURRENT #0: Sat Sep 16 19:32:53 CEST 2000 [EMAIL PROTECTED]:/usr/obj/usr/src/sys/nihil Timecounter "i8254" frequency 1193182 Hz Timecounter "TSC" frequency 266615847 Hz CPU: Pentium II/Pentium II Xeon/Celeron (266.62-MHz 686-class CPU) Origin = "GenuineIntel" Id = 0x652 Stepping = 2 Features=0x183f9ff real memory = 268369920 (262080K bytes) config> #flags wdc0 0xa0ffa0ff Invalid command or syntax. Type `?' for help. config> #flags wdc1 0xa0ffa0ff Invalid command or syntax. Type `?' for help. config> #iosiz npx0 196608 Invalid command or syntax. Type `?' for help. config> #irq pcic0 11 Invalid command or syntax. Type `?' for help. config> quit avail memory = 257589248 (251552K bytes) Preloaded elf kernel "kernel.ko" at 0xc03ad000. Preloaded userconfig_script "/boot/kernel.conf" at 0xc03ad0ac. Preloaded elf module "linux.ko" at 0xc03ad0fc. Preloaded elf module "linprocfs.ko" at 0xc03ad19c. Pentium Pro MTRR support enabled VESA: v2.0, 2496k memory, flags:0x0, mode table:0xc031ee42 (122) VESA: MagicGraph 256 AV 44K PRELIMINARY npx0: on motherboard npx0: INT 16 interface pcib0: on motherboard pci0: on pcib0 pci0: at 4.0 irq 11 isab0: at device 5.0 on pci0 isa0: on isab0 atapci0: port 0xfe60-0xfe6f at device 5.1 on pci0 ata0: at 0x1f0 irq 14 on atapci0 ata1: at 0x170 irq 15 on atapci0 pci0: at 5.2 irq 11 pci0: at 5.3 pci0: at 9.0 irq 11 pcic-pci0: at device 11.0 on pci0 pcic-pci1: at device 11.1 on pci0 fdc0: at port 0x3f0-0x3f5,0x3f7 irq 6 drq 2 on isa0 atkbdc0: at port 0x60,0x64 on isa0 atkbd0: irq 1 on atkbdc0 psm0: irq 12 on atkbdc0 psm0: model GlidePoint, device ID 0 vga0: at port 0x3c0-0x3df iomem 0xa-0xb on isa0 sc0: on isa0 sc0: VGA <16 virtual consoles, flags=0x200> sio0 at port 0x3f8-0x3ff irq 4 flags 0x10 on isa0 sio0: type 16550A sio1: configured irq 3 not in bitmap of probed irqs 0 ppc0: at port 0x378-0x37f irq 7 flags 0x40 on isa0 ppc0: Generic chipset (ECP/PS2/NIBBLE) in COMPATIBLE mode ppc0: FIFO with 16/16/16 bytes threshold lpt0: on ppbus0 lpt0: Interrupt-driven port ppi0: on ppbus0 pps0: on ppbus0 pcic0: at port 0x3e0-0x3e1 on isa0 pcic0: Polling mode pccard0: on pcic0 pccard1: on pcic0 unknown: can't assign resources unknown: can't assign resources unknown: can't assign resources unknown: can't assign resources unknown: can't assign resources unknown: can't assign resources pcm0: at port 0x220-0x233,0x530-0x537,0x388-0x38f,0x330-0x333,0x538-0x539 irq 5 drq 1,0 on isa0 IP packet filtering initialized, divert enabled, rule-based forwarding disabled, default to deny, logging limited to 100 packets/entry by default IPsec: Initialized Security Association Processing. ad0: 24207MB [49184/16/63] at ata0-master using UDMA33 ad1: 6194MB [13424/15/63] at ata1-master using UDMA33 Mounting root from ufs:/dev/ad0s1a pccard: card inserted, slot 0 panic: from debugger syncing disks... done Uptime: 3h22m40s dumping to dev #ad/0x20001, offset 2547840 dump ata0: resetting devices .. done 255 254 253 252 251 250 249 248 247 246 245 244 243 242 241 240 239 238 237 236 235 234 233 232 231 230 229 228 227 226 225 224 223 222 221 220 219 218 217 216 215 214 213 212 211 210 209 208 207 206 205 204 203 202 201 200 199 198 197 196 195 194 193 192 191 190 189 188 187 186 185 184 183 182 181 180 179 178 177 176 175 174 173 172 171 170 169 168 167 166 165 164 163 162 161 160 159 158 157 156 155 154 153 152 151 150 149 148 147 146 145 144 143 142 141 140 139 138 137 136 135 134 133 132 131 130 129 128 127 126 125 124 123 122 121 120 119 118 117 116 115 114 113 112 111 110 109 108 107 106 105 104 103 102 101 100 99 98 97 96 95 94 93 92 91 90 89 88 87 86 85 84 83 82 81 80 79 78 77 76 75 74 73 72 71 70 69 68 67 66 65 64 63 62 61 60 59 58 57 56 55 54 53 52 51 50 49 48 47 46 45 44 43 42 41 40 39 38 37 36 35 34 33 32 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 --- #0 dumpsys () at /usr/src/sys/kern/kern_shutdown.c:475 475 dumppcb.pcb_cr3 = rcr3(); (kgdb) proc 37 (kgdb) bt #0 mi_switch () at /usr/src/sys/kern/kern_synch.c:953 #1 0xc017e2f0 in msleep (ident=0xc9663f20, mtx=0x0, priority=0x110, wmesg=0xc028b101 "piperd", timo=0x0) at /usr/src/sys/kern/kern_synch.c:506 #2 0xc018e5bc in pipe_read (fp=0xc1258cc0, uio=0xc9666ec4, cred=0xc0ea9f80, flags=0x0, p=0xc7874a00) at /usr/src/sys/kern/sys_pipe.c:445 #3 0xc018d01e in dofileread (p=0xc7874a00, fp=0xc1258cc0, fd=0x0, buf=0x80ac000, nbyte=0x2800, offset=0x, flags=0x0) at /usr/src/sys/sys/file.h:141 #4 0xc018cf47 in read (p=0xc7874a00, uap=0xc9666f80) at /usr/src/sys/kern/sys_generic.c:110 #5 0xc0261aec in syscall2 (frame={
Re: Debugging -current SMPNG HANG on heavy disk-io
Hi, ... > The frames above are what the system went to as the result of your > debugger request. I'd also be interested to see the output of the > 'icnt' macro (if this is UP machine) or 'icnt1' (if it's SMP), and > 'ps' (the macro I promised above). (kgdb) icnt 1215544*566*0 0* 0 0 1 0 1555964*0* 0* 0* 0 0* 22636* 11 1 0 0 0 0 0 441031 imen: 6f0b (kgdb) ps pidprocaddruid pri ppid pgrp flag stat comm wchan 37 c7874a00 c96650000 32 636 004086 3 tar piperd c9663f20 36 c7874bc0 c960a0000 32 636 004006 3 tar FFS node c02f4220 35 c7874d80 c96070000 32 635 004006 3 tar inode c1d2fa00 6 c7874f40 c96040000 32 1 6 004086 3 sh wait c7874f40 5 c7875100 c82950000 4 0 0 000204 3 syncer syncer c03236e8 4 c78752c0 c82930000 4 0 0 100204 3 bufdaemonpsleep c03072f0 3 c7875480 c82910000 4 0 0 000204 3 vmdaemon psleep c0317a00 2 c7875640 c828f0000 4 0 0 100204 3 pagedaemon psleep c02f5938 21 c7875800 c78d40000 1*0 0 000204 2 irq8: rtc 20 c78759c0 c78d20000 1*0 0 000204 2 irq0: clk 19 c7875b80 c78b0 7*0 0 000204 6 irq5: pcm0 18 c7875d40 c788e0000 7*0 0 000204 6 irq7: ppc0 17 c7875f00 c788c0000 7*0 0 000204 6 irq12: psm0 16 c78760c0 c788a0000 7*0 0 000204 2 irq1: atkbd0 15 c7876280 c78870000 6*0 0 000204 6 irq6: fdc0 14 c7876440 c78850000 6*0 0 000204 6 irq15: ata1 13 c7876600 c78830000 6*0 0 000204 2 irq14: ata0 12 c78767c0 c78810000 4 0 0 000204 3 random rndslp c0322934 11 c7876980 c787f0000 15*0 0 008204 6 softinterrupt 10 c7876b40 c787d0000 4 0 0 008204 2 idle 1 c7876d00 c787b0000 4 0 1 004284 3 init wait c7876d00 0 c0322960 c03c0 4 0 0 000204 3 swapper sched c0322960 ... > handler. At this point, it would be very interesting to see the value > of p->p_comm, which is the process name at the end of the ps listing. > > > (kgdb) proc 35 > > Why are you interested in this process? It was one of the tar's which I grabbed by hand (without your ps macro) ... Whats next to show :-) Bye! Michael Reifenberger ^.*Plaut.*$, IT, R/3 Basis, GPS To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: Debugging -current SMPNG HANG on heavy disk-io
On Sunday, 17 September 2000 at 0:32:01 +0200, Michael Reifenberger wrote: > Hi, > -current hangs reliable (as described in another mail) for me. > For short: > "tar cf /dev/null /usr/ports&; tar cf - /usr/ports | tar tf -" > locks the system solid after a few minutes. > The first tar itself seems to need some time longer before hang. > This is verified to occure with 2 different disks (IDE). I've seen this on NFS a few weeks back, but I haven't followed through. In my case, I couldn't even get to the debugger. > Now the questions how to debug this: > - How do I get a backtrace of a specific process from within DDB? > - How do I determine where the system hangs/loops fromm within DDB? I can't give you answers for ddb. > - How do I get the process-list (like ps) from within gdb (postmortem) I have a macro which does this. I should commit some of them, but they're in terrible shape. I'm attaching them to this message. You'll have to modify at least .gdbinit.paths. > Below is a first try to debug postmortem with gdb > Does this look reasonable? Something else to look? > > #0 dumpsys () at /usr/src/sys/kern/kern_shutdown.c:475 > 475 dumppcb.pcb_cr3 = rcr3(); > (kgdb) bt > #0 dumpsys () at /usr/src/sys/kern/kern_shutdown.c:475 > #1 0xc017aeb3 in boot (howto=256) at /usr/src/sys/kern/kern_shutdown.c:316 > #2 0xc017b255 in panic (fmt=0xc02802b4 "from debugger") > at /usr/src/sys/kern/kern_shutdown.c:568 > #3 0xc013b2c9 in db_panic (addr=-1071295444, have_addr=0, count=-1, > modif=0xc788bd88 "") at /usr/src/sys/ddb/db_command.c:433 > #4 0xc013b269 in db_command (last_cmdp=0xc02bf5b4, cmd_table=0xc02bf414, > aux_cmd_tablep=0xc03002fc) at /usr/src/sys/ddb/db_command.c:333 > #5 0xc013b32e in db_command_loop () at /usr/src/sys/ddb/db_command.c:455 > #6 0xc013d4eb in db_trap (type=3, code=0) at /usr/src/sys/ddb/db_trap.c:71 > #7 0xc02551ca in kdb_trap (type=3, code=0, regs=0xc788beac) > at /usr/src/sys/i386/i386/db_interface.c:163 > #8 0xc0260fdc in trap (frame={tf_fs = 16, tf_es = 16, tf_ds = -1070530544, > tf_edi = -1070468928, tf_esi = -1070491744, tf_ebp = -947339528, > tf_isp = -947339560, tf_ebx = 582, tf_edx = -1072984320, tf_ecx = 32, > tf_eax = 38, tf_trapno = 3, tf_err = 0, tf_eip = -1071295444, tf_cs = 8, > tf_eflags = 70, tf_esp = -1070930945, tf_ss = -1070944311}) > at /usr/src/sys/i386/i386/trap.c:584 > #9 0xc025542c in Debugger (msg=0xc02aafc9 "manual escape to debugger") > at machine/cpufunc.h:64 The frames above are what the system went to as the result of your debugger request. I'd also be interested to see the output of the 'icnt' macro (if this is UP machine) or 'icnt1' (if it's SMP), and 'ps' (the macro I promised above). > #10 0xc0251f36 in scgetc (sc=0xc031f0c0, flags=2) > at /usr/src/sys/dev/syscons/syscons.c:3133 > #11 0xc024ef59 in sckbdevent (thiskbd=0xc0317f60, event=0, arg=0xc031f0c0) > at /usr/src/sys/dev/syscons/syscons.c:634 > #12 0xc0246eae in atkbd_intr (kbd=0xc0317f60, arg=0x0) > at /usr/src/sys/dev/kbd/atkbd.c:462 > #13 0xc026c45c in atkbd_isa_intr (arg=0xc0317f60) > at /usr/src/sys/isa/atkbd_isa.c:125 The ones above are the keyboard interrupt handler. > #14 0xc02681a4 in ithd_loop (dummy=0x0) at /usr/src/sys/i386/isa/ithread.c:239 This is the interesting one. We appear to be looping in an interrupt handler. At this point, it would be very interesting to see the value of p->p_comm, which is the process name at the end of the ps listing. > (kgdb) proc 35 Why are you interested in this process? Greg -- Finger [EMAIL PROTECTED] for PGP public key See complete headers for address and phone numbers source .gdbinit.kernel source .gdbinit.paths tr # GRRR # set remotebaud 115200 set remotebaud 9600 set remotetimeout 1 set complaints 1 set print pretty # dir /src/ZAPHOD/src/sys/modules/vinum # dir /src/ZAPHOD/src/sys/i386/conf # dir /src/ZAPHOD/src/sys dir src/sys/i386/conf dir src/sys file src/sys/compile/ZAPHODng/kernel.ko.debug define asf set $file = linker_files.tqh_first set $found = 0 while ($found == 0) if (*$file->filename == 'v') set $found = 1 else set $file = $file->link.tqe_next end end shell /usr/bin/objdump --section-headers sys/modules/vinum/vinum.ko | grep ' .text' | awk '{print "add-symbol-file sys/modules/vinum/vinum.ko \$file->address+0x" $4}' > .asf source .asf end document asf Find the load address of Vinum in the kernel and add the symbols at this address end define xi x/10i $eip end define xs x/12x $esp end define xb x/12x $ebp end define z ni x/1i $eip end define zs si x/1i $eip end define xp printf " esp: " output/x $esp echo ( output (((int)$ebp)-(int)$esp)/4-4 printf " words on stack)\n ebp: " output/x $ebp printf "\n eip: " x/1i $eip printf "Saved ebp: " output/x *(int*)$ebp printf " (maximum of " output ((*(int*)$ebp)-(int)$ebp)/4-4 printf " parameters possible)\nSaved eip:
Debugging -current SMPNG HANG on heavy disk-io
Hi, -current hangs reliable (as described in another mail) for me. For short: "tar cf /dev/null /usr/ports&; tar cf - /usr/ports | tar tf -" locks the system solid after a few minutes. The first tar itself seems to need some time longer before hang. This is verified to occure with 2 different disks (IDE). Now the questions how to debug this: - How do I get a backtrace of a specific process from within DDB? - How do I determine where the system hangs/loops fromm within DDB? - How do I get the process-list (like ps) from within gdb (postmortem) Below is a first try to debug postmortem with gdb Does this look reasonable? Something else to look? ... Copyright (c) 1992-2000 The FreeBSD Project. Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994 The Regents of the University of California. All rights reserved. FreeBSD 5.0-CURRENT #0: Sat Sep 16 19:32:53 CEST 2000 [EMAIL PROTECTED]:/usr/obj/usr/src/sys/nihil Timecounter "i8254" frequency 1193182 Hz Timecounter "TSC" frequency 266615847 Hz CPU: Pentium II/Pentium II Xeon/Celeron (266.62-MHz 686-class CPU) Origin = "GenuineIntel" Id = 0x652 Stepping = 2 Features=0x183f9ff real memory = 268369920 (262080K bytes) config> #flags wdc0 0xa0ffa0ff Invalid command or syntax. Type `?' for help. config> #flags wdc1 0xa0ffa0ff Invalid command or syntax. Type `?' for help. config> #iosiz npx0 196608 Invalid command or syntax. Type `?' for help. config> #irq pcic0 11 Invalid command or syntax. Type `?' for help. config> quit avail memory = 257589248 (251552K bytes) Preloaded elf kernel "kernel.ko" at 0xc03ad000. Preloaded userconfig_script "/boot/kernel.conf" at 0xc03ad0ac. Preloaded elf module "linux.ko" at 0xc03ad0fc. Preloaded elf module "linprocfs.ko" at 0xc03ad19c. Pentium Pro MTRR support enabled VESA: v2.0, 2496k memory, flags:0x0, mode table:0xc031ee42 (122) VESA: MagicGraph 256 AV 44K PRELIMINARY npx0: on motherboard npx0: INT 16 interface pcib0: on motherboard pci0: on pcib0 pci0: at 4.0 irq 11 isab0: at device 5.0 on pci0 isa0: on isab0 atapci0: port 0xfe60-0xfe6f at device 5.1 on pci0 ata0: at 0x1f0 irq 14 on atapci0 ata1: at 0x170 irq 15 on atapci0 pci0: at 5.2 irq 11 pci0: at 5.3 pci0: at 9.0 irq 11 pcic-pci0: at device 11.0 on pci0 pcic-pci1: at device 11.1 on pci0 fdc0: at port 0x3f0-0x3f5,0x3f7 irq 6 drq 2 on isa0 atkbdc0: at port 0x60,0x64 on isa0 atkbd0: irq 1 on atkbdc0 psm0: irq 12 on atkbdc0 psm0: model GlidePoint, device ID 0 vga0: at port 0x3c0-0x3df iomem 0xa-0xb on isa0 sc0: on isa0 sc0: VGA <16 virtual consoles, flags=0x200> sio0 at port 0x3f8-0x3ff irq 4 flags 0x10 on isa0 sio0: type 16550A sio1: configured irq 3 not in bitmap of probed irqs 0 ppc0: at port 0x378-0x37f irq 7 flags 0x40 on isa0 ppc0: Generic chipset (ECP/PS2/NIBBLE) in COMPATIBLE mode ppc0: FIFO with 16/16/16 bytes threshold lpt0: on ppbus0 lpt0: Interrupt-driven port ppi0: on ppbus0 pps0: on ppbus0 pcic0: at port 0x3e0-0x3e1 on isa0 pcic0: Polling mode pccard0: on pcic0 pccard1: on pcic0 unknown: can't assign resources unknown: can't assign resources unknown: can't assign resources unknown: can't assign resources unknown: can't assign resources unknown: can't assign resources pcm0: at port 0x220-0x233,0x530-0x537,0x388-0x38f,0x330-0x333,0x538-0x539 irq 5 drq 1,0 on isa0 IP packet filtering initialized, divert enabled, rule-based forwarding disabled, default to deny, logging limited to 100 packets/entry by default IPsec: Initialized Security Association Processing. ad0: 24207MB [49184/16/63] at ata0-master using UDMA33 ad1: 6194MB [13424/15/63] at ata1-master using UDMA33 Mounting root from ufs:/dev/ad0s1a pccard: card inserted, slot 0 panic: from debugger syncing disks... done Uptime: 3h22m40s dumping to dev #ad/0x20001, offset 2547840 dump ata0: resetting devices .. done 255 254 253 252 251 250 249 248 247 246 245 244 243 242 241 240 239 238 237 236 235 234 233 232 231 230 229 228 227 226 225 224 223 222 221 220 219 218 217 216 215 214 213 212 211 210 209 208 207 206 205 204 203 202 201 200 199 198 197 196 195 194 193 192 191 190 189 188 187 186 185 184 183 182 181 180 179 178 177 176 175 174 173 172 171 170 169 168 167 166 165 164 163 162 161 160 159 158 157 156 155 154 153 152 151 150 149 148 147 146 145 144 143 142 141 140 139 138 137 136 135 134 133 132 131 130 129 128 127 126 125 124 123 122 121 120 119 118 117 116 115 114 113 112 111 110 109 108 107 106 105 104 103 102 101 100 99 98 97 96 95 94 93 92 91 90 89 88 87 86 85 84 83 82 81 80 79 78 77 76 75 74 73 72 71 70 69 68 67 66 65 64 63 62 61 60 59 58 57 56 55 54 53 52 51 50 49 48 47 46 45 44 43 42 41 40 39 38 37 36 35 34 33 32 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 --- #0 dumpsys () at /usr/src/sys/kern/kern_shutdown.c:475 475 dumppcb.pcb_cr3 = rcr3(); (kgdb) bt #0 dumpsys () at /usr/src/sys/kern/kern_shutdown.c:475 #1 0xc017aeb3 in boot (howto