Re: Debugging -current SMPNG HANG on heavy disk-io

2000-09-19 Thread Michael Reifenberger

Hi,
On Tue, 19 Sep 2000 [EMAIL PROTECTED] wrote:
...
> This looks like you've hit the limit for the FFS node memory type.

BINGO!

>FFS node262144 65536K  65536K 65536K  20244600 6  256

So the symptom is clear. But the cause?

With pre SMPng I had the default kmem sizes (which is 12MB I think).
Now I bumped kern.vm.kmem.size 4 times to 4096 (which leads to 20k max-mem
for FFS node) and still can't tar /usr/ports to /dev/null!

Where comes the increased memory consumption from ?!?

BTW: What is the KMEM exactly:
Kernel real memory? 
Kernel virtual memory?

Thanks anyway for your answer and your efforts!

Bye!

Michael Reifenberger
^.*Plaut.*$, IT, R/3 Basis, GPS



To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: Debugging -current SMPNG HANG on heavy disk-io

2000-09-19 Thread Michael Reifenberger

On Tue, 19 Sep 2000 [EMAIL PROTECTED] wrote:
...
> This looks like you've hit the limit for the FFS node memory type.
> 
> vmstat -m will indicate if this is correct.
> 
...
> Increasing the kmem_map size (by setting a loader variable
> (kern.vm.kmem.size) or defining VM_KMEM_SIZE and VM_KMEM_SIZE_MAX in
> the kernel config file) should help.
I'll try. Thanks for the hint!
BTW: Is it possible to dynamically adjust the limit of the node mem?
The system shouldn't freeze anyway.

Bye!

Michael Reifenberger
^.*Plaut.*$, IT, R/3 Basis, GPS



To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: Debugging -current SMPNG HANG on heavy disk-io

2000-09-19 Thread Tor . Egge

> (kgdb) ps
>   pidprocaddruid  pri ppid  pgrp   flag stat comm wchan
>37 c7874a00 c96650000  32 636  004086  3  tar  piperd c9663f20
>36 c7874bc0 c960a0000  32 636  004006  3  tar  FFS node 
>c02f4220

This looks like you've hit the limit for the FFS node memory type.

vmstat -m will indicate if this is correct.

If you see somethinig like

  Memory statistics by type  Type  Kern
  Type  InUse MemUse HighUse  Limit Requests Limit Limit Size(s)
[]
   FFS node262144 65536K  65536K 65536K  20244600 6  256
[]
Memory Totals:  In UseFreeRequests
93897K608K 9482590

(i.e. MemUse == Limit), then you've hit the limit.  The process
allocating a FFS node normally holds a vnode lock, resulting in 
a cascade of vnode locks and a frozen system.

Increasing the kmem_map size (by setting a loader variable
(kern.vm.kmem.size) or defining VM_KMEM_SIZE and VM_KMEM_SIZE_MAX in
the kernel config file) should help.

- Tor Egge


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: Debugging -current SMPNG HANG on heavy disk-io

2000-09-18 Thread Michael Reifenberger

On Sun, 17 Sep 2000, John Baldwin wrote:
...
> Hmm, could it be lockmgr() related?
How can I proof?

Bye!

Michael Reifenberger
^.*Plaut.*$, IT, R/3 Basis, GPS



To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: Debugging -current SMPNG HANG on heavy disk-io

2000-09-17 Thread John Baldwin

Greg Lehey wrote:
> On Monday, 18 September 2000 at  1:29:34 +0200, Michael Reifenberger wrote:
> > On Mon, 18 Sep 2000, Greg Lehey wrote:
> > ...
> >> Oops, that's what comes of typing hurriedly early in the morning.
> >>
> >>   p ((struct proc *)gd_curproc)->p_comm
> >>   p ((struct proc *)gd_curproc)->p_pid
> > Works better:
> > (kgdb) p ((struct proc *)gd_curproc)->p_comm
> > $6 = "irq1: atkbd0\000\000\000\000"
> > (kgdb) p ((struct proc *)gd_curproc)->p_pid
> > $7 = 0x10
> 
> Hmm.  I suppose that's reasonable, since you've just pressed a key.
> 
> We obviously have a problem here, but I'm not going to be able to look
> at it myself until Friday or Saturday.  Anybody else want to take a
> look?  There's also the possibility that a problem I had seen and not
> investigated could in fact be the same problem: I got it tarring and
> untarring across an NFS connection.

Hmm, could it be lockmgr() related?

-- 

John Baldwin <[EMAIL PROTECTED]> -- http://www.FreeBSD.org/~jhb/
"Power Users Use the Power to Serve!"  -  http://www.FreeBSD.org/


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: Debugging -current SMPNG HANG on heavy disk-io

2000-09-17 Thread Greg Lehey

On Monday, 18 September 2000 at  1:29:34 +0200, Michael Reifenberger wrote:
> On Mon, 18 Sep 2000, Greg Lehey wrote:
> ...
>> Oops, that's what comes of typing hurriedly early in the morning.
>>
>>   p ((struct proc *)gd_curproc)->p_comm
>>   p ((struct proc *)gd_curproc)->p_pid
> Works better:
> (kgdb) p ((struct proc *)gd_curproc)->p_comm
> $6 = "irq1: atkbd0\000\000\000\000"
> (kgdb) p ((struct proc *)gd_curproc)->p_pid
> $7 = 0x10

Hmm.  I suppose that's reasonable, since you've just pressed a key.

We obviously have a problem here, but I'm not going to be able to look
at it myself until Friday or Saturday.  Anybody else want to take a
look?  There's also the possibility that a problem I had seen and not
investigated could in fact be the same problem: I got it tarring and
untarring across an NFS connection.

Greg
--
Finger [EMAIL PROTECTED] for PGP public key
See complete headers for address and phone numbers


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: Debugging -current SMPNG HANG on heavy disk-io

2000-09-17 Thread Michael Reifenberger

On Mon, 18 Sep 2000, Greg Lehey wrote:
...
> Oops, that's what comes of typing hurriedly early in the morning.
> 
>   p ((struct proc *)gd_curproc)->p_comm
>   p ((struct proc *)gd_curproc)->p_pid
Works better:
(kgdb) p ((struct proc *)gd_curproc)->p_comm
$6 = "irq1: atkbd0\000\000\000\000"
(kgdb) p ((struct proc *)gd_curproc)->p_pid
$7 = 0x10

Bye!

Michael Reifenberger
^.*Plaut.*$, IT, R/3 Basis, GPS



To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: Debugging -current SMPNG HANG on heavy disk-io

2000-09-17 Thread Greg Lehey

On Monday, 18 September 2000 at  1:23:30 +0200, Michael Reifenberger wrote:
> On Mon, 18 Sep 2000, Greg Lehey wrote:
> ...
>> You could also show the content of p->p_pid.  If you don't have a p
>> pointer in the frame you're looking at, use ((struct
>> *proc)gd_curproc)->p_pid and ((struct *proc)gd_curproc)->p_comm.  We
>> need to know what is hanging.
> Sorry doesn't seem to work:
> (kgdb) p p->p_comm
> No symbol "p" in current context.
> (kgdb) p ((struct*proc)gd_curproc)->p_pid
> A syntax error in expression, near `proc)gd_curproc)->p_pid'.
> (kgdb) p ((struct *proc)gd_curproc)->p_comm
> A syntax error in expression, near `proc)gd_curproc)->p_comm'.
> (kgdb) p gd_curproc
> $1 = 0xc78760c0

Oops, that's what comes of typing hurriedly early in the morning.

  p ((struct proc *)gd_curproc)->p_comm
  p ((struct proc *)gd_curproc)->p_pid

Greg
--
Finger [EMAIL PROTECTED] for PGP public key
See complete headers for address and phone numbers


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: Debugging -current SMPNG HANG on heavy disk-io

2000-09-17 Thread Michael Reifenberger

On Mon, 18 Sep 2000, Greg Lehey wrote:
...
> You could also show the content of p->p_pid.  If you don't have a p
> pointer in the frame you're looking at, use ((struct
> *proc)gd_curproc)->p_pid and ((struct *proc)gd_curproc)->p_comm.  We
> need to know what is hanging.
Sorry doesn't seem to work:
(kgdb) p p->p_comm
No symbol "p" in current context.
(kgdb) p ((struct*proc)gd_curproc)->p_pid
A syntax error in expression, near `proc)gd_curproc)->p_pid'.
(kgdb) p ((struct *proc)gd_curproc)->p_comm
A syntax error in expression, near `proc)gd_curproc)->p_comm'.
(kgdb) p gd_curproc
$1 = 0xc78760c0



Bye!

Michael Reifenberger
^.*Plaut.*$, IT, R/3 Basis, GPS



To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: Debugging -current SMPNG HANG on heavy disk-io

2000-09-17 Thread Greg Lehey

On Sunday, 17 September 2000 at 16:29:41 +0200, Michael Reifenberger wrote:
> Hi,
> ...
>> The frames above are what the system went to as the result of your
>> debugger request.  I'd also be interested to see the output of the
>> 'icnt' macro (if this is UP machine) or 'icnt1' (if it's SMP), and
>> 'ps' (the macro I promised above).
> (kgdb) icnt
> 1215544*566*0   0*  0   0   1   0
> 1555964*0*  0*  0*  0   0*  22636*  11
> 1   0   0   0   0   0   441031
> imen: 6f0b
> (kgdb) ps
>   pidprocaddruid  pri ppid  pgrp   flag stat comm wchan
>37 c7874a00 c96650000  32 636  004086  3  tar  piperd c9663f20
>36 c7874bc0 c960a0000  32 636  004006  3  tar  FFS node 
>c02f4220
>35 c7874d80 c96070000  32 635  004006  3  tar  inode c1d2fa00
> 6 c7874f40 c96040000  32 1 6  004086  3  sh   wait c7874f40
> 5 c7875100 c82950000   4 0 0  000204  3  syncer   syncer c03236e8
> 4 c78752c0 c82930000   4 0 0  100204  3  bufdaemonpsleep c03072f0
> 3 c7875480 c82910000   4 0 0  000204  3  vmdaemon psleep c0317a00
> 2 c7875640 c828f0000   4 0 0  100204  3  pagedaemon   psleep c02f5938
>21 c7875800 c78d40000   1*0 0  000204  2  irq8: rtc
>20 c78759c0 c78d20000   1*0 0  000204  2  irq0: clk
>19 c7875b80 c78b0   7*0 0  000204  6  irq5: pcm0
>18 c7875d40 c788e0000   7*0 0  000204  6  irq7: ppc0
>17 c7875f00 c788c0000   7*0 0  000204  6  irq12: psm0
>16 c78760c0 c788a0000   7*0 0  000204  2  irq1: atkbd0
>15 c7876280 c78870000   6*0 0  000204  6  irq6: fdc0
>14 c7876440 c78850000   6*0 0  000204  6  irq15: ata1
>13 c7876600 c78830000   6*0 0  000204  2  irq14: ata0
>12 c78767c0 c78810000   4 0 0  000204  3  random   rndslp c0322934
>11 c7876980 c787f0000  15*0 0  008204  6  softinterrupt
>10 c7876b40 c787d0000   4 0 0  008204  2  idle
> 1 c7876d00 c787b0000   4 0 1  004284  3  init wait c7876d00
> 0 c0322960 c03c0   4 0 0  000204  3  swapper  sched c0322960
> ...
>> handler.  At this point, it would be very interesting to see the value
>> of p->p_comm, which is the process name at the end of the ps listing.
>>
>>> (kgdb) proc 35
>>
>> Why are you interested in this process?
> It was one of the tar's which I grabbed by hand (without your ps macro)
> ...
>
> Whats next to show :-)

To quote:

>> At this point, it would be very interesting to see the value of
>> p->p_comm, which is the process name at the end of the ps listing.

You could also show the content of p->p_pid.  If you don't have a p
pointer in the frame you're looking at, use ((struct
*proc)gd_curproc)->p_pid and ((struct *proc)gd_curproc)->p_comm.  We
need to know what is hanging.

I'm probably going on holiday for the rest of the week; somebody else
should pick this one up.

Greg
--
Finger [EMAIL PROTECTED] for PGP public key
See complete headers for address and phone numbers


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: Debugging -current SMPNG HANG on heavy disk-io

2000-09-17 Thread Michael Reifenberger

Hi,
if the order of the ps macro is correct, here the backtraces of the procs 35,36,37:

Copyright (c) 1992-2000 The FreeBSD Project.
Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
The Regents of the University of California. All rights reserved.
FreeBSD 5.0-CURRENT #0: Sat Sep 16 19:32:53 CEST 2000
[EMAIL PROTECTED]:/usr/obj/usr/src/sys/nihil
Timecounter "i8254"  frequency 1193182 Hz
Timecounter "TSC"  frequency 266615847 Hz
CPU: Pentium II/Pentium II Xeon/Celeron (266.62-MHz 686-class CPU)
  Origin = "GenuineIntel"  Id = 0x652  Stepping = 2
  
Features=0x183f9ff
real memory  = 268369920 (262080K bytes)
config> #flags wdc0 0xa0ffa0ff
Invalid command or syntax.  Type `?' for help.
config> #flags wdc1 0xa0ffa0ff
Invalid command or syntax.  Type `?' for help.
config> #iosiz npx0 196608
Invalid command or syntax.  Type `?' for help.
config> #irq pcic0 11
Invalid command or syntax.  Type `?' for help.
config> quit
avail memory = 257589248 (251552K bytes)
Preloaded elf kernel "kernel.ko" at 0xc03ad000.
Preloaded userconfig_script "/boot/kernel.conf" at 0xc03ad0ac.
Preloaded elf module "linux.ko" at 0xc03ad0fc.
Preloaded elf module "linprocfs.ko" at 0xc03ad19c.
Pentium Pro MTRR support enabled
VESA: v2.0, 2496k memory, flags:0x0, mode table:0xc031ee42 (122)
VESA: MagicGraph 256 AV 44K PRELIMINARY
npx0:  on motherboard
npx0: INT 16 interface
pcib0:  on motherboard
pci0:  on pcib0
pci0:  at 4.0 irq 11
isab0:  at device 5.0 on pci0
isa0:  on isab0
atapci0:  port 0xfe60-0xfe6f at device 5.1 on pci0
ata0: at 0x1f0 irq 14 on atapci0
ata1: at 0x170 irq 15 on atapci0
pci0:  at 5.2 irq 11
pci0:  at 5.3
pci0:  at 9.0 irq 11
pcic-pci0:  at device 11.0 on pci0
pcic-pci1:  at device 11.1 on pci0
fdc0:  at port 0x3f0-0x3f5,0x3f7 irq 6 drq 2 on isa0
atkbdc0:  at port 0x60,0x64 on isa0
atkbd0:  irq 1 on atkbdc0
psm0:  irq 12 on atkbdc0
psm0: model GlidePoint, device ID 0
vga0:  at port 0x3c0-0x3df iomem 0xa-0xb on isa0
sc0:  on isa0
sc0: VGA <16 virtual consoles, flags=0x200>
sio0 at port 0x3f8-0x3ff irq 4 flags 0x10 on isa0
sio0: type 16550A
sio1: configured irq 3 not in bitmap of probed irqs 0
ppc0:  at port 0x378-0x37f irq 7 flags 0x40 on isa0
ppc0: Generic chipset (ECP/PS2/NIBBLE) in COMPATIBLE mode
ppc0: FIFO with 16/16/16 bytes threshold
lpt0:  on ppbus0
lpt0: Interrupt-driven port
ppi0:  on ppbus0
pps0:  on ppbus0
pcic0:  at port 0x3e0-0x3e1 on isa0
pcic0: Polling mode
pccard0:  on pcic0
pccard1:  on pcic0
unknown:  can't assign resources
unknown:  can't assign resources
unknown:  can't assign resources
unknown:  can't assign resources
unknown:  can't assign resources
unknown:  can't assign resources
pcm0:  at port 
0x220-0x233,0x530-0x537,0x388-0x38f,0x330-0x333,0x538-0x539 irq 5 drq 1,0 on isa0
IP packet filtering initialized, divert enabled, rule-based forwarding disabled, 
default to deny, logging limited to 100 packets/entry by default
IPsec: Initialized Security Association Processing.
ad0: 24207MB  [49184/16/63] at ata0-master using UDMA33
ad1: 6194MB  [13424/15/63] at ata1-master using UDMA33
Mounting root from ufs:/dev/ad0s1a
pccard: card inserted, slot 0
panic: from debugger

syncing disks... 
done
Uptime: 3h22m40s

dumping to dev #ad/0x20001, offset 2547840
dump ata0: resetting devices .. done
255 254 253 252 251 250 249 248 247 246 245 244 243 242 241 240 239 238 237 236 235 
234 233 232 231 230 229 228 227 226 225 224 223 222 221 220 219 218 217 216 215 214 
213 212 211 210 209 208 207 206 205 204 203 202 201 200 199 198 197 196 195 194 193 
192 191 190 189 188 187 186 185 184 183 182 181 180 179 178 177 176 175 174 173 172 
171 170 169 168 167 166 165 164 163 162 161 160 159 158 157 156 155 154 153 152 151 
150 149 148 147 146 145 144 143 142 141 140 139 138 137 136 135 134 133 132 131 130 
129 128 127 126 125 124 123 122 121 120 119 118 117 116 115 114 113 112 111 110 109 
108 107 106 105 104 103 102 101 100 99 98 97 96 95 94 93 92 91 90 89 88 87 86 85 84 83 
82 81 80 79 78 77 76 75 74 73 72 71 70 69 68 67 66 65 64 63 62 61 60 59 58 57 56 55 54 
53 52 51 50 49 48 47 46 45 44 43 42 41 40 39 38 37 36 35 34 33 32 31 30 29 28 27 26 25 
24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 
---
#0  dumpsys () at /usr/src/sys/kern/kern_shutdown.c:475
475 dumppcb.pcb_cr3 = rcr3();
(kgdb) proc 37
(kgdb) bt
#0  mi_switch () at /usr/src/sys/kern/kern_synch.c:953
#1  0xc017e2f0 in msleep (ident=0xc9663f20, mtx=0x0, priority=0x110, wmesg=0xc028b101 
"piperd", timo=0x0)
at /usr/src/sys/kern/kern_synch.c:506
#2  0xc018e5bc in pipe_read (fp=0xc1258cc0, uio=0xc9666ec4, cred=0xc0ea9f80, 
flags=0x0, p=0xc7874a00)
at /usr/src/sys/kern/sys_pipe.c:445
#3  0xc018d01e in dofileread (p=0xc7874a00, fp=0xc1258cc0, fd=0x0, buf=0x80ac000, 
nbyte=0x2800, 
offset=0x, flags=0x0) at /usr/src/sys/sys/file.h:141
#4  0xc018cf47 in read (p=0xc7874a00, uap=0xc9666f80) at 
/usr/src/sys/kern/sys_generic.c:110
#5  0xc0261aec in syscall2 (frame={

Re: Debugging -current SMPNG HANG on heavy disk-io

2000-09-17 Thread Michael Reifenberger

Hi,
...
> The frames above are what the system went to as the result of your
> debugger request.  I'd also be interested to see the output of the
> 'icnt' macro (if this is UP machine) or 'icnt1' (if it's SMP), and
> 'ps' (the macro I promised above).
(kgdb) icnt
1215544*566*0   0*  0   0   1   0
1555964*0*  0*  0*  0   0*  22636*  11
1   0   0   0   0   0   441031
imen: 6f0b
(kgdb) ps
  pidprocaddruid  pri ppid  pgrp   flag stat comm wchan
   37 c7874a00 c96650000  32 636  004086  3  tar  piperd c9663f20
   36 c7874bc0 c960a0000  32 636  004006  3  tar  FFS node c02f4220
   35 c7874d80 c96070000  32 635  004006  3  tar  inode c1d2fa00
6 c7874f40 c96040000  32 1 6  004086  3  sh   wait c7874f40
5 c7875100 c82950000   4 0 0  000204  3  syncer   syncer c03236e8
4 c78752c0 c82930000   4 0 0  100204  3  bufdaemonpsleep c03072f0
3 c7875480 c82910000   4 0 0  000204  3  vmdaemon psleep c0317a00
2 c7875640 c828f0000   4 0 0  100204  3  pagedaemon   psleep c02f5938
   21 c7875800 c78d40000   1*0 0  000204  2  irq8: rtc
   20 c78759c0 c78d20000   1*0 0  000204  2  irq0: clk
   19 c7875b80 c78b0   7*0 0  000204  6  irq5: pcm0
   18 c7875d40 c788e0000   7*0 0  000204  6  irq7: ppc0
   17 c7875f00 c788c0000   7*0 0  000204  6  irq12: psm0
   16 c78760c0 c788a0000   7*0 0  000204  2  irq1: atkbd0
   15 c7876280 c78870000   6*0 0  000204  6  irq6: fdc0
   14 c7876440 c78850000   6*0 0  000204  6  irq15: ata1
   13 c7876600 c78830000   6*0 0  000204  2  irq14: ata0
   12 c78767c0 c78810000   4 0 0  000204  3  random   rndslp c0322934
   11 c7876980 c787f0000  15*0 0  008204  6  softinterrupt
   10 c7876b40 c787d0000   4 0 0  008204  2  idle
1 c7876d00 c787b0000   4 0 1  004284  3  init wait c7876d00
0 c0322960 c03c0   4 0 0  000204  3  swapper  sched c0322960
...
> handler.  At this point, it would be very interesting to see the value
> of p->p_comm, which is the process name at the end of the ps listing.
> 
> > (kgdb) proc 35
> 
> Why are you interested in this process?
It was one of the tar's which I grabbed by hand (without your ps macro)
...

Whats next to show :-)

Bye!

Michael Reifenberger
^.*Plaut.*$, IT, R/3 Basis, GPS



To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: Debugging -current SMPNG HANG on heavy disk-io

2000-09-16 Thread Greg Lehey

On Sunday, 17 September 2000 at  0:32:01 +0200, Michael Reifenberger wrote:
> Hi,
> -current hangs reliable (as described in another mail) for me.
> For short:
>   "tar cf /dev/null /usr/ports&; tar cf - /usr/ports | tar tf -"
>   locks the system solid after a few minutes.
>   The first tar itself seems to need some time longer before hang.
>   This is verified to occure with 2 different disks (IDE).

I've seen this on NFS a few weeks back, but I haven't followed
through.  In my case, I couldn't even get to the debugger.

> Now the questions how to debug this:
> - How do I get a backtrace of a specific process from within DDB?
> - How do I determine where the system hangs/loops fromm within DDB?

I can't give you answers for ddb.

> - How do I get the process-list (like ps) from within gdb (postmortem)

I have a macro which does this.  I should commit some of them, but
they're in terrible shape.  I'm attaching them to this message.
You'll have to modify at least .gdbinit.paths.

> Below is a first try to debug postmortem with gdb
> Does this look reasonable? Something else to look?
> 
> #0  dumpsys () at /usr/src/sys/kern/kern_shutdown.c:475
> 475   dumppcb.pcb_cr3 = rcr3();
> (kgdb) bt
> #0  dumpsys () at /usr/src/sys/kern/kern_shutdown.c:475
> #1  0xc017aeb3 in boot (howto=256) at /usr/src/sys/kern/kern_shutdown.c:316
> #2  0xc017b255 in panic (fmt=0xc02802b4 "from debugger")
> at /usr/src/sys/kern/kern_shutdown.c:568
> #3  0xc013b2c9 in db_panic (addr=-1071295444, have_addr=0, count=-1,
> modif=0xc788bd88 "") at /usr/src/sys/ddb/db_command.c:433
> #4  0xc013b269 in db_command (last_cmdp=0xc02bf5b4, cmd_table=0xc02bf414,
> aux_cmd_tablep=0xc03002fc) at /usr/src/sys/ddb/db_command.c:333
> #5  0xc013b32e in db_command_loop () at /usr/src/sys/ddb/db_command.c:455
> #6  0xc013d4eb in db_trap (type=3, code=0) at /usr/src/sys/ddb/db_trap.c:71
> #7  0xc02551ca in kdb_trap (type=3, code=0, regs=0xc788beac)
> at /usr/src/sys/i386/i386/db_interface.c:163
> #8  0xc0260fdc in trap (frame={tf_fs = 16, tf_es = 16, tf_ds = -1070530544,
>   tf_edi = -1070468928, tf_esi = -1070491744, tf_ebp = -947339528,
>   tf_isp = -947339560, tf_ebx = 582, tf_edx = -1072984320, tf_ecx = 32,
>   tf_eax = 38, tf_trapno = 3, tf_err = 0, tf_eip = -1071295444, tf_cs = 8,
>   tf_eflags = 70, tf_esp = -1070930945, tf_ss = -1070944311})
> at /usr/src/sys/i386/i386/trap.c:584
> #9  0xc025542c in Debugger (msg=0xc02aafc9 "manual escape to debugger")
> at machine/cpufunc.h:64

The frames above are what the system went to as the result of your
debugger request.  I'd also be interested to see the output of the
'icnt' macro (if this is UP machine) or 'icnt1' (if it's SMP), and
'ps' (the macro I promised above).

> #10 0xc0251f36 in scgetc (sc=0xc031f0c0, flags=2)
> at /usr/src/sys/dev/syscons/syscons.c:3133
> #11 0xc024ef59 in sckbdevent (thiskbd=0xc0317f60, event=0, arg=0xc031f0c0)
> at /usr/src/sys/dev/syscons/syscons.c:634
> #12 0xc0246eae in atkbd_intr (kbd=0xc0317f60, arg=0x0)
> at /usr/src/sys/dev/kbd/atkbd.c:462
> #13 0xc026c45c in atkbd_isa_intr (arg=0xc0317f60)
> at /usr/src/sys/isa/atkbd_isa.c:125

The ones above are the keyboard interrupt handler.

> #14 0xc02681a4 in ithd_loop (dummy=0x0) at /usr/src/sys/i386/isa/ithread.c:239

This is the interesting one.  We appear to be looping in an interrupt
handler.  At this point, it would be very interesting to see the value
of p->p_comm, which is the process name at the end of the ps listing.

> (kgdb) proc 35

Why are you interested in this process?

Greg
--
Finger [EMAIL PROTECTED] for PGP public key
See complete headers for address and phone numbers


source .gdbinit.kernel
source .gdbinit.paths
tr


# GRRR
# set remotebaud 115200
set remotebaud 9600
set remotetimeout 1
set complaints 1
set print pretty
# dir /src/ZAPHOD/src/sys/modules/vinum
# dir /src/ZAPHOD/src/sys/i386/conf
# dir /src/ZAPHOD/src/sys
dir src/sys/i386/conf
dir src/sys
file src/sys/compile/ZAPHODng/kernel.ko.debug
define asf
   set $file = linker_files.tqh_first
   set $found = 0
   while ($found == 0)
 if (*$file->filename == 'v')
set $found = 1
 else
   set $file = $file->link.tqe_next
 end
   end
   shell /usr/bin/objdump --section-headers sys/modules/vinum/vinum.ko | grep ' .text' 
| awk '{print "add-symbol-file sys/modules/vinum/vinum.ko \$file->address+0x" $4}' > 
.asf
   source .asf
end
document asf
Find the load address of Vinum in the kernel and add the symbols at this address
end


define xi
x/10i $eip
end
define xs
x/12x $esp
end
define xb
x/12x $ebp
end
define z
ni
x/1i $eip
end
define zs
si
x/1i $eip
end
define xp
printf "  esp: " 
output/x $esp
echo  (
output (((int)$ebp)-(int)$esp)/4-4
printf " words on stack)\n  ebp: " 
output/x $ebp
printf "\n  eip: " 
x/1i $eip
printf "Saved ebp: " 
output/x *(int*)$ebp
printf " (maximum of "  
output ((*(int*)$ebp)-(int)$ebp)/4-4
printf " parameters possible)\nSaved eip:

Debugging -current SMPNG HANG on heavy disk-io

2000-09-16 Thread Michael Reifenberger

Hi,
-current hangs reliable (as described in another mail) for me.
For short: 
  "tar cf /dev/null /usr/ports&; tar cf - /usr/ports | tar tf -"
  locks the system solid after a few minutes. 
  The first tar itself seems to need some time longer before hang.
  This is verified to occure with 2 different disks (IDE).

Now the questions how to debug this:
- How do I get a backtrace of a specific process from within DDB?
- How do I determine where the system hangs/loops fromm within DDB?
- How do I get the process-list (like ps) from within gdb (postmortem)

Below is a first try to debug postmortem with gdb
Does this look reasonable? Something else to look?
...
Copyright (c) 1992-2000 The FreeBSD Project.
Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
The Regents of the University of California. All rights reserved.
FreeBSD 5.0-CURRENT #0: Sat Sep 16 19:32:53 CEST 2000
[EMAIL PROTECTED]:/usr/obj/usr/src/sys/nihil
Timecounter "i8254"  frequency 1193182 Hz
Timecounter "TSC"  frequency 266615847 Hz
CPU: Pentium II/Pentium II Xeon/Celeron (266.62-MHz 686-class CPU)
  Origin = "GenuineIntel"  Id = 0x652  Stepping = 2
  
Features=0x183f9ff
real memory  = 268369920 (262080K bytes)
config> #flags wdc0 0xa0ffa0ff
Invalid command or syntax.  Type `?' for help.
config> #flags wdc1 0xa0ffa0ff
Invalid command or syntax.  Type `?' for help.
config> #iosiz npx0 196608
Invalid command or syntax.  Type `?' for help.
config> #irq pcic0 11
Invalid command or syntax.  Type `?' for help.
config> quit
avail memory = 257589248 (251552K bytes)
Preloaded elf kernel "kernel.ko" at 0xc03ad000.
Preloaded userconfig_script "/boot/kernel.conf" at 0xc03ad0ac.
Preloaded elf module "linux.ko" at 0xc03ad0fc.
Preloaded elf module "linprocfs.ko" at 0xc03ad19c.
Pentium Pro MTRR support enabled
VESA: v2.0, 2496k memory, flags:0x0, mode table:0xc031ee42 (122)
VESA: MagicGraph 256 AV 44K PRELIMINARY
npx0:  on motherboard
npx0: INT 16 interface
pcib0:  on motherboard
pci0:  on pcib0
pci0:  at 4.0 irq 11
isab0:  at device 5.0 on pci0
isa0:  on isab0
atapci0:  port 0xfe60-0xfe6f at device 5.1 on pci0
ata0: at 0x1f0 irq 14 on atapci0
ata1: at 0x170 irq 15 on atapci0
pci0:  at 5.2 irq 11
pci0:  at 5.3
pci0:  at 9.0 irq 11
pcic-pci0:  at device 11.0 on pci0
pcic-pci1:  at device 11.1 on pci0
fdc0:  at port 0x3f0-0x3f5,0x3f7 irq 6 drq 2 on isa0
atkbdc0:  at port 0x60,0x64 on isa0
atkbd0:  irq 1 on atkbdc0
psm0:  irq 12 on atkbdc0
psm0: model GlidePoint, device ID 0
vga0:  at port 0x3c0-0x3df iomem 0xa-0xb on isa0
sc0:  on isa0
sc0: VGA <16 virtual consoles, flags=0x200>
sio0 at port 0x3f8-0x3ff irq 4 flags 0x10 on isa0
sio0: type 16550A
sio1: configured irq 3 not in bitmap of probed irqs 0
ppc0:  at port 0x378-0x37f irq 7 flags 0x40 on isa0
ppc0: Generic chipset (ECP/PS2/NIBBLE) in COMPATIBLE mode
ppc0: FIFO with 16/16/16 bytes threshold
lpt0:  on ppbus0
lpt0: Interrupt-driven port
ppi0:  on ppbus0
pps0:  on ppbus0
pcic0:  at port 0x3e0-0x3e1 on isa0
pcic0: Polling mode
pccard0:  on pcic0
pccard1:  on pcic0
unknown:  can't assign resources
unknown:  can't assign resources
unknown:  can't assign resources
unknown:  can't assign resources
unknown:  can't assign resources
unknown:  can't assign resources
pcm0:  at port 
0x220-0x233,0x530-0x537,0x388-0x38f,0x330-0x333,0x538-0x539 irq 5 drq 1,0 on isa0
IP packet filtering initialized, divert enabled, rule-based forwarding disabled, 
default to deny, logging limited to 100 packets/entry by default
IPsec: Initialized Security Association Processing.
ad0: 24207MB  [49184/16/63] at ata0-master using UDMA33
ad1: 6194MB  [13424/15/63] at ata1-master using UDMA33
Mounting root from ufs:/dev/ad0s1a
pccard: card inserted, slot 0
panic: from debugger

syncing disks... 
done
Uptime: 3h22m40s

dumping to dev #ad/0x20001, offset 2547840
dump ata0: resetting devices .. done
255 254 253 252 251 250 249 248 247 246 245 244 243 242 241 240 239 238 237 236 235 
234 233 232 231 230 229 228 227 226 225 224 223 222 221 220 219 218 217 216 215 214 
213 212 211 210 209 208 207 206 205 204 203 202 201 200 199 198 197 196 195 194 193 
192 191 190 189 188 187 186 185 184 183 182 181 180 179 178 177 176 175 174 173 172 
171 170 169 168 167 166 165 164 163 162 161 160 159 158 157 156 155 154 153 152 151 
150 149 148 147 146 145 144 143 142 141 140 139 138 137 136 135 134 133 132 131 130 
129 128 127 126 125 124 123 122 121 120 119 118 117 116 115 114 113 112 111 110 109 
108 107 106 105 104 103 102 101 100 99 98 97 96 95 94 93 92 91 90 89 88 87 86 85 84 83 
82 81 80 79 78 77 76 75 74 73 72 71 70 69 68 67 66 65 64 63 62 61 60 59 58 57 56 55 54 
53 52 51 50 49 48 47 46 45 44 43 42 41 40 39 38 37 36 35 34 33 32 31 30 29 28 27 26 25 
24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 
---
#0  dumpsys () at /usr/src/sys/kern/kern_shutdown.c:475
475 dumppcb.pcb_cr3 = rcr3();
(kgdb) bt
#0  dumpsys () at /usr/src/sys/kern/kern_shutdown.c:475
#1  0xc017aeb3 in boot (howto