Re: repost of procfs crashes in -CURRENT (no html)..

2000-02-18 Thread Luoqi Chen

 Kernel: 
 ===
 FreeBSD karma.afterthought.org 4.0-CURRENT FreeBSD 4.0-CURRENT #0: Mon Feb
 14 23:00:42 GMT 2000
 [EMAIL PROTECTED]:/usr/src/sys/compile/KARMA  i386
 
 Background:
  
 3 users. One with X running me, and two users running breakwidgets
 binary testing script, which make use of a minimized version of the
 "killall" perl script which reads procfs. 
 
 This crash appears to be the old one where when two processes read procfs
 simultaneously, ugly things can happen. mdillon described this in more
 depth to me once but I've since lost the e-mail. I posted similar crash
 reports in late November  early december. He suggested having my
 programs "lock" procfs reads so only one could do it's killall function at
 a time. Unfortunatly, the binary testing script is very time sensitive and
 this would slow things down my current run-through is about 48 hours
 paralleled on 4 machines
 
I don't believe that's the cause.

 The kernel is a GENERIC one with ipv6, softupdates, and pcm added to it. 
 
 Crash #1:
 =
 (kgdb) bt
 #0  boot (howto=256) at ../../kern/kern_shutdown.c:304
 #1  0xc014e194 in poweroff_wait (junk=0xc02b9480, howto=-871862272) at
 ../../kern/kern_shutdown.c:554
 #2  0xc022d064 in vm_fault (map=0xc031ee28, vaddr=3423105024, fault_type=1
 '\001', fault_flags=0) at ../../vm/vm_fault.c:240
 #3  0xc02810d2 in trap_pfault (frame=0xcc136cc4, usermode=0,
 eva=3423108180) at ../../i386/i386/trap.c:788
 #4  0xc0280d37 in trap (frame={tf_fs = -871170032, tf_es = -871170032,
 tf_ds = 16, tf_edi = -871142055, tf_esi = -871142025,
   tf_ebp = -871141804, tf_isp = -871142160, tf_ebx = -872323392,
 tf_edx = 0, tf_ecx = -872323392, tf_eax = -871859336,
   tf_trapno = 12, tf_err = 0, tf_eip = -1072160861, tf_cs = 8,
 tf_eflags = 66118, tf_esp = 0, tf_ss = 0})
 at ../../i386/i386/trap.c:423
 #5  0xc0181fa3 in procfs_dostatus (curp=0xcc145e00, p=0xcc0166c0,
 pfs=0xc14abf60, uio=0xcc136eec)
 at ../../miscfs/procfs/procfs_status.c:115

The fault is taken when trying to access the target process' p_stats which
resides in the u area. What's interesting here is the code checks P_INMEM
flag prior to accessing p_stats, so there shouldn't be a fault. My guess is
this is an embryonic process, the p_stats field is inherited from the corpse
of another process which points to no where. Would you print out p-p_stat
(not p_stats) and check if it is 1 (SIDL)? That would confirm my theory.

If this indeed is the case, the fix should be delaying setting P_INMEM flags
in fork() until after the u area is allocated. It maybe also a good idea to
skip embryonic processes in procfs altogether.

 #6  0xc0182590 in procfs_rw (ap=0xcc136ea0) at
 ../../miscfs/procfs/procfs_subr.c:277
 #7  0xc017dc0a in vn_read (fp=0xc14431c0, uio=0xcc136eec, cred=0xc1450700,
 flags=0, p=0xcc145e00) at vnode_if.h:334
 #8  0xc015ac50 in dofileread (p=0xcc145e00, fp=0xc14431c0, fd=6,
 buf=0x8235000, nbyte=4096, offset=-1, flags=0)
 at ../../sys/file.h:140
 #9  0xc015ab57 in read (p=0xcc145e00, uap=0xcc136f80) at
 ../../kern/sys_generic.c:111
 #10 0xc028167e in syscall (frame={tf_fs = 47, tf_es = 47, tf_ds = 47,
 tf_edi = -1077946820, tf_esi = 672915688,
   tf_ebp = -1077946996, tf_isp = -871141420, tf_ebx = 672858084,
 tf_edx = 672809512, tf_ecx = 136531968, tf_eax = 3,
   tf_trapno = 0, tf_err = 2, tf_eip = 672818732, tf_cs = 31, tf_eflags
 = 659, tf_esp = -1077947040, tf_ss = 47})
 at ../../i386/i386/trap.c:1055
 
 
 
 Crash #2:
 =
 #0  boot (howto=256) at ../../kern/kern_shutdown.c:304
 #1  0xc014e194 in poweroff_wait (junk=0xc02b9480, howto=-873472000) at
 ../../kern/kern_shutdown.c:554
 #2  0xc022d064 in vm_fault (map=0xc031ee28, vaddr=3421495296, fault_type=1
 '\001', fault_flags=0) at ../../vm/vm_fault.c:240
 #3  0xc02810d2 in trap_pfault (frame=0xcbe0ccc4, usermode=0,
 eva=3421498452) at ../../i386/i386/trap.c:788
 #4  0xc0280d37 in trap (frame={tf_fs = -874512368, tf_es = -874512368,
 tf_ds = 16, tf_edi = -874459817, tf_esi = -874459788,
   tf_ebp = -874459564, tf_isp = -874459920, tf_ebx = -873997056,
 tf_edx = 0, tf_ecx = -873997056, tf_eax = -873469064,
   tf_trapno = 12, tf_err = 0, tf_eip = -1072160861, tf_cs = 8,
 tf_eflags = 66118, tf_esp = 0, tf_ss = 0})
 at ../../i386/i386/trap.c:423
 #5  0xc0181fa3 in procfs_dostatus (curp=0xcbd7df20, p=0xcbe7dd00,
 pfs=0xc154ac20, uio=0xcbe0ceec)
 at ../../miscfs/procfs/procfs_status.c:115
 #6  0xc0182590 in procfs_rw (ap=0xcbe0cea0) at
 ../../miscfs/procfs/procfs_subr.c:277
 #7  0xc017dc0a in vn_read (fp=0xc1469200, uio=0xcbe0ceec, cred=0xc153d180,
 flags=0, p=0xcbd7df20) at vnode_if.h:334
 #8  0xc015ac50 in dofileread (p=0xcbd7df20, fp=0xc1469200, fd=5,
 buf=0x8253000, nbyte=4096, offset=-1, flags=0)
 at ../../sys/file.h:140
 #9  0xc015ab57 in read (p=0xcbd7df20, uap=0xcbe0cf80) at
 ../../kern/sys_generic.c:111
 #10 0xc028167e in syscall (frame={tf_fs = 47, tf_es = 47, tf_ds = 47,
 tf_edi = -1077945828, tf_esi = 

repost of procfs crashes in -CURRENT (no html)..

2000-02-18 Thread Thomas Stromberg

Sorry about the html posting, it seems that Mozilla M13 decided to rape my
message. I hate html postings just as much as you do (thank god for
procmail filters), and will send this one using pine so Mozilla doesn't
try to rethink my e-mail for me. 

Kernel: 
===
FreeBSD karma.afterthought.org 4.0-CURRENT FreeBSD 4.0-CURRENT #0: Mon Feb
14 23:00:42 GMT 2000
[EMAIL PROTECTED]:/usr/src/sys/compile/KARMA  i386

Background:
 
3 users. One with X running me, and two users running breakwidgets
binary testing script, which make use of a minimized version of the
"killall" perl script which reads procfs. 

This crash appears to be the old one where when two processes read procfs
simultaneously, ugly things can happen. mdillon described this in more
depth to me once but I've since lost the e-mail. I posted similar crash
reports in late November  early december. He suggested having my
programs "lock" procfs reads so only one could do it's killall function at
a time. Unfortunatly, the binary testing script is very time sensitive and
this would slow things down my current run-through is about 48 hours
paralleled on 4 machines

The kernel is a GENERIC one with ipv6, softupdates, and pcm added to it. 

Crash #1:
=
(kgdb) bt
#0  boot (howto=256) at ../../kern/kern_shutdown.c:304
#1  0xc014e194 in poweroff_wait (junk=0xc02b9480, howto=-871862272) at
../../kern/kern_shutdown.c:554
#2  0xc022d064 in vm_fault (map=0xc031ee28, vaddr=3423105024, fault_type=1
'\001', fault_flags=0) at ../../vm/vm_fault.c:240
#3  0xc02810d2 in trap_pfault (frame=0xcc136cc4, usermode=0,
eva=3423108180) at ../../i386/i386/trap.c:788
#4  0xc0280d37 in trap (frame={tf_fs = -871170032, tf_es = -871170032,
tf_ds = 16, tf_edi = -871142055, tf_esi = -871142025,
  tf_ebp = -871141804, tf_isp = -871142160, tf_ebx = -872323392,
tf_edx = 0, tf_ecx = -872323392, tf_eax = -871859336,
  tf_trapno = 12, tf_err = 0, tf_eip = -1072160861, tf_cs = 8,
tf_eflags = 66118, tf_esp = 0, tf_ss = 0})
at ../../i386/i386/trap.c:423
#5  0xc0181fa3 in procfs_dostatus (curp=0xcc145e00, p=0xcc0166c0,
pfs=0xc14abf60, uio=0xcc136eec)
at ../../miscfs/procfs/procfs_status.c:115
#6  0xc0182590 in procfs_rw (ap=0xcc136ea0) at
../../miscfs/procfs/procfs_subr.c:277
#7  0xc017dc0a in vn_read (fp=0xc14431c0, uio=0xcc136eec, cred=0xc1450700,
flags=0, p=0xcc145e00) at vnode_if.h:334
#8  0xc015ac50 in dofileread (p=0xcc145e00, fp=0xc14431c0, fd=6,
buf=0x8235000, nbyte=4096, offset=-1, flags=0)
at ../../sys/file.h:140
#9  0xc015ab57 in read (p=0xcc145e00, uap=0xcc136f80) at
../../kern/sys_generic.c:111
#10 0xc028167e in syscall (frame={tf_fs = 47, tf_es = 47, tf_ds = 47,
tf_edi = -1077946820, tf_esi = 672915688,
  tf_ebp = -1077946996, tf_isp = -871141420, tf_ebx = 672858084,
tf_edx = 672809512, tf_ecx = 136531968, tf_eax = 3,
  tf_trapno = 0, tf_err = 2, tf_eip = 672818732, tf_cs = 31, tf_eflags
= 659, tf_esp = -1077947040, tf_ss = 47})
at ../../i386/i386/trap.c:1055



Crash #2:
=
#0  boot (howto=256) at ../../kern/kern_shutdown.c:304
#1  0xc014e194 in poweroff_wait (junk=0xc02b9480, howto=-873472000) at
../../kern/kern_shutdown.c:554
#2  0xc022d064 in vm_fault (map=0xc031ee28, vaddr=3421495296, fault_type=1
'\001', fault_flags=0) at ../../vm/vm_fault.c:240
#3  0xc02810d2 in trap_pfault (frame=0xcbe0ccc4, usermode=0,
eva=3421498452) at ../../i386/i386/trap.c:788
#4  0xc0280d37 in trap (frame={tf_fs = -874512368, tf_es = -874512368,
tf_ds = 16, tf_edi = -874459817, tf_esi = -874459788,
  tf_ebp = -874459564, tf_isp = -874459920, tf_ebx = -873997056,
tf_edx = 0, tf_ecx = -873997056, tf_eax = -873469064,
  tf_trapno = 12, tf_err = 0, tf_eip = -1072160861, tf_cs = 8,
tf_eflags = 66118, tf_esp = 0, tf_ss = 0})
at ../../i386/i386/trap.c:423
#5  0xc0181fa3 in procfs_dostatus (curp=0xcbd7df20, p=0xcbe7dd00,
pfs=0xc154ac20, uio=0xcbe0ceec)
at ../../miscfs/procfs/procfs_status.c:115
#6  0xc0182590 in procfs_rw (ap=0xcbe0cea0) at
../../miscfs/procfs/procfs_subr.c:277
#7  0xc017dc0a in vn_read (fp=0xc1469200, uio=0xcbe0ceec, cred=0xc153d180,
flags=0, p=0xcbd7df20) at vnode_if.h:334
#8  0xc015ac50 in dofileread (p=0xcbd7df20, fp=0xc1469200, fd=5,
buf=0x8253000, nbyte=4096, offset=-1, flags=0)
at ../../sys/file.h:140
#9  0xc015ab57 in read (p=0xcbd7df20, uap=0xcbe0cf80) at
../../kern/sys_generic.c:111
#10 0xc028167e in syscall (frame={tf_fs = 47, tf_es = 47, tf_ds = 47,
tf_edi = -1077945828, tf_esi = 136638564,
  tf_ebp = -1077946004, tf_isp = -874459180, tf_ebx = 672858084,
tf_edx = 672809512, tf_ecx = 136654848, tf_eax = 3,
  tf_trapno = 0, tf_err = 2, tf_eip = 672818732, tf_cs = 31, tf_eflags
= 663, tf_esp = -1077946048, tf_ss = 47})
at ../../i386/i386/trap.c:1055
#11 0xc0276646 in Xint0x80_syscall ()




To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: repost of procfs crashes in -CURRENT (no html)..

2000-02-18 Thread Poul-Henning Kamp


The real solution is to make killall(1)s funtionality part of kill(1)
and avoid reading /proc so that we don't even have to mount /proc.

Poul-Henning

In message [EMAIL PROTECTED], 
Thomas Stromberg writes:

3 users. One with X running me, and two users running breakwidgets
binary testing script, which make use of a minimized version of the
"killall" perl script which reads procfs. 

This crash appears to be the old one where when two processes read procfs
simultaneously, ugly things can happen.

--
Poul-Henning Kamp FreeBSD coreteam member
[EMAIL PROTECTED]   "Real hackers run -current on their laptop."
FreeBSD -- It will take a long time before progress goes too far!


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message