Re: HEADS UP/STATUS: network locking

2003-09-17 Thread Sam Leffler
> On Tue, Sep 16, 2003 at 09:29:07AM -0700, Sam Leffler wrote:
>> 
>> Please send me your kernel config and tell me again exactly what fails.
>> I will try to reproduce your problem.
>> 
>>  Sam
> After your yesterday/today commits, I got panic while doing netstat -an.
> On the kernel from about two days ago, with manually added patches, the
> netstat command render system unusable (with netstat process in LOCK
> state, or, in other cases - (swi8: tty:sio clock) process in LOCK state).

You cannot mix+match the commits and the patches.  You also, if I recall,
were only applying some of the patches and not all of them.  I'm not sure
this can work.  I haven't looked at the config you sent me; will try today.
I think that unless you can track my changes through p4 it may be
problematic using the patches.  I'll see about updating the patches based
on the current CVS.

Sam

___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: HEADS UP/STATUS: network locking

2003-09-17 Thread Wiktor Niesiobedzki
On Tue, Sep 16, 2003 at 09:29:07AM -0700, Sam Leffler wrote:
> 
> Please send me your kernel config and tell me again exactly what fails.  I
> will try to reproduce your problem.
> 
>   Sam
After your yesterday/today commits, I got panic while doing netstat -an. On
the kernel from about two days ago, with manually added patches, the netstat
command render system unusable (with netstat process in LOCK state, or, in
other cases - (swi8: tty:sio clock) process in LOCK state). System has:
dc0: <3Com OfficeConnect 10/100B> port 0xe400-0xe4ff mem 0xe900-0xe90003ff
irq 10 at device 18.0 on pci0
rl0:  port 0xe800-0xe8ff mem 0xe9001000-0xe90010ff
irq 12 at device 19.0 on pci0

It acts as a home router to my DSL line (over PPPoE).

If there's any other information I may provide, please let me know.

Kernel config attached

Cheers,

Wiktor Niesiobędzki

panic: page fault
panic messages:
---
Fatal trap 12: page fault while in kernel mode
fault virtual address   = 0x24
fault code  = supervisor read, page not present
instruction pointer = 0x8:0xc018a11b
stack pointer   = 0x10:0xcebaeae4
frame pointer   = 0x10:0xcebaeaf8
code segment= base 0x0, limit 0xf, type 0x1b
= DPL 0, pres 1, def32 1, gran 1
processor eflags= interrupt enabled, resume, IOPL = 0
current process = 2914 (sshd)
trap number = 12
panic: page fault

syncing disks, buffers remaining... 2236 2236

Fatal trap 12: page fault while in kernel mode
fault virtual address   = 0x24
fault code  = supervisor read, page not present
instruction pointer = 0x8:0xc018a11b
stack pointer   = 0x10:0xcd751c88
frame pointer   = 0x10:0xcd751c9c
code segment= base 0x0, limit 0xf, type 0x1b
= DPL 0, pres 1, def32 1, gran 1
processor eflags= interrupt enabled, resume, IOPL = 0
current process = 23 (irq12: rl0)
trap number = 12
panic: page fault
Uptime: 1h59m32s
Dumping 256 MB
 16 32 48 64 80 96 112 128 144 160 176 192 208 224 240

(kgdb) bt
#0  doadump () at /usr/src/sys/kern/kern_shutdown.c:240
#1  0xc0194ef0 in boot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:372
#2  0xc01952d8 in panic () at /usr/src/sys/kern/kern_shutdown.c:550
#3  0xc02a9e56 in trap_fatal (frame=0xcd751c48, eva=0) at 
/usr/src/sys/i386/i386/trap.c:818
#4  0xc02a9493 in trap (frame=
  {tf_fs = -1072037864, tf_es = 16, tf_ds = -847970288, tf_edi = 4, tf_esi = 16, 
tf_ebp = -847962980, tf_isp = -847963020, tf_ebx = 0, tf_edx = -1070828335, tf_ecx = 
-1030343792, tf_eax = 16, tf_trapno = 12, tf_err = 0, tf_eip = -1072127717, tf_cs = 8, 
tf_eflags = 66195, tf_esp = 1242790725, tf_ss = 66572650}) at 
/usr/src/sys/i386/i386/trap.c:251
#5  0xc02997a8 in calltrap () at {standard input}:102
#6  0xc018a559 in _mtx_lock_sleep (m=0x10, opts=0, file=0x0, line=0) at 
/usr/src/sys/kern/kern_mutex.c:635
#7  0xc017f014 in ithread_loop (arg=0xc0eac600) at /usr/src/sys/kern/kern_intr.c:533
#8  0xc017dcc1 in fork_exit (callout=0xc017ee50 , arg=0x0, frame=0x0) at 
/usr/src/sys/kern/kern_fork.c:796

(kgdb) fr 6
#6  0xc018a559 in _mtx_lock_sleep (m=0x10, opts=0, file=0x0, line=0) at 
/usr/src/sys/kern/kern_mutex.c:635
635 propagate_priority(td);
(kgdb) l 635
630  * Save who we're blocked on.
631  */
632 td->td_blocked = m;
633 td->td_lockname = m->mtx_object.lo_name;
634 TD_SET_LOCK(td);
635 propagate_priority(td);
636
637 if (LOCK_LOG_TEST(&m->mtx_object, opts))
638 CTR3(KTR_LOCK,
639 "_mtx_lock_sleep: p %p blocked on [%p] %s", td, m,
(kgdb) fr 4
#4  0xc02a9493 in trap (frame=
  {tf_fs = -1072037864, tf_es = 16, tf_ds = -847970288, tf_edi = 4, tf_esi = 16, 
tf_ebp = -847962980, tf_isp = -847963020, tf_ebx = 0, tf_edx = -1070828335, tf_ecx = 
-1030343792, tf_eax = 16, tf_trapno = 12, tf_err = 0, tf_eip = -1072127717, tf_cs = 8, 
tf_eflags = 66195, tf_esp = 1242790725, tf_ss = 66572650}) at 
/usr/src/sys/i386/i386/trap.c:251
251 trap_fatal(&frame, eva);
(kgdb) p/x frame.tf_eip
$1 = 0xc018a11b
(kgdb) disass 0xc018a11b
Dump of assembler code for function propagate_priority:
0xc018a090 :push   %ebp
0xc018a091 :  mov%esp,%ebp
0xc018a093 :  push   %edi
0xc018a094 :  push   %esi
0xc018a095 :  push   %ebx
0xc018a096 :  sub$0x8,%esp
0xc018a099 :  mov0x8(%ebp),%ecx
0xc018a09c : movzbl 0xdd(%ecx),%esi
0xc018a0a3 : mov0x5c(%ecx),%ebx
0xc018a0a6 : lea0x0(%esi),%esi
0xc018a0a9 : lea0x0(%edi,1),%edi
0xc018a0b0 : mov0x1c(%ebx),%eax
0xc018a0b3 : mov$0x0,%ecx
0xc018a0b8 : cmp$0x4,%eax
0xc018a0bb : je 0xc018a0c5 
0xc018a0bd : mov0x1c(%ebx),%eax
0xc018a0c0 : mov%eax,%ecx
0xc

HEADS UP/STATUS: network locking

2003-09-05 Thread Sam Leffler
I've committed a number of changes to lock the "middlware" parts of the
network subsystem.  There's still more to come; I'm moving slowly to insure
each batch gets exposure.  All the pending changes can be found at:

http://www.freebsd.org/~sam

The major changes that will go in next week are: bridge, dummynet, ipfw,
multicast routing (mroute), and the routing table (rtentry).  I've been
running with all these mods on a variety of machines (desktop, NFS server,
laptop, firewall) for weeks but testing everything is difficult so don't be
surprised if you encounter issues like lock order reversals.  Each patch is
pretty much independent so if you regularly use say dummynet then it would
be useful to try the patch and send me feedback.

In the above directory you'll also find the first tangible benefit of this
work: netisr.patch contains changes to "push Giant up" one level.  Note
however that unless the network drivers mark their interrupt handlers
MPSAFE you're not really going to exercise the locking.  I've been running
em, sis, fxp, wi, and ath drivers this way for several months with no ill
effects (except for a problem running Atheros hardware in HostAP mode).
Many other drivers are locked and appear ready to run MPSAFE.

Shortly I'll have Giant pushed all the way up through the INET protocols.
When that happens it'll be time to remove Giant from the socket layer and
lock IPv6 and UNIX domain sockets.  At some point we'll need to switch over
to a non-Giant top half; at that point drivers and protocols that are not
properly locked will need help or be left behind.

Note that the current plan is to NOT commit any changes to remove Giant
from the socket layer until after 5.2.  Folks interested in trying this
stuff will need to track the work in perforce or apply patches that I'll
make available at "stable points".

If folks want to talk about this work at BSDCon I'll be around Wed-Fri.
I'll also be at the developers summit on Saturday.

Sam

___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "[EMAIL PROTECTED]"