SunOS nat04 5.9 Generic_117172-12 i86pc i386 i86pc pfil-2.1.4 ip_fil4.3next
First panic:
Dec 22 16:07:21 nat04 ^Mpanic[cpu1]/thread=f1082e40: Dec 22 16:07:21 nat04 unix: [ID 683369 kern.notice] BAD TRAP: type=e (Page fault (#pf)) rp=f1082d6c addr=ddaae21a Dec 22 16:07:21 nat04 unix: [ID 100000 kern.notice] Dec 22 16:07:21 nat04 unix: [ID 839527 kern.notice] sched: Dec 22 16:07:21 nat04 unix: [ID 702911 kern.notice] Page fault (#pf) Dec 22 16:07:21 nat04 unix: [ID 532287 kern.notice] Bad kernel fault at addr=0xd daae21a Dec 22 16:07:21 nat04 unix: [ID 764221 kern.notice] pid=0, pc=0xfe9dc212, sp=0xf 1082e40, eflags=0x10282 Dec 22 16:07:21 nat04 unix: [ID 827398 kern.notice] Dec 22 16:07:21 nat04 eip(fe9dc212), eflags(10282), ebp(f1082db8), uesp(f1082e40 ), esp(f1082d9c) Dec 22 16:07:21 nat04 unix: [ID 358493 kern.notice] eax(ddaae21a), ebx(0), ecx(f e9df326), edx(f1082e40), esi(fed89c5c), edi(ddaae21a) Dec 22 16:07:21 nat04 unix: [ID 788268 kern.notice] cr0(8005003b<pg,wp,ne,et,ts, mp,pe>) Dec 22 16:07:21 nat04 unix: [ID 874142 kern.notice] cr2(ddaae21a), cr3(48f4000) Dec 22 16:07:21 nat04 unix: [ID 804652 kern.notice] cr4(6d0<xmme,fxsr,pge,mce,ps e>) Dec 22 16:07:21 nat04 unix: [ID 720544 kern.notice] cs(158) ds(160) ss(a5e0) es( 160) fs(1a8) gs(1b0) Dec 22 16:07:21 nat04 unix: [ID 100000 kern.notice] Dec 22 16:07:21 nat04 genunix: [ID 353471 kern.notice] f1082ce0 unix:die+ac (e, f1082d6c, ddaae21a, 1, ddaae21a) Dec 22 16:07:21 nat04 genunix: [ID 353471 kern.notice] f1082d58 unix:trap+c36 (f 1082d6c, ddaae21a, 1, 1b0) Dec 22 16:07:21 nat04 genunix: [ID 353471 kern.notice] f1082d6c unix:_cmntrap+94 (160, f1080160, ddaae21a) Dec 22 16:07:21 nat04 genunix: [ID 353471 kern.notice] f1082db8 arp:ar_ce_walk+1 a () Dec 22 16:07:21 nat04 genunix: [ID 353471 kern.notice] f1082dd0 arp:ar_wsrv+118 (d2862c0c, f1082de8) Dec 22 16:07:21 nat04 genunix: [ID 353471 kern.notice] f1082ddc genunix:runservi ce+34 (d2862c0c) Dec 22 16:07:21 nat04 genunix: [ID 353471 kern.notice] f1082de8 genunix:queue_se rvice+3a (d2862c0c) Dec 22 16:07:21 nat04 genunix: [ID 353471 kern.notice] f1082df8 genunix:stream_s ervice+78 (d20e0008) Dec 22 16:07:21 nat04 genunix: [ID 353471 kern.notice] f1082e30 genunix:taskq_d_ thread+7b (d3ca8b00)
nat04:/var/crash/nat01# adb -k unix.2 vmcore.2 physmem 7f7e5 $c ar_ce_walk+0x1a(fe9df326, 0) ar_wsrv+0x118(d2862c0c) runservice+0x34(d2862c0c) queue_service+0x3a(d2862c0c) stream_service+0x78(d20e0008) taskq_d_thread+0x7b()
0xd35f49b0: BAD TRAP: type=e (Page fault (#pf)) rp=f1082d6c addr=ddaae21a 0xd35f49b0: BAD TRAP: type=e (Page fault (#pf)) rp=f1082d6c addr=ddaae21a 0xd35f0830:
0xd3cb89b0: sched: 0xd35f1070: Page fault (#pf) 0xd2097a30: Bad kernel fault at addr=0xddaae21a 0xd20d50b0: pid=0, pc=0xfe9dc212, sp=0xf1082e40, eflags=0x10282 0xd20d6370: eip(fe9dc212), eflags(10282), ebp(f1082db8), uesp(f1082e40), esp(f1082d9c) 0xd1cfdb70: eax(ddaae21a), ebx(0), ecx(fe9df326), edx(f1082e40), esi(fed89c5 c), edi(ddaae21a) 0xd2090230: cr0(8005003b<pg,wp,ne,et,ts,mp,pe>) 0xd2cbf970: cr2(ddaae21a), cr3(48f4000) 0xd18f1cb0: cr4(6d0<xmme,fxsr,pge,mce,pse>) 0xd20d9a30: cs(158) ds(160) ss(a5e0) es(160) fs(1a8) gs(1b0) 0xd1cf9430: 0xd3ca4233: f1082ce0 unix:die+ac (e, f1082d6c, ddaae21a, 1, ddaae21a) 0xd35f2eb3: f1082d58 unix:trap+c36 (f1082d6c, ddaae21a, 1, 1b0) 0xd3cba673: f1082d6c unix:_cmntrap+94 (160, f1080160, ddaae21a) 0xd20a4233: f1082db8 arp:ar_ce_walk+1a () 0xd3cbc573: f1082dd0 arp:ar_wsrv+118 (d2862c0c, f1082de8) 0xd3cbb133: f1082ddc genunix:runservice+34 (d2862c0c) 0xd35f1373: f1082de8 genunix:queue_service+3a (d2862c0c) 0xd35f1673: f1082df8 genunix:stream_service+78 (d20e0008) 0xd3cba1f3: f1082e30 genunix:taskq_d_thread+7b (d3ca8b00) 0xd2094af0: 0xd1d00c73: syncing file systems... 0xd35f11f3: done 0xd2cbf673: dumping to /dev/dsk/c0t0d0s1, offset 0, content: kernel
panic_thread/X panic_thread: panic_thread: f1082e40
f1082e40$<thread
0xf1082e40: link stk startpc
0 f1082e40 fe8fbf97
0xf1082e4c: bound_cpu affinitycnt bind_cpu
d1d5a198 0 -1
0xf1082e54: flag proc_flag schedflag
808 0 13
0xf1082e5a: preempt preempt_lk state
1 0 4
0xf1082e60: pri epri
60 0
0xf1082e64:
pc sp
e fe848790
0xf1082e7c: wchan0 wchan sobj_ops
0 0 0
0xf1082e88: cid clfuncs cldata
0 fec35f44 0
0xf1082e94: ctx lofault onfault
0 0 0
0xf1082ea0: ontrap swap lock
fec28b24 f1081000 1
0xf1082eaa: pil pi_lock cpu
0 0 d1d5a198
0xf1082eb0: lpl intr did
fec1ab6c 0 5103
0xf1082ec8: tnf_tpdp tid waitfor
d3cb2c80 0 0
0xf1082ed4: sigqueue sig hold
0 0 0
0xf1082ee8: forw back thlink
0 0 0
0xf1082ef4: lwp procp audit_data
0 fec28d30 0
0xf1082f00: next prev trace
f1098e40 f106be40 0
0xf1082f0c: whystop whatstop dslot
0 0 0
0xf1082f14: pollstate pollcache cred
0 0 d1d0ff68
0xf1082f20: start lbolt stoptime
41c91ad1 0 0
0xf1082f30: pctcpu sysnum delay_cv
0 0 0
0xf1082f38: delay_lock
0xf1082f38: owner
0 waiters
0
0xf1082f40: lockp oldspl pre_sys
d1d5a208 a 0
0xf1082f48: disp_queue disp_time kpri_req
d1d5a1e4 1211017 0
0xf1082f54: astflag sig_check post_sys
0 0 0
0xf1082f57: trapret waitrq mstate
0 0 0
0xf1082f64: rprof prioinv ts
0 0 d1d0baf8
0xf1082f70: mmuctx tsd stime
0 0 11e36d
0xf1082f7c: door plockp handoff
0 fec2a48c 0
0xf1082f88: schedctl cpupart bind_pset
0 fec1ab10 -1
0xf1082f94: copyops stkbase red_pp
default_copyops f1081000 0
0xf1082fa0: a_fd a_nfd a_stale
0 0 0
0xf1082fac: priforw priback sleepq
0 0 0
0xf1082fb8: panic_trap lgrp_affinity upimutex
f1082cd4 0 0
0xf1082fc4: nupinest proj unpark
0 d1c82528 0
0xf1082fd0: taskq joincv anttime
d1d0aa28 0 0
fec28d30$< proc2u auxv fec28f70 p0+0x2e8: start.tv_sec start.tv_nsec 41c8ed12 1474ef69 p0+0x23c: execsw ticks 0 0 p0+0x305: psargs [EMAIL PROTECTED]@[EMAIL PROTECTED]@[EMAIL PROTECTED]@[EMAIL PROTECTED]@[EMAIL PROTECTED]@[EMAIL PROTECTED]@[EMAIL PROTECTED]@[EMAIL PROTECTED]@[EMAIL PROTECTED]@[EMAIL PROTECTED]@[EMAIL PROTECTED]@[EMAIL PROTECTED]@[EMAIL PROTECTED] @[EMAIL PROTECTED]@[EMAIL PROTECTED]@[EMAIL PROTECTED]@[EMAIL PROTECTED]@[EMAIL PROTECTED]@[EMAIL PROTECTED]@[EMAIL PROTECTED]@[EMAIL PROTECTED]@[EMAIL PROTECTED]@[EMAIL PROTECTED]@[EMAIL PROTECTED]@[EMAIL PROTECTED]@[EMAIL PROTECTED]@[EMAIL PROTECTED]@[EMAIL PROTECTED]@[EMAIL PROTECTED] @[EMAIL PROTECTED]@[EMAIL PROTECTED]@[EMAIL PROTECTED]@[EMAIL PROTECTED]@[EMAIL PROTECTED]@[EMAIL PROTECTED]@[EMAIL PROTECTED]@[EMAIL PROTECTED]@^@ p0+0x2f4: comm [EMAIL PROTECTED]@[EMAIL PROTECTED]@[EMAIL PROTECTED]@[EMAIL PROTECTED]@[EMAIL PROTECTED]@[EMAIL PROTECTED]@ p0+0x358: argc argv envp 0 0 0 p0+0x364: cdir rdir mem d20e8e08 0 0 p0+0x370: cmask acflag systrap 022 02 0 entrymask fec290a8 exitmask fec290cc p0+0x3c0: signodefer sigonstack sigresethand 0 0 0 p0+0x3d8: sigrestart 0 p0+0x3e0: sigmask 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 p0+0x550: signal 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 p0+0x608: saved_lf_rlimit 0 0 p0+0x618: lock p0+0x618: owner 0
waiters
0
p0+0x628: nfiles list rlist
3 d209fd90 d1d04a90
0xd209fd90: lock
0xd209fd90: owner
0 waiters
0
0xd209fd98: file fpollinfo refcnt
0 0 0
0xd209fda4: alloc flag busy
0 0 0
0xd209fdac: wanted_cv closing_cv
0 0
0xd209fdb0: lock
0xd209fdb0: owner
0 waiters
0
0xd209fdb8: file fpollinfo refcnt
0 0 0
0xd209fdc4: alloc flag busy
0 0 0
0xd209fdcc: wanted_cv closing_cv
0 0
0xd209fdd0: lock
0xd209fdd0: owner
0 waiters
0
0xd209fdd8: file fpollinfo refcnt
0 0 0
0xd209fde4: alloc flag busy
0 0 0
0xd209fdec: wanted_cv closing_cv
0 0Jorgen Lundman wrote:
Joining the two discussions...
If you watch "snoop -v" output, does it suggest that the TCP checksum might be wrong?
I did do this, looking for that very issue of checksums, but the guide where I saw checksum being mentioned, it seemed to be about eri nic specifically, I should have tried either way of course...
set ip:dohwcksum=0
... however, this made no difference.
Meanwhile I also tried ip_fil4.1 - that was not a good idea. Instant panic, reboot and instant panic, reb... Took me a while to get a key and boot CD to fix that. Luckily it doesn't call savecore until after ipf had paniced, so /var/crash wasn't filling up.
However:
> I would be very interested to know if some changes I've made to the locking
> fix this problem.
>
> If you can do stress testing, please download:
>
> http://coombs.anu.edu.au/~avalon/ip_fil4.1next.tar.gz
This works like a charm.
210.172.128.225 -> 204.152.190.12 TCP D=22 S=55198 Syn Seq=3164144531 Len=0 Win=65535 Options=<mss 1460,nop,wscale 6,nop,nop,tstamp 0 0>
204.152.190.12 -> 210.172.128.225 TCP D=55198 S=22 Syn Ack=3164144532 Seq=2314966196 Len=0 Win=32768 Options=<mss 1460,nop,wscale 0,nop,nop,tstamp 0 0>
So NAT is working well. Now I can stress test it to see how it behaves. It could be my earlier post regarding 4.1.3 panicing was due to that it wasn't actually working, but with 12,000 rules hitting it "something" was bound to fill up.
This is a complete vanilla box, for testing. If shell access would in any help with your work then this is no problem. But I suspect you perhaps knew what was wrong since 4.1next seems to work. I will report how the stress test pans out.
Sincerely,
Lundy
-- Jorgen Lundman | <[EMAIL PROTECTED]> Unix Administrator | +81 (0)3 -5456-2687 ext 1017 (work) Shibuya-ku, Tokyo | +81 (0)90-5578-8500 (cell) Japan | +81 (0)3 -3375-1767 (home)
