Re: Speaking of ship blockers for 9....
Gleb Smirnoff wrote: > I> Jul 22 08:54:25 brane kernel: pf: state key linking mismatch! dir=OUT, if= > I> tun0, stored af=2, a0: 10.0.2.220:60985, a1: 192.41.162.30:53, proto=17, > I> found af=2, a0: 41.154.2.53:1701, a1: 41.133.165.161:59051, proto=17. > > Let me give you link to my branch of pf: > > http://lists.freebsd.org/pipermail/freebsd-pf/2012-June/006643.html > http://lists.freebsd.org/pipermail/freebsd-pf/2012-June/006662.html > > In that branch the code that puts the "reverse" pointer on state keys, > as well as the m_addr_changed() function and the pf_compare_state_keys() > had been cut away. > > So, this exact bug definitely can't be reproduced there. However, others > may hide in :) > > Let me encourage you to try and test my branch (instructions in URLs > above). I do see much better performance, however, I'm seeing this panic after about 23 minutes (the slightly higher uptime was a result of a manual fsck). This system is not particularly loaded. It's a UP Pentium-m which is our office gateway. I can give you access to inspect if you like. Fatal trap 12: page fault while in kernel mode fault virtual address = 0x0 fault code = supervisor write, page not present instruction pointer = 0x20:0xc046f8f4 stack pointer = 0x28:0xeb7b7bd8 frame pointer = 0x28:0xeb7b7bec code segment= base 0x0, limit 0xf, type 0x1b = DPL 0, pres 1, def32 1, gran 1 processor eflags= interrupt enabled, resume, IOPL = 0 current process = 4 (pf purge) trap number = 12 panic: page fault KDB: stack backtrace: db_trace_self_wrapper(c0819c2b,eb7b7a78,c05d5829,c0816ff2,c08acca0,...) at db_trace_self_wrapper+0x26 kdb_backtrace(c0816ff2,c08acca0,c07f2736,eb7b7a84,eb7b7a84,...) at kdb_backtrace+0x29 panic(c07f2736,c0845a85,c559fd68,1,1,...) at panic+0xc9 trap_fatal(0,c60c826c,c610b31c,c610ac44,8,...) at trap_fatal+0x353 trap_pfault(eb7b7b18,c05c0a2d,c0ecc500,c0ecc608,c54ec000,...) at trap_pfault+0xd9 trap(eb7b7b98) at trap+0x418 calltrap() at calltrap+0x6 --- trap 0xc, eip = 0xc046f8f4, esp = 0xeb7b7bd8, ebp = 0xeb7b7bec --- pf_state_key_detach(eb7b7c18,c046af2a,502a6f69,0,8000,...) at pf_state_key_detach+0x74 pf_detach_state(c64d5d00,0,8000,0,c559fbc0,...) at pf_detach_state+0x1c6 pf_unlink_state(c64d5d00,1,0,0,c0870398,...) at pf_unlink_state+0x1c5 pf_purge_expired_states(c08947c0,0,0,c07eadbf,64,...) at pf_purge_expired_states+0xe6 pf_purge_thread(0,eb7b7d08,0,c54ec000,0,...) at pf_purge_thread+0x14f fork_exit(c0471b60,0,eb7b7d08) at fork_exit+0xa2 fork_trampoline() at fork_trampoline+0x8 --- trap 0, eip = 0, esp = 0xeb7b7d40, ebp = 0 --- Uptime: 57m29s Physical memory: 2038 MB Dumping 189 MB: 174 158 142 126 110 94 78 62 46 30 14 (kgdb) bt #0 doadump (textdump=1) at pcpu.h:249 #1 0xc05d563a in kern_reboot (howto=260) at /usr/src.pflock/sys/kern/kern_shutdown.c:449 #2 0xc05d5888 in panic (fmt=Variable "fmt" is not available.) at /usr/src.pflock/sys/kern/kern_shutdown.c:637 #3 0xc07b8b23 in trap_fatal (frame=0xeb7b7b98, eva=0) at /usr/src.pflock/sys/i386/i386/trap.c:1028 #4 0xc07b8c09 in trap_pfault (frame=0xeb7b7b98, usermode=0, eva=0) at /usr/src.pflock/sys/i386/i386/trap.c:881 #5 0xc07b9a58 in trap (frame=dwarf2_read_address: Corrupted DWARF expression.) at /usr/src.pflock/sys/i386/i386/trap.c:552 #6 0xc07a579c in calltrap () at /usr/src.pflock/sys/i386/i386/exception.s:169 #7 0xc046f8f4 in pf_state_key_detach (s=0xc64d5d00, idx=1) at /usr/src.pflock/sys/contrib/pf/net/pf.c:1040 #8 0xc04713f6 in pf_detach_state (s=0xc64d5d00) at /usr/src.pflock/sys/contrib/pf/net/pf.c:1006 #9 0xc0471975 in pf_unlink_state (s=0xc64d5d00, flags=Variable "flags" is not available.) at /usr/src.pflock/sys/contrib/pf/net/pf.c:1520 #10 0xc0471a96 in pf_purge_expired_states (maxcheck=148) at /usr/src.pflock/sys/contrib/pf/net/pf.c:1573 #11 0xc0471caf in pf_purge_thread (v=0x0) at /usr/src.pflock/sys/contrib/pf/net/pf.c:1371 #12 0xc05a5af2 in fork_exit (callout=0xc0471b60 , arg=0x0, frame=0xeb7b7d08) at /usr/src.pflock/sys/kern/kern_fork.c:995 #13 0xc07a5814 in fork_trampoline () at /usr/src.pflock/sys/i386/i386/exception.s:276 Ian -- Ian Freislich ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: Speaking of ship blockers for 9....
Gleb Smirnoff wrote: > Let me give you link to my branch of pf: > > http://lists.freebsd.org/pipermail/freebsd-pf/2012-June/006643.html > http://lists.freebsd.org/pipermail/freebsd-pf/2012-June/006662.html > > In that branch the code that puts the "reverse" pointer on state keys, > as well as the m_addr_changed() function and the pf_compare_state_keys() > had been cut away. > > So, this exact bug definitely can't be reproduced there. However, others > may hide in :) Thanks. I'll be able to work on this next week. My system is pretty similar to yours - 16 cores, full BGP RIB, 20+ VLANs + CARP on 4*bce(4), PF+Sync, 400k+ states, NAT, tables, anchors etc. The complication is that the production system is on 8 and the pfsync is incompatible with 9 and CURRENT. And, 9/CURRENT is unuseable for me as a backup without this fix because of the state mismatch rate. Ian -- Ian Freislich ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: Speaking of ship blockers for 9....
Ian, On Tue, Aug 07, 2012 at 08:17:56PM +0200, Ian FREISLICH wrote: I> I have a problem that's been getting progressively worse as the I> source progresses. So much so that it's had me searching all the I> way from 8.0-RELEASE to 10-CURRENT without luck on both amd64 and I> i386. I> I> pf(4) erroneously mismatches state and then blocks an active flow. I> It seems that 8.X does so silently and 9 to -CURRENT do so verbosely. I> Whether silent or loud, the effect on traffic makes it impracticle I> to use FreeBSD+PF for a firewall in any setting (my use is home, I> small office, large office and moderately large datacenter core I> router). It appears that this has actually been a forever problem I> that just being tickled more now. ... I> ... I> state-mismatch2777673.6/s I> I> That's 277767 flows terminated in the last almost 22 hours due to I> this pf bug. (!!!) I> I> 9.1-PRERELEASE logs (as does -CURRENT): I> Jul 22 08:54:25 brane kernel: pf: state key linking mismatch! dir=OUT, if=tun0, stored af=2, a0: 10.0.2.220:60985, a1: 192.41.162.30:53, proto=17, found af=2, a0: 41.154.2.53:1701, a1: 41.133.165.161:59051, proto=17. Let me give you link to my branch of pf: http://lists.freebsd.org/pipermail/freebsd-pf/2012-June/006643.html http://lists.freebsd.org/pipermail/freebsd-pf/2012-June/006662.html In that branch the code that puts the "reverse" pointer on state keys, as well as the m_addr_changed() function and the pf_compare_state_keys() had been cut away. So, this exact bug definitely can't be reproduced there. However, others may hide in :) Let me encourage you to try and test my branch (instructions in URLs above). P.S. I plan to merge it to head at the and of August. -- Totus tuus, Glebius. ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: Speaking of ship blockers for 9....
On 08/07/12 11:43, Garrett Cooper wrote: On Aug 7, 2012, at 11:17 AM, Ian FREISLICH wrote: Garrett Cooper Is this is in 9.1 -PRERELEASE, -RELEASE (or whatever the official label is...)? If so, it seems like this would be a ship blocker. I have a problem that's been getting progressively worse as the source progresses. So much so that it's had me searching all the way from 8.0-RELEASE to 10-CURRENT without luck on both amd64 and i386. pf(4) erroneously mismatches state and then blocks an active flow. It seems that 8.X does so silently and 9 to -CURRENT do so verbosely. Whether silent or loud, the effect on traffic makes it impracticle to use FreeBSD+PF for a firewall in any setting (my use is home, small office, large office and moderately large datacenter core router). It appears that this has actually been a forever problem that just being tickled more now. Here's from my home firewall: Status: Enabled for 7 days 02:57:58 Debug: Urgent State Table Total Rate current entries 1653 searches45792251 74.4/s inserts 4283750.7/s removals 4267220.7/s ... state-mismatch 15860.0/s Here's from a moderately busy firewall: Status: Enabled for 0 days 21:40:44 Debug: Urgent State Table Total Rate current entries 122395 searches 442864168556745.4/s inserts202644593 2596.5/s removals 202522198 2595.0/s ... state-mismatch2777673.6/s That's 277767 flows terminated in the last almost 22 hours due to this pf bug. (!!!) 9.1-PRERELEASE logs (as does -CURRENT): Jul 22 08:54:25 brane kernel: pf: state key linking mismatch! dir=OUT, if=tun0, stored af=2, a0: 10.0.2.220:60985, a1: 192.41.162.30:53, proto=17, found af=2, a0: 41.154.2.53:1701, a1: 41.133.165.161:59051, proto=17. Jul 22 08:54:25 brane kernel: pf: state key linking mismatch! dir=OUT, if=tun0, stored af=2, a0: 10.0.2.220:52995, a1: 41.154.2.100:53, proto=17, found af=2, a0: 41.154.2.53:1701, a1: 41.133.165.161:59051, proto=17. Jul 22 08:54:25 brane kernel: pf: state key linking mismatch! dir=OUT, if=tun0, stored af=2, a0: 10.0.2.220:60095, a1: 206.223.136.200:53, proto=17, found af=2, a0: 41.154.2.53:1701, a1: 41.133.165.161:59051, proto=17. Jul 22 08:54:25 brane kernel: pf: state key linking mismatch! dir=OUT, if=tun0, stored af=2, a0: 10.0.2.220:50463, a1: 206.223.136.200:53, proto=17, found af=2, a0: 41.154.2.53:1701, a1: 41.133.165.161:59051, proto=17. Jul 22 08:54:25 brane kernel: pf: state key linking mismatch! dir=OUT, if=tun0, stored af=2, a0: 10.0.2.220:56748, a1: 192.41.162.30:53, proto=17, found af=2, a0: 41.154.2.53:1701, a1: 41.133.165.161:59051, proto=17. Jul 22 08:54:25 brane kernel: pf: state key linking mismatch! dir=OUT, if=tun0, stored af=2, a0: 10.0.2.220:60793, a1: 192.41.162.30:53, proto=17, found af=2, a0: 41.154.2.53:1701, a1: 41.133.165.161:59051, proto=17. Filed a PR yet with packet captures? Thanks, -Garrett___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org" I was having this problem on one machine but not another (different pf.confs). Are you using synproxy state or modulate state? Feel OK posting a basic pf.conf that experiences the issue? I feel like there was something with either scrub or synproxy I had to remove to make the hurting stop. Obviously that means something is probably borked, and I will share in the no-pr shame. Matt ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Speaking of ship blockers for 9....
Garrett Cooper > Is this is in 9.1 -PRERELEASE, -RELEASE (or whatever the official > label is...)? If so, it seems like this would be a ship blocker. I have a problem that's been getting progressively worse as the source progresses. So much so that it's had me searching all the way from 8.0-RELEASE to 10-CURRENT without luck on both amd64 and i386. pf(4) erroneously mismatches state and then blocks an active flow. It seems that 8.X does so silently and 9 to -CURRENT do so verbosely. Whether silent or loud, the effect on traffic makes it impracticle to use FreeBSD+PF for a firewall in any setting (my use is home, small office, large office and moderately large datacenter core router). It appears that this has actually been a forever problem that just being tickled more now. Here's from my home firewall: Status: Enabled for 7 days 02:57:58 Debug: Urgent State Table Total Rate current entries 1653 searches45792251 74.4/s inserts 4283750.7/s removals 4267220.7/s ... state-mismatch 15860.0/s Here's from a moderately busy firewall: Status: Enabled for 0 days 21:40:44 Debug: Urgent State Table Total Rate current entries 122395 searches 442864168556745.4/s inserts202644593 2596.5/s removals 202522198 2595.0/s ... state-mismatch2777673.6/s That's 277767 flows terminated in the last almost 22 hours due to this pf bug. (!!!) 9.1-PRERELEASE logs (as does -CURRENT): Jul 22 08:54:25 brane kernel: pf: state key linking mismatch! dir=OUT, if=tun0, stored af=2, a0: 10.0.2.220:60985, a1: 192.41.162.30:53, proto=17, found af=2, a0: 41.154.2.53:1701, a1: 41.133.165.161:59051, proto=17. Jul 22 08:54:25 brane kernel: pf: state key linking mismatch! dir=OUT, if=tun0, stored af=2, a0: 10.0.2.220:52995, a1: 41.154.2.100:53, proto=17, found af=2, a0: 41.154.2.53:1701, a1: 41.133.165.161:59051, proto=17. Jul 22 08:54:25 brane kernel: pf: state key linking mismatch! dir=OUT, if=tun0, stored af=2, a0: 10.0.2.220:60095, a1: 206.223.136.200:53, proto=17, found af=2, a0: 41.154.2.53:1701, a1: 41.133.165.161:59051, proto=17. Jul 22 08:54:25 brane kernel: pf: state key linking mismatch! dir=OUT, if=tun0, stored af=2, a0: 10.0.2.220:50463, a1: 206.223.136.200:53, proto=17, found af=2, a0: 41.154.2.53:1701, a1: 41.133.165.161:59051, proto=17. Jul 22 08:54:25 brane kernel: pf: state key linking mismatch! dir=OUT, if=tun0, stored af=2, a0: 10.0.2.220:56748, a1: 192.41.162.30:53, proto=17, found af=2, a0: 41.154.2.53:1701, a1: 41.133.165.161:59051, proto=17. Jul 22 08:54:25 brane kernel: pf: state key linking mismatch! dir=OUT, if=tun0, stored af=2, a0: 10.0.2.220:60793, a1: 192.41.162.30:53, proto=17, found af=2, a0: 41.154.2.53:1701, a1: 41.133.165.161:59051, proto=17. Ian -- Ian Freislich ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: Speaking of ship blockers for 9....
On Aug 7, 2012, at 11:17 AM, Ian FREISLICH wrote: > Garrett Cooper >>Is this is in 9.1 -PRERELEASE, -RELEASE (or whatever the official >> label is...)? If so, it seems like this would be a ship blocker. > > I have a problem that's been getting progressively worse as the > source progresses. So much so that it's had me searching all the > way from 8.0-RELEASE to 10-CURRENT without luck on both amd64 and > i386. > > pf(4) erroneously mismatches state and then blocks an active flow. > It seems that 8.X does so silently and 9 to -CURRENT do so verbosely. > Whether silent or loud, the effect on traffic makes it impracticle > to use FreeBSD+PF for a firewall in any setting (my use is home, > small office, large office and moderately large datacenter core > router). It appears that this has actually been a forever problem > that just being tickled more now. > > Here's from my home firewall: > Status: Enabled for 7 days 02:57:58 Debug: Urgent > > State Table Total Rate > current entries 1653 > searches45792251 74.4/s > inserts 4283750.7/s > removals 4267220.7/s > ... > state-mismatch 15860.0/s > > > Here's from a moderately busy firewall: > Status: Enabled for 0 days 21:40:44 Debug: Urgent > > State Table Total Rate > current entries 122395 > searches 442864168556745.4/s > inserts202644593 2596.5/s > removals 202522198 2595.0/s > ... > state-mismatch2777673.6/s > > That's 277767 flows terminated in the last almost 22 hours due to > this pf bug. (!!!) > > 9.1-PRERELEASE logs (as does -CURRENT): > Jul 22 08:54:25 brane kernel: pf: state key linking mismatch! dir=OUT, > if=tun0, stored af=2, a0: 10.0.2.220:60985, a1: 192.41.162.30:53, proto=17, > found af=2, a0: 41.154.2.53:1701, a1: 41.133.165.161:59051, proto=17. > Jul 22 08:54:25 brane kernel: pf: state key linking mismatch! dir=OUT, > if=tun0, stored af=2, a0: 10.0.2.220:52995, a1: 41.154.2.100:53, proto=17, > found af=2, a0: 41.154.2.53:1701, a1: 41.133.165.161:59051, proto=17. > Jul 22 08:54:25 brane kernel: pf: state key linking mismatch! dir=OUT, > if=tun0, stored af=2, a0: 10.0.2.220:60095, a1: 206.223.136.200:53, proto=17, > found af=2, a0: 41.154.2.53:1701, a1: 41.133.165.161:59051, proto=17. > Jul 22 08:54:25 brane kernel: pf: state key linking mismatch! dir=OUT, > if=tun0, stored af=2, a0: 10.0.2.220:50463, a1: 206.223.136.200:53, proto=17, > found af=2, a0: 41.154.2.53:1701, a1: 41.133.165.161:59051, proto=17. > Jul 22 08:54:25 brane kernel: pf: state key linking mismatch! dir=OUT, > if=tun0, stored af=2, a0: 10.0.2.220:56748, a1: 192.41.162.30:53, proto=17, > found af=2, a0: 41.154.2.53:1701, a1: 41.133.165.161:59051, proto=17. > Jul 22 08:54:25 brane kernel: pf: state key linking mismatch! dir=OUT, > if=tun0, stored af=2, a0: 10.0.2.220:60793, a1: 192.41.162.30:53, proto=17, > found af=2, a0: 41.154.2.53:1701, a1: 41.133.165.161:59051, proto=17. Filed a PR yet with packet captures? Thanks, -Garrett___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"