[Bug 246951] carp(4): Active CARP member crashes: panic, trap_pfault, ip_input || ip_output when using ipSec, AES-NI (on Intel I350)
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=246951 Mark Johnston changed: What|Removed |Added Flags|mfc-stable12?, |mfc-stable12-, |mfc-stable11? |mfc-stable11+ --- Comment #20 from Mark Johnston --- Note, the change was intentionally not merged to stable/12, as the bug does not exist there. In particular, if you are seeing these panics on a stable/12-based branch, then the culprit lies elsewhere. -- You are receiving this mail because: You are on the CC list for the bug. ___ freebsd-net@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
[Bug 246951] carp(4): Active CARP member crashes: panic, trap_pfault, ip_input || ip_output when using ipSec, AES-NI (on Intel I350)
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=246951 Mark Johnston changed: What|Removed |Added Status|Open|Closed Assignee|n...@freebsd.org |ma...@freebsd.org Resolution|--- |FIXED --- Comment #19 from Mark Johnston --- I'm going to resolve this for now since I suspect that the patch fixes the original bug. I'll give a heads-up to the pfsense folks so that it can be integrated there. -- You are receiving this mail because: You are the assignee for the bug. You are on the CC list for the bug. ___ freebsd-net@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
[Bug 246951] carp(4): Active CARP member crashes: panic, trap_pfault, ip_input || ip_output when using ipSec, AES-NI (on Intel I350)
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=246951 --- Comment #18 from commit-h...@freebsd.org --- A commit references this bug: Author: markj Date: Wed Jul 8 13:40:28 UTC 2020 New revision: 363009 URL: https://svnweb.freebsd.org/changeset/base/363009 Log: MFC r362840: Fix a possible refcount leak when handling IPSec traffic. PR: 246951 Changes: stable/11/sys/netinet/ip_input.c -- You are receiving this mail because: You are the assignee for the bug. You are on the CC list for the bug. ___ freebsd-net@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
[Bug 246951] carp(4): Active CARP member crashes: panic, trap_pfault, ip_input || ip_output when using ipSec, AES-NI (on Intel I350)
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=246951 --- Comment #17 from commit-h...@freebsd.org --- A commit references this bug: Author: markj Date: Wed Jul 1 15:42:49 UTC 2020 New revision: 362840 URL: https://svnweb.freebsd.org/changeset/base/362840 Log: Fix a possible next-hop refcount leak when handling IPSec traffic. It may be possible to fix this by deferring the lookup, but let's keep the initial change simple to make MFCs easier. PR: 246951 Reviewed by: melifaro MFC after:1 week Sponsored by: The FreeBSD Foundation Differential Revision:https://reviews.freebsd.org/D25519 Changes: head/sys/netinet/ip_input.c -- You are receiving this mail because: You are the assignee for the bug. You are on the CC list for the bug. ___ freebsd-net@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
[Bug 246951] carp(4): Active CARP member crashes: panic, trap_pfault, ip_input || ip_output when using ipSec, AES-NI (on Intel I350)
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=246951 --- Comment #16 from Mark Johnston --- (In reply to igor-fbsdbugs from comment #14) Oops, I did not realize that there are multiple reporters. I posted a patch for HEAD here: https://reviews.freebsd.org/D25519 . There is an analogous bug in stable/11 and 12. If forwarding is enabled (default in pfsense) and ipsec is in use, then it is possible that this bug is responsible for the crashes I was looking at in comment 1. (In reply to Ryan Moeller from comment #15) If you're able to use DTrace to track ifa refs, then you might try looking for leaks as suggested in comment 10. Your case sounds like a different bug though. -- You are receiving this mail because: You are the assignee for the bug. You are on the CC list for the bug. ___ freebsd-net@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
[Bug 246951] carp(4): Active CARP member crashes: panic, trap_pfault, ip_input || ip_output when using ipSec, AES-NI (on Intel I350)
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=246951 Ryan Moeller changed: What|Removed |Added CC||freql...@freebsd.org --- Comment #15 from Ryan Moeller --- I have a FreeNAS bug report where ifa_addr is NULL on 11.3, too. I was suspecting some local patches we have for dealing with various IGMP / multicast issues, but I did not find anything in auditing those patches. I haven't found any way to explain ifa_addr being NULL in an ifma that is still linked into a list. In my user's case the panic is cropping up in igmp_fasttimo. Though the circumstances are different, I wonder if these may be caused by the same underlying problem. One thing I have thought to double-check next is making sure that the maddr locks match up with the correct ifp everywhere the list is modified. -- You are receiving this mail because: You are the assignee for the bug. You are on the CC list for the bug. ___ freebsd-net@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
[Bug 246951] carp(4): Active CARP member crashes: panic, trap_pfault, ip_input || ip_output when using ipSec, AES-NI (on Intel I350)
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=246951 --- Comment #14 from igor-fbsdb...@grinchenko.org --- sorry, it is not clear whom you are referring to. we might be dealing with two(or more) distinct bugs here. in my case ipv4 forwarding is enabled (net.inet.ip.forwarding=1) and I see ip_forward() in 50% of the backtraces I get when pfsense crashes. -- You are receiving this mail because: You are the assignee for the bug. You are on the CC list for the bug. ___ freebsd-net@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
[Bug 246951] carp(4): Active CARP member crashes: panic, trap_pfault, ip_input || ip_output when using ipSec, AES-NI (on Intel I350)
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=246951 --- Comment #13 from Mark Johnston --- (In reply to freebsd-bugzilla from comment #12) I see at least one such bug in ip_forward() in stable/11: in the IPSEC_ENABLED case there is a missing ifa_free() call. Do you have IPv4 forwarding enabled, or do you use any forward actions in your firewall rule set. -- You are receiving this mail because: You are the assignee for the bug. You are on the CC list for the bug. ___ freebsd-net@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
[Bug 246951] carp(4): Active CARP member crashes: panic, trap_pfault, ip_input || ip_output when using ipSec, AES-NI (on Intel I350)
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=246951 --- Comment #12 from freebsd-bugzilla@biscuit.ninja --- Hello, Just to add we haven't seen a crash for 28 days. We have switched out IpSec for OpenVPN. The issue looks to be fairly conclusively IpSec related. Thanks -- You are receiving this mail because: You are the assignee for the bug. You are on the CC list for the bug. ___ freebsd-net@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
[Bug 246951] carp(4): Active CARP member crashes: panic, trap_pfault, ip_input || ip_output when using ipSec, AES-NI (on Intel I350)
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=246951 igor-fbsdb...@grinchenko.org changed: What|Removed |Added CC||igor-fbsdb...@grinchenko.or ||g --- Comment #11 from igor-fbsdb...@grinchenko.org --- unfortunately dtrace is not available on pfsense. is there another way to get the refcounts? I can confirm, however, that it does take a few days for this crash to happen. usually within 4-8 days, but never immediately. I have a handful of crash backtraces from previous crashes. I will paste the first 3 few lines from each. let me know if you need the entire dumps: Tracing pid 12 tid 100285 td 0xf8000ccf6620 ip_input() at ip_input+0x60e/frame 0xfe0451afee70 netisr_dispatch_src() at netisr_dispatch_src+0xa8/frame 0xfe0451afeec0 ether_demux() at ether_demux+0x173/frame 0xfe0451afeef0 Tracing pid 12 tid 100263 td 0xf8000c88f620 ip_output() at ip_output+0x1418/frame 0xfe0451a15a60 ipsec_process_done() at ipsec_process_done+0x1c8/frame 0xfe0451a15ab0 esp_output_cb() at esp_output_cb+0xeb/frame 0xfe0451a15b10 db:0:kdb.enter.default> bt Tracing pid 12 tid 100283 td 0xf8000c67b000 ip_output() at ip_output+0x1418/frame 0xfe0451af4a60 ipsec_process_done() at ipsec_process_done+0x1c8/frame 0xfe0451af4ab0 esp_output_cb() at esp_output_cb+0xeb/frame 0xfe0451af4b10 Tracing pid 12 tid 100270 td 0xf8000c8cf000 ip_input() at ip_input+0x60e/frame 0xfe0451a38e70 netisr_dispatch_src() at netisr_dispatch_src+0xa8/frame 0xfe0451a38ec0 ether_demux() at ether_demux+0x173/frame 0xfe0451a38ef0 Tracing pid 12 tid 100267 td 0xf8000c3f3620 ip_input() at ip_input+0x60e/frame 0xfe0451a29e70 netisr_dispatch_src() at netisr_dispatch_src+0xa8/frame 0xfe0451a29ec0 ether_demux() at ether_demux+0x173/frame 0xfe0451a29ef0 Tracing pid 18 tid 100138 td 0xf800085d1620 ip_output() at ip_output+0x1418/frame 0xfe044f4181c0 ipsec_process_done() at ipsec_process_done+0x1c8/frame 0xfe044f418210 esp_output_cb() at esp_output_cb+0xeb/frame 0xfe044f418270 Tracing pid 12 tid 100264 td 0xf8000c15f000 in_broadcast() at in_broadcast+0x43/frame 0xfe0451a1ac40 ip_output() at ip_output+0x7be/frame 0xfe0451a1ad70 ip_forward() at ip_forward+0x2b5/frame 0xfe0451a1ae10 our pfsense host is relatively busy pushing about 300mbit of traffic through, with about half of it to the ipsec tunnel. -- You are receiving this mail because: You are on the CC list for the bug. You are the assignee for the bug. ___ freebsd-net@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
[Bug 246951] carp(4): Active CARP member crashes: panic, trap_pfault, ip_input || ip_output when using ipSec, AES-NI (on Intel I350)
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=246951 --- Comment #10 from Mark Johnston --- (In reply to freebsd-bugzilla from comment #9) So from the panic sites it looks like a linked ifaddr structure has been freed. (Or zeroed, but I cannot see any code which does that.) One possible cause that we can try to rule out is a refcount overflow. You said that the uptime between panics is at least several days - is that consistently the case? In other words, have you ever seen back-to-back crashes in a short timeframe? I'm not sure if pfsense enables DTrace, but if so we can try to probe ifa_ref(), which is helpfully not inlined. Please try this command (you might need to kldload dtraceall first): # dtrace -n 'fbt::ifa_ref:entry {@[args[0]->ifa_refcnt] = count();} tick-300s {exit(0);}' It will sleep for 300s and dump a table of reference counts. -- You are receiving this mail because: You are the assignee for the bug. You are on the CC list for the bug. ___ freebsd-net@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
[Bug 246951] carp(4): Active CARP member crashes: panic, trap_pfault, ip_input || ip_output when using ipSec, AES-NI (on Intel I350)
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=246951 --- Comment #9 from freebsd-bugzilla@biscuit.ninja --- (In reply to Kubilay Kocak from comment #4) Thank you. I've attached dmesg.boot output. This is pfSense so there is no rc.conf. I've attached instead: - ifconfig output - interfaces and LAGGs specified in pfSense config.xml - sysctl tunables specified in pfSense config.xml The uptime between crashes varies between 2 and 30 days. It does not seem to correlate to any specific event that we are aware of or even peak throughput. The only additional package installed on these firewalls is NRPE. In terms of workload: - HTTP/s traffic too and from customers - TCP load blancing of customer HTTP/s with 10 pools, 4 virtual servers per pool. Total of around 1.5 - 2 million active sessions - ipSec site-to-site tunnel for replication to our standby data centre - CARP / pfSync with bandwith/packet rates of 22-80 Mb/s, 2-8 Kpps - AES-NI enabled for IpSec (AES256-GCM) The firewalls are handling: - 20-45 Mb/s (13-45 Kpps) inbound ipSec - 30-150 Mb/s (14-55 Kpps) outbound ipSec - 20-90 Mb/s (15-60 Kpps) inbound IP traffic - 50-250 Mb/s (15-60 Kpps) outbound IP traffic - 30-90k states - ~66k Mbuf Clusters utilised (out of 1M total) The only other thing of note, that I can think of, is that we have a Cassandra cluster replicating over the IpSec tunnel. That's around 256 constantly changing states as data is replicated from one data centre to another. We have now disabled IpSec and switched to OpenVPN for the site-to-site VPN, in order to see whether the crash is reproducable without IpSec Additionally, I had setup a couple of FreeBSD 11.3 VMs with a site-to-site IpSec connection. I had continuous iperf running over the tunnel for 7 days without issue. If there is any further information that I can provide, or anything I can do to assist, please don't hesitate. -- You are receiving this mail because: You are on the CC list for the bug. You are the assignee for the bug. ___ freebsd-net@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
[Bug 246951] carp(4): Active CARP member crashes: panic, trap_pfault, ip_input || ip_output when using ipSec, AES-NI (on Intel I350)
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=246951 --- Comment #8 from freebsd-bugzilla@biscuit.ninja --- Created attachment 215355 --> https://bugs.freebsd.org/bugzilla/attachment.cgi?id=215355&action=edit sysctl tunables from pfSense config.xml -- You are receiving this mail because: You are the assignee for the bug. You are on the CC list for the bug. ___ freebsd-net@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
[Bug 246951] carp(4): Active CARP member crashes: panic, trap_pfault, ip_input || ip_output when using ipSec, AES-NI (on Intel I350)
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=246951 --- Comment #7 from freebsd-bugzilla@biscuit.ninja --- Created attachment 215354 --> https://bugs.freebsd.org/bugzilla/attachment.cgi?id=215354&action=edit interfaces and LAGGs specified in pfSense config.xml -- You are receiving this mail because: You are the assignee for the bug. You are on the CC list for the bug. ___ freebsd-net@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
[Bug 246951] carp(4): Active CARP member crashes: panic, trap_pfault, ip_input || ip_output when using ipSec, AES-NI (on Intel I350)
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=246951 --- Comment #6 from freebsd-bugzilla@biscuit.ninja --- Created attachment 215353 --> https://bugs.freebsd.org/bugzilla/attachment.cgi?id=215353&action=edit output from the ifconfig command with public ips obfuscated -- You are receiving this mail because: You are on the CC list for the bug. You are the assignee for the bug. ___ freebsd-net@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
[Bug 246951] carp(4): Active CARP member crashes: panic, trap_pfault, ip_input || ip_output when using ipSec, AES-NI (on Intel I350)
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=246951 --- Comment #5 from freebsd-bugzilla@biscuit.ninja --- Created attachment 215352 --> https://bugs.freebsd.org/bugzilla/attachment.cgi?id=215352&action=edit dmesg.boot written out following the last kernel panic -- You are receiving this mail because: You are the assignee for the bug. You are on the CC list for the bug. ___ freebsd-net@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
[Bug 246951] carp(4): Active CARP member crashes: panic, trap_pfault, ip_input || ip_output when using ipSec, AES-NI (on Intel I350)
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=246951 Kubilay Kocak changed: What|Removed |Added Summary|Regular crash: panic, |carp(4): Active CARP member |trap_pfault, ip_input |||crashes: panic, |ip_output using ipSec, |trap_pfault, ip_input || |AES-NI & CARP |ip_output when using ipSec, ||AES-NI (on Intel I350) URL||https://forum.netgate.com/t ||opic/151329/pfsense-active- ||carp-member-crashed-aesni_p ||rocess-crypto_dispatch/12 Status|New |Open Flags||mfc-stable12?, ||mfc-stable11? CC||n...@freebsd.org Keywords|panic |crash, needs-qa, regression --- Comment #4 from Kubilay Kocak --- @Reporter Could you please add additional information, including: - /var/run/dmesg.boot output (as an attachment) - /etc/rc.conf network configuration (minimal reproducer configuration, sanitised where necessary, as an attachment) - Details of reproducibility changes isolating any of the components of the configuration (ipsec, different interface, not incl lagg(4) etc) - Any indications of events/traffic/workloads that might be relevent to triggering the crash -- You are receiving this mail because: You are on the CC list for the bug. You are the assignee for the bug. ___ freebsd-net@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"