Re: iflib/bridge kernel panic
On Tue, Sep 29, 2020 at 05:36:15PM -0400, Shawn Webb wrote: > On Tue, Sep 29, 2020 at 11:20:44PM +0200, Kristof Provost wrote: > > > > > > On 28 Sep 2020, at 16:44, Alexander Leidinger wrote: > > > > > Quoting Kristof Provost (from Mon, 28 Sep 2020 13:53:16 > > > +0200): > > > > > > > On 28 Sep 2020, at 12:45, Alexander Leidinger wrote: > > > > > Quoting Kristof Provost (from Sun, 27 Sep 2020 > > > > > 17:51:32 +0200): > > > > > > Here???s an early version of a task queue based approach: > > > > > > http://people.freebsd.org/~kp/0001-bridge-Cope-with-if_ioctl-s-that-sleep.patch > > > > > > > > > > > > That still needs to be cleaned up, but this should resolve > > > > > > the sleep issue and the LOR. > > > > > > > > > > There are some issues... seems like inside a jail I can't ping > > > > > systems outside of the hardware. > > > > > > > > > > Bridge setup: > > > > >- member jail A > > > > >- member jail B > > > > >- member external_if of host > > > > > > > > > > If I ping the router from the host, it works. If I ping from one > > > > > jail to another, it works. If I ping from the jail to the IP of > > > > > the external_if, it works. If I ping from a jail to the router, > > > > > I do not get a response. > > > > > > > > > Can you check for 'failed ifpromisc' error messages in dmesg? And > > > > verify that all bridge member interfaces are in promiscuous mode? > > > > > > I have a panic for you...: > > > - startup still in progress = 22 jails in startup, somewhere after a > > > few jails started the panic happened > > > - tcpdump was running on the external interface > > > - a ping to a jail IP from another system was running, the first ping > > > went through, then it paniced > > > > > > First regarding your questions about promisc mode: no error, but the > > > promisc mode is directly disabled again on all interfaces. > > > > > I think I see why you had issues with the promiscuous setting. I???ve > > updated the patch to be even more horrific than it was before. > > > > I can???t explain the panic, and the backtrace also doesn???t appear to be > > directly related to this patch. Not sure what???s going on with that. > > I should have time to test the new patch this weekend. ${LIFE} is > keeping me busy the past few weeks. I'm gonna add an event in my > calendar to remind me to test the patch. heh. Sorry for the delay. I rebuilt with the new patch this morning. Looking good on all fronts, including LORs. Thanks, -- Shawn Webb Cofounder / Security Engineer HardenedBSD GPG Key ID: 0xFF2E67A277F8E1FA GPG Key Fingerprint: D206 BB45 15E0 9C49 0CF9 3633 C85B 0AF8 AB23 0FB2 https://git-01.md.hardenedbsd.org/HardenedBSD/pubkeys/src/branch/master/Shawn_Webb/03A4CBEBB82EA5A67D9F3853FF2E67A277F8E1FA.pub.asc signature.asc Description: PGP signature
Re: iflib/bridge kernel panic
On Sat, Oct 3, 2020 at 2:54 PM Felix Kronlage-Dammers wrote: > > Alexander Leidinger wrote on 03.10.20 17:37: > > > Quoting Kristof Provost (from Sat, 03 Oct 2020 16:06:43 > > +0200): > > >> Okay, let’s abandon that patch. It’s ugly and it doesn’t work. > >> > >> Here’s a different approach that I’m much happier with. > >> https://people.freebsd.org/~kp/0001-bridge-Call-member-interface-ioctl-without-NET_EPOCH.patch > >> > >> > >> It passes the regression tests with WITNESS and INVARIANTS enabled, > >> and a hack in the epair ioctl() handler to make it sleep (to look a > >> bit like the Intel ioctl() handler that currently trips up if_bridge). > > Works for me. > > No crash, no LOR, promisc-mode stays enabled, jails are reachable. > > indeed! I can second that. Works nicely, my machine does not panic > anymore and machines (bhyve vms) behind the bridge are reachable. I third that, it works great for me! -Dustin ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: iflib/bridge kernel panic
Alexander Leidinger wrote on 03.10.20 17:37: > Quoting Kristof Provost (from Sat, 03 Oct 2020 16:06:43 > +0200): >> Okay, let’s abandon that patch. It’s ugly and it doesn’t work. >> >> Here’s a different approach that I’m much happier with. >> https://people.freebsd.org/~kp/0001-bridge-Call-member-interface-ioctl-without-NET_EPOCH.patch >> >> >> It passes the regression tests with WITNESS and INVARIANTS enabled, >> and a hack in the epair ioctl() handler to make it sleep (to look a >> bit like the Intel ioctl() handler that currently trips up if_bridge). > Works for me. > No crash, no LOR, promisc-mode stays enabled, jails are reachable. indeed! I can second that. Works nicely, my machine does not panic anymore and machines (bhyve vms) behind the bridge are reachable. felix -- GPG/PGP: 7A0B612C / 5F4D 9B06 C240 3250 35BF 66ED 1AD3 A9B8 7A0B 612C https://hazardous.org/ - f...@hazardous.org - fkr@irc - @felixkronlage ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: iflib/bridge kernel panic
Quoting Kristof Provost (from Sat, 03 Oct 2020 16:06:43 +0200): Okay, let’s abandon that patch. It’s ugly and it doesn’t work. Here’s a different approach that I’m much happier with. https://people.freebsd.org/~kp/0001-bridge-Call-member-interface-ioctl-without-NET_EPOCH.patch It passes the regression tests with WITNESS and INVARIANTS enabled, and a hack in the epair ioctl() handler to make it sleep (to look a bit like the Intel ioctl() handler that currently trips up if_bridge). Works for me. No crash, no LOR, promisc-mode stays enabled, jails are reachable. Bye, Alexander. -- http://www.Leidinger.net alexan...@leidinger.net: PGP 0x8F31830F9F2772BF http://www.FreeBSD.orgnetch...@freebsd.org : PGP 0x8F31830F9F2772BF pgpxQarmwzz7F.pgp Description: Digitale PGP-Signatur
Re: iflib/bridge kernel panic
On 30 Sep 2020, at 13:52, Alexander Leidinger wrote: Quoting Kristof Provost (from Tue, 29 Sep 2020 23:20:44 +0200): On 28 Sep 2020, at 16:44, Alexander Leidinger wrote: Quoting Kristof Provost (from Mon, 28 Sep 2020 13:53:16 +0200): On 28 Sep 2020, at 12:45, Alexander Leidinger wrote: Quoting Kristof Provost (from Sun, 27 Sep 2020 17:51:32 +0200): Here’s an early version of a task queue based approach: http://people.freebsd.org/~kp/0001-bridge-Cope-with-if_ioctl-s-that-sleep.patch That still needs to be cleaned up, but this should resolve the sleep issue and the LOR. There are some issues... seems like inside a jail I can't ping systems outside of the hardware. Bridge setup: - member jail A - member jail B - member external_if of host If I ping the router from the host, it works. If I ping from one jail to another, it works. If I ping from the jail to the IP of the external_if, it works. If I ping from a jail to the router, I do not get a response. Can you check for 'failed ifpromisc' error messages in dmesg? And verify that all bridge member interfaces are in promiscuous mode? I have a panic for you...: - startup still in progress = 22 jails in startup, somewhere after a few jails started the panic happened - tcpdump was running on the external interface - a ping to a jail IP from another system was running, the first ping went through, then it paniced First regarding your questions about promisc mode: no error, but the promisc mode is directly disabled again on all interfaces. I think I see why you had issues with the promiscuous setting. I’ve updated the patch to be even more horrific than it was before. Hmmm same behavior as before. I haven't kept the old version of the patch, so I can't compare if I somehow downloaded the old version again, or if I got the updated one... Okay, let’s abandon that patch. It’s ugly and it doesn’t work. Here’s a different approach that I’m much happier with. https://people.freebsd.org/~kp/0001-bridge-Call-member-interface-ioctl-without-NET_EPOCH.patch It passes the regression tests with WITNESS and INVARIANTS enabled, and a hack in the epair ioctl() handler to make it sleep (to look a bit like the Intel ioctl() handler that currently trips up if_bridge). Best, Kristof ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: iflib/bridge kernel panic
On Tue, Sep 29, 2020 at 4:21 PM Kristof Provost wrote: > > On 28 Sep 2020, at 16:44, Alexander Leidinger wrote: > > > Quoting Kristof Provost (from Mon, 28 Sep 2020 > > 13:53:16 +0200): > > > >> On 28 Sep 2020, at 12:45, Alexander Leidinger wrote: > >>> Quoting Kristof Provost (from Sun, 27 Sep 2020 > >>> 17:51:32 +0200): > Here’s an early version of a task queue based approach: > http://people.freebsd.org/~kp/0001-bridge-Cope-with-if_ioctl-s-that-sleep.patch > > That still needs to be cleaned up, but this should resolve the > sleep issue and the LOR. > >>> > >>> There are some issues... seems like inside a jail I can't ping > >>> systems outside of the hardware. So similar to the others, kind of. Using the original https://reviews.freebsd.org/D26418 patch, everything seems to work fine. Using the newer http://people.freebsd.org/~kp/0001-bridge-Cope-with-if_ioctl-s-that-sleep.patch patch, byhve VMs on the bridge attached to the igb/em(5) interfaces don't pass traffic. The bhyve VMs on the bridge attached to the cxgbe(4) interfaces, however, work fine. -Dustin ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: iflib/bridge kernel panic
Quoting Kristof Provost (from Tue, 29 Sep 2020 23:20:44 +0200): On 28 Sep 2020, at 16:44, Alexander Leidinger wrote: Quoting Kristof Provost (from Mon, 28 Sep 2020 13:53:16 +0200): On 28 Sep 2020, at 12:45, Alexander Leidinger wrote: Quoting Kristof Provost (from Sun, 27 Sep 2020 17:51:32 +0200): Here’s an early version of a task queue based approach: http://people.freebsd.org/~kp/0001-bridge-Cope-with-if_ioctl-s-that-sleep.patch That still needs to be cleaned up, but this should resolve the sleep issue and the LOR. There are some issues... seems like inside a jail I can't ping systems outside of the hardware. Bridge setup: - member jail A - member jail B - member external_if of host If I ping the router from the host, it works. If I ping from one jail to another, it works. If I ping from the jail to the IP of the external_if, it works. If I ping from a jail to the router, I do not get a response. Can you check for 'failed ifpromisc' error messages in dmesg? And verify that all bridge member interfaces are in promiscuous mode? I have a panic for you...: - startup still in progress = 22 jails in startup, somewhere after a few jails started the panic happened - tcpdump was running on the external interface - a ping to a jail IP from another system was running, the first ping went through, then it paniced First regarding your questions about promisc mode: no error, but the promisc mode is directly disabled again on all interfaces. I think I see why you had issues with the promiscuous setting. I’ve updated the patch to be even more horrific than it was before. Hmmm same behavior as before. I haven't kept the old version of the patch, so I can't compare if I somehow downloaded the old version again, or if I got the updated one... # md5 0001-bridge-Cope-with-if_ioctl-s-that-sleep.patch MD5 (0001-bridge-Cope-with-if_ioctl-s-that-sleep.patch) = 9f107739e29fad5c9bb5e75e2dae7bcc I can’t explain the panic, and the backtrace also doesn’t appear to be directly related to this patch. Not sure what’s going on with that. Then let's hope for now it is some kind of defect which is not showing up when it works as it should... we can have a look at it again in case it reproduces with the final patch. Bye, Alexander. -- http://www.Leidinger.net alexan...@leidinger.net: PGP 0x8F31830F9F2772BF http://www.FreeBSD.orgnetch...@freebsd.org : PGP 0x8F31830F9F2772BF pgpXvB8oyrPbh.pgp Description: Digitale PGP-Signatur
Re: iflib/bridge kernel panic
On Tue, Sep 29, 2020 at 11:20:44PM +0200, Kristof Provost wrote: > > > On 28 Sep 2020, at 16:44, Alexander Leidinger wrote: > > > Quoting Kristof Provost (from Mon, 28 Sep 2020 13:53:16 > > +0200): > > > > > On 28 Sep 2020, at 12:45, Alexander Leidinger wrote: > > > > Quoting Kristof Provost (from Sun, 27 Sep 2020 > > > > 17:51:32 +0200): > > > > > Here???s an early version of a task queue based approach: > > > > > http://people.freebsd.org/~kp/0001-bridge-Cope-with-if_ioctl-s-that-sleep.patch > > > > > > > > > > That still needs to be cleaned up, but this should resolve > > > > > the sleep issue and the LOR. > > > > > > > > There are some issues... seems like inside a jail I can't ping > > > > systems outside of the hardware. > > > > > > > > Bridge setup: > > > >- member jail A > > > >- member jail B > > > >- member external_if of host > > > > > > > > If I ping the router from the host, it works. If I ping from one > > > > jail to another, it works. If I ping from the jail to the IP of > > > > the external_if, it works. If I ping from a jail to the router, > > > > I do not get a response. > > > > > > > Can you check for 'failed ifpromisc' error messages in dmesg? And > > > verify that all bridge member interfaces are in promiscuous mode? > > > > I have a panic for you...: > > - startup still in progress = 22 jails in startup, somewhere after a > > few jails started the panic happened > > - tcpdump was running on the external interface > > - a ping to a jail IP from another system was running, the first ping > > went through, then it paniced > > > > First regarding your questions about promisc mode: no error, but the > > promisc mode is directly disabled again on all interfaces. > > > I think I see why you had issues with the promiscuous setting. I???ve > updated the patch to be even more horrific than it was before. > > I can???t explain the panic, and the backtrace also doesn???t appear to be > directly related to this patch. Not sure what???s going on with that. I should have time to test the new patch this weekend. ${LIFE} is keeping me busy the past few weeks. I'm gonna add an event in my calendar to remind me to test the patch. heh. Thanks, -- Shawn Webb Cofounder / Security Engineer HardenedBSD GPG Key ID: 0xFF2E67A277F8E1FA GPG Key Fingerprint: D206 BB45 15E0 9C49 0CF9 3633 C85B 0AF8 AB23 0FB2 https://git-01.md.hardenedbsd.org/HardenedBSD/pubkeys/src/branch/master/Shawn_Webb/03A4CBEBB82EA5A67D9F3853FF2E67A277F8E1FA.pub.asc signature.asc Description: PGP signature
Re: iflib/bridge kernel panic
On 28 Sep 2020, at 16:44, Alexander Leidinger wrote: Quoting Kristof Provost (from Mon, 28 Sep 2020 13:53:16 +0200): On 28 Sep 2020, at 12:45, Alexander Leidinger wrote: Quoting Kristof Provost (from Sun, 27 Sep 2020 17:51:32 +0200): Here’s an early version of a task queue based approach: http://people.freebsd.org/~kp/0001-bridge-Cope-with-if_ioctl-s-that-sleep.patch That still needs to be cleaned up, but this should resolve the sleep issue and the LOR. There are some issues... seems like inside a jail I can't ping systems outside of the hardware. Bridge setup: - member jail A - member jail B - member external_if of host If I ping the router from the host, it works. If I ping from one jail to another, it works. If I ping from the jail to the IP of the external_if, it works. If I ping from a jail to the router, I do not get a response. Can you check for 'failed ifpromisc' error messages in dmesg? And verify that all bridge member interfaces are in promiscuous mode? I have a panic for you...: - startup still in progress = 22 jails in startup, somewhere after a few jails started the panic happened - tcpdump was running on the external interface - a ping to a jail IP from another system was running, the first ping went through, then it paniced First regarding your questions about promisc mode: no error, but the promisc mode is directly disabled again on all interfaces. I think I see why you had issues with the promiscuous setting. I’ve updated the patch to be even more horrific than it was before. I can’t explain the panic, and the backtrace also doesn’t appear to be directly related to this patch. Not sure what’s going on with that. Krsitof ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: iflib/bridge kernel panic
Quoting Kristof Provost (from Mon, 28 Sep 2020 13:53:16 +0200): On 28 Sep 2020, at 12:45, Alexander Leidinger wrote: Quoting Kristof Provost (from Sun, 27 Sep 2020 17:51:32 +0200): Here’s an early version of a task queue based approach: http://people.freebsd.org/~kp/0001-bridge-Cope-with-if_ioctl-s-that-sleep.patch That still needs to be cleaned up, but this should resolve the sleep issue and the LOR. There are some issues... seems like inside a jail I can't ping systems outside of the hardware. Bridge setup: - member jail A - member jail B - member external_if of host If I ping the router from the host, it works. If I ping from one jail to another, it works. If I ping from the jail to the IP of the external_if, it works. If I ping from a jail to the router, I do not get a response. Can you check for 'failed ifpromisc' error messages in dmesg? And verify that all bridge member interfaces are in promiscuous mode? I have a panic for you...: - startup still in progress = 22 jails in startup, somewhere after a few jails started the panic happened - tcpdump was running on the external interface - a ping to a jail IP from another system was running, the first ping went through, then it paniced First regarding your questions about promisc mode: no error, but the promisc mode is directly disabled again on all interfaces. Data (external_if = igb0, jail epairs are j_X_Yif with X the ID of the jail and Y either h like host-side or j like jail-side): ---snip--- Host: # ifconfig -a igb0: flags=8863 metric 0 mtu 1500 options=4a520b9 ether [...]:a4 inet 192.168.1.x netmask 0xff00 broadcast 192.168.1.255 inet6 fe80::[...]a4%igb0 prefixlen 64 scopeid 0x1 inet6 fd73:[...] prefixlen 64 inet6 2003:[...] prefixlen 64 autoconf inet6 fd73:[...] prefixlen 64 autoconf media: Ethernet autoselect (1000baseT ) status: active nd6 options=23 igb1: flags=8822 metric 0 mtu 1500 options=4e527bb ether [...]:a5 media: Ethernet autoselect status: no carrier nd6 options=29 lo0: flags=8049 metric 0 mtu 16384 options=680003 inet6 ::1 prefixlen 128 inet6 fe80::1%lo0 prefixlen 64 scopeid 0x3 inet 127.0.0.1 netmask 0xff00 groups: lo nd6 options=21 vswitch0: flags=8843 metric 0 mtu 1500 ether [...]:a3 id 00:00:00:00:00:00 priority 32768 hellotime 2 fwddelay 15 maxage 20 holdcnt 6 proto stp-rstp maxaddr 2000 timeout 1200 root id 00:00:00:00:00:00 priority 32768 ifcost 0 port 0 member: j_weather_hif flags=143 ifmaxaddr 0 port 9 priority 128 path cost 2000 member: j_web_hif flags=143 ifmaxaddr 0 port 8 priority 128 path cost 2000 member: j_commit_hif flags=143 ifmaxaddr 0 port 7 priority 128 path cost 2000 member: j_video_hif flags=143 ifmaxaddr 0 port 6 priority 128 path cost 2000 member: j_dns_hif flags=143 ifmaxaddr 0 port 5 priority 128 path cost 2000 member: igb0 flags=143 ifmaxaddr 0 port 1 priority 128 path cost 2 groups: bridge nd6 options=9 j_dns_hif: flags=8843 metric 0 mtu 1500 options=8 ether [...]:0a hwaddr [...]:0a inet6 fe80::[...]0a%j_dns_hif prefixlen 64 scopeid 0x5 groups: epair media: Ethernet 10Gbase-T (10Gbase-T ) status: active nd6 options=21 [... some more jail interfaces ...] # dmesg | grep promis igb0: promiscuous mode enabled igb0: promiscuous mode disabled j_dns_hif: promiscuous mode enabled j_dns_hif: promiscuous mode disabled [... some more like this ...] # jexec 2 ifconfig -a lo0: flags=8049 metric 0 mtu 16384 options=680003 inet6 ::1 prefixlen 128 inet6 fe80::1%lo0 prefixlen 64 scopeid 0x1 inet 127.0.0.1 netmask 0xff00 groups: lo nd6 options=21 j_dns_jif: flags=8843 metric 0 mtu 1500 options=8 ether [...]:0b hwaddr [...]:0b inet 192.168.1.y netmask 0xff00 broadcast 192.168.1.255 inet6 fe80::[...]0b%j_dns_jif prefixlen 64 scopeid 0x2 inet6 fd73:[...]:y prefixlen 64 groups: epair media: Ethernet 10Gbase-T (10Gbase-T ) status: active nd6 options=21 ---snip--- And here the backtrace of the panic: ---snip--- panic: if_setflag: decrement non-positive refcount 0 for flag 256 cpuid = 4 time = 1601300532 KDB: stack backtrace: db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfe0378ea3920 vpanic() at vpanic+0x182/frame 0xfe0378ea3970 panic() at panic+0x43/frame 0xfe0378ea39d0 if_setflag() at if_setflag+0x137/frame 0xfe0378ea3a30 ifpromisc() at ifpromisc+0x2a/frame 0xfe0378ea3a60 bpf_detachd_locked() at bpf_detachd_locked+0x280/frame 0xfe0378ea3ab0
Re: iflib/bridge kernel panic
Quoting Kristof Provost (from Sun, 27 Sep 2020 17:51:32 +0200): On 21 Sep 2020, at 14:16, Shawn Webb wrote: On Mon, Sep 21, 2020 at 09:57:40AM +0200, Kristof Provost wrote: On 21 Sep 2020, at 2:52, Shawn Webb wrote: From latest HEAD on a Dell Precision 7550 laptop: https://gist.github.com/lattera/a0803f31f58bcf8ead51ac1ebbc447e2 The last working boot environment was 14 Aug 2020. If I get some time to bisect commits, I'll try to figure out the culprit. Try https://reviews.freebsd.org/D26418 That seems to fix the kernel panic. dmesg gets spammed with a freak ton of these LOR messages now: Here’s an early version of a task queue based approach: http://people.freebsd.org/~kp/0001-bridge-Cope-with-if_ioctl-s-that-sleep.patch That still needs to be cleaned up, but this should resolve the sleep issue and the LOR. There are some issues... seems like inside a jail I can't ping systems outside of the hardware. Bridge setup: - member jail A - member jail B - member external_if of host If I ping the router from the host, it works. If I ping from one jail to another, it works. If I ping from the jail to the IP of the external_if, it works. If I ping from a jail to the router, I do not get a response. Bye, Alexander. -- http://www.Leidinger.net alexan...@leidinger.net: PGP 0x8F31830F9F2772BF http://www.FreeBSD.orgnetch...@freebsd.org : PGP 0x8F31830F9F2772BF pgpZ4OpaNfO4d.pgp Description: Digitale PGP-Signatur
Re: iflib/bridge kernel panic
On 28 Sep 2020, at 12:45, Alexander Leidinger wrote: Quoting Kristof Provost (from Sun, 27 Sep 2020 17:51:32 +0200): Here’s an early version of a task queue based approach: http://people.freebsd.org/~kp/0001-bridge-Cope-with-if_ioctl-s-that-sleep.patch That still needs to be cleaned up, but this should resolve the sleep issue and the LOR. There are some issues... seems like inside a jail I can't ping systems outside of the hardware. Bridge setup: - member jail A - member jail B - member external_if of host If I ping the router from the host, it works. If I ping from one jail to another, it works. If I ping from the jail to the IP of the external_if, it works. If I ping from a jail to the router, I do not get a response. Can you check for 'failed ifpromisc' error messages in dmesg? And verify that all bridge member interfaces are in promiscuous mode? Kristof ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: iflib/bridge kernel panic
On 21 Sep 2020, at 14:16, Shawn Webb wrote: On Mon, Sep 21, 2020 at 09:57:40AM +0200, Kristof Provost wrote: On 21 Sep 2020, at 2:52, Shawn Webb wrote: From latest HEAD on a Dell Precision 7550 laptop: https://gist.github.com/lattera/a0803f31f58bcf8ead51ac1ebbc447e2 The last working boot environment was 14 Aug 2020. If I get some time to bisect commits, I'll try to figure out the culprit. Try https://reviews.freebsd.org/D26418 That seems to fix the kernel panic. dmesg gets spammed with a freak ton of these LOR messages now: Here’s an early version of a task queue based approach: http://people.freebsd.org/~kp/0001-bridge-Cope-with-if_ioctl-s-that-sleep.patch That still needs to be cleaned up, but this should resolve the sleep issue and the LOR. Best regards, Kristof ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: iflib/bridge kernel panic
Hi! There is some serios issue in kernel related to network interfaces. See my message "speedtest.net in multi connections mode causes the FreeBSD 13-CURRENT router to crash" from 28 aug 2020. I noticed that kernel every time goes into panic if users run on client computers in browser speedtest.net in multi connections mode. If my external network interface use VLAN this panic occurs when uplink has speed 100Mbits per second. Without VLAN speedtest passes without any problems at 100Mbits channel but every time goes into panic at 1Gbits outer channel. During crash, the console screen goes out and the server (router) stops responding to the keyboard. Can anyone do this test on their machine? Sergei. From: xt Sent: Friday, September 25, 2020 8:46 PM To: Sergey V. Dyatko ; Kristof Provost Cc: FreeBSD Current Subject: Re: iflib/bridge kernel panic Sergey V. Dyatko wrote: On Mon, 21 Sep 2020 09:57:40 +0200 "Kristof Provost" wrote: On 21 Sep 2020, at 2:52, Shawn Webb wrote: From latest HEAD on a Dell Precision 7550 laptop: https://gist.github.com/lattera/a0803f31f58bcf8ead51ac1ebbc447e2 The last working boot environment was 14 Aug 2020. If I get some time to bisect commits, I'll try to figure out the culprit. Try https://reviews.freebsd.org/D26418 Best regards, Kristof I'm not sure, but doesn't this panic have the same root as mine? Sorry, but I haven't text console and can post only screenshot[s] from IP-KVM https://gyazo.com/fee41c5267e9fc543d43901e498b7c94 rc.conf have something like: clonned_interfaces="lagg0 vlan101" ifconfig_lagg0="laggproto lacp laggport em0 laggport em1 x.x.x.x/mask" ifconfig_vlan101="vlan 101 vlandev lagg0 192.168.1.29/24" without VLAN part all works fine. Installed from FreeBSD-13.0-CURRENT-amd64-20200924-3c514403bef-disc1.iso Yes, same panic. ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org" ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: iflib/bridge kernel panic
Sergey V. Dyatko wrote: On Mon, 21 Sep 2020 09:57:40 +0200 "Kristof Provost" wrote: On 21 Sep 2020, at 2:52, Shawn Webb wrote: From latest HEAD on a Dell Precision 7550 laptop: https://gist.github.com/lattera/a0803f31f58bcf8ead51ac1ebbc447e2 The last working boot environment was 14 Aug 2020. If I get some time to bisect commits, I'll try to figure out the culprit. Try https://reviews.freebsd.org/D26418 Best regards, Kristof I'm not sure, but doesn't this panic have the same root as mine? Sorry, but I haven't text console and can post only screenshot[s] from IP-KVM https://gyazo.com/fee41c5267e9fc543d43901e498b7c94 rc.conf have something like: clonned_interfaces="lagg0 vlan101" ifconfig_lagg0="laggproto lacp laggport em0 laggport em1 x.x.x.x/mask" ifconfig_vlan101="vlan 101 vlandev lagg0 192.168.1.29/24" without VLAN part all works fine. Installed from FreeBSD-13.0-CURRENT-amd64-20200924-3c514403bef-disc1.iso Yes, same panic. ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: iflib/bridge kernel panic
On Mon, 21 Sep 2020 09:57:40 +0200 "Kristof Provost" wrote: > On 21 Sep 2020, at 2:52, Shawn Webb wrote: > >> From latest HEAD on a Dell Precision 7550 laptop: > > > > https://gist.github.com/lattera/a0803f31f58bcf8ead51ac1ebbc447e2 > > > > The last working boot environment was 14 Aug 2020. If I get some time to > > bisect commits, I'll try to figure out the culprit. > > > Try https://reviews.freebsd.org/D26418 > > Best regards, > Kristof I'm not sure, but doesn't this panic have the same root as mine? Sorry, but I haven't text console and can post only screenshot[s] from IP-KVM https://gyazo.com/fee41c5267e9fc543d43901e498b7c94 rc.conf have something like: clonned_interfaces="lagg0 vlan101" ifconfig_lagg0="laggproto lacp laggport em0 laggport em1 x.x.x.x/mask" ifconfig_vlan101="vlan 101 vlandev lagg0 192.168.1.29/24" without VLAN part all works fine. Installed from FreeBSD-13.0-CURRENT-amd64-20200924-3c514403bef-disc1.iso -- wbr, Sergey ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: iflib/bridge kernel panic
On 23 Sep 2020, at 19:37, xto...@hotmail.com wrote: > Kristof Provost wrote: >> On 21 Sep 2020, at 2:52, Shawn Webb wrote: From latest HEAD on a Dell Precision 7550 laptop: >>> >>> https://gist.github.com/lattera/a0803f31f58bcf8ead51ac1ebbc447e2 >>> >>> The last working boot environment was 14 Aug 2020. If I get some time to >>> bisect commits, I'll try to figure out the culprit. >>> >> Try https://reviews.freebsd.org/D26418 > > Anything stopping this from being integrated? Yes, it’s not correct. I’ve got this on my todo list. I think I know how to fix it better. Best regards, Kristof ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: iflib/bridge kernel panic
Kristof Provost wrote: On 21 Sep 2020, at 2:52, Shawn Webb wrote: From latest HEAD on a Dell Precision 7550 laptop: https://gist.github.com/lattera/a0803f31f58bcf8ead51ac1ebbc447e2 The last working boot environment was 14 Aug 2020. If I get some time to bisect commits, I'll try to figure out the culprit. Try https://reviews.freebsd.org/D26418 Anything stopping this from being integrated? ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: iflib/bridge kernel panic
On Mon, Sep 21, 2020 at 09:57:40AM +0200, Kristof Provost wrote: > On 21 Sep 2020, at 2:52, Shawn Webb wrote: > >> From latest HEAD on a Dell Precision 7550 laptop: > > > > https://gist.github.com/lattera/a0803f31f58bcf8ead51ac1ebbc447e2 > > > > The last working boot environment was 14 Aug 2020. If I get some time to > > bisect commits, I'll try to figure out the culprit. > > > Try https://reviews.freebsd.org/D26418 That seems to fix the kernel panic. dmesg gets spammed with a freak ton of these LOR messages now: BEGIN LOG 01 Sep 21 08:08:28 hbsd-laptop-02 kernel: [25] Sleeping on "e1000_delay" with the following non-sleepable locks held: Sep 21 08:08:28 hbsd-laptop-02 kernel: [25] exclusive sleep mutex if_bridge (if_bridge) r = 0 (0xf8001ea07218) locked @ /usr/src/sys/net/if_bridge.c:827 Sep 21 08:08:28 hbsd-laptop-02 kernel: [25] stack backtrace: Sep 21 08:08:28 hbsd-laptop-02 kernel: [25] #0 0x80c6c4a1 at witness_debugger+0x71 Sep 21 08:08:28 hbsd-laptop-02 kernel: [25] #1 0x80c6d5bd at witness_warn+0x40d Sep 21 08:08:28 hbsd-laptop-02 kernel: [25] #2 0x80c09b8b at _sleep+0x5b Sep 21 08:08:28 hbsd-laptop-02 kernel: [25] #3 0x80c0a38e at pause_sbt+0xfe Sep 21 08:08:28 hbsd-laptop-02 kernel: [25] #4 0x80652b2d at e1000_write_phy_reg_mdic+0xed Sep 21 08:08:28 hbsd-laptop-02 kernel: [25] #5 0x80656bde at __e1000_write_phy_reg_hv+0x1ce Sep 21 08:08:28 hbsd-laptop-02 kernel: [25] #6 0x80640ea9 at e1000_lv_jumbo_workaround_ich8lan+0x799 Sep 21 08:08:28 hbsd-laptop-02 kernel: [25] #7 0x8062329e at em_if_init+0x151e Sep 21 08:08:28 hbsd-laptop-02 kernel: [25] #8 0x80d347a9 at iflib_init_locked+0x2d9 Sep 21 08:08:28 hbsd-laptop-02 kernel: [25] #9 0x80d36b08 at iflib_if_ioctl+0x1b8 Sep 21 08:08:28 hbsd-laptop-02 kernel: [25] #10 0x83c582ac at bridge_set_ifcap+0x8c Sep 21 08:08:28 hbsd-laptop-02 kernel: [25] #11 0x83c544c8 at bridge_ioctl_add+0x4c8 Sep 21 08:08:28 hbsd-laptop-02 kernel: [25] #12 0x83c560ff at bridge_ioctl+0x2df Sep 21 08:08:28 hbsd-laptop-02 kernel: [25] #13 0x80d9f1a1 at in_control+0x341 Sep 21 08:08:28 hbsd-laptop-02 kernel: [25] #14 0x80d16266 at ifioctl+0x766 Sep 21 08:08:28 hbsd-laptop-02 kernel: [25] #15 0x80c715a0 at kern_ioctl+0x290 Sep 21 08:08:28 hbsd-laptop-02 kernel: [25] #16 0x80c71267 at sys_ioctl+0x127 Sep 21 08:08:28 hbsd-laptop-02 kernel: [25] #17 0x8122bf4c at amd64_syscall+0x14c END LOG 01 BEGIN LOG 02 Sep 21 08:08:28 hbsd-laptop-02 kernel: [29] lock order reversal: (sleepable after non-sleepable) Sep 21 08:08:28 hbsd-laptop-02 kernel: [29] 1st 0xf800374616a0 ure0 (ure0, sleep mutex) @ /usr/src/sys/dev/usb/usb_request.c:714 Sep 21 08:08:28 hbsd-laptop-02 kernel: [29] 2nd 0x81eb1ab8 sysctl lock (sysctl lock, sleepable rm) @ /usr/src/sys/kern/kern_sysctl.c:837 Sep 21 08:08:28 hbsd-laptop-02 kernel: [29] lock order ure0 -> sysctl lock attempted at: Sep 21 08:08:28 hbsd-laptop-02 kernel: [29] #0 0x80c6c1dc at witness_checkorder+0xdcc Sep 21 08:08:28 hbsd-laptop-02 kernel: [29] #1 0x80bf76bb at _rm_wlock_debug+0x6b Sep 21 08:08:28 hbsd-laptop-02 kernel: [29] #2 0x80c0c7a6 at sysctl_add_oid+0x46 Sep 21 08:08:28 hbsd-laptop-02 kernel: [29] #3 0x83c64ea1 at ure_attach_post+0x1a91 Sep 21 08:08:28 hbsd-laptop-02 kernel: [29] #4 0x83c6a1af at ue_attach_post_task+0x2f Sep 21 08:08:28 hbsd-laptop-02 kernel: [29] #5 0x80a2b749 at usb_process+0xf9 Sep 21 08:08:28 hbsd-laptop-02 kernel: [29] #6 0x80bb9fe5 at fork_exit+0x85 Sep 21 08:08:28 hbsd-laptop-02 kernel: [29] #7 0x81200a9e at fork_trampoline+0xe END LOG 02 At work, I have two ethernet interfaces: the onboard em0 and a usb ethernet dongle. Thanks, -- Shawn Webb Cofounder / Security Engineer HardenedBSD GPG Key ID: 0xFF2E67A277F8E1FA GPG Key Fingerprint: D206 BB45 15E0 9C49 0CF9 3633 C85B 0AF8 AB23 0FB2 https://git-01.md.hardenedbsd.org/HardenedBSD/pubkeys/src/branch/master/Shawn_Webb/03A4CBEBB82EA5A67D9F3853FF2E67A277F8E1FA.pub.asc signature.asc Description: PGP signature
Re: iflib/bridge kernel panic
On 21 Sep 2020, at 2:52, Shawn Webb wrote: >> From latest HEAD on a Dell Precision 7550 laptop: > > https://gist.github.com/lattera/a0803f31f58bcf8ead51ac1ebbc447e2 > > The last working boot environment was 14 Aug 2020. If I get some time to > bisect commits, I'll try to figure out the culprit. > Try https://reviews.freebsd.org/D26418 Best regards, Kristof ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: iflib/bridge kernel panic
Hi Shawn, Is it possible to reproduce the issue on FreeBSD? The excerpt you've linked to is not on FreeBSD. Conrad On Sun, Sep 20, 2020 at 5:53 PM Shawn Webb wrote: > > From latest HEAD on a Dell Precision 7550 laptop: > > https://gist.github.com/lattera/a0803f31f58bcf8ead51ac1ebbc447e2 > > The last working boot environment was 14 Aug 2020. If I get some time to > bisect commits, I'll try to figure out the culprit. > > Thanks, > > Shawn Webb > > (Sorry for the brevity. Only partially working system due to above > breakage.) > ___ > freebsd-current@freebsd.org mailing list > https://lists.freebsd.org/mailman/listinfo/freebsd-current > To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org" ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"