Re: iflib/bridge kernel panic

2020-09-28 Thread Alexander Leidinger


Quoting Kristof Provost  (from Mon, 28 Sep 2020  
13:53:16 +0200):



On 28 Sep 2020, at 12:45, Alexander Leidinger wrote:
Quoting Kristof Provost  (from Sun, 27 Sep 2020  
17:51:32 +0200):
Here’s an early version of a task queue based approach:  
http://people.freebsd.org/~kp/0001-bridge-Cope-with-if_ioctl-s-that-sleep.patch


That still needs to be cleaned up, but this should resolve the  
sleep issue and the LOR.


There are some issues... seems like inside a jail I can't ping  
systems outside of the hardware.


Bridge setup:
   - member jail A
   - member jail B
   - member external_if of host

If I ping the router from the host, it works. If I ping from one  
jail to another, it works. If I ping from the jail to the IP of the  
external_if, it works. If I ping from a jail to the router, I do  
not get a response.


Can you check for 'failed ifpromisc' error messages in dmesg? And  
verify that all bridge member interfaces are in promiscuous mode?


I have a panic for you...:
 - startup still in progress = 22 jails in startup, somewhere after a  
few jails started the panic happened

 - tcpdump was running on the external interface
 - a ping to a jail IP from another system was running, the first  
ping went through, then it paniced


First regarding your questions about promisc mode: no error, but the  
promisc mode is directly disabled again on all interfaces.


Data (external_if = igb0, jail epairs are j_X_Yif with X the ID of the  
jail and Y either h like host-side or j like jail-side):

---snip---
Host:

# ifconfig -a
igb0: flags=8863 metric 0 mtu 1500
 
options=4a520b9

ether [...]:a4
inet 192.168.1.x netmask 0xff00 broadcast 192.168.1.255
inet6 fe80::[...]a4%igb0 prefixlen 64 scopeid 0x1
inet6 fd73:[...] prefixlen 64
inet6 2003:[...] prefixlen 64 autoconf
inet6 fd73:[...] prefixlen 64 autoconf
media: Ethernet autoselect (1000baseT )
status: active
nd6 options=23
igb1: flags=8822 metric 0 mtu 1500
 
options=4e527bb

ether [...]:a5
media: Ethernet autoselect
status: no carrier
nd6 options=29
lo0: flags=8049 metric 0 mtu 16384
options=680003
inet6 ::1 prefixlen 128
inet6 fe80::1%lo0 prefixlen 64 scopeid 0x3
inet 127.0.0.1 netmask 0xff00
groups: lo
nd6 options=21
vswitch0: flags=8843 metric 0 mtu 1500
ether [...]:a3
id 00:00:00:00:00:00 priority 32768 hellotime 2 fwddelay 15
maxage 20 holdcnt 6 proto stp-rstp maxaddr 2000 timeout 1200
root id 00:00:00:00:00:00 priority 32768 ifcost 0 port 0
member: j_weather_hif flags=143
ifmaxaddr 0 port 9 priority 128 path cost 2000
member: j_web_hif flags=143
ifmaxaddr 0 port 8 priority 128 path cost 2000
member: j_commit_hif flags=143
ifmaxaddr 0 port 7 priority 128 path cost 2000
member: j_video_hif flags=143
ifmaxaddr 0 port 6 priority 128 path cost 2000
member: j_dns_hif flags=143
ifmaxaddr 0 port 5 priority 128 path cost 2000
member: igb0 flags=143
ifmaxaddr 0 port 1 priority 128 path cost 2
groups: bridge
nd6 options=9
j_dns_hif: flags=8843 metric 0  
mtu 1500

options=8
ether [...]:0a
hwaddr [...]:0a
inet6 fe80::[...]0a%j_dns_hif prefixlen 64 scopeid 0x5
groups: epair
media: Ethernet 10Gbase-T (10Gbase-T )
status: active
nd6 options=21
[... some more jail interfaces ...]

# dmesg | grep promis
igb0: promiscuous mode enabled
igb0: promiscuous mode disabled
j_dns_hif: promiscuous mode enabled
j_dns_hif: promiscuous mode disabled
[... some more like this ...]

# jexec 2 ifconfig -a
lo0: flags=8049 metric 0 mtu 16384
options=680003
inet6 ::1 prefixlen 128
inet6 fe80::1%lo0 prefixlen 64 scopeid 0x1
inet 127.0.0.1 netmask 0xff00
groups: lo
nd6 options=21
j_dns_jif: flags=8843 metric 0  
mtu 1500

options=8
ether [...]:0b
hwaddr [...]:0b
inet 192.168.1.y netmask 0xff00 broadcast 192.168.1.255
inet6 fe80::[...]0b%j_dns_jif prefixlen 64 scopeid 0x2
inet6 fd73:[...]:y prefixlen 64
groups: epair
media: Ethernet 10Gbase-T (10Gbase-T )
status: active
nd6 options=21
---snip---

And here the backtrace of the panic:
---snip---
panic: if_setflag: decrement non-positive refcount 0 for flag 256
cpuid = 4
time = 1601300532
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfe0378ea3920
vpanic() at vpanic+0x182/frame 0xfe0378ea3970
panic() at panic+0x43/frame 0xfe0378ea39d0
if_setflag() at if_setflag+0x137/frame 0xfe0378ea3a30
ifpromisc() at ifpromisc+0x2a/frame 0xfe0378ea3a60
bpf_detachd_locked() at bpf_detachd_locked+0x280/frame 0xfe0378ea3ab0

Re: iflib/bridge kernel panic

2020-09-28 Thread Alexander Leidinger


Quoting Kristof Provost  (from Sun, 27 Sep 2020  
17:51:32 +0200):



On 21 Sep 2020, at 14:16, Shawn Webb wrote:

On Mon, Sep 21, 2020 at 09:57:40AM +0200, Kristof Provost wrote:

On 21 Sep 2020, at 2:52, Shawn Webb wrote:

From latest HEAD on a Dell Precision 7550 laptop:


https://gist.github.com/lattera/a0803f31f58bcf8ead51ac1ebbc447e2

The last working boot environment was 14 Aug 2020. If I get some time to
bisect commits, I'll try to figure out the culprit.


Try https://reviews.freebsd.org/D26418


That seems to fix the kernel panic. dmesg gets spammed with a freak
ton of these LOR messages now:

Here’s an early version of a task queue based approach:  
http://people.freebsd.org/~kp/0001-bridge-Cope-with-if_ioctl-s-that-sleep.patch


That still needs to be cleaned up, but this should resolve the sleep  
issue and the LOR.


There are some issues... seems like inside a jail I can't ping systems  
outside of the hardware.


Bridge setup:
- member jail A
- member jail B
- member external_if of host

If I ping the router from the host, it works. If I ping from one jail  
to another, it works. If I ping from the jail to the IP of the  
external_if, it works. If I ping from a jail to the router, I do not  
get a response.


Bye,
Alexander.

--
http://www.Leidinger.net alexan...@leidinger.net: PGP 0x8F31830F9F2772BF
http://www.FreeBSD.orgnetch...@freebsd.org  : PGP 0x8F31830F9F2772BF


pgpZ4OpaNfO4d.pgp
Description: Digitale PGP-Signatur


Re: iflib/bridge kernel panic

2020-09-28 Thread Kristof Provost

On 28 Sep 2020, at 12:45, Alexander Leidinger wrote:
Quoting Kristof Provost  (from Sun, 27 Sep 2020 
17:51:32 +0200):
Here’s an early version of a task queue based approach: 
http://people.freebsd.org/~kp/0001-bridge-Cope-with-if_ioctl-s-that-sleep.patch


That still needs to be cleaned up, but this should resolve the sleep 
issue and the LOR.


There are some issues... seems like inside a jail I can't ping systems 
outside of the hardware.


Bridge setup:
- member jail A
- member jail B
- member external_if of host

If I ping the router from the host, it works. If I ping from one jail 
to another, it works. If I ping from the jail to the IP of the 
external_if, it works. If I ping from a jail to the router, I do not 
get a response.


Can you check for 'failed ifpromisc' error messages in dmesg? And verify 
that all bridge member interfaces are in promiscuous mode?


Kristof
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: clang build buggy code with certain CPUTYPE setting

2020-09-28 Thread Andriy Gapon
On 26/09/2020 22:55, Marek Zarychta wrote:
> Thank you for the information and for the fix. Sadly I must admit it
> doesn't work for me. I have tried two builds with fresh sources today to
> be certain and it looks like the bug is still present on FreeBSD
> 13-CURRENT r366186. Either the upstream fixed it only partially or it is
> another bug. As a workaround, I will build worlds without
> CPUTYPE?=amdfam10 for a while. I hope the problem will be resolved
> before clang 11 is MFCed to 12-STABLE.

Can you disassemble the faulting instruction in the core dump?
Can you provide full CPU ID / features information from dmesg?

-- 
Andriy Gapon
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"