Hi folks, I have this EdgeRouter Lite an running OpenBSD on it. Now I upgraded it from 6.5 to 6.6 and ran into a problem that renders my OpenBSD DSL router unusable, because it fails after PPPoE is started.
One important thing to notice here is that my DSL connection is IPv6!
Configuration somewhat like this:
edge# cat /etc/hostname.pppoe0
of course I had to move that away in order to start :-)
Alternatively it also works as long as cable is pulled out/modem turned off.
inet 0.0.0.0 255.255.255.255 NONE pppoedev cnmac1 authproto pap authname '***'
authkey '***' up
inet6 eui64
!/sbin/route add -inet6 default -ifp pppoe0 fe80::%pppoe0
SO!
Now following happens after I started my DSL Modem:
tcpdump: listening on cnmac1, link-type EN10MB
01:47:46.688711 :: > ff02::1:ffe8:bfc6: icmp6: neighbor sol: who has
fe80::3e1e:4ff:fee8:bfc6 [icmp6 cksum ok] (len 24, hlim 255)
01:48:15.864033 PPPoE-Discovery
code Initiation, version 1, type 1, id 0x0000, length 12
tag Service-Name, length 0
tag Host-Uniq, length 4 \263\007R\220
01:49:15.846567 PPPoE-Discovery
code Initiation, version 1, type 1, id 0x0000, length 12
tag Service-Name, length 0
tag Host-Uniq, length 4 \263\007R\220
01:50:15.844513 PPPoE-Discovery
code Initiation, version 1, type 1, id 0x0000, length 12
tag Service-Name, length 0
tag Host-Uniq, length 4 \263\007R\220
01:51:15.848512 PPPoE-Discovery
code Initiation, version 1, type 1, id 0x0000, length 12
tag
Trap cause = 2 Frame 0x980000000ffdb860
Trap PC 0xffffffff813ba9dc RA 0xffffffff8145fe1c fault 0x0
0xffffffff813ba928 (1,9800000007f21776,1,2) ra 0xffffffff8145fe1c sp
0x980000000ffdb9b8, sz 0
0xffffffff8145fd10 (1,9800000007f21776,1,2) ra 0xffffffff8145bca4 sp
0x980000000ffdb9b8, sz 144
0xffffffff8145ba30 (1,9800000007f21776,1,2) ra 0xffffffff8145586c sp
0x980000000ffdba48, sz 128
0xffffffff814551b0 (1,9800000007f21776,1,2) ra 0x0 sp 0x980000000ffdbac8, sz 0
User-level: pid 93301
stopped on non ddb fault
Stopped at 0xffffffff813ba9dc: lbu v1,0(a0)
ddb{1}> boot reboot
System restart.
On boot:
starting early daemons: syslogd pflogd nsd
Trap cause = 2 Frame 0x980000000ffdb860
Trap PC 0xffffffff813ba9dc RA 0xffffffff8145fe1c fault 0x0
0xffffffff813ba928 (1,980000000644b476,1,2) ra 0xffffffff8145fe1c sp
0x980000000ffdb9b8, sz 0
0xffffffff8145fd10 (1,980000000644b476,1,2) ra 0xffffffff8145bca4 sp
0x980000000ffdb9b8, sz 144
0xffffffff8145ba30 (1,980000000644b476,1,2) ra 0xffffffff8145586c sp
0x980000000ffdba48, sz 128
0xffffffff814551b0 (1,980000000644b476,1,2) ra 0x0 sp 0x980000000ffdbac8, sz 0
User-level: pid 45763
stopped on non ddb fault
Stopped at 0xffffffff813ba9dc: lbu v1,0(a0)
ddb{1}> boot reboot
System restart.
Same thing!
So ... I was experimenting somewhat in order to get some more info for you:
Trap cause = 2 Frame 0x980000000ffdb860
Trap PC 0xffffffff813ba9dc RA 0xffffffff8145fe1c fault 0x0
0xffffffff813ba928 (1,9800000005b73976,1,2) ra 0xffffffff8145fe1c sp
0x980000000ffdb9b8, sz 0
0xffffffff8145fd10 (1,9800000005b73976,1,2) ra 0xffffffff8145bca4 sp
0x980000000ffdb9b8, sz 144
0xffffffff8145ba30 (1,9800000005b73976,1,2) ra 0xffffffff8145586c sp
0x980000000ffdba48, sz 128
0xffffffff814551b0 (1,9800000005b73976,1,2) ra 0x0 sp 0x980000000ffdbac8, sz 0
User-level: pid 64532
stopped on non ddb fault
Stopped at 0xffffffff813ba9dc: lbu v1,0(a0)
ddb{0}> trace
0xffffffff813ba928 (1,9800000005b73976,1,2) ra 0xffffffff8145fe1c sp 0x9800000
00ffdb9b8, sz 0
0xffffffff8145fd10 (1,9800000005b73976,1,2) ra 0xffffffff8145bca4 sp 0x9800000
00ffdb9b8, sz 144
0xffffffff8145ba30 (1,9800000005b73976,1,2) ra 0xffffffff8145586c sp 0x9800000
00ffdba48, sz 128
0xffffffff814551b0 (1,9800000005b73976,1,2) ra 0x0 sp 0x980000000ffdbac8, sz 0
User-level: pid 64532
ddb{0}> help
machine kill print p pprint examine
x search set write w delete
d break dwatch watch step s
continue c until next match trace
bt call ps callout show boot
help hangman dmesg
ddb{0}> examine
0xffffffff813ba9dc: 90830000
ddb{0}>
ddb{0}> continue
panic: trap
Starting stack trace...
mips trace requires a trap frame... giving up
End of stack trace.
syncing disks...
Here machine hanged itself, had to turn off power.
And again, I was trying to find out somewhat more:
ddb{1}> show register
at 0x1
v0 0x980000000ffdba28
v1 0x9800000003a5f377
a0 0x1
a1 0x9800000003a5f376
a2 0x1
a3 0x2
a4 0x1
a5 0xae79a5801a40124f
a6 0xae79a5801a40124f
a7 0xc00000000000afe0
t0 0x2
t1 0
t2 0x1
t3 0xce2d0000
s0 0xffffffff815f0000
s1 0x9800000003a5f300
s2 0xffffffff815af558
s3 0x9800000003a5f376
s4 0x1
s5 0xffffffff
s6 0x9800000003a5f370
s7 0x1
t8 0x272dce97f87b7500
t9 0xffffffff8118e380
k0 0x5b10
k1 0x5b00
gp 0xffffffff815dca20
sp 0x980000000ffdb9b8
s8 0x8851
ra 0xffffffff8145fe1c
sr 0x10008fe3
lo 0x6b9f78acc26fb8
hi 0
bad 0
cs 0x8
pc 0xffffffff813ba9dc
0xffffffff813ba9dc: lbu v1,0(a0)
ddb{1}> show panic
the kernel did not panic
ddb{1}> show mbuf
mbuf 0xffffffff813ba9dc
m_type: 0 m_flags: 0
Trap cause = 4 Frame 0x980000000ffdb3b0
Trap PC 0xffffffff8130bd34 RA 0xffffffff8130bd24 fault 0xffffffff813ba9dc
0xffffffff8130bc88 (c00000000000c470,9001070000000000,228,0) ra
0xffffffff8130bd24 sp 0x980000000ffdb508, sz 0
0xffffffff8130bc88 (c00000000000c470,9001070000000000,228,0) ra 0x0 sp
0x980000000ffdb508, sz 0
User-level: pid 88020
Caught exception in ddb.
ddb{1}>
ddb{1}> show socket
socket 0xffffffff813ba9dc
so_type: 992
so_options: 0x0008
so_linger: 0
so_state: 0x64a50001
Trap cause = 4 Frame 0x980000000ffdb340
Trap PC 0xffffffff811bfccc RA 0xffffffff811bfcb8 fault 0xffffffff813ba9e4
0xffffffff811bfbd0 (c00000000000c470,9001070000000000,228,0) ra
0xffffffff81200b9c sp 0x980000000ffdb498, sz 144
0xffffffff81200b78 (c00000000000c470,9001070000000000,228,0) ra
0xffffffff81200638 sp 0x980000000ffdb528, sz 16
0xffffffff81200240 (c00000000000c470,9001070000000000,228,0) ra
0xffffffff81201a98 sp 0x980000000ffdb538, sz 208
0xffffffff812018e8 (c00000000000c470,9001070000000000,228,0) ra
0xffffffff812ff004 sp 0x980000000ffdb608, sz 208
0xffffffff812feda0 (c00000000000c470,9001070000000000,228,0) ra
0xffffffff81049024 sp 0x980000000ffdb6d8, sz 48
0xffffffff81048e14 (c00000000000c470,9001070000000000,228,0) ra 0x0 sp
0x980000000ffdb708, sz 0
User-level: pid 88020
Caught exception in ddb.
ddb{1}>
And so on. It says that it did not panic, but this trap has exactly an effect
like a panic.
Router drops into DDB even though sysctl panic=0, so he does not care about
that. Does not
restart automatically.
I collected these outputs by copying it using GNU screen
from a ttyUSB console (console connected on Edgerouter Lite).
Hope that helps!
To me it looks like it fails during PPPoE initialization.
I do not have here an IPv4-only PPPoE,
so I can not tell if it is happening on IPv6-only PPPoE.
However, it is an update of OpenBSD 6.5 which worked fine,
OpenBSD 6.6 broke this. Configuration did not change.
First I was thinking maybe it had something to do with
a bridge I was filtering on bad BRAS MAC addresses, on cnmac1,
where my DSL Modem is connected, but it was not the bridge.
It fails also without a bridge. It is definitely PPPoE device.
When I turn off the modem, or I simply remove /etc/hostname.pppoe0
in order not to start PPPoE, like stated above, the router works
without reliability issues (sofar).
--
Lars Schotte
Mudroňova 13
92101 Piešťany
pgpPlKWj3cbDX.pgp
Description: OpenPGP digital signature
