On Fri, Jun 18, 2021 at 05:06:27PM +0100, Matthew Reeve wrote: > Hi, yes sure, here it is. Please let me know if this does not give you what > you need. > > Thanks!
Thanks, that looks like an issue with slists. We had similar issue with lists code in the past and reworked them to be more conservative. Will check that. > root@OpenWrt:/tmp# gdb debug/bird bird.1623776146.6869.7.core > GNU gdb (GDB) 10.1 > Copyright (C) 2020 Free Software Foundation, Inc. > License GPLv3+: GNU GPL version 3 or later > <http://gnu.org/licenses/gpl.html> > This is free software: you are free to change and redistribute it. > There is NO WARRANTY, to the extent permitted by law. > Type "show copying" and "show warranty" for details. > This GDB was configured as "arm-openwrt-linux". > Type "show configuration" for configuration details. > For bug reporting instructions, please see: > <https://www.gnu.org/software/gdb/bugs/>. > Find the GDB manual and other documentation resources online at: > <http://www.gnu.org/software/gdb/documentation/>. > > For help, type "help". > Type "apropos word" to search for commands related to "word"... > Reading symbols from debug/bird... > [New LWP 6869] > Core was generated by `./bird'. > Program terminated with signal SIGBUS, Bus error. > #0 ospf_rt_reset (p=0x1d610a0) at proto/ospf/rt.c:1646 > 1646 proto/ospf/rt.c: No such file or directory. > (gdb) bt > #0 ospf_rt_reset (p=0x1d610a0) at proto/ospf/rt.c:1646 > #1 ospf_rt_spf (p=0x1d610a0) at proto/ospf/rt.c:1698 > #2 ospf_rt_spf (p=0x1d610a0) at proto/ospf/rt.c:1688 > #3 ospf_disp (timer=<optimized out>) at proto/ospf/ospf.c:468 > #4 0x00061574 in timers_fire (loop=0xc4878 <main_timeloop>) at > lib/timer.c:235 > #5 0x00012ca8 in io_loop () at sysdep/unix/io.c:2195 > #6 main (argc=<optimized out>, argv=<optimized out>) at > sysdep/unix/main.c:939 > (gdb) > > On 18/06/2021 16:16, Ondrej Zajicek wrote: > > On Mon, Jun 14, 2021 at 04:25:04PM +0100, Matthew Reeve wrote: > > > Hi, > > > > > > when using bird 2.0.8 on openwrt 21.02 (and other versions) on a Netgear > > > R7800 router, if the OSPF protocol is used, either v2 or v3, bird > > > immediately crashes on startup with: > > > > > > Fri Jun 11 14:41:11 2021 daemon.info bird: Started > > > Fri Jun 11 14:41:11 2021 kern.err kernel: [ 3500.853248] Alignment trap: > > > not > > > handling instruction f44c0a1f at [<00035848>] Fri Jun 11 14:41:11 2021 > > > kern.alert kernel: [ 3500.853283] 8<--- cut here --- > > > Fri Jun 11 14:41:11 2021 kern.alert kernel: [ 3500.859363] Unhandled > > > fault: > > > alignment exception (0x801) at 0x007e0624 > > > Fri Jun 11 14:41:11 2021 kern.alert kernel: [ 3500.862443] pgd = 0bbef4fd > > > Fri Jun 11 14:41:11 2021 kern.alert kernel: [ 3500.868821] [007e0624] > > > *pgd=5d6ca835, *pte=5c40b75f, *ppte=5c40bc7f > > > > > > > > > This router uses an ARMv7 processor and the issue seems to be to do with > > > memory alignment issues. I've debugged it and traced it to an access to > > > the > > > top_hash_entry struct. I've found that if I add the PACKED macro to the > > > struct definition then it fixes the problem, as per this patch: > > Hi > > > > Thanks, could you try to get backtrace from the coredump using gdb to see > > where is the invalid access? > > > > -- Elen sila lumenn' omentielvo Ondrej 'Santiago' Zajicek (email: [email protected]) OpenPGP encrypted e-mails preferred (KeyID 0x11DEADC3, wwwkeys.pgp.net) "To err is human -- to blame it on a computer is even more so."
