Re: alignment error rtadvd/armv7
Martin Brandenburgwrites: > On Sun, 18 Sep 2016, Jeremie Courreges-Anglas wrote: > >> Martin Brandenburg writes: >> >> > On a PandaBoard (armv7) running -current, when I run rtadvd, it crashes >> > with a bus error shortly after printing (received a routing message). I >> > can reproduce by sending SIGHUP to a dhclient running on the same >> > interface. >> > >> > I have traced this down to the following block of code in rtadvd.c. >> > >> >static void >> >rtmsg_input(void) >> >{ >> >int n, type, ifindex = 0, plen; >> >size_t len; >> >char msg[2048], *next, *lim; >> >u_char ifname[IF_NAMESIZE]; >> >struct prefix *prefix; >> >struct rainfo *rai; >> >struct in6_addr *addr; >> >char addrbuf[INET6_ADDRSTRLEN]; >> > >> > So msg is not 32-bit aligned, presumably because INET6_ADDRSTRLEN is 46. >> > I can fix the bus error by hardcoding 48, but of course that's not >> > right. >> > >> > Then msg is passed to get_next_msg (as next) where the expression >> > rtm->rtm_hdrlen (rtm is the not-aligned msg) is the first dereference >> > and thus the point where it crashes. >> > >> > I'm at the point now where I think I've found the root of the problem >> > but don't know enough to fix it. >> > >> > Any thoughts? >> >> Thanks for the report. >> >> I guess that we could fix the rtm_* functions to work on an unaligned >> input buffer, but an easier fix would be to just ask for a suitably >> aligned input buffer, with malloc(3). Does the diff below fix your >> problem? > > This fixes the problem. I let it sit in debug mode for 30 minutes (which > is far longer than it ever lasted before) through plenty of routing > messages, and it never crashed. I will keep monitoring, but I think it's > good. Committed, thanks. -- jca | PGP : 0x1524E7EE / 5135 92C1 AD36 5293 2BDF DDCC 0DFA 74AE 1524 E7EE
Re: alignment error rtadvd/armv7
On Sun, 18 Sep 2016, Jeremie Courreges-Anglas wrote: > Martin Brandenburgwrites: > > > On a PandaBoard (armv7) running -current, when I run rtadvd, it crashes > > with a bus error shortly after printing (received a routing message). I > > can reproduce by sending SIGHUP to a dhclient running on the same > > interface. > > > > I have traced this down to the following block of code in rtadvd.c. > > > > static void > > rtmsg_input(void) > > { > > int n, type, ifindex = 0, plen; > > size_t len; > > char msg[2048], *next, *lim; > > u_char ifname[IF_NAMESIZE]; > > struct prefix *prefix; > > struct rainfo *rai; > > struct in6_addr *addr; > > char addrbuf[INET6_ADDRSTRLEN]; > > > > So msg is not 32-bit aligned, presumably because INET6_ADDRSTRLEN is 46. > > I can fix the bus error by hardcoding 48, but of course that's not > > right. > > > > Then msg is passed to get_next_msg (as next) where the expression > > rtm->rtm_hdrlen (rtm is the not-aligned msg) is the first dereference > > and thus the point where it crashes. > > > > I'm at the point now where I think I've found the root of the problem > > but don't know enough to fix it. > > > > Any thoughts? > > Thanks for the report. > > I guess that we could fix the rtm_* functions to work on an unaligned > input buffer, but an easier fix would be to just ask for a suitably > aligned input buffer, with malloc(3). Does the diff below fix your > problem? This fixes the problem. I let it sit in debug mode for 30 minutes (which is far longer than it ever lasted before) through plenty of routing messages, and it never crashed. I will keep monitoring, but I think it's good. Martin
alignment error rtadvd/armv7
On a PandaBoard (armv7) running -current, when I run rtadvd, it crashes with a bus error shortly after printing (received a routing message). I can reproduce by sending SIGHUP to a dhclient running on the same interface. I have traced this down to the following block of code in rtadvd.c. static void rtmsg_input(void) { int n, type, ifindex = 0, plen; size_t len; char msg[2048], *next, *lim; u_char ifname[IF_NAMESIZE]; struct prefix *prefix; struct rainfo *rai; struct in6_addr *addr; char addrbuf[INET6_ADDRSTRLEN]; So msg is not 32-bit aligned, presumably because INET6_ADDRSTRLEN is 46. I can fix the bus error by hardcoding 48, but of course that's not right. Then msg is passed to get_next_msg (as next) where the expression rtm->rtm_hdrlen (rtm is the not-aligned msg) is the first dereference and thus the point where it crashes. I'm at the point now where I think I've found the root of the problem but don't know enough to fix it. Any thoughts? Martin