Martin - Can I see the config.h and config.log generated for your test?
thanks! donald On Tue, Feb 23, 2016 at 4:15 PM, Martin Winter < [email protected]> wrote: > On 23 Feb 2016, at 5:23, Donald Sharp wrote: > > Martin - >> >> What's your configure line for freebsd? Can you point me at your >> config.log for this build? My freebsd vm is unhappy with me right now. >> > > That’s where I have issues. My CI integrated system spots the error (and > every time), but if > I try to run it outside, then it seems to work. > > Current config inside the CI system is as follows: > > --enable-user=quagga --enable-group=quagga > --sysconfdir=/usr/local/etc/quagga-test --localstatedir=/var/run/quagga > --enable-vtysh > --with-pkg-extra-version=-ci.NetDEF.org-20160223.101813-git.04a3aab > --enable-fpm --enable-tcp-zebra --disable-irdb --enable-isisd > --enable-isis-topology --enable-bgp-announce --enable-opaque-lsa > --without-libpam --enable-pimd --enable-rtadv --disable-snmp > --enable-tcp-zebra --prefix=/usr/local --mandir=/usr/local/man > --infodir=/usr/local/info/ --build=amd64-portbld-freebsd10.1 > > Still trying to reproduce it outside of the automated system (so I can do > more troubleshooting) > > (The automated CI system uses my CI plans to build packages, then installs > them > and uses them for the tests. When I do it outside of CI system I’m just > doing > the usual make/make install and usually with less optimization to get debug > info) > > This bug affects at least 3, probably more BGP errors > > Anyway, beside of someone looking at the specific patch and see if > something > might go wrong on FreeBSD, I just need a bit more time to reproduce it. > Hope to know more in another 8..12 hrs. > > - Martin > > > On Mon, Feb 22, 2016 at 10:26 PM, Martin Winter < >> [email protected]> wrote: >> >> On 22 Feb 2016, at 6:40, Donald Sharp wrote: >>> >>> Martin - >>> >>>> >>>> Any word on this? I'd like to push this patch if it fixes your issues. >>>> >>>> >>> Not yet. >>> I’m dealing with (most likely) several issues which seem overlap. So it >>> takes a bit >>> more time. >>> I still tend to think that Timo’s fix is good - but only addressing part >>> of the problems. >>> I’ll prefer to keep it on hold at this time until at least one more >>> FreeBSD bug is fixed. >>> >>> >>> One of the git bisect on BGP errors pointed to commit >>> >>> commit 04a3aabf58d95d01c4c8168eeff43cf9d9892eee >>> Author: Nicolas Dichtel <[email protected]> >>> Date: Thu Sep 3 10:47:43 2015 +0200 >>> >>> vrf: add a runtime check before playing with netns >>> >>> This patch adds a runtime check to determine if netns are >>> available. Some >>> systems like OpenWRT have the system call setns() but don't >>> have the kernel >>> option CONFIG_NET_NS enabled. >>> >>> Reported-by: Christian Franke <[email protected]> >>> Signed-off-by: Nicolas Dichtel <[email protected]> >>> Tested-by: Christian Franke <[email protected]> >>> >>> >>> Doing a git bisect points to this commit breaking at least 3 BGP tests. >>> I’m still trying to manually verify and understand how this breaks it, so >>> give me >>> more time. >>> >>> (Feel free to guess on how this commit could break FreeBSD… seems to be >>> related >>> to similar not correctly installing forwarding entries as well) >>> >>> - Martin >>> >>> >>> >>> On Thu, Feb 18, 2016 at 10:30 PM, Martin Winter < >>> >>>> [email protected]> wrote: >>>> >>>> On 18 Feb 2016, at 18:35, Donald Sharp wrote: >>>> >>>>> >>>>> Martin - >>>>> >>>>>> >>>>>> rt_socket.c is not compiled on linux so no need for verification. >>>>>> Once >>>>>> >>>>>> you >>>>> >>>>> have a run please let me know and I'll push your updated patch. >>>>>> >>>>>> >>>>> It’s running now. Will know soon. It definitely fixes some of the >>>>> errors >>>>> and I hope it actually fixes all of the FreeBSD errors. >>>>> >>>>> Will know in approx one more day. >>>>> >>>>> (BTW: Testing on a git commit which does no longer exist because of >>>>> rebase, >>>>> but I needed the same to compare. Lucky that I still had the VMs with >>>>> the >>>>> old git checkout running…) >>>>> >>>>> - Martin >>>>> >>>>> >>>>> donald >>>>>> >>>>>> On Thu, Feb 18, 2016 at 9:19 PM, Martin Winter < >>>>>> [email protected]> wrote: >>>>>> >>>>>> Timo, >>>>>> >>>>>>> >>>>>>> On 17 Feb 2016, at 23:01, Timo Teras wrote: >>>>>>> >>>>>>> On Wed, 17 Feb 2016 20:11:07 -0800 >>>>>>> >>>>>>> "Martin Winter" <[email protected]> wrote: >>>>>>>> >>>>>>>> This is based on Quagga proposed 6 branch of Feb 10 (git f48d5b9957 >>>>>>>> - >>>>>>>> >>>>>>>> which does no longer exist, [no] thanks to rebase on a public >>>>>>>>> git) on FreeBSD 10.2. >>>>>>>>> >>>>>>>>> Zebra seems to fail delete any routes in FreeBSD RIB. It fails with >>>>>>>>> updates (better routes to different interface) and >>>>>>>>> it fails to remove them when quagga is killed. >>>>>>>>> >>>>>>>>> >>>>>>>>> Thanks for the testing. Sounds like this is breakage caused by my >>>>>>>> atomic FIB patch which was pretty untested on *BSD. >>>>>>>> >>>>>>>> Looks like the code talking to kernel does not handle RTM_CHANGE >>>>>>>> correctly. As first test, perhaps just removing RTM_CHANGE usage >>>>>>>> might >>>>>>>> fix the issues and revert to how it used to work. >>>>>>>> >>>>>>>> Using RTM_CHANGE on kernels where it works is better, but it's left >>>>>>>> an >>>>>>>> exercise for developer who has access and will to fix it on *BSD. >>>>>>>> >>>>>>>> >>>>>>>> Thanks for the patch. Seems like you never tested it (failed to >>>>>>> >>>>>>> compile), >>>>>> >>>>> >>>>> but was close enough to guess what you probably meant. See inline on >>>>>> >>>>>>> >>>>>>> patch >>>>>> >>>>> >>>>> >>>>>> diff --git a/zebra/rt_socket.c b/zebra/rt_socket.c >>>>>>> >>>>>>> index 4d0a7db..9012280 100644 >>>>>>>> --- a/zebra/rt_socket.c >>>>>>>> +++ b/zebra/rt_socket.c >>>>>>>> @@ -68,7 +68,7 @@ sin_masklen (struct in_addr mask) >>>>>>>> >>>>>>>> /* Interface between zebra message and rtm message. */ >>>>>>>> static int >>>>>>>> -kernel_rtm_ipv4 (int cmd, struct prefix *p, struct rib *rib, int >>>>>>>> >>>>>>>> family) >>>>>>> >>>>>> >>>>> +kernel_rtm_ipv4 (int cmd, struct prefix *p, struct rib *rib) >>>>>> >>>>>>> >>>>>>>> { >>>>>>>> struct sockaddr_in *mask = NULL; >>>>>>>> @@ -245,7 +245,7 @@ sin6_masklen (struct in6_addr mask) >>>>>>>> >>>>>>>> /* Interface between zebra message and rtm message. */ >>>>>>>> static int >>>>>>>> -kernel_rtm_ipv6 (int cmd, struct prefix *p, struct rib *rib, int >>>>>>>> >>>>>>>> family) >>>>>>> >>>>>> >>>>> +kernel_rtm_ipv6 (int cmd, struct prefix *p, struct rib *rib) >>>>>> >>>>>>> { >>>>>>>> struct sockaddr_in6 *mask; >>>>>>>> struct sockaddr_in6 sin_dest, sin_mask, sin_gate; >>>>>>>> @@ -363,33 +363,32 @@ kernel_rtm_ipv6 (int cmd, struct prefix *p, >>>>>>>> >>>>>>>> struct >>>>>>> >>>>>> >>>>> rib *rib, int family) >>>>>> >>>>>>> >>>>>>>> #endif >>>>>>>> >>>>>>>> +static int >>>>>>>> +kernel_rtm_ipv6 (int cmd, struct prefix *p, struct rib *rib) >>>>>>>> >>>>>>>> >>>>>>>> I assume this should be >>>>>>> kernel_rtm - not kernel_rtm_ipv6 >>>>>>> (otherwise you have a duplicate for kernel_rtm_ipv6() and a loop >>>>>>> in case of AF_INET6) >>>>>>> >>>>>>> >>>>>>> +{ >>>>>>> >>>>>>> + switch (PREFIX_FAMILY(p)) >>>>>>>> + { >>>>>>>> + case AF_INET: >>>>>>>> + return kernel_rtm_ipv4 (cmd, p, rib); >>>>>>>> + case AF_INET6: >>>>>>>> + return kernel_rtm_ipv6 (cmd, p, rib); >>>>>>>> + } >>>>>>>> + return 0; >>>>>>>> +} >>>>>>>> + >>>>>>>> int >>>>>>>> kernel_route_rib (struct prefix *p, struct rib *old, struct rib >>>>>>>> *new) >>>>>>>> { >>>>>>>> - struct rib *rib; >>>>>>>> - int route = 0, cmd; >>>>>>>> - >>>>>>>> - if (!old && new) >>>>>>>> - cmd = RTM_ADD; >>>>>>>> - else if (old && !new) >>>>>>>> - cmd = RTM_DELETE; >>>>>>>> - else >>>>>>>> - cmd = RTM_CHANGE; >>>>>>>> - >>>>>>>> - rib = new ? new : old; >>>>>>>> + int route = 0; >>>>>>>> >>>>>>>> if (zserv_privs.change(ZPRIVS_RAISE)) >>>>>>>> zlog (NULL, LOG_ERR, "Can't raise privileges"); >>>>>>>> >>>>>>>> - switch (PREFIX_FAMILY(p)) >>>>>>>> - { >>>>>>>> - case AF_INET: >>>>>>>> - route = kernel_rtm_ipv4 (cmd, p, rib, AF_INET); >>>>>>>> - break; >>>>>>>> - case AF_INET6: >>>>>>>> - route = kernel_rtm_ipv6 (cmd, p, rib, AF_INET6); >>>>>>>> - break; >>>>>>>> - } >>>>>>>> + if (old) >>>>>>>> + route |= kernel_rtm (RTM_DELETE, p, rib); >>>>>>>> >>>>>>>> >>>>>>>> Changed to >>>>>>> route |= kernel_rtm (RTM_DELETE, p, old); >>>>>>> >>>>>>> + >>>>>>> >>>>>>> + if (new) >>>>>>>> + route |= kernel_rtm (RTM_ADD, p, rib); >>>>>>>> >>>>>>>> >>>>>>>> and changed to >>>>>>> route |= kernel_rtm (RTM_ADD, p, new); >>>>>>> >>>>>>> (You removed the declaration of “rib” above - so rib is undefined) >>>>>>> >>>>>>> >>>>>>> if (zserv_privs.change(ZPRIVS_LOWER)) >>>>>>> >>>>>>>> zlog (NULL, LOG_ERR, "Can't lower privileges"); >>>>>>>> >>>>>>>> >>>>>>>> Attached is a updated patch >>>>>>> >>>>>>> With the changes it fixes the specific issue I’ve mentioned. I have >>>>>>> not >>>>>>> verified the >>>>>>> patch on Linux yet. Will do a full run with this patch to see how >>>>>>> many >>>>>>> >>>>>>> of >>>>>> >>>>> >>>>> my approx >>>>>> >>>>>>> 20..30 failing FreeBSD testcases it fixes (I assume many to all…) >>>>>>> >>>>>>> - Martin >>>>>>> >>>>>>> >>>>>>> _______________________________________________ >>>>>>> Quagga-dev mailing list >>>>>>> [email protected] >>>>>>> https://lists.quagga.net/mailman/listinfo/quagga-dev >>>>>>> >>>>>>> >>>>>>> >>>>>
_______________________________________________ Quagga-dev mailing list [email protected] https://lists.quagga.net/mailman/listinfo/quagga-dev
