Re: Important bgpd fix

Stuart Henderson Sat, 11 Jan 2014 07:08:56 -0800

I'm running this on one router without seeing any problems yet,
however it does not have any graceful-restart peers so it's not exactly
a great test.


Has anyone else tried this at all yet?


On 2014/01/02 00:03, Claudio Jeker wrote:
> There is a somewhat critical bug in bgpd which got hit by local friends
> a few weeks ago. The problem is that on session with the graceful restart
> capability stale routes are not properly flushed. This can lead to bad
> FIB entries and black holes. This happens when a router does not reconnect
> before the timeout fires.
> 
> As a quick fix disabling the graceful reload capability with
> "announce restart no" will skip the probelmatic code and no stale routes
> will remain. The session must be cleared first before the change becomes
> active.
> 
> The proper fix is attached. in short this removes a check for the
> CAPA_GR_FORWARD flag in the timeout and IMSG_SESSION_RESTARTED handler.
> CAPA_GR_RESTARTING is indicating that bgpd is currently doing a graceful
> restart for this neighbor and it is set for any session that must issue
> a flush of stale routes (either after a successful restart or because of
> hitting a timeout or change of configuration). So whenever the SE issues
> an IMSG_SESSION_FLUSH or IMSG_SESSION_RESTARTED message the
> CAPA_GR_RESTARTING flag needs to be cleared.
> 
> Please test and/or verify that my logic is now correct.
> -- 
> :wq Claudio
> 
> PS: I'm away for a bit more then a week so I will not be able to answer
> any questions anytime soon.
> 
> Index: rde.c
> ===================================================================
> RCS file: /cvs/src/usr.sbin/bgpd/rde.c,v
> retrieving revision 1.326
> diff -u -p -r1.326 rde.c
> --- rde.c     13 Nov 2013 20:41:01 -0000      1.326
> +++ rde.c     1 Jan 2014 02:30:04 -0000
> @@ -3282,6 +3282,7 @@ peer_stale(u_int32_t id, u_int8_t aid)
>               return;
>       }
>  
> +     /* flush the now even staler routes out */
>       if (peer->staletime[aid])
>               peer_flush(peer, aid);
>       peer->staletime[aid] = now = time(NULL);
> @@ -3317,7 +3318,14 @@ peer_recv_eor(struct rde_peer *peer, u_i
>  {
>       peer->prefix_rcvd_eor++;
>  
> -     /* First notify SE to remove possible race with the timeout. */
> +     /*
> +      * First notify SE to avert a possible race with the restart timeout.
> +      * If the timeout fires before this imsg is processed by the SE it will
> +      * result in the same operation since the timeout issues a FLUSH which
> +      * does the same as the RESTARTED action (flushing stale routes).
> +      * The logic in the SE is so that only one of FLUSH or RESTARTED will
> +      * be sent back to the RDE and so peer_flush is only called once.
> +      */
>       if (imsg_compose(ibuf_se, IMSG_SESSION_RESTARTED, peer->conf.id,
>           0, -1, &aid, sizeof(aid)) == -1)
>               fatal("imsg_compose error");
> Index: session.c
> ===================================================================
> RCS file: /cvs/src/usr.sbin/bgpd/session.c,v
> retrieving revision 1.333
> diff -u -p -r1.333 session.c
> --- session.c 13 Nov 2013 20:41:01 -0000      1.333
> +++ session.c 1 Jan 2014 02:36:31 -0000
> @@ -1729,12 +1729,11 @@ session_graceful_stop(struct peer *p)
>  
>       for (i = 0; i < AID_MAX; i++) {
>               /*
> -              * Only flush if the peer is restarting and the peer indicated
> -              * it hold the forwarding state. In all other cases the
> -              * session was already flushed when the session came up.
> +              * Only flush if the peer is restarting and the timeout fired.
> +              * In all other cases the session was already flushed when the
> +              * session went down or when the new open message was parsed.
>                */
> -             if (p->capa.neg.grestart.flags[i] & CAPA_GR_RESTARTING &&
> -                 p->capa.neg.grestart.flags[i] & CAPA_GR_FORWARD) {
> +             if (p->capa.neg.grestart.flags[i] & CAPA_GR_RESTARTING) {
>                       log_peer_warnx(&p->conf, "graceful restart of %s, "
>                           "time-out, flushing", aid2str(i));
>                       if (imsg_compose(ibuf_rde, IMSG_SESSION_FLUSH,
> @@ -2520,12 +2519,16 @@ capa_neg_calc(struct peer *p)
>        */
>  
>       for (i = 0; i < AID_MAX; i++) {
> +             int8_t  negflags;
> +
>               /* disable GR if the AFI/SAFI is not present */
>               if (p->capa.peer.grestart.flags[i] & CAPA_GR_PRESENT &&
>                   p->capa.neg.mp[i] == 0)
>                       p->capa.peer.grestart.flags[i] = 0;     /* disable */
>               /* look at current GR state and decide what to do */
> -             if (p->capa.neg.grestart.flags[i] & CAPA_GR_RESTARTING) {
> +             negflags = p->capa.neg.grestart.flags[i];
> +             p->capa.neg.grestart.flags[i] = p->capa.peer.grestart.flags[i];
> +             if (negflags & CAPA_GR_RESTARTING) {
>                       if (!(p->capa.peer.grestart.flags[i] &
>                           CAPA_GR_FORWARD)) {
>                               if (imsg_compose(ibuf_rde, IMSG_SESSION_FLUSH,
> @@ -2533,12 +2536,10 @@ capa_neg_calc(struct peer *p)
>                                       return (-1);
>                               log_peer_warnx(&p->conf, "graceful restart of "
>                                   "%s, not restarted, flushing", aid2str(i));
> -                     }
> -                     p->capa.neg.grestart.flags[i] =
> -                         p->capa.peer.grestart.flags[i] | CAPA_GR_RESTARTING;
> -             } else
> -                     p->capa.neg.grestart.flags[i] =
> -                         p->capa.peer.grestart.flags[i];
> +                     } else
> +                             p->capa.neg.grestart.flags[i] |=
> +                                 CAPA_GR_RESTARTING;
> +             }
>       }
>       p->capa.neg.grestart.timeout = p->capa.peer.grestart.timeout;
>       p->capa.neg.grestart.restart = p->capa.peer.grestart.restart;
> @@ -2911,9 +2912,7 @@ session_dispatch_imsg(struct imsgbuf *ib
>                       if (aid >= AID_MAX)
>                               fatalx("IMSG_SESSION_RESTARTED: bad AID");
>                       if (p->capa.neg.grestart.flags[aid] &
> -                         CAPA_GR_RESTARTING &&
> -                         p->capa.neg.grestart.flags[aid] &
> -                         CAPA_GR_FORWARD) {
> +                         CAPA_GR_RESTARTING) {
>                               log_peer_warnx(&p->conf,
>                                   "graceful restart of %s finished",
>                                   aid2str(aid));
>

Re: Important bgpd fix

Reply via email to