On 2018/02/09 03:39, Claudio Jeker wrote:
> On netsplits it can happen that on join multiple ospfd end up as DR.
> In my case with 3 routers the one cut off stays DR even though the rest of
> the network already has a DR and BDR.

Very likely this is what I've seen. My layout has been roughly
like this,

site a router 1  -------------  site b router 3
  |                                        |
  |                                        |
site a router 2  -------------  site b router 4

and it's usually one of the site a<>b links that drops out and
later comes back, followed by the multiple DR confusion.
It's hard to say which is the "cut off" router in that case as they
all have alternative links.

> Looking into this it seems that in some cases we don't issue an
> IF_EVT_NBR_CHNG and so the re-evaluation of DR/BDR does not happen.
> Looking at hello.c and the rfc seems to suggest that the following case is
> currently not handled:
> 
>             o   Bidirectional communication has been established to a
>                 neighbor.  In other words, the state of the neighbor has
>                 transitioned to 2-Way or higher.
> 
> The other cases in the RFC seem to be covered.
> The following diff fixes this and seems to solve the problem I'm seeing.
> 
> Since this is one of those bits that always caused trouble I would like
> more tests and maybe someone is brave enough to OK the diff.

I'm running this on a handful of routers, it's early days to say whether
it fixes things for me, but I've not seen problems yet. Not quite
feeling brave enough for an OK until I've seen it running for longer
but certainly the diff makes sense to me.


> :wq Claudio
> 
> Index: hello.c
> ===================================================================
> RCS file: /cvs/src/usr.sbin/ospfd/hello.c,v
> retrieving revision 1.21
> diff -u -p -r1.21 hello.c
> --- hello.c   18 Nov 2014 20:54:29 -0000      1.21
> +++ hello.c   9 Feb 2018 02:11:55 -0000
> @@ -188,7 +188,6 @@ recv_hello(struct iface *iface, struct i
>               nbr->dr.s_addr = hello.d_rtr;
>               nbr->bdr.s_addr = hello.bd_rtr;
>               nbr->priority = hello.rtr_priority;
> -             nbr_change = 1;
>       }
>  
>       /* actually the neighbor address shouldn't be stored on virtual links */
> @@ -201,8 +200,10 @@ recv_hello(struct iface *iface, struct i
>               memcpy(&nbr_id, buf, sizeof(nbr_id));
>               if (nbr_id == ospfe_router_id()) {
>                       /* seen myself */
> -                     if (nbr->state & NBR_STA_PRELIM)
> +                     if (nbr->state & NBR_STA_PRELIM) {
>                               nbr_fsm(nbr, NBR_EVT_2_WAY_RCVD);
> +                             nbr_change = 1;
> +                     }
>                       break;
>               }
>               buf += sizeof(nbr_id);
> 

Reply via email to