Hi,

Thanks so much for the diff, it appears to have resolved the issue.

We are now trying to establish whether we need the fix widely deployed or
only on the box that originates with the large LSA updates, pushing it over
the 1500mtu.

We are going to run some tests, but our expectation is that when the DR
sends the message from the originating router on to its neighbors that they
will then see the same issue.

Out of interest is there any way of just announcing a single network.

In this particular case the large LS-Update is caused because we have many
interfaces, but these are all carp so will failover in one hit anyway. We
have allocated 10.128.0.0/16 to this firewall so there are many networks,
but anything in our network with a destination of 10.128.0.0/16 can end up
here.

We tried something like *redistribute 10.128.0.0/16 <http://10.128.0.0/16>
depend on carp0*, but what that appears to do is limit advertisements to
the subnets that fall within that range, so we still have a very large LSA
update anyway.

Just wondering if there was any workaround, as it would just simplify
processing etc.

It is probably a non issue anyway now, with the fix, but just interested if
anyone has done anything similar.

Regards

Richard



On Wed, May 6, 2020 at 9:52 AM Claudio Jeker <cje...@diehard.n-r-g.com>
wrote:

> On Wed, May 06, 2020 at 09:33:11AM +0100, Richard Chivers wrote:
> > Hi,
> >
> > Some progress has been made, we can now replicate this consistently and
> it
> > appears that whenever a LS update exceeds the mtu (1500) we get this
> issue
> > of lsa_check bad age.
> >
> > When running with the diff Claudio sent we start getting a bunch of
> errors
> > complaining about:
> >
> > recv_ls_update: bad packet size, neighbor ID x.x.x.x
> > lsa_check: bad packet size
> >
> > We don't ever move to a state of FULL/DR or similar.
> >
> > Does anyone have any suggestions? We are just starting to look at the
> wider
> > code to see if we can comprehend what may be occurring, but it will
> > likely be a steep learning curve :)
> >
>
> Just realized that my diff was wrong since ibuf_reserve() would change the
> write position of the buffer and so you end up with some empty space in
> the buffer.
>
> Here is a better diff. This is using ibuf_size to get the current write
> position and then ibuf_seek() to write the age back into the right spot.
> Using the position instead of the pointer has the benefit that a realloc()
> in ibuf_add() will not result in the stale pointer to lsage that the
> current code has.
>
> I have currently no ospf setup so my testing is limited.
> --
> :wq Claudio
>
> Index: lsupdate.c
> ===================================================================
> RCS file: /cvs/src/usr.sbin/ospfd/lsupdate.c,v
> retrieving revision 1.47
> diff -u -p -r1.47 lsupdate.c
> --- lsupdate.c  19 Nov 2019 09:55:55 -0000      1.47
> +++ lsupdate.c  6 May 2020 08:48:19 -0000
> @@ -175,8 +175,8 @@ int
>  add_ls_update(struct ibuf *buf, struct iface *iface, void *data,
> u_int16_t len,
>      u_int16_t older)
>  {
> -       void            *lsage;
> -       u_int16_t        age;
> +       size_t          ageoff;
> +       u_int16_t       age;
>
>         if ((size_t)iface->mtu < sizeof(struct ip) + sizeof(struct
> ospf_hdr) +
>             sizeof(u_int32_t) + ibuf_size(buf) + len + MD5_DIGEST_LENGTH) {
> @@ -186,7 +186,7 @@ add_ls_update(struct ibuf *buf, struct i
>                         return (0);
>         }
>
> -       lsage = ibuf_reserve(buf, 0);
> +       ageoff = ibuf_size(buf);
>         if (ibuf_add(buf, data, len)) {
>                 log_warn("add_ls_update");
>                 return (0);
> @@ -198,7 +198,7 @@ add_ls_update(struct ibuf *buf, struct i
>         if ((age += older + iface->transmit_delay) >= MAX_AGE)
>                 age = MAX_AGE;
>         age = htons(age);
> -       memcpy(lsage, &age, sizeof(age));
> +       memcpy(ibuf_seek(buf, ageoff, sizeof(age)), &age, sizeof(age));
>
>         return (1);
>  }
>

Reply via email to