On Mon, Mar 06, 2017 at 14:46 +0100, Claudio Jeker wrote:
> On Mon, Mar 06, 2017 at 01:04:28PM +0100, Mike Belopuhov wrote:
> > On Fri, Mar 03, 2017 at 16:58 +0100, [email protected] wrote:
> > > Hi,
> > >
> > > here is a patch which fixes an integer overflow in the computation of
> > > adaptive timeouts. The effect of this integer overflow is that
> > > depending on timeout values, legitimate established connection states
> > > can be evicted immediately (although other states with shorter
> > > timeouts would be available but that's irrelevant).
> > >
> > > Examples of such values would be:
> > >
> > > timeout tcp.established 86400s (default)
> > > limit states 200000
> > >
> > > leads to a computed timeout of exactly 0 when the number of states is
> > > 140579.
> > >
> > > While this condition is true, if the flush routine stumbles upon an
> > > established connection, it is immediately dropped. This is not
> > > guaranteed to happen because the flush routine checks only
> > > (100/interval)% of the states each second.
> > >
> > > Here are values for a lower number of states in order to help
> > > reproduce the problem:
> > >
> > > timeout tcp.established 1048576s
> > > limit states 6830
> > >
> > > the 4100th state may lead to the eviction of an established connection.
> > >
> > > We tried to provide the least intrusive patch and it works for us;
> > > feel free to use it.
> > >
> > > Regards,
> > > Mathieu
> > >
> > >
> > > --- sys/net/pf.c Mon Jul 18 15:17:44 2016
> > > +++ sys/net/pf.c Fri Jan 20 13:10:04 2017
> > > @@ -1223,7 +1223,7 @@
> > > if (states >= end)
> > > return (0);
> > >
> > > - timeout = timeout * (end - states) / (end - start);
> > > + timeout = (u_int64_t) timeout * (end - states) / (end - start);
> > > }
> > >
> > > return (state->expire + timeout);
> > > EOF
> > >
> >
> > I think your diff looks alright. I've double checked that it fixes the
> > overflow you're describing.
> >
>
> Agreed. I was toying a bit more around and came up with this version. Not
> sure if it is better. The idea is that (end - states) is very large when
> walking into adaptive timeouts and shrinking. I changed it to (states -
> start) which grows the bigger the table becomes.
> Also using more u_int32_t seems to be a good idea.
>
In several (start, end) ranges of number of states that I've tested,
there's virtually no difference between these two versions so I'd
rather not complicate the code unless there's no compelling reason
to do so. Do you have a test in mind where this avoids some corner
case?
Either way, your or the submitted version are OK mikeb.
> --
> :wq Claudio
>
> Index: net/pf.c
> ===================================================================
> RCS file: /cvs/src/sys/net/pf.c,v
> retrieving revision 1.1015
> diff -u -p -r1.1015 pf.c
> --- net/pf.c 9 Feb 2017 15:19:32 -0000 1.1015
> +++ net/pf.c 6 Mar 2017 13:40:23 -0000
> @@ -1199,7 +1199,7 @@ pf_purge_thread(void *v)
> int32_t
> pf_state_expires(const struct pf_state *state)
> {
> - int32_t timeout;
> + u_int32_t timeout;
> u_int32_t start;
> u_int32_t end;
> u_int32_t states;
> @@ -1225,10 +1225,16 @@ pf_state_expires(const struct pf_state *
> states = pf_status.states;
> }
> if (end && states > start && start < end) {
> + u_int64_t t;
> +
> if (states >= end)
> return (0);
>
> - timeout = timeout * (end - states) / (end - start);
> + t = (u_int64_t)timeout * (states - start) / (end - start);
> + if (t >= timeout)
> + timeout = 0;
> + else
> + timeout -= t;
> }
>
> return (state->expire + timeout);