On Fri, Sep 02, 2022 at 03:04:34PM +0200, YASUOKA Masahiko wrote:
> Hi,
> 
> The diff is to fix a problem in a complex setup.
> 
> Normal setup of divert-reply for TCP connection:
> 
>   client --- relayd  --- server
> 
> - transparently forward TCP connections
> - divert-reply is configured the outbound connection to the server
>   - so that the PF state is removed when the PCB is deleted
>   - otherwise if packets from server is comming after the PCB is
>     deleted, they are accidentally forwarded directly to the client
> 
> In addtion to this, "match out nat-to" is configured for the outbound
> connection instead of dropping "transparent".  The purpose of doing
> this is to expand the space of ephemeral ports of NAT.  Ephemeral
> ports of PCB is limitted in one 2^16 space, but ephemeral ports of PF
> is limitted in 2^16 for each remote address.
> 
> In this case, if the PF state is dropped immediately after the PCB is
> dropped, the port number of NAT might be reused quickly, then a
> problem can happen on the server side since the port is used for the
> old connection.
> 
> So the diff is to keep the state until timeout.
> 
> comment?

How does that work together with port reuse?

One reason I have introduced pf_remove_divert_state() is to behave
correctly in case the client does port reuse.

When client creates and closes a lot of connections it will reuse
its port before the timeout triggers.

We have code in pf_state_key_attach(), pf_test_state() and tcp_input()
to remove old states and create new ones in that case.  For divert
the old state has to be removed, so that the new packet reaches the
listen state.

I don't know if this still works with your diff.  Have you considered
it?

Reuse is tested in /usr/src/regress/sys/net/pf_divert/.  But I do
not use nat or rdr there.  So my test may not cover the code in
your diff as you check "key[PF_SK_WIRE] != si->s->key[PF_SK_STACK]".

I am running regress test with diff right now, we will see if it
still works.

bluhm

> Index: sys/net/pf.c
> ===================================================================
> RCS file: /cvs/src/sys/net/pf.c,v
> retrieving revision 1.1138
> diff -u -p -r1.1138 pf.c
> --- sys/net/pf.c      30 Aug 2022 11:53:03 -0000      1.1138
> +++ sys/net/pf.c      2 Sep 2022 12:54:36 -0000
> @@ -1148,6 +1148,8 @@ pf_find_state(struct pf_pdesc *pd, struc
>  
>       if (s == NULL)
>               return (PF_DROP);
> +     if (ISSET(s->state_flags, PFSTATE_INP_UNLINKED))
> +             return (PF_DROP);
>  
>       if (s->rule.ptr->pktrate.limit && pd->dir == s->direction) {
>               pf_add_threshold(&s->rule.ptr->pktrate);
> @@ -1461,7 +1463,23 @@ pf_remove_divert_state(struct pf_state_k
>               if (sk == si->s->key[PF_SK_STACK] && si->s->rule.ptr &&
>                   (si->s->rule.ptr->divert.type == PF_DIVERT_TO ||
>                   si->s->rule.ptr->divert.type == PF_DIVERT_REPLY)) {
> -                     pf_remove_state(si->s);
> +                     if (si->s->key[PF_SK_STACK]->proto == IPPROTO_TCP &&
> +                         si->s->key[PF_SK_WIRE] != si->s->key[PF_SK_STACK]) {
> +                             /*
> +                              * If the local address is translated, keep
> +                              * the state for "tcp.closed" seconds to
> +                              * prevent its source port from being reused.
> +                              */
> +                             if (si->s->src.state < TCPS_FIN_WAIT_2 ||
> +                                 si->s->dst.state < TCPS_FIN_WAIT_2) {
> +                                     pf_set_protostate(si->s, PF_PEER_BOTH,
> +                                         TCPS_TIME_WAIT);
> +                                     si->s->timeout = PFTM_TCP_CLOSED;
> +                                     si->s->expire = getuptime();
> +                             }
> +                             si->s->state_flags |= PFSTATE_INP_UNLINKED;
> +                     } else
> +                             pf_remove_state(si->s);
>                       break;
>               }
>       }
> Index: sys/net/pfvar.h
> ===================================================================
> RCS file: /cvs/src/sys/net/pfvar.h,v
> retrieving revision 1.509
> diff -u -p -r1.509 pfvar.h
> --- sys/net/pfvar.h   20 Jul 2022 09:33:11 -0000      1.509
> +++ sys/net/pfvar.h   2 Sep 2022 12:54:37 -0000
> @@ -784,6 +784,7 @@ struct pf_state {
>  #define      PFSTATE_RANDOMID        0x0080
>  #define      PFSTATE_SCRUB_TCP       0x0100
>  #define      PFSTATE_SETPRIO         0x0200
> +#define      PFSTATE_INP_UNLINKED    0x0400
>  #define      PFSTATE_SCRUBMASK 
> (PFSTATE_NODF|PFSTATE_RANDOMID|PFSTATE_SCRUB_TCP)
>  #define      PFSTATE_SETMASK   (PFSTATE_SETTOS|PFSTATE_SETPRIO)
>       u_int8_t                 log;
> 
> 
> 

Reply via email to