Hello,

thank you for detailed bug report. It clearly shows there is something odd. 

On Mon, May 08, 2023 at 01:30:24AM +0300, Joosep wrote:

</snip>

> When looking through the pf debug-level logs, 2 events stand up - first one
> has no problems, second one fails.Note that source port 53081 for nat exist
> in
> 2 different events (this particular test run was 66 dns requests long
> on an otherwise idle system):
> May  8 03:35:53 lab73 /bsd: pf: key search, in on vlan5: UDP wire: (0)
> 10.10.11.3:25217 10.10.10.1:53
> May  8 03:35:53 lab73 /bsd: pf: key setup: UDP wire: (0) 10.10.11.3:25217
> 10.10.10.1:53 stack: (0) -
> May  8 03:35:53 lab73 /bsd: pf: key search, out on em2: UDP wire: (0)
> 10.10.10.1:53 10.10.11.3:25217
> May  8 03:35:53 lab73 /bsd: pf: pf_map_addr: selected address 10.10.10.2
> May  8 03:35:53 lab73 /bsd: pf: key setup: UDP wire: (0) 10.10.10.1:53
> 10.10.10.2:53081 stack: (0) 10.10.10.1:53 10.10.11.3:25217
> May  8 03:35:53 lab73 /bsd: pf: key search, in on em2: UDP wire: (0)
> 10.10.10.1:53 10.10.10.2:53081
> May  8 03:35:53 lab73 /bsd: pf: key search, out on vlan5: UDP wire: (0)
> 10.10.11.3:25217 10.10.10.1:53
> .....
> May  8 03:36:02 lab73 /bsd: pf: key search, in on vlan5: UDP wire: (0)
> 10.10.11.3:17189 10.10.10.1:53
> May  8 03:36:02 lab73 /bsd: pf: key setup: UDP wire: (0) 10.10.11.3:17189
> 10.10.10.1:53 stack: (0) -
> May  8 03:36:02 lab73 /bsd: pf: key search, out on em2: UDP wire: (0)
> 10.10.10.1:53 10.10.11.3:17189
> May  8 03:36:02 lab73 /bsd: pf: pf_map_addr: selected address 10.10.10.2
> May  8 03:36:02 lab73 /bsd: pf: key setup: UDP wire: (0) 10.10.10.1:53
> 10.10.10.2:53081 stack: (0) 10.10.10.1:53 10.10.11.3:17189
> May  8 03:36:02 lab73 /bsd: pf: wire key attach failed on all: UDP out
> wire: (0) 10.10.10.1:53 10.10.10.2:53081 1:0 @3, existing: UDP out wire:
> (0) 10.10.10.1:53 10.10.10.2:53081 stack: (0) 10.10.10.1:53 10.10.11.3:25217
> 2:1 @3
> May  8 03:36:07 lab73 /bsd: pf: key search, in on vlan5: UDP wire: (0)
> 10.10.11.3:17189 10.10.10.1:53
> May  8 03:36:07 lab73 /bsd: pf: key search, out on em2: UDP wire: (0)
> 10.10.10.1:53 10.10.11.3:17189
> May  8 03:36:07 lab73 /bsd: pf: pf_map_addr: selected address 10.10.10.2
> May  8 03:36:07 lab73 /bsd: pf: key setup: UDP wire: (0) 10.10.10.1:53
> 10.10.10.2:57351 stack: (0) 10.10.10.1:53 10.10.11.3:17189
> May  8 03:36:07 lab73 /bsd: pf: key search, in on em2: UDP wire: (0)
> 10.10.10.1:53 10.10.10.2:57351
> May  8 03:36:07 lab73 /bsd: pf: key search, out on vlan5: UDP wire: (0)
> 10.10.11.3:17189 10.10.10.1:53
>
> It seems to me (i might be wrong), that system selects source ports
> for nat in a way, that makes collisions likely.
>

this is very odd. source port for NAT is selected in pf_get_sport()
function found in pf_lb.c. the relevant part reads as follows:

182         do {
183                 key.af = pd->naf;
184                 key.proto = pd->proto;
185                 key.rdomain = pd->rdomain;
...
203                 } else if (low == 0 && high == 0) {
204                         key.port[sidx] = pd->nsport;
...
215                 } else {
216                         u_int32_t tmp;
217
218                         if (low > high) {
219                                 tmp = low;
220                                 low = high;
221                                 high = tmp;
222                         }
223                         /* low < high */
224                         cut = arc4random_uniform(1 + high - low) + low;
225                         /* low <= cut <= high */
226                         for (tmp = cut; tmp <= high && tmp <= 0xffff; 
++tmp) {
227                                 key.port[sidx] = htons(tmp);
228                                 if (pf_find_state_all(&key, dir, NULL) ==
229                                     NULL && !in_baddynamic(tmp, pd->proto)) 
{
230                                         *nport = htons(tmp);
231                                         return (0);
232                                 }
233                         }
234                         tmp = cut;
235                         for (tmp -= 1; tmp >= low && tmp <= 0xffff; --tmp) {
236                                 key.port[sidx] = htons(tmp);
237                                 if (pf_find_state_all(&key, dir, NULL) ==
238                                     NULL && !in_baddynamic(tmp, pd->proto)) 
{
239                                         *nport = htons(tmp);
240                                         return (0);
241                                 }
242                         }
243                 }

    If I understand code in parse.y right, the low is PF_NAT_PROXY_PORT_LOW 
(50001)
    and high is PF_NAT_PROXY_PORT_HIGH (65535), therefore port selection process
    happens in else branch 216 - 243.  There is a for loop at lines 235 - 240
    which tries really hard to find unused port. The code uses 
pf_find_state_all()
    to make sure the selected port is unused in state table.

    if this code suggests port number then pf_find_state_all() somehow 
guaranties
    the port can not be found in table. the odd thing is that 
pf_state_key_attach()
    reports that port number exists already in table (is in conflict).

    both calls: call to pf_get_sport() and call to pf_state_key_attach() happen
    under PF_LOCK(), so they are atomic nothing else can mess up with table.

    after walking through the code I suspect we fail to initialize look up for
    pf_find_state_all() properly, we forgot to compute a toeplitz hash. Can you
    give a try to diff below to see if it fixes NAT?

thanks a lot
regards
sashan

--------8<---------------8<---------------8<------------------8<--------

diff --git a/sys/net/pf_lb.c b/sys/net/pf_lb.c
index a3d8d6333cd..b893dd819df 100644
--- a/sys/net/pf_lb.c
+++ b/sys/net/pf_lb.c
@@ -196,18 +196,24 @@ pf_get_sport(struct pf_pdesc *pd, struct pf_rule *r,
                        /* XXX bug: icmp states dont use the id on both
                         * XXX sides (traceroute -I through nat) */
                        key.port[sidx] = pd->nsport;
+                       key.hash = pf_pkt_hash(key.af, key.proto, &key.addr[0],
+                           &key.addr[1], key.port[0], key.port[1]);
                        if (pf_find_state_all(&key, dir, NULL) == NULL) {
                                *nport = pd->nsport;
                                return (0);
                        }
                } else if (low == 0 && high == 0) {
                        key.port[sidx] = pd->nsport;
+                       key.hash = pf_pkt_hash(key.af, key.proto, &key.addr[0],
+                           &key.addr[1], key.port[0], key.port[1]);
                        if (pf_find_state_all(&key, dir, NULL) == NULL) {
                                *nport = pd->nsport;
                                return (0);
                        }
                } else if (low == high) {
                        key.port[sidx] = htons(low);
+                       key.hash = pf_pkt_hash(key.af, key.proto, &key.addr[0],
+                           &key.addr[1], key.port[0], key.port[1]);
                        if (pf_find_state_all(&key, dir, NULL) == NULL) {
                                *nport = htons(low);
                                return (0);
@@ -225,6 +231,9 @@ pf_get_sport(struct pf_pdesc *pd, struct pf_rule *r,
                        /* low <= cut <= high */
                        for (tmp = cut; tmp <= high && tmp <= 0xffff; ++tmp) {
                                key.port[sidx] = htons(tmp);
+                               key.hash = pf_pkt_hash(key.af, key.proto,
+                                   &key.addr[0], &key.addr[1], key.port[0],
+                                   key.port[1]);
                                if (pf_find_state_all(&key, dir, NULL) ==
                                    NULL && !in_baddynamic(tmp, pd->proto)) {
                                        *nport = htons(tmp);
@@ -234,6 +243,9 @@ pf_get_sport(struct pf_pdesc *pd, struct pf_rule *r,
                        tmp = cut;
                        for (tmp -= 1; tmp >= low && tmp <= 0xffff; --tmp) {
                                key.port[sidx] = htons(tmp);
+                               key.hash = pf_pkt_hash(key.af, key.proto,
+                                   &key.addr[0], &key.addr[1], key.port[0],
+                                   key.port[1]);
                                if (pf_find_state_all(&key, dir, NULL) ==
                                    NULL && !in_baddynamic(tmp, pd->proto)) {
                                        *nport = htons(tmp);
diff --git a/sys/net/pfvar_priv.h b/sys/net/pfvar_priv.h
index 741125e447e..147cd7d8f75 100644
--- a/sys/net/pfvar_priv.h
+++ b/sys/net/pfvar_priv.h
@@ -406,6 +406,9 @@ void                        pf_state_peer_hton(const struct 
pf_state_peer *,
                            struct pfsync_state_peer *);
 void                   pf_state_peer_ntoh(const struct pfsync_state_peer *,
                            struct pf_state_peer *);
+u_int16_t              pf_pkt_hash(sa_family_t, uint8_t,
+                           const struct pf_addr *, const struct pf_addr *,
+                           uint16_t, uint16_t);
 
 #endif /* _KERNEL */
 

Reply via email to