Ok, here goes.
Describe the general setup, what interfaces you have, where those specific connections flow through (what interface does the first SYN arrive through, which interface is connected to the default gateway, what interface should it be routed to).
It is a three legged firewall. rl0 has IP xxx.5.11.200 and it is connected to the default gateway which has IP xxx.5.11.2. rl1 is an internal interface. rl2 is another external interface with IP yyy.253.135.162. There is a Cisco router after that with IP yyy.253.135.161. So basically this firewall has two ways of reaching the internet. Either through rl0 or rl2.
The problematic machine has IP xxx.5.11.201 and when it sends a packet it will arrive on rl0 but we want that specific machine to use the gateway provided by rl2. We don't want it to use the default gateway. For this to work I use a nat rule and a route-to rule.
When the first SYN arrives on the first interface, does the rule translate it, does the rule create state (pfctl -vvss output might help)? Do any other rules (matching on other interfaces) try to create state, too? Any other translations?
These are the rules that affect all connections for that interface and those IPs.
nat on rl0 from xxx.5.11.201 to any -> rl2 nat on rl2 from xxx.5.11.201 to any -> rl2
block in log all
block out log all
pass out log quick on rl0 route-to (rl2 yyy.253.135.161) from yyy.253.135.162
to any label bp-traffic
pass out quick on rl0 all keep state
block in log quick on rl0 from xxx.5.11.201 to 192.169.69.0/24
pass in log quick on rl0 proto tcp from xxx.5.11.201 to yyy.253.135.165 port 31
28 flags S/SA keep state label biopolis-squid
pass in log quick on rl0 from xxx.5.11.201 to any keep state
For the state insert failures you get from /var/log/messages with pfctl -xm, can you try to provide one example of a single connection, including tcpdump of the first SYN on all interfaces, any states that are related to that connection (pfctl -vvss) and the state failure message itself? A state insert fails when there is another state entry with conflicting key (source/destination address/port), which can occur when translations and route-to mess up.
I do a telnet "zzz.13.199.159 443" from xxx.5.11.201 which should come
in through the firewall on rl0, nat applied and route-to applied and then
go out on rl2. This is what happens.
1. This is the error message I receive
Sep 20 09:39:53 ouzo /bsd: pf: state insert failed: tree_ext_gwy lan: yyy.253.135.162:58415 gwy: yyy.253.135.162:58415 ext: zzz.13.199.159:443
Sep 20 09:39:53 ouzo /bsd: pf: state insert failed: tree_ext_gwy lan: yyy.253.135.162:58415 gwy: yyy.253.135.162:58415 ext: zzz.13.199.159:443
2. This is the output from the logging interface.
# tcpdump -e -n -i pflog0 host zzz.13.199.159
09:39:53.205971 rule 18/0(match): pass in on rl0: xxx.5.11.201.22828 > zzz.13.199.159.443: S 406304494:406304494(0) win 16384 <mss 1460,nop,nop,sackOK,[|tcp]> (DF) [tos 0x10]
09:39:53.206013 rule 2/0(match): pass out on rl0: yyy.253.135.162.58415 > zzz.13.199.159.443: S 406304494:406304494(0) win 16384 <mss 1460,nop,nop,sackOK,[|tcp]> (DF) [tos 0x10]
3. This is the output from the incoming interface. You can see that the SYN packet has to be transmitted twice.
# tcpdump -n -i rl0 host zzz.13.199.159
tcpdump: listening on rl0
09:39:53.205942 xxx.5.11.201.22828 > zzz.13.199.159.443: S 406304494:406304494(0) win 16384 <mss 1460,nop,nop,sackOK,nop,wscale 0,nop,nop,timestamp 531607929 0> (DF) [tos 0x10]
09:39:59.198391 xxx.5.11.201.22828 > zzz.13.199.159.443: S 406304494:406304494(0) win 16384 <mss 1460,nop,nop,sackOK,nop,wscale 0,nop,nop,timestamp 531607941 0> (DF) [tos 0x10]
09:39:59.993125 zzz.13.199.159.443 > xxx.5.11.201.22828: S 3673602358:3673602358(0) ack 406304495 win 5792 <mss 1460,sackOK,timestamp 158589077 531607941,nop,wscale 0> (DF)
09:39:59.995990 xxx.5.11.201.22828 > zzz.13.199.159.443: . ack 1 win 17376 <nop,nop,timestamp 531607943 158589077> (DF) [tos 0x10]
4. This is the output from the outgoing interface rl2. The first SYN packet is never visible on this link but the second comes through.
# tcpdump -n -i rl2 host zzz.13.199.159
tcpdump: listening on rl2
09:39:59.199400 yyy.253.135.162.58415 > zzz.13.199.159.443: S 406304494:406304494(0) win 16384 <mss 1460,nop,nop,sackOK,nop,wscale 0,nop,nop,timestamp 531607941 0> (DF) [tos 0x10]
09:39:59.993089 zzz.13.199.159.443 > yyy.253.135.162.58415: S 3673602358:3673602358(0) ack 406304495 win 5792 <mss 1460,sackOK,timestamp 158589077 531607941,nop,wscale 0> (DF)
09:39:59.996031 yyy.253.135.162.58415 > zzz.13.199.159.443: . ack 1 win 17376 <nop,nop,timestamp 531607943 158589077> (DF) [tos 0x10]
5. Here are the two states that are created for this connection
tcp xxx.5.11.201:22828 -> yyy.253.135.162:58415 -> zzz.13.199.159:443 ESTA
BLISHED:ESTABLISHED
[406304495 + 5792] wscale 0 [3673602359 + 621381321] wscale 0
age 00:00:07, expires in 23:59:59, 3:1 pkts, 180:60 bytes, rule 2
tcp zzz.13.199.159:443 <- xxx.5.11.201:22828 ESTABLISHED:ESTABLISHED
[3673602359 + 621381321] wscale 0 [406304495 + 5792] wscale 0
age 00:00:07, expires in 23:59:59, 3:1 pkts, 180:60 bytes, rule 18
I'll have to walk through the code manually to find out what is broken, what I need is all information related to one such connection (what interfaces packets flow through, what rules they match there, and which states they create).
I hope this helps. I'm happy to provide more information if needed. Testing is also no problem.
Thanks a lot, Nickus
