Hi all I'm hoping someone will help me understand and resolve an issue I'm experiencing while trying to do outbound load-balancing.
Here is the set up: I have 3 WAN connections: -em0 (192.168.10.1) is LAN interface. -em1 (192.168.130.10) (gateway: 192.168.130.1) is satellite interface. -em2 (192.168.140.10) (gateway: 192.168.140.1) is wifi interface. -em3 (192.168.120.10) (gateway: 192.168.120.1) is cellular interface. I've got multi-path routing working on the gateway. I can start a ping on the gateway and drop any connection and it will just switch to another one that's up. It's great. I need to be able to do the same thing for a PC behind the gateway, but I cannot seem to make that happen. I'm following http://openbsd.das.ufsc.br/faq/pf/pools.html, the bottom guide for setting up load balancing outgoing traffic. My test is pretty simple: ==== Start ifstated in debug mode with *only* em0 and em3 online. ------------------------------ $ sudo ifstated -d initial state: auto changing state to auto running /usr/local/bin/unmetered > /dev/null changing state to cellular-only running pfctl -a optimal_route -F rules rules cleared running echo "pass in on em0 from (em0:network) route-to (em3 192.168.120.1)" | pfctl -a optimal_route -f - started Start pinging from the lan PC behind the gateway and monitor the external gateway machines for icmp packets with tcpdump. The em3 logs show these: 21:24:23.547251 192.168.140.10 > resolver2.opendns.com: icmp: echo request 21:24:23.564112 resolver2.opendns.com > 192.168.140.10: icmp: echo reply And we are good. ------------------------------ Then turn on em2. ifstated logs: changing state to auto running /usr/local/bin/unmetered > /dev/null changing state to cellular-and-wifi running pfctl -a optimal_route -F rules rules cleared running echo "pass in on em0 from (em0:network) route-to { (em2 192.168.140.1), (em3 192.168.120.1) } round-robin" | pfctl -a optimal_route -f - Now both em2 and em3 are available. Traffic is still pinging through the em3 interface, which is fine - I would expect that. ------------------------------ Then turn OFF em3. ifstated logs: changing state to auto running /usr/local/bin/unmetered > /dev/null changing state to wifi-only running pfctl -a optimal_route -F rules rules cleared running echo "pass in on em0 from em0:network route-to (em2 192.168.140.1)" | pfctl -a optimal_route -f - Now here's where the problem comes. When I turn off the em3 interface, pings do in fact start going through the em2 interface. Here is the tcpdump from the gateway on the em3 interface before it went down: 21:37:17.958682 192.168.140.10 > resolver2.opendns.com: icmp: echo request 21:37:17.978438 resolver2.opendns.com > 192.168.140.10: icmp: echo reply Here is the tcpdump from the gateway on the em2 interface after em3 went down: 21:37:19.530809 192.168.140.10 > resolver2.opendns.com: icmp: echo request 21:37:20.541179 192.168.140.10 > resolver2.opendns.com: icmp: echo request As you can see, it started routing traffic out em2, but for whatever reason it's still trying to NAT the traffic to my em3 interface instead of the em2 interface - which makes the packets not reach the pc behind the gateway. If I stop the ping session and start it again without making any changes, it now goes through the em3 and things seem to be working fine. I'm assuming that I'm doing something ignorant, but I'm not entirely sure where to look. Do you have any ideas? ----------------------------------------------------------------------------------------------- /etc/pf.conf ----------------------------------------------------------------------------------------------- lan_net= "192.168.10.0/24" int_if="em0" sat_if="em1" wif_if="em2" cel_if="em3" sat_gw="192.168.130.1" wif_gw="192.168.140.1" cel_gw="192.168.120.1" tcp_services="{ 22, 113 }" icmp_types="echoreq" #Options set block-policy return #set loginterface egress set skip on lo #Block everything by default block #FTP Proxy anchor "ftp-proxy/*" pass in quick on $int_if inet proto tcp to any port ftp \ divert-to 127.0.0.1 port 8021 #NATs match out on $wif_if from $lan_net nat-to ($wif_if) match out on $sat_if from $lan_net nat-to ($sat_if) match out on $cel_if from $lan_net nat-to ($cel_if) #Allow all lan traffic pass out on $int_if to $lan_net #Allow from lan to gateway pass in quick on $int_if from $lan_net to $int_if #Dynamically load the rules that dictate which route we take. #Influenced by ifstated anchor optimal_route #do our filtering inbound from the lan. #allow all outbound pass out on $sat_if pass out on $wif_if pass out on $cel_if #If an IP is from neighbor router networks, send it back through that interface. #pass out on { $sat_if $cel_if } from $wif_if route-to ($wif_if $wif_gw) #pass out on { $wif_if $cel_if } from $sat_if route-to ($sat_if $sat_gw) #pass out on { $sat_if $wif_if } from $cel_if route-to ($cel_if $cel_gw) pass out on { $sat_if $cel_if } from ($wif_if:network) route-to ($wif_if $wif_gw) pass out on { $wif_if $cel_if } from ($sat_if:network) route-to ($sat_if $sat_gw) pass out on { $sat_if $wif_if } from ($cel_if:network) route-to ($cel_if $cel_gw) #Filter rules antispoof quick for { lo $int_if } pass in on egress inet proto tcp from any to (egress) port $tcp_services pass in inet proto icmp all icmp-type $icmp_types ----------------------------------------------------------------------------------------------- /etc/ifstated.conf ----------------------------------------------------------------------------------------------- init-state auto int_if = "em0" sat_if = "em1" wif_if = "em2" cel_if = "em3" # The "net" addresses are other addresses which can be used to determine # whether we have connectivity. Make sure the hosts are always up, or # test multiple ip's, 'or'-ing the tests. net = '( "ping -q -c 1 -w 1 192.168.6.8 > /dev/null" every 10 && \ "ping -q -c 1 -w 1 192.168.3.8 > /dev/null" every 10)' unmetered = '("/usr/local/bin/unmetered > /dev/null" every 10 )' state auto { #unmetered is a script located at /usr/local/bin/unmetered. #It merely checks to see if we are between the hours of 12am and 5am if ($sat_if.link.up && $unmetered) && $wif_if.link.down { set-state unmetered-satellite-only } if ($sat_if.link.up && $unmetered) && $wif_if.link.up{ set-state unmetered-satellite-and-wifi } if ($sat_if.link.up && ! $unmetered) && $wif_if.link.up{ set-state metered-satellite-and-wifi } if $wif_if.link.up && $cel_if.link.up && $sat_if.link.up{ set-state all } if $cel_if.link.up && $sat_if.link.up && $wif_if.link.down{ set-state cellular-and-satellite } if $cel_if.link.up && $sat_if.link.down && $wif_if.link.up{ set-state cellular-and-wifi } if $wif_if.link.up && $sat_if.link.down && $cel_if.link.down{ set-state wifi-only } if $wif_if.link.down && $sat_if.link.down && $cel_if.link.up{ set-state cellular-only } if $wif_if.link.down && $sat_if.link.up && $cel_if.link.down{ set-state satellite-only } } state unmetered-satellite-only { init { run 'pfctl -a optimal_route -F rules' run 'echo "pass in on em0 from 192.168.10.0/24 route-to (em1 192.168.130.1)" | pfctl -a optimal_route -f -' } if ! $unmetered set-state auto if $sat_if.link.down set-state auto } state unmetered-satellite-and-wifi{ init { run 'pfctl -a optimal_route -F rules' run 'echo "pass in on em0 from 192.168.10.0/24 route-to { (em1 192.168.130.1), (em2 192.168.140.1) } round-robin" | pfctl -a optimal_route -f -' } if ! $unmetered set-state auto if $sat_if.link.down || $wif_if.link.down set-state auto } state metered-satellite-and-wifi{ init { run 'pfctl -a optimal_route -F rules' run 'echo "pass in on em0 from 192.168.10.0/24 route-to { (em2 192.168.140.1) } round-robin" | pfctl -a optimal_route -f -' } if $unmetered set-state auto if $cel_if.link.up set-state auto if $sat_if.link.down || $wif_if.link.down set-state auto } state all { init { run 'pfctl -a optimal_route -F rules' run 'echo "pass in on em0 from 192.168.10.0/24 route-to { (em3 192.168.120.1), (em2 192.168.130.1) } round-robin" | pfctl -a optimal_route -f -' } if $unmetered set-state auto if $wif_if.link.down || $cel_if.link.down || $sat_if.link.down set-state auto } state cellular-and-satellite { init { run 'pfctl -a optimal_route -F rules' run 'echo "pass in on em0 from 192.168.10.0/24 route-to { (em3 192.168.120.1), (em2 192.168.130.1) }" | pfctl -a optimal_route -f -' } if $wif_if.link.up set-state auto if $cel_if.link.down || $sat_if.link.down set-state auto if $unmetered set-state auto } state cellular-and-wifi{ init { run 'pfctl -a optimal_route -F rules' #run 'echo "match out on { em2 em3 } from (em0:network) nat-to { 192.168.140.1, 192.168.120.1 } " | pfctl -a optimal_route -f -' #run 'echo "match out on em3 from em0:network nat-to (em3)" | pfctl -a optimal_route -f -' run 'echo "pass in on em0 from (em0:network) route-to { (em2 192.168.140.1), (em3 192.168.120.1) } round-robin" | pfctl -a optimal_route -f -' #run 'echo "match out on { em2 em3 } from em0 nat-to { (em2), (em3)}" | pfctl -a optimal_route -f -' #run 'echo "pass out on { em1 } from em0:network route-to { (em2 192.168.140.1), (em3 192.168.120.1) }" | pfctl -a optimal_route -f -' #run 'echo "pass in on em0 from 192.168.10.0/24 route-to { (em3 192.168.120.1), (em2 192.168.140.1) }" | pfctl -a optimal_route -f -' } if $sat_if.link.up set-state auto if $cel_if.link.down || $wif_if.link.down set-state auto } state satellite-only { init { run 'pfctl -a optimal_route -F rules' #run 'echo "match out on em1 from (em0:network) nat-to (em1)" | pfctl -a optimal_route -f -' run 'echo "pass in on em0 from (em0:network) route-to (em1 192.168.130.1)" | pfctl -a optimal_route -f -' #run 'echo "pass out on { em2 em3 } from { em0:network } route-to { (em1 192.168.130.1) }" | pfctl -a optimal_route -f -' #run 'echo "pass in on em0 from 192.168.10.0/24 route-to (em1 192.168.130.1)" | pfctl -a optimal_route -f -' } if $wif_if.link.up || $cel_if.link.up set-state auto if $sat_if.link.down set-state auto if $unmetered set-state auto } state cellular-only { init { run 'pfctl -a optimal_route -F rules' #run 'echo "match out on em3 from (em0:network) nat-to (em3)" | pfctl -a optimal_route -f -' run 'echo "pass in on em0 from (em0:network) route-to (em3 192.168.120.1)" | pfctl -a optimal_route -f -' #run 'echo "pass out on { em1 em2 } from { em0:network } route-to { (em3 192.168.120.1) }" | pfctl -a optimal_route -f -' #run 'echo "pass in on em0 from 192.168.10.0/24 route-to (em3 192.168.120.1)" | pfctl -a optimal_route -f -' } if $wif_if.link.up || $sat_if.link.up set-state auto if $cel_if.link.down set-state auto } state wifi-only { init { run 'pfctl -a optimal_route -F rules' #run 'echo "match out on em2 from (em0:network) nat-to (em2)" | pfctl -a optimal_route -f -' run 'echo "pass in on em0 from em0:network route-to (em2 192.168.140.1)" | pfctl -a optimal_route -f -' #run 'echo "pass in on em2 from 192.168.10.0/24 route-to (em3 192.168.120.1)" | pfctl -a optimal_route -f -' #run 'echo "pass out on { em1 em3 } from { em0:network } route-to (em2 192.168.140.1)" | pfctl -a optimal_route -f -' } if $cel_if.link.up || $sat_if.link.up set-state auto if $wif_if.link.down set-state auto } Sorry for the giant email, I tried to attach what was relevant. If you need to inspect additional files I can provide them. Thank you for your time, --Brad