vpn bridge misbehavior
Greetings all, Last week I described briefly a problem with *return* TCP/IP traffic only, across a LAN-to-LAN VPN network bridge, only on the first connection. I appreciate your responses and so now as you've requested I have composed a detailed network topology and configuration document in order to properly describe the problem in detail. This email best viewed with a fixed-width font. (Pardon the ASCII art.) # #THE NETWORK: # PUBLIC INTERNET | | | | | | - - - |firewall 3 |__link_A__|firewall 1 |__link_B__|firewall 2 | |OpenBSD 4.0| |OpenBSD 3.9| |OpenBSD 4.0| - - - | | | || 172.18.1.0/24 |192.168.248.0/21 192.168.254.0/24 | | || - 172.18.2.0/24||_| router 1 | | - |OS X 10.4.8| 172.18.3.0/24 | router 2 | - |OpenBSD 4.0| | | | -172.17.1.0/24 | || | | 172.17.2.0/24 || X.X.X.X/X 172.17.3.0/24 #-- #Internal IP addresses: #-- firewall 1: 192.168.250.1 firewall 2: 192.168.254.1 firewall 3: 172.18.1.1, 172.18.2.1, 172.18.3.1 router 1: 192.168.254.2, 172.17.1.1, 172.17.2.1, 172.17.3.1 router 2: 192.168.250.3, X.X.X.X The network behind firewall 1 is the primary network. Software developers sit within its private address space. Firewall 1 is also running two OpenVPN server instances; one instance is configured as a routed tunnel, one instance is configured as a bridged tunnel. Firewalls 2 and 3 are both at remote locations. They each run an OpenVPN client instance which connects back to firewall 1. In the diagram above link A represents the routed OpenVPN tunnel and link B represents the bridged tunnel. You'll notice that firewall 2, which connects to the bridged tunnel, handles a private IP range that is in fact a subset of firewall 1's. The art department's desktops lie within that 192.168.254.0/24 range at firewall 2's remote site. It should be noted here that developers also regularly connect to either tunnel when working from home or the road. The bridged tunnel is configured to forward all traffic while the routed tunnel is configured to only forward appropriate traffic. Router 1 sits within the range of addresses firewall 2's bridge to firewall 1's network and routes traffic for 3 seperate subnets (sub-subnets?) of intel mac mini development server clusters (God, please don't ask why). Router 2 sits within the main part of firewall 1's local network and acts as a OpenVPN client just for routing traffic to/from a network at another remote site whose administration is not within my jurisdiction. Theoretically TCP/IP traffic should be able to pass from any part of the network to any other part of the network and back. For that matter, this all in fact seems to work correctly despite the fact that I'm a total rookie network admin... with one notable exception, which brings us to the problem. # #THE PROBLEM: # Now for the fun part. I've been adding routes and configurations and pf rules and such as we build this network out. Some branches (notably the 172.17 and 172.18 subnets) are very new additions. The only part where known problems persist (and here lies the point of this whole email) is with access from developer desktops behind firewall 1 (also all OS X 10.4.8 machines, fwiw) to the mac mini dev cluster behind router 1. The real odd part of this problem is that its only a problem for any given client the first time it connects to a mac dev server in any given day. Whats even weirder is that it doesn't appear to be a problem with the macs themselves or with router 1, which is also a mac. Traffic reaches the mac mini server and tcpdump verifies that it comes all the way back through the tunnel and appears on tun1 of firewall 1 but fails to get passed over the bridge (bridge0) with firewall 1's tunnel endpoint and its local ethernet device (sis0). This first connection always times out, however immediately retrying always works. Pinging always works though and pinging first always eliminates the first-connection failure of the following TCP/IP connections - but for only that client and the strange little cycle seems to reset itself sometime between the end of the business day and
Re: vpn bridge misbehavior
On Thu, Feb 01, 2007 at 05:25:05PM -0800, Jonathan Whiteman wrote: Greetings. Is there a commonly known cause of *return* TCP/IP traffic to reach but be dropped rather than passed back across a bridge (ala bridgename.bridge0) but... get this... only on the first try? I'd like to get into a detailed explanation of the network topology I'm working with here but I don't want to scare off anyone by opening with a 3 page email. The bridge seems to work fine for everything except every 24 hours or so (may be less... like say 2-8 hours actually?) individual clients trying to access services on a *certain* cluster of servers on the other side of the bridge has to either first ping the server (which always works) or else just accept that their first connection attempt WILL time out but the second one WILL succeed. Obvious issues like the server machines or even just their network devices going to sleep because of misconfigured power management has already been excluded as a possiblity because tcpdumping on both devices in the bridge clearly shows missing return traffic only being passed back to the other device AFTER the first attempt. Anyway, any advice is greatly appreciated. While OpenBSD doesn't do that, ISTR some other VPN implementations offering to open tunnels 'on demand' (and, presumably, close them when not needed). Could this be involved in this case? Still, I don't know why that would only be a problem one way, but if this seems to depend on the tunnel in use, something like this might be the case. Joachim
Re: vpn bridge misbehavior
Is there a commonly known cause of *return* TCP/IP traffic to reach but be dropped rather than passed back across a bridge (ala bridgename.bridge0) but... get this... only on the first try? if this is a long-lived TCP connection, perhaps firewall states have timed out. if so, adjusting timers may help, either for certain rules, or use the 'set optimization' shortcuts (these set up default values for tcp.first, tcp.opening etc - src/sbin/pfctl/pfctl.c shows that aggressive sets tcp.established to 5h, normal 24h, conservative 5d) without the 3 page email it's guesswork though. I'd like to get into a detailed explanation of the network topology I'm working with here but I don't want to scare off anyone by opening with a 3 page email. people can always skip the email, most people who will be able to help would prefer to have the information in one place rather than back-and-forth to find it out. in most cases, actual configs and output from system commands work better than descriptions. note that the process of gathering all the relevant information for a good list post will often highlight the actual problem :-)
vpn bridge misbehavior
Greetings. Is there a commonly known cause of *return* TCP/IP traffic to reach but be dropped rather than passed back across a bridge (ala bridgename.bridge0) but... get this... only on the first try? I'd like to get into a detailed explanation of the network topology I'm working with here but I don't want to scare off anyone by opening with a 3 page email. The bridge seems to work fine for everything except every 24 hours or so (may be less... like say 2-8 hours actually?) individual clients trying to access services on a *certain* cluster of servers on the other side of the bridge has to either first ping the server (which always works) or else just accept that their first connection attempt WILL time out but the second one WILL succeed. Obvious issues like the server machines or even just their network devices going to sleep because of misconfigured power management has already been excluded as a possiblity because tcpdumping on both devices in the bridge clearly shows missing return traffic only being passed back to the other device AFTER the first attempt. Anyway, any advice is greatly appreciated. thanks, ~jon
Re: vpn bridge misbehavior
Hi, On 2/2/07, Jonathan Whiteman [EMAIL PROTECTED] wrote: I'd like to get into a detailed explanation of the network topology I'm working with here but I don't want to scare off anyone by opening with a 3 page email. Your subject implies that you built a layer-2 LAN-to-LAN bridge over an (IPSec or OpenVPN?) VPN tunnel. Not beening a OpenBSD nor VPN specialist yet, still I would love to see your detailed three page description, as I am currently experimenting with a similar setup (and got a prototype to work last night in the lab). Rolf