[Shorewall-users] Connection tracking problem using providers

Aidan Anderson Fri, 01 Aug 2008 03:37:19 -0700

I'm having a problem setting up connection tracking using the providers 
file for a very specific scenario.  I'll try and be as brief as possible 
but I will give you some background first to give you an idea of what I 
am trying to achieve.


We have 2 data centres, we'll call them DC-A and DC-B.  These data 
centres run application servers that clients connect to from various 
locations via a hardware VPN solution.  For data centre redundancy, the 
VPN devices are duplicated over each data centre.  The clients, each on 
their own private 10.x.x.x/24 networks, have a hardware VPN device that 
maintains 2 tunnels, one to the VPN devices in each data centre.  The 
client end device is configured to use the tunnel into DC-A as the 
primary route and the tunnel into DC-B as backup route in the event that 
DC-A becomes unreachable.  There are about 300 client locations so there 
are 9 VPN devices at each data centre to cover the load.  The following 
diagram shows the layout of the network showing the 2 data centres and 
an example client:

 +------------+   +---------+                      +---------+   
+------------+ 
 |VPN Devs x 9+---+ VPN LAN |                      | VPN LAN +---|VPN 
Devs x 9|
 +------------+   +----+----+                      +----+----+   
+------------+
                       |eth4                        eth4|
                       |                                |
+-------+ bond0 +------+------+ eth2        eth2 +------+------+ bond0 
+-------+
|App LAN+-------+ DC-A Router +------------------+ DC-B Router 
+-------+App LAN|
+-------+       +------+------+      (neos)      +------+------+       
+-------+
                       |eth1                        eth1|
                       |                                |
                +------+--------------------------------+------+
                |                   Internet                   |
                +----------------------+-----------------------+
                                       |
                                       |
                                  +----+-----+
                                  |VPN Device|
                                  +----+-----+
                                       |
                                  +----+-----+
                                  |Client LAN|
                                  +----------+

Each of the routers are running Gentoo Linux, Kernel 2.6.18, and 
Shorewall 3.2.8

Here is what I am trying to achieve and the problems I am having:

As mentioned above, the clients will route via DC-A as the primary 
route.  I want client devices to be able to route to servers in both 
DC-A and DC-B.  There is a private (neos) link between the 2 data 
centres so that when clients connect to DC-A, they can reach servers in 
the App LAN in DC-B.    There is a static route for each client's 
10.x.x.x/24 subnet pointing to the VPN Device attached to eth4 at each 
data centre.

Traffic initiated from the servers (e.g. print jobs) works no problem.  
If the traffic is initiated from a server in DC-A's App LAN, it is 
routed via the local VPN device down to the client.  The client VPN 
device tracks the session to ensure that replies are routed back up the 
DC-A tunnel.  The same goes for traffic initiated from a server in DC-B, 
it is routed via the local VPN device at DC-B and session tracked by the 
client VPN device back to DC-B.

The problem am I having is with traffic initiated from the clients back 
to the server.  If the client is trying to reach a server in DC-A, I 
don't have problem.  Traffic comes into the data centre via eth4 to the 
server attached to bond0 and is routed back via eth4 down to client.  
If, however, the server is attached to bond0 at DC-B, the traffic 
arrives via eth4 in DC-A, is then routed via eth2 (private link to other 
data centre) but the route back is via the local eth4 interface in DC-B 
and this doesn't work.  I'm not sure why, but my guess is that I am 
breaking some fundamental routing rules, i.e. traffic arriving with a 
certain source on one interface being replied via another interface.  
When I do some tcpdumps on the routers, the traffic only gets as far as 
eth2 on DC-B's router so it is probably being stopped at that point.  
Rules are configured to allow this traffic but there is nothing in the 
Shorewall logs to indicated that this traffic is being dropped.

My thought then was to use the providers file to set up a provider 
against eth2 on DC-B and connection track the sessions to ensure that 
the replies are routed back via eth2 to DC-A.  My providers file on the 
router at DC-B is set up as follows:

#NAME   NUMBER  MARK    DUPLICATE       INTERFACE       GATEWAY         
OPTIONS         COPY
neos    1       1       main            eth2            192.168.0.134   
track           bond0
#LAST LINE -- ADD YOUR ENTRIES ABOVE THIS LINE -- DO NOT REMOVE

eth2 on DC-B has an IP of 192.168.0.133/30 and 192.168.0.134 is the IP 
of eth2 on DC-A.  I have only copied bond0 into the neos routing table 
so as not to include the various 10.x.x.x/24 routes set up against 
eth4.  There is a default static route on eth2 to route all 10.0.0.0/8 
traffic via 192.168.0.134 so in theory if the traffic is forced to use 
the neos routing table, it will always send any 10 traffic back to 
DC-A.  My routing tables are as follows:

main routing table:

192.168.0.132/30 dev eth2  proto kernel  scope link  src 192.168.0.133
xx.xx.xx.192/29 dev eth1  proto kernel  scope link  src xx.xx.xx.197
xx.xx.xx.208/28 dev eth4  proto kernel  scope link  src xx.xx.xx.222
172.30.16.0/24 dev bond0  proto kernel  scope link  src 172.30.16.254
10.1.0.0/24 via xx.xx.xx.216 dev eth4  metric 1
10.34.17.0/24 via xx.xx.xx.211 dev eth4  metric 1
10.38.1.0/24 via xx.xx.xx.217 dev eth4  metric 1
172.30.16.0/24 dev bond0  proto kernel  scope link  src 172.30.16.254
10.10.1.0/24 via xx.xx.xx.213 dev eth4  metric 1
10.0.0.0/24 via xx.xx.xx.210 dev eth4  metric 1
172.30.0.0/21 via 192.168.0.134 dev eth2  metric 1
192.168.192.0/18 via 192.168.0.134 dev eth2  metric 1
192.168.0.0/17 via 192.168.0.134 dev eth2  metric 1
172.31.0.0/16 via 192.168.0.134 dev eth2  metric 1
172.16.0.0/16 via 192.168.0.134 dev eth2  metric 1
10.0.0.0/8 via 192.168.0.134 dev eth2  metric 1
127.0.0.0/8 dev lo  scope link
default via xx.xx.xx.193 dev eth1

The first 3 parts of the public IPs have been replaced with xx to 
protect the innocent.  Public IPs are used on eth4 (VPN device LAN) and 
eth1 (Internet gateway).  I have removed most of the 10.x.x.x/24 routes 
from the output for clarity.


neos routing table:

192.168.0.134 dev eth2  scope link  src 192.168.0.133
192.168.0.132/30 dev eth2  proto kernel  scope link  src 192.168.0.133
172.30.16.0/24 dev bond0  proto kernel  scope link  src 172.30.16.254
172.30.0.0/21 via 192.168.0.134 dev eth2  metric 1
192.168.192.0/18 via 192.168.0.134 dev eth2  metric 1
192.168.0.0/17 via 192.168.0.134 dev eth2  metric 1
172.31.0.0/16 via 192.168.0.134 dev eth2  metric 1
172.16.0.0/16 via 192.168.0.134 dev eth2  metric 1
10.0.0.0/8 via 192.168.0.134 dev eth2  metric 1
default via 192.168.0.134 dev eth2


Unfortunately when I tested it, it didn't work.  The problem seems to be 
that the tracking isn't working for traffic arriving on eth2 in DC-B.  
When I run tcpdumps again, the traffic is getting as far as eth2 on DC-B 
but no further, which was the same result I got without running 
providers.  The version of Shorewall that I'm running also has the bug 
regarding the proper handling of PREROUTING marks when 
HIGH_ROUTE_MARKS=No.  I have patched the /usr/share/shorewall/complier 
script by commenting out the erroneous line and replacing with the 
correct line as follows:

    #
    # Route marks are restored in PREROUTING/OUTPUT prior to these 
rules. We only send
    # packets that are not part of a marked connection to the 
'tcpre/tcout' chains.
    #
    if [ -n "$ROUTEMARK_INTERFACES" -a -z "$TC_EXPERT" ]; then
        #mark_part="-m mark --mark 0/0xFF00"
        [ -n "$HIGH_ROUTE_MARKS" ] && mark_part="-m mark --mark 
0/0xFF00" || mark_part="-m mark --mark 0/0xFF"
        #
        # But let marks in tcpre override those assigned by 'track'
        #
        for interface in $ROUTEMARK_INTERFACES; do
            run_iptables -t mangle -A PREROUTING -i $interface -j tcpre
        done
    fi

I restarted Shorewall but got the same problem.  I have had providers 
working in the past for multiple providers as explained in the docs so I 
know that the connection tracking works in that scenario, although in 
that scenario, there are no conflicting routes between the providers 
routing table and the main routing table so I'm maybe thinking that what 
I am trying to achieve is not possible.  Shorewall is also configured in 
the standard way to apply the packet marks in the pre-routing chain, 
which should apply the packet marks before the routing decision is made 
according to the "Shorewall and Routing" documentation.

I may also just be missing something fundamental.

Any pointers in the right direction would be greatly appreciated.  
Apologies for not being as brief as promised.

regards,
Aidan


-------------------------------------------------------------------------
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK & win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100&url=/
_______________________________________________
Shorewall-users mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/shorewall-users

[Shorewall-users] Connection tracking problem using providers

Reply via email to