GitHub user dsclose opened a pull request:

    https://github.com/apache/cloudstack/pull/1519

    Cloudstack 9339: Virtual Routers don't handle Multiple Public Interfaces

    This PR addresses CLOUDSTACK-9339 and may need a code review from someone 
familiar with the System VM scripts. In particular, this PR has not been tested 
in a VPC RvR context. Only standalone routers and RvR routers have been 
demonstrated.
    
    - **d582358: Leave public interfaces down in backup redundant routers.** 
Previously backup routers were bringing all interfaces up and thus arping 
public IPs away from the master router.
    - **9ee1eb6: Add the default gateway to the main routing table when 
interfaces are configured.** The gateway for the first public IP was always 
being added to the main routing table. Sometimes a router would consequently 
add the gateway for an IP other than the default source-NAT IP. This would 
prevent outbound connectivity for guest VMs.
    - **ad9d72f: Add default gateway to device-specific routing tables.** 
Link-level routes were being put into the device-specific routing tables 
(accessed via firewall marks) but these are unnecessary. Instead, the default 
gateway is needed to allow the kernel to make an appropriate routing decision.
    - **8db879e: Only mark guest connections when they are part of a 
static-NAT.** Guest connections were being marked with a zero. This added no 
functionality and prevented static-NAT rules from routing outbound traffic 
properly as device-specific routing tables would not be used. Instead, all 
traffic would be routed out via the default public interface.
    - **788b1be: Allow forwarding and collect network stats on any public 
interface.** Forwarding rules and network stats were limited to eth2 on RvR 
networks. This needed to be decoupled from eth2 and reapplied to whichever 
interface was under consideration.
    - **b19e8aa: Ensure that CONNMARK --restore-mark only appears once.** This 
is a bit of a hack and can do with being improved. The CONNMARK rule was not 
being picked up by the de-duplication logic in CsNetfilter and was being added 
twice. This caused checksum errors on packets traversing NAT.
    - **bf285e1: Transition to master state should add all necessary routes.** 
Now that backup routers keep their interfaces down, the route logic executed at 
configuration-time cannot be applied. Instead, once the interface is brought up 
during a transition to master, routers must re-evaluate what routes are needed 
and add them. Unfortunately I couldn't see a way to re-use the existing route 
logic with the variables that I had in scope so there is some duplication. In 
some cases, routers did not successfully arp IPs away from the old master so 
some arp logic was added. During a failover most connections with guest VMs 
will be maintained with only minor packet loss. SSH sessions implemented via 
port-forwarding rules on an interface other than the source-NAT interface 
consistently get dropped, however, so the failover isn't quite seamless. It's 
possible that there's an easy fix for that.
    
    I expect that a number of tests may need to be modified/written as part of 
this PR. Any feedback or pointers would be useful as initially I'll be relying 
on the CI failures to tell me where to look.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/dsclose/cloudstack CLOUDSTACK-9339

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/cloudstack/pull/1519.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #1519
    
----
commit d5823589c8f14f943d921ef0072119b3643893ef
Author: dean.close <dean.cl...@icloudhosting.com>
Date:   2016-04-27T08:30:30Z

    CLOUDSTACK-9339: Leave public interfaces down in backup redundant routers.

commit 9ee1eb6be9ccc6b297a2d0bd74c6ded629a04461
Author: dean.close <dean.cl...@icloudhosting.com>
Date:   2016-04-27T08:35:49Z

    CLOUDSTACK-9339: Add the default gateway to the main routing table when the 
interfaces are configured.

commit ad9d72f288070096c5817d9dac36b1cf8c37b7b2
Author: dean.close <dean.cl...@icloudhosting.com>
Date:   2016-04-27T08:49:25Z

    CLOUDSTACK-9339: Add default gateway to device-specific routing tables.

commit 8db879e3f0a880836ba11129ce66649fc1260d95
Author: dean.close <dean.cl...@icloudhosting.com>
Date:   2016-04-27T09:07:06Z

    CLOUDSTACK-9339: Only mark guest connections when they are part of a 
static-NAT.

commit 788b1be3366435bb8c1dc3c0a7f14e6a24ac45c0
Author: dean.close <dean.cl...@icloudhosting.com>
Date:   2016-04-27T09:08:23Z

    CLOUDSTACK-9339: Allow forwarding and collect network stats on any public 
interface.

commit b19e8aab2725cdad048162505153dfcd10fd8810
Author: dean.close <dean.cl...@icloudhosting.com>
Date:   2016-04-27T09:10:04Z

    CLOUDSTACK-9339: Ensure that CONNMARK --restore-mark only appears once. 
This avoids checksum errors.

commit bf285e185d674238f228a546b600aed694699166
Author: dean.close <dean.cl...@icloudhosting.com>
Date:   2016-04-27T09:23:41Z

    CLOUDSTACK-9339: Transition to master state should add all necessary routes 
and arp IPs.

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

Reply via email to