Dean,
I just fixed it yesterday.
the commit is
---
diff --git a/systemvm/patches/debian/config/opt/cloud/bin/cs/CsAddress.py
b/systemvm/patches/debian/config/opt/cloud/bin/cs/CsAddress.py
index 5f63c06..5256d03 100755
--- a/systemvm/patches/debian/config/opt/cloud/bin/cs/CsAddress.py
+++ b/systemvm/patches/debian/config/opt/cloud/bin/cs/CsAddress.py
@@ -27,7 +27,7 @@ from CsRoute import CsRoute
from CsRule import CsRule
VRRP_TYPES = ['guest']
-PUBLIC_INTERFACE = ['eth1']
+VPC_PUBLIC_INTERFACE = ['eth1']
class CsAddress(CsDataBag):
@@ -323,7 +323,7 @@ class CsIP:
# If redundant only bring up public interfaces that are
not eth1.
# Reason: private gateways are public interfaces.
# master.py and keepalived will deal with eth1 public
interface.
- if self.cl.is_redundant() and (not self.is_public() or
self.getDevice() not in PUBLIC_INTERFACE):
+ if self.cl.is_redundant() and (not self.is_public() or
(self.config.is_vpc() and self.getDevice() not in VPC_PUBLIC_INTERFACE)):
CsHelper.execute(cmd2)
# if not redundant bring everything up
if not self.cl.is_redundant():
---
- Wei
2016-04-01 20:13 GMT+02:00 Dean Close <[email protected]>:
> Hi guys,
>
> I had been investigating a possible bug with the way interfaces are
> managed on virtual routers. The public interfaces are being brought up on
> backup routers and (because they boot second) they arp the IPs away from
> the master. I'd been examining an idea for a fix but whilst doing so I
> found that the system appears to be designed to bring up these interfaces.
>
> I suspect that a few things need to be reworked - but the changes
> necessary go so far against what has been implemented that I wanted to open
> this up before doing the work.
>
> Hopefully if I go through my findings you guys can help me see what I
> might be getting wrong.
>
> The following was correct for pre-4.6 redundant routers:
>
> 1. Both routers get configured with IP addresses, routes and iptables
> rules.
> 2. Public interfaces are initially set as DOWN.
> 3. Keepalived runs a VRRP instance on the private interface (eth0) to
> negotiate MASTER/BACKUP roles.
> 4. Keepalived manages the virtual IP on eth0 used as the public gateway
> for the guest VMs.
> 5. Keepalived uses a master notify script to bring up the public
> interfaces.
>
> The above was true for pre-4.6 routers. Now, however, things appear to
> work differently:
>
> 1. Both routers get configured as before.
> 2. All interfaces apart from eth1 (the Hypervisor-link interface) are
> set as UP.
> 3. Keepalived runs a VRRP instance on the first public interface (eth2)
> to negotiate MASTER/BACKUP roles.
> 4. Keepalived manages the virtual IP as before.
> 5. Keepalived uses a master notify script to bring up the public
> interfaces (unnecessary)
> 6. Keepalived uses a backup notify script to bring down the public
> interfaces (unused)
>
> This is unexpected for the following reasons:
>
> 1. The keepalived notify script brings the public interfaces down when
> transitioning to BACKUP - so how can we expect to run a VRRP instance over
> eth2?
> 2. If interfaces are down when transitioning to BACKUP, why are they not
> expected to be down to begin with? (Before the router has become MASTER)
> 3. Why are we running a VRRP instance over an interface with an IP that
> will clash with another host on the network?
>
> The following method from the CsIP class in /opt/cloud/bin/cs/CsAddress.py
> confuses matters futher:
>
> def check_is_up(self):
> """ Ensure device is up """
> cmd = "ip link show %s | grep 'state DOWN'" % self.getDevice()
> for i in CsHelper.execute(cmd):
> if " DOWN " in i:
> cmd2 = "ip link set %s up" % self.getDevice()
> # If redundant only bring up public interfaces that are
> not eth1.
> # Reason: private gateways are public interfaces.
> # master.py and keepalived will deal with eth1 public
> interface.
> if self.cl.is_redundant() and (not self.is_public() or
> self.getDevice() not in PUBLIC_INTERFACE):
> CsHelper.execute(cmd2)
> # if not redundant bring everything up
> if not self.cl.is_redundant():
> CsHelper.execute(cmd2)
>
> The comments refer to eth1 as a public interface when this is the link to
> the hypervisor. Indeed, PUBLIC_INTERFACE is defined on line 31 as ['eth1'].
> But keepalived and master.py don't influence eth1 at all. This looks like a
> mistake.
>
> Lastly, the logic of this line looks flawed:
>
> if self.cl.is_redundant() and (not self.is_public() or self.getDevice()
> not in PUBLIC_INTERFACE)
>
> As PUBLIC_INTERFACE is limited to eth1, the `not self.is_public()` will be
> ignored. Public IPs will never be assigned to eth1, so this line evaluates
> as:
>
>
> if self.cl.is_redundant() and (self.getDevice() not in PUBLIC_INTERFACE)
>
> which reduces even further to:
>
> if self.cs.is_redundant() and self.is_control()
>
>
> What would need doing
> ---------------------
>
> 1. The keepalived.conf template would need to be changed to run the VRRP
> instance over eth0.
> 2. The check_is_up method of the CsIP class should be renamed to
> 'bring_up_interfaces'. For redundant routers it should ignore IPs that pass
> is_public or needs_vrrp.
> 3. The arpPing method should do nothing if the interface is down.
> 4. The PUBLIC_INTERFACE constant should be either renamed or dropped
> altogether.
> 5. Other things that I haven't considered?
>
>
> I'd really appreciate any feedback on this. It's possible that I've got it
> all wrong but I'm suspecting not. I just don't want to tread on anyone's
> toes by submitting a PR that goes against what appears to be an explicit
> design decision.
>
>
> Kind regards,
>
> Dean Close
> iCloudHosting.com
> http://www.icloudhosting.com
> Tel: 01582 227927
>
> Unit 2, Smallmead Road, Reading RG2 0QS
>
> ******************************************************************
> The names iCloudHosting and iCloudHosting.com are trading styles of BBS
> Commerce Ltd which is registered in England and Wales, Company Number
> 04837714. Please use our trading address above for mail. Our registered
> office is 5 Theale Lakes Business Park, Moulden Way, Sulhamstead, Reading,
> Berkshire, RG7 4GB. VAT Registration Number GB 982 8230 94.
>
> This email and any files transmitted with it are confidential and intended
> solely for the use of the individual or entity to whom they are addressed.
> If you are not the intended recipient you are not authorised to and must
> not disclose, copy, distribute, or retain this message or any part of it.
>
> iCloudHosting accepts no responsibility for information, errors or
> omissions in this email.
> ******************************************************************
>
>