Tim Gipson created CLOUDSTACK-10074:
---------------------------------------
Summary: VR Failover causes failed VPC routers
Key: CLOUDSTACK-10074
URL: https://issues.apache.org/jira/browse/CLOUDSTACK-10074
Project: CloudStack
Issue Type: Bug
Security Level: Public (Anyone can view this level - this is the default.)
Components: Virtual Router
Affects Versions: 4.8.2.0
Environment: CentOS 7 for management, KVM hypervisors
Reporter: Tim Gipson
Attachments: iptables_broken.txt, iptables_working.txt,
management-server20170913-2pm.log, VR_backup_cloud.log, VR_master_cloud.log
I’ve found what I think could be a possible issue with the redundant VPC router
pairs in Clousdstack. The issue was first noticed when routers were failing
over from master to backup. When the backup router became master, everything
continued to work properly and traffic flowed as normal. However, when it
failed from the new master back to the original master the virtual router
stopped allowing traffic through any network interfaces and any failover after
that resulted in virtual routers that were not passing traffic.
I can reproduce this behavior by doing a manual failover (logging in and
issuing a reboot command on the router) from master to backup and then back to
the original master. From what I can tell, the iptables rules on the router
are somehow modified during the failover (or a manual reboot) in such a way as
to make them completely nonfunctional. I did a side-by-side comparison of the
iptables rules before and after a failover (or a manual reboot) and there are
definite differences. Sometimes rules are changed, sometimes they are
duplicated, and I’ve even found that some rules are missing completely out of
iptables.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)