The intention of this proposal is to have a way forward to reducing maintenance downtime for virtual routers. There are two parts to this proposal;
1. Dealing with legacy routers and replacing them before shutting down. 2. Unifying router embodiments and making use of redundancy mechanisms to quickly failover from old to new. Ad .1 It will always be possible that a router is to old and will not be able to talk to a new version that is to replace it. This might be due to a keepalived update or replacement or just because it is very old. So though Unifying the routers and making them redundant enabled will solve a lot of use cases it will never deal with any conceivable situation, not even in systems upgraded to a version in which all intended functionality has been implemented. Dealing with any older router is to work as follows: 1. A check will be done to make sure the old VR is still up. * If it is not there is no consideration it will be replaced as quickly as possible. Possible improvements here are the iptables configuration speedup and other generic optimisations unrelated to the upgrade itself. * If it is there we need to walk on eggs with provisioning the new one😉 2. A new VR will be instantiated 3. Configuration data will be send but not applied. 4. The interfaces will be added and if need be brought down. 5. All configuration is applied 6. The old VR is killed 7. The interface on the new VR are brought up Ad .2 This is a long-term goal. At the moment we have five (or debatably six) different incarnations of the virtual router: * Basic zone dhcp server * Shared network ‘router’ * VR * rVR * VPC * rVPC a first set of steps will be to reduce this to * shared networks (where a basic zone is an automatic implementation of a single shared network in a zone) * VR (which is always redundant enabled but may have only one instance) * VPC (as above) and then the next step is to unify VR and VPC as a VR is really only a VPC with just one network the final step is then to unify a shared network with a VPC and this one is so far ahead that I don’t want to make too much statements about it now. We will have to find the exact implementation hazards that we will face in this step along the way. I think we are talking at least one year in when we reach this point. As Shapeblue we will be starting a short PoC on the first part. We will try to figure out if the process under .1 is feasible, or that we need to wait configuring interfaces to the last moment and then do a ‘blind’ start. daan.hoogl...@shapeblue.com www.shapeblue.com 53 Chandos Place, Covent Garden, London WC2N 4HSUK @shapeblue