What does VMWare support say? We use SRM and we can manipulate all of the network properties of a VM. On the network side, why not leverage something like OTV? Or have the same network in the alternate site on a different VLAN shutdown that can be turned up in the event of a failover?
On Fri, Jun 24, 2016 at 9:37 AM Michael Leone <[email protected]> wrote: > On Fri, Jun 24, 2016 at 12:04 PM, Rubens Almeida <[email protected]> > wrote: > >> Here's my 2 cents on this matter: I'm still waiting to see when a Windows >> server host will handle 2 gateways without trouble. I'm used to see on >> every customer I'm assigned to work as SME on my day job. Every one of them >> have this kind of issue on one degree or another. What I do is: on the >> production NIC I set the customer's gateway. On all other NICs no gateway >> at all. If needed, I then set a persistent routes pointing to the >> respective gateway handling that specific network. Hope that helps! >> > > As I said, there are no other NICs. Also, in case of disaster, I don't > want to have to edit 175 VMs, to set addressing on a previously unused NIC > (script-based or not). I need an automatic dead-gateway detection and > failover, apparently. > > > >> Rubens >> >> On Fri, Jun 24, 2016 at 12:23 PM, Michael Leone <[email protected]> >> wrote: >> >>> Here's my setup: I have a lot of VMware VMs. We also use their SRM (Site >>> Recovery Manager) for Disaster Recovery. Basically, SRM lets the VMs fail >>> over to another site, in case of disaster. They will keep their current IP >>> addressing. >>> >>> So what we did was set 2 gateways on each VM - first entry is x.x.x.1, >>> which is the gateway at the production site. Second entry is x.x.x.2, which >>> is the gateway at the recovery site. This way, if the VMs did fail over, >>> they would still be able to find a gateway and continue to work (since >>> theoretically x.x.x.1 would not be available, being a smoldering pile of >>> ash or whatever). Note that these are all 1 NIC machines, no multi-homing. >>> And all static addressing, no DHCP. >>> >>> I seem to recall testing this a couple years ago, and it worked fine. >>> However, I'm old, so who knows how faulty my memory is ... >>> >>> Here's the problem - yesterday the recovery site went down. Mind you, >>> the main production site stayed up, and in fact, has never gone down. But >>> then I started getting weird calls - I couldn't ping some VMs, yet other on >>> the same subnet as I am had no difficulties. >>> >>> Eventually, what I had to do was delete the x.x.x.2 gateway entry from >>> the problematical machines, flush their DNS cache, and then everyone could >>> access these VMs again. >>> >>> But why?. Since the main production site switch never went down, none of >>> the VMs should have been using the recovery site as a gateway; they should >>> all have been using x.x.x.1, and the fact that x.x.x.2 was unavailable >>> should not have matter to them in the slightest. >>> >>> And even if they were using the recovery site x.x.x.2 as gateway, once >>> it dropped, the VM should have still been able to use the other entry, the >>> production site switch x.x.x.1, as a gateway and continued to be available. >>> >>> So, 3 questions then: >>> >>> 1. Am I wrong in believing that a Windows machine (Win 2008 R2 and Win >>> 2012 R2) will use the gateways in the order listed? (i.e., use x.x.x.1 >>> first, and not try to use x.x.x.2 unless x.x.x.1 is unavailable). Seems >>> most of my VMs worked this way, but not all, yet all are configured the >>> same way. >>> >>> 2. And, if the gateway in use (for example, x.x.x.2) becomes >>> unavailable, I thought Windows would automatically try the other entry, >>> without any user intervention. Is this not so? >>> >>> 3. What I want is that for the VMs to use the first gateway listed. If >>> it can't reach or use that, then I want it to automatically use the next >>> entry in the gateway list. Is this possible? If so, then how? >>> >>> Thanks for any help. >>> >>> >>

