On Fri, Jun 24, 2016 at 12:04 PM, Rubens Almeida <[email protected]> wrote:
> Here's my 2 cents on this matter: I'm still waiting to see when a Windows > server host will handle 2 gateways without trouble. I'm used to see on > every customer I'm assigned to work as SME on my day job. Every one of them > have this kind of issue on one degree or another. What I do is: on the > production NIC I set the customer's gateway. On all other NICs no gateway > at all. If needed, I then set a persistent routes pointing to the > respective gateway handling that specific network. Hope that helps! > As I said, there are no other NICs. Also, in case of disaster, I don't want to have to edit 175 VMs, to set addressing on a previously unused NIC (script-based or not). I need an automatic dead-gateway detection and failover, apparently. > Rubens > > On Fri, Jun 24, 2016 at 12:23 PM, Michael Leone <[email protected]> > wrote: > >> Here's my setup: I have a lot of VMware VMs. We also use their SRM (Site >> Recovery Manager) for Disaster Recovery. Basically, SRM lets the VMs fail >> over to another site, in case of disaster. They will keep their current IP >> addressing. >> >> So what we did was set 2 gateways on each VM - first entry is x.x.x.1, >> which is the gateway at the production site. Second entry is x.x.x.2, which >> is the gateway at the recovery site. This way, if the VMs did fail over, >> they would still be able to find a gateway and continue to work (since >> theoretically x.x.x.1 would not be available, being a smoldering pile of >> ash or whatever). Note that these are all 1 NIC machines, no multi-homing. >> And all static addressing, no DHCP. >> >> I seem to recall testing this a couple years ago, and it worked fine. >> However, I'm old, so who knows how faulty my memory is ... >> >> Here's the problem - yesterday the recovery site went down. Mind you, the >> main production site stayed up, and in fact, has never gone down. But then >> I started getting weird calls - I couldn't ping some VMs, yet other on the >> same subnet as I am had no difficulties. >> >> Eventually, what I had to do was delete the x.x.x.2 gateway entry from >> the problematical machines, flush their DNS cache, and then everyone could >> access these VMs again. >> >> But why?. Since the main production site switch never went down, none of >> the VMs should have been using the recovery site as a gateway; they should >> all have been using x.x.x.1, and the fact that x.x.x.2 was unavailable >> should not have matter to them in the slightest. >> >> And even if they were using the recovery site x.x.x.2 as gateway, once >> it dropped, the VM should have still been able to use the other entry, the >> production site switch x.x.x.1, as a gateway and continued to be available. >> >> So, 3 questions then: >> >> 1. Am I wrong in believing that a Windows machine (Win 2008 R2 and Win >> 2012 R2) will use the gateways in the order listed? (i.e., use x.x.x.1 >> first, and not try to use x.x.x.2 unless x.x.x.1 is unavailable). Seems >> most of my VMs worked this way, but not all, yet all are configured the >> same way. >> >> 2. And, if the gateway in use (for example, x.x.x.2) becomes unavailable, >> I thought Windows would automatically try the other entry, without any user >> intervention. Is this not so? >> >> 3. What I want is that for the VMs to use the first gateway listed. If it >> can't reach or use that, then I want it to automatically use the next entry >> in the gateway list. Is this possible? If so, then how? >> >> Thanks for any help. >> >> >

