Here's my setup: I have a lot of VMware VMs. We also use their SRM (Site
Recovery Manager) for Disaster Recovery. Basically, SRM lets the VMs fail
over to another site, in case of disaster. They will keep their current IP
addressing.

So what we did was set 2 gateways on each VM - first entry is x.x.x.1,
which is the gateway at the production site. Second entry is x.x.x.2, which
is the gateway at the recovery site. This way, if the VMs did fail over,
they would still be able to find a gateway and continue to work (since
theoretically x.x.x.1 would not be available, being a smoldering pile of
ash or whatever). Note that these are all 1 NIC machines, no multi-homing.
And all static addressing, no DHCP.

I seem to recall testing this a couple years ago, and it worked fine.
However, I'm old, so who knows how faulty my memory is ...

Here's the problem - yesterday the recovery site went down. Mind you, the
main production site stayed up, and in fact, has never gone down. But then
I started getting weird calls - I couldn't ping some VMs, yet other on the
same subnet as I am had no difficulties.

Eventually, what I had to do was delete the x.x.x.2 gateway entry from the
problematical machines, flush their DNS cache, and then everyone could
access these VMs again.

But why?. Since the main production site switch never went down, none of
the VMs should have been using the recovery site as a gateway; they should
all have been using x.x.x.1, and the fact that x.x.x.2 was unavailable
should not have matter to them in the slightest.

 And even if they were using the recovery site x.x.x.2 as gateway, once it
dropped, the VM should have still been able to use the other entry, the
production site switch x.x.x.1, as a gateway and continued to be available.

So, 3 questions then:

1. Am I wrong in believing that a Windows machine (Win 2008 R2 and Win 2012
R2) will use the gateways in the order listed? (i.e., use x.x.x.1 first,
and not try to use x.x.x.2 unless x.x.x.1 is unavailable). Seems most of my
VMs worked this way, but not all, yet all are configured the same way.

2. And, if the gateway in use (for example, x.x.x.2) becomes unavailable, I
thought Windows would automatically try the other entry, without any user
intervention. Is this not so?

3. What I want is that for the VMs to use the first gateway listed. If it
can't reach or use that, then I want it to automatically use the next entry
in the gateway list. Is this possible? If so, then how?

Thanks for any help.

Reply via email to