Why aren't you using SRM to modify the network settings during failover? 

- Sean

> On Jun 24, 2016, at 8:30 AM, Michael Leone <[email protected]> wrote:
> 
>> On Fri, Jun 24, 2016 at 12:04 PM, Rubens Almeida <[email protected]> 
>> wrote:
>> Here's my 2 cents on this matter: I'm still waiting to see when a Windows 
>> server host will handle 2 gateways without trouble. I'm used to see on every 
>> customer I'm assigned to work as SME on my day job. Every one of them have 
>> this kind of issue on one degree or another. What I do is: on the production 
>> NIC I set the customer's gateway. On all other NICs no gateway at all. If 
>> needed, I then set a persistent routes pointing to the respective gateway 
>> handling that specific network. Hope that helps!
> 
> As I said, there are no other NICs. Also, in case of disaster, I don't want 
> to have to edit 175 VMs, to set addressing on a previously unused NIC 
> (script-based or not). I need an automatic dead-gateway detection and 
> failover, apparently.
> 
> 
>> 
>> Rubens
>> 
>>> On Fri, Jun 24, 2016 at 12:23 PM, Michael Leone <[email protected]> wrote:
>>> Here's my setup: I have a lot of VMware VMs. We also use their SRM (Site 
>>> Recovery Manager) for Disaster Recovery. Basically, SRM lets the VMs fail 
>>> over to another site, in case of disaster. They will keep their current IP 
>>> addressing.
>>> 
>>> So what we did was set 2 gateways on each VM - first entry is x.x.x.1, 
>>> which is the gateway at the production site. Second entry is x.x.x.2, which 
>>> is the gateway at the recovery site. This way, if the VMs did fail over, 
>>> they would still be able to find a gateway and continue to work (since 
>>> theoretically x.x.x.1 would not be available, being a smoldering pile of 
>>> ash or whatever). Note that these are all 1 NIC machines, no multi-homing. 
>>> And all static addressing, no DHCP.
>>> 
>>> I seem to recall testing this a couple years ago, and it worked fine. 
>>> However, I'm old, so who knows how faulty my memory is ...
>>> 
>>> Here's the problem - yesterday the recovery site went down. Mind you, the 
>>> main production site stayed up, and in fact, has never gone down. But then 
>>> I started getting weird calls - I couldn't ping some VMs, yet other on the 
>>> same subnet as I am had no difficulties.
>>> 
>>> Eventually, what I had to do was delete the x.x.x.2 gateway entry from the 
>>> problematical machines, flush their DNS cache, and then everyone could 
>>> access these VMs again.
>>> 
>>> But why?. Since the main production site switch never went down, none of 
>>> the VMs should have been using the recovery site as a gateway; they should 
>>> all have been using x.x.x.1, and the fact that x.x.x.2 was unavailable 
>>> should not have matter to them in the slightest.
>>> 
>>>  And even if they were using the recovery site x.x.x.2 as gateway, once it 
>>> dropped, the VM should have still been able to use the other entry, the 
>>> production site switch x.x.x.1, as a gateway and continued to be available.
>>> 
>>> So, 3 questions then:
>>> 
>>> 1. Am I wrong in believing that a Windows machine (Win 2008 R2 and Win 2012 
>>> R2) will use the gateways in the order listed? (i.e., use x.x.x.1 first, 
>>> and not try to use x.x.x.2 unless x.x.x.1 is unavailable). Seems most of my 
>>> VMs worked this way, but not all, yet all are configured the same way.
>>> 
>>> 2. And, if the gateway in use (for example, x.x.x.2) becomes unavailable, I 
>>> thought Windows would automatically try the other entry, without any user 
>>> intervention. Is this not so?
>>> 
>>> 3. What I want is that for the VMs to use the first gateway listed. If it 
>>> can't reach or use that, then I want it to automatically use the next entry 
>>> in the gateway list. Is this possible? If so, then how?
>>> 
>>> Thanks for any help.
> 

Reply via email to