Ok, so I had a network outage yesterday, thankfully on Sunday, so no 
productivity last. Here is my setup:

All HP Switches my Core switch is an 8212zl and my physical VMware serves and 
NetApp storage, connected to 2 stacked 3800 switches, they are then trunked 
with 2 * 10G links and 2 * 1 gig copper as failover to the core switch. Here is 
what happened, at just after 2PM I started getting e-mail of servers off-line 
for more than 5 min, and the list just kept growing. I had previously just done 
some UPS power balancing and had to shut down a few items for the move. I 
figured maybe I disrupted some power cable as I did the changes. I drove back 
and physically checked everything, everything looked good I could ping the 
gateway from some servers but not from others, the whole thing was very 
strange, finally we figured it out, one of the 10 Gb trunk had failed but the 
core switch did not realized it was down, that's what caused the strange 
network behaviour.

Ok so now my monitoring guys, says well if it had been configured as lacp there 
would have been no outage and he says that they configure all switch to switch 
trunking with lacp. I asked my networking guy that did the initial 
configuration and his comment is:

LACP is industry standard and used widely when you interface servers to 
switches or different vendors switches / other networking gear. When you have 
same make (HP or say Cisco), most folks always use Cisco etherchannel / 
portchannel (which also works with HP) or in HP language trunk.  I have never 
come across anything like this so will not comment that if you have this kind 
of issue, then LACP would have prevented.

If there is a fiber issue, then you can have unidirectional link and then it is 
UDLD feature with LACP also enabled that helps. But fiber unidirectional is 
extremely rare, else why 98% of cisco networks will not use LACP.

The issue here is that you have someone else managing the network and you use 
me for help you set up the network, so there will always be a conflict of 
interest and differences in viewpoints.

So is there a correct answer here or I was just extremely unlucky with a 
hardware failure that did not fail over?

__________________________________
Stefan Jafs


Reply via email to