Hi All,

I’m fielding an worrisome emergency situation where a set of (2) stacked 
switches (serving the MGMT & SAN networks) has partially failed (one of the 
stack members was ejected).

Here’s what I’ve got:


  *   The Xen hosts 2 MGMT NICs are bonded (active-passive) and connected to 
both switches
  *   The Xen hosts 2 SAN NICs are NOT bonded and connected to both switches
  *   PUB/GUEST NICs are connected to a different set of stacked switches (and  
are fine)

Amazingly CloudStack has survived (guest VMs are still running, there’s been no 
disk issues).

However I’ve got one compute-cluster (of 6 Xen hosts) in an alert state (as the 
pool master is affected) in the CloudStack UI (and cannot manage guests there) 
and I cannot get to their MGMT interfaces (going to hop on the hosts today and 
get more intel).

A replacement switch is arriving in 24 hours & I’m preparing the switch-swap 
process.

I’d REALLY like to shut everything down (guest VM’s) before mucking about with 
the switch-stack serving the SAN network, but I think I have to get to the host 
MGMT NICs sorted first (my suspicion is that the bonded MGMT NICS haven’t 
failed over – due t the nature of the switch failure maybe? – I’m considering 
pulling the MGMT NIC connections on the failed switch to see if I can get a 
path back).

Anyway, not really much of an ask here, talking myself through it as I ride the 
tiger.

Thanks.
David

David Merrill
Senior Systems Engineer,
Managed and Private/Hybrid Cloud Services
OTELCO
92 Oak Street, Portland ME 04101
office 207.772.5678<callto:207.772.5678>
www.otelco.com<http://www.otelco.com>/business/managed-services

Confidentiality Message
The information contained in this e-mail transmission may be confidential and 
legally privileged. If you are not the intended recipient, you are notified 
that any dissemination, distribution, copying or other use of this information, 
including attachments, is prohibited. If you received this message in error, 
please call me at 207.772.5678<callto:207.772.5678> so this error can be 
corrected.

Reply via email to