Been there, done that, a year ago (whole DC down == everything shutting down (storage things, S3 things, CloudStack things (all VMs and everything), etc, etc, etc) - very successfully but not funny...)
On Mon, 17 Jun 2019 at 14:33, David Merrill <david.merr...@otelco.com> wrote: > Hi All, > > I’m fielding an worrisome emergency situation where a set of (2) stacked > switches (serving the MGMT & SAN networks) has partially failed (one of the > stack members was ejected). > > Here’s what I’ve got: > > > * The Xen hosts 2 MGMT NICs are bonded (active-passive) and connected > to both switches > * The Xen hosts 2 SAN NICs are NOT bonded and connected to both > switches > * PUB/GUEST NICs are connected to a different set of stacked switches > (and are fine) > > Amazingly CloudStack has survived (guest VMs are still running, there’s > been no disk issues). > > However I’ve got one compute-cluster (of 6 Xen hosts) in an alert state > (as the pool master is affected) in the CloudStack UI (and cannot manage > guests there) and I cannot get to their MGMT interfaces (going to hop on > the hosts today and get more intel). > > A replacement switch is arriving in 24 hours & I’m preparing the > switch-swap process. > > I’d REALLY like to shut everything down (guest VM’s) before mucking about > with the switch-stack serving the SAN network, but I think I have to get to > the host MGMT NICs sorted first (my suspicion is that the bonded MGMT NICS > haven’t failed over – due t the nature of the switch failure maybe? – I’m > considering pulling the MGMT NIC connections on the failed switch to see if > I can get a path back). > > Anyway, not really much of an ask here, talking myself through it as I > ride the tiger. > > Thanks. > David > > David Merrill > Senior Systems Engineer, > Managed and Private/Hybrid Cloud Services > OTELCO > 92 Oak Street, Portland ME 04101 > office 207.772.5678<callto:207.772.5678> > www.otelco.com<http://www.otelco.com>/business/managed-services > > Confidentiality Message > The information contained in this e-mail transmission may be confidential > and legally privileged. If you are not the intended recipient, you are > notified that any dissemination, distribution, copying or other use of this > information, including attachments, is prohibited. If you received this > message in error, please call me at 207.772.5678<callto:207.772.5678> so > this error can be corrected. > > -- Andrija Panić