subject:"Re\: \[Cloud\] \[Cloud\-announce\] Cloud VPS single hypervisor failure and \(some\) down instances"

Re: [Cloud] [Cloud-announce] Cloud VPS single hypervisor failure and (some) down instances (possibly resolved)

2018-02-14 Thread Andrew Bogott

The host in question has been repaired and restarted; all hosted VMs should now be up and running. We're not 100% certain that we've addressed the root cause of the problem, so we will see if it dies again. In the meantime, though, everything should be back to normal. Sorry for the downtime

Re: [Cloud] [Cloud-announce] Cloud VPS single hypervisor failure and (some) down instances

2018-02-14 Thread Andrew Bogott

On 2/14/18 6:58 AM, Chase Pettet wrote: We lost a KVM host at around 7:20 UTC. Because we use local storage for instances there are a number of them that are down. Toolforge suffered a few losses but it seems to have been few enough that GridEngine and Kubernetes users are unaffected at thi