After various failed measures, we're now trying to revert back to the older kernel and switching back between NFS servers yet again. So Tools NFS (and various associated services) will probably break, at least for a few minutes.

With luck this will get us into a stable place, but I'll update again regardless.

-Andrew


On 6/29/17 3:27 PM, Andrew Bogott wrote:
The tools cluster is suffering from several maladies right now. Existing services seem to be mostly fine, but any kubernetes services that tried to restart in the last few hours probably failed to start, and new things are still failing to start. Similarly, web services and other tools are failing to restart in several cases.

There are various theories as to what's going on -- most likely it's a kernel-version incompatibility with the newly upgraded NFS server. There was an earlier ldap outage which is better understood and should be resolved by now.

We apologize for the inconvenience, and are working frantically to restore stability. There will be a follow-up email when things are resolved.

-Andrew




_______________________________________________
Labs-announce mailing list
labs-annou...@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/labs-announce
_______________________________________________
Labs-l mailing list
Labs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/labs-l

Reply via email to