Thanks for the updates, and for continuing to work on this after business hours (at least for folks in SF).
Pine On Thu, Jun 29, 2017 at 6:25 PM, Andrew Bogott <abog...@wikimedia.org> wrote: > After various failed measures, we're now trying to revert back to the > older kernel and switching back between NFS servers yet again. So Tools > NFS (and various associated services) will probably break, at least for a > few minutes. > > With luck this will get us into a stable place, but I'll update again > regardless. > > -Andrew > > > On 6/29/17 3:27 PM, Andrew Bogott wrote: > >> The tools cluster is suffering from several maladies right now. >> Existing services seem to be mostly fine, but any kubernetes services that >> tried to restart in the last few hours probably failed to start, and new >> things are still failing to start. Similarly, web services and other tools >> are failing to restart in several cases. >> >> There are various theories as to what's going on -- most likely it's >> a kernel-version incompatibility with the newly upgraded NFS server. There >> was an earlier ldap outage which is better understood and should be >> resolved by now. >> >> We apologize for the inconvenience, and are working frantically to >> restore stability. There will be a follow-up email when things are >> resolved. >> >> -Andrew >> >> >> > > _______________________________________________ > Labs-announce mailing list > labs-annou...@lists.wikimedia.org > https://lists.wikimedia.org/mailman/listinfo/labs-announce > _______________________________________________ > Labs-l mailing list > Labs-l@lists.wikimedia.org > https://lists.wikimedia.org/mailman/listinfo/labs-l >
_______________________________________________ Labs-l mailing list Labs-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/labs-l