On 18 August 2015 at 10:41, Merlijn van Deen <[email protected]> wrote:
> On 18 August 2015 at 03:22, Thomas Tanon <[email protected]> wrote: > >> Is it related to the current current hight load on the tools labs grid? >> > > This was caused by three of the 10 nodes being out of rotation (one > disabled for the restart today, two had not come back up correctly after > the earlier reboots). Those two have been restarted, and an extra execution > node has been added, so we should be ok for now. We'll take more care about > making sure the hosts come back up after the coming reboots. > A post-mortem & a list of actionables for this outage is now available at https://wikitech.wikimedia.org/wiki/Incident_documentation/20150817-ToolLabs-WebgridOutage Best, Merlijn
_______________________________________________ Labs-l mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/labs-l
