We're currently having a Labs outage. The nfs server because non-responsive, causing a cascading failure. I'm suspending instances currently, until load comes down. Once load is under control I'll slowly resume instances. Soon, we'll be doing the following things to ensure this doesn't continue to occur:
1. We're moving away from glusterfs to local storage on the virtual nodes until we find another more appropriate solution 2. We're getting rid of the labs-nfs1 instance, and will move the home directories to project storage 3. We're adding more (and better) hardware, that will lead to less swapping, which will lead to less IO Sorry about the experience as of late, I'm looking forward to improving the situation for us. - Ryan _______________________________________________ Labs-l mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/labs-l
