Some of you may have noticed some annoyance with the NFS filesystems lately. While we seem to have successfully solved the problem that had it crash completely every 14 days, there is a lingering issue with the controller on the file server that causes intermittent stalls in the disk IO.
In practice, this should have no impact on your running tools (or interactive session) except for disk access "freezing" for periods of 2-3 minutes at irregular intervals. The amount of stalls seem to be related to write traffic, but never gets much worse than 2-3 times per hours (annoying though they be). In an attempt to solve the issue this afternoon, I tweaked some driver settings on the file server but accidentally brought the filesystems back up in the wrong order, making files appearing unavailable for a brief period (12s) and necessitating a reboot of the Tool Labs cluster. Sadly, this was in vain since the underlying issue remains. It is not yet clear if the issue is caused by the driver or a hardware problem, but my efforts remain focused on solving the issue for good. In the meantime, I thank you for your patience as performance remains impacted. -- Marc _______________________________________________ Labs-l mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/labs-l
