/me slaps Coren for not using SAL which is where I look first when I see rebooted server :P
On Thu, Jul 18, 2013 at 11:06 PM, Ken Snider <[email protected]> wrote: > FYI. Thanks for the update, Marc! > > > --Ken. > > (Sent from iPhone) > > On 2013-07-18, at 1:13 PM, "Marc A. Pelletier" <[email protected]> wrote: > >> Some of you may have noticed some annoyance with the NFS filesystems >> lately. While we seem to have successfully solved the problem that had >> it crash completely every 14 days, there is a lingering issue with the >> controller on the file server that causes intermittent stalls in the >> disk IO. >> >> In practice, this should have no impact on your running tools (or >> interactive session) except for disk access "freezing" for periods of >> 2-3 minutes at irregular intervals. The amount of stalls seem to be >> related to write traffic, but never gets much worse than 2-3 times per >> hours (annoying though they be). >> >> In an attempt to solve the issue this afternoon, I tweaked some driver >> settings on the file server but accidentally brought the filesystems >> back up in the wrong order, making files appearing unavailable for a >> brief period (12s) and necessitating a reboot of the Tool Labs cluster. >> >> Sadly, this was in vain since the underlying issue remains. It is not >> yet clear if the issue is caused by the driver or a hardware problem, >> but my efforts remain focused on solving the issue for good. >> >> In the meantime, I thank you for your patience as performance remains >> impacted. >> >> -- Marc >> >> _______________________________________________ >> Labs-l mailing list >> [email protected] >> https://lists.wikimedia.org/mailman/listinfo/labs-l > > _______________________________________________ > Labs-l mailing list > [email protected] > https://lists.wikimedia.org/mailman/listinfo/labs-l _______________________________________________ Labs-l mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/labs-l
