Hi Tom: There was a known issue with 1.6.7.1. What I did was downgrade to 1.6.6 and everything worked well. Or you can try upgrading, but there is something def wrong with that version...
If you like, I can help you offline. I should be free this weekend (I have a long weekend) On Thu, Jul 2, 2009 at 8:22 AM, Thomas Roth<[email protected]> wrote: > Hi all, > > our MDT gets stuck and unresponsive with very high loads (Lustre > 1.6.7.1, Kernel 2.6.22, 8 Core, 32GB RAM). The only thing calling > attention is one ll_mt_?? process running with 100% cpu. Nothing unusual > happening on the cluster before that. > After reboot as well as after moving the service to another server, this > behavior reappears. The initial stages - mounting MGS, mouting MDT, > recovery - work fine, but then the load goes up and the system is > rendered unusable. > > Atm, I don't know what to do, except shutting down all servers and > possible do a writeconf everywhere. > > I see that a similar problem was reported by Mag in March this year, but > no clues or solutions appeared. > Any ideas? > > Yours, > Thomas > _______________________________________________ Lustre-discuss mailing list [email protected] http://lists.lustre.org/mailman/listinfo/lustre-discuss
