You may also want to check and, if necessary, limit the lru_size on your clients. I believe there are guidelines in the ops manual. We have ~750 clients and limit ours to 600 per OST. That, combined with the setting zone_reclaim_mode=0 should make a big difference.
Regards, Charlie Taylor UF HPC Center On Feb 1, 2012, at 2:04 PM, Carlos Thomaz wrote: > Hi David, > > You may be facing the same issue discussed on previous threads, which is > the issue regarding the zone_reclaim_mode. > > Take a look on the previous thread where myself and Kevin replied to > Vijesh Ek. > > If you don't have access to the previous emails, look at your kernel > settings for the zone reclaim: > > cat /proc/sys/vm/zone_reclaim_mode > > It should be set to 0. > > Also, look at the number of Lustre OSS service threads. It may be set to > high... > > Rgds. > Carlos. > > > -- > Carlos Thomaz | HPC Systems Architect > Mobile: +1 (303) 519-0578 > [email protected] | Skype ID: carlosthomaz > DataDirect Networks, Inc. > 9960 Federal Dr., Ste 100 Colorado Springs, CO 80921 > ddn.com <http://www.ddn.com/> | Twitter: @ddn_limitless > <http://twitter.com/ddn_limitless> | 1.800.TERABYTE > > > > > > On 2/1/12 11:57 AM, "David Noriega" <[email protected]> wrote: > >> indicates the system was overloaded (too many service threads, or >> > > _______________________________________________ > Lustre-discuss mailing list > [email protected] > http://lists.lustre.org/mailman/listinfo/lustre-discuss Charles A. Taylor, Ph.D. Associate Director, UF HPC Center (352) 392-4036 _______________________________________________ Lustre-discuss mailing list [email protected] http://lists.lustre.org/mailman/listinfo/lustre-discuss
