While doing a large scan of a "large" lustre filesystem (10TB) I noticed the client hung the host.
I did a simple 'find /oagre/lustre/fs ' and it naturally took 36 hours since there are many small files. But we noticed the host crashed with ll_socket<pid of find> and 'find' process taking up 100% CPU. No commands were working, but I was able to ssh into the box. We are using Lustre 1.6.5.1. Is this a known issue? Could this be a statahead issue mentioned in the previous threads? Sorry if this is redundant. TIA On Thu, Sep 11, 2008 at 7:50 PM, Mag Gam <[EMAIL PROTECTED]> wrote: > I have 32GB on the MDS. So, where do I start? :-) > > > > On Thu, Sep 11, 2008 at 6:05 PM, Andreas Dilger <[EMAIL PROTECTED]> wrote: >> On Sep 11, 2008 06:28 -0400, Mag Gam wrote: >>> I have a filesystem with over 1m directories which are filled with >>> hourly temperatures of a controller environment for years. They are >>> being hosted on our lustre filesystem, and I constantly do a a fstat() >>> and fstat64() to get the directory's create time. I was wondering if >>> there is a way to speed this operation? Is it possible for me to >>> increase the mds cache? Are there any tricks I can perform to speed >>> this operation up? >> >> To cache 1M directory entries would need in the range of 6GB of RAM. >> >> Cheers, Andreas >> -- >> Andreas Dilger >> Sr. Staff Engineer, Lustre Group >> Sun Microsystems of Canada, Inc. >> >> > _______________________________________________ Lustre-discuss mailing list [email protected] http://lists.lustre.org/mailman/listinfo/lustre-discuss
