This happened again :-( ANyone have any insight on a problem similar to this?
TIA On Mon, Sep 15, 2008 at 8:18 PM, Mag Gam <[EMAIL PROTECTED]> wrote: > While doing a large scan of a "large" lustre filesystem (10TB) I > noticed the client hung the host. > > I did a simple 'find /oagre/lustre/fs ' and it naturally took 36 hours > since there are many small files. But we noticed the host crashed with > ll_socket<pid of find> and 'find' process taking up 100% CPU. No > commands were working, but I was able to ssh into the box. We are > using Lustre 1.6.5.1. Is this a known issue? Could this be a statahead > issue mentioned in the previous threads? > > Sorry if this is redundant. > > TIA > > > > > On Thu, Sep 11, 2008 at 7:50 PM, Mag Gam <[EMAIL PROTECTED]> wrote: >> I have 32GB on the MDS. So, where do I start? :-) >> >> >> >> On Thu, Sep 11, 2008 at 6:05 PM, Andreas Dilger <[EMAIL PROTECTED]> wrote: >>> On Sep 11, 2008 06:28 -0400, Mag Gam wrote: >>>> I have a filesystem with over 1m directories which are filled with >>>> hourly temperatures of a controller environment for years. They are >>>> being hosted on our lustre filesystem, and I constantly do a a fstat() >>>> and fstat64() to get the directory's create time. I was wondering if >>>> there is a way to speed this operation? Is it possible for me to >>>> increase the mds cache? Are there any tricks I can perform to speed >>>> this operation up? >>> >>> To cache 1M directory entries would need in the range of 6GB of RAM. >>> >>> Cheers, Andreas >>> -- >>> Andreas Dilger >>> Sr. Staff Engineer, Lustre Group >>> Sun Microsystems of Canada, Inc. >>> >>> >> > _______________________________________________ Lustre-discuss mailing list [email protected] http://lists.lustre.org/mailman/listinfo/lustre-discuss
