On Jul 30, 2009 09:52 +0200, Guillaume Demillecamps wrote: >> On Jul 22, 2009 11:45 +0200, Guillaume Demillecamps wrote: >>> Lustre 1.8.0 on all servers / clients involved in this. OS is SLES 10 >>> SP2 with un-patched kernel on the clients. I however has put the same >>> kernel revision downloaded from suse.com on the clients as the version >>> used in the Lustre-patched MGS/MDS/OSS servers. File system is only >>> several GBs, with ~500000 files. All inter-connections are through TCP. >>> >>> We have some “manual” replication of an active lustre file system to a >>> passive lustre file system. We have “sync” clients that just basically >>> mount both file systems and run large sync jobs from the active Lustre >>> to the passive Lustre. So far, so good (apart that it is quite a slow >>> process). However my issue is that Lustre is rising memory so high >>> that rsync cannot get enough RAM to finish its job before kswap kicks >>> in and slows things down drastically.
> # name <active> <total> <size> <obj/slab>: slabdata <active> <num> > lustre_inode_cache 385652 385652 960 4 : slabdata 96413 96413 > lov_oinfo 2929548 2929548 320 12 : slabdata 244129 244129 > ldlm_locks 136262 254424 512 8 : slabdata 31803 31803 > ldlm_resources 136183 256120 384 10 : slabdata 25612 25612 This shows that we have 385k Lustre inodes, yet there are 2.9M "lov_oinfo" structs (there should only be a single one per inode). I'm not sure why that is happening, but that is consuming about 1GB of RAM. The 385k inode count is reasonable, given you have 500k files, per above. There are 136k locks, which is also fine (probably so much lower than the inode count because of your short lock expiry time). So, it seems like a problem of some kind, and is probably deserving of filing a bug. Cheers, Andreas -- Andreas Dilger Sr. Staff Engineer, Lustre Group Sun Microsystems of Canada, Inc. _______________________________________________ Lustre-discuss mailing list [email protected] http://lists.lustre.org/mailman/listinfo/lustre-discuss
