On Tue, May 20, 2008 at 08:14:01PM +0200, Reto Gantenbein wrote: > Hello Lustre users > > We have a Lustre setup consisting of 2 Lustre servers exporting 7 OSTs > and 1 MGS/MDT. The storage devices are FibreChannel-RAIDs connected > through a QLogic SANbox 5602 with the two servers. The servers run the > Lustre-patched 2.6.18 Vanilla sources. We use Lustre 1.6.4.3. > > On Client side we have about 180 hosts with Lustre patchless Client and > Gentoo kernel 2.6.22. > > We experienced a major problem when accessing many files at a time. > Especially a `find` does perform very very slowly in our setup. The 4 > core MGS/MDT server has nearly 100% system CPU time and the client > nearly the same amount CPU wait time. It's similar when doing a `ls` or > deleting files in a directory with thousands of files.
High CPU usage while deleting a large directory is a known (and fixed) issue: https://bugzilla.lustre.org/show_bug.cgi?id=15029 https://bugzilla.lustre.org/show_bug.cgi?id=13918 The fix is: https://bugzilla.lustre.org/attachment.cgi?id=13806&action=edit It seemed that the fix was not included in 1.6.4.3 but I'm not 100% sure. Isaac > > I did already search though the list, but couldn't find anything > similar. > > It seems to me like the MGS/MDT is the bottleneck here, but I have no > idea about tuning it. The MDT storage is a Transtec 4Gbps FibreChannel > SAS Raid. It's configured to use Raid level 1+0. > > Does anybody have hints, recommendations, experiences to this topic? > > Kind regards > Reto Gantenbein > > > -- > Universität Bern > Abt. Informatikdienste > Gruppe Zentrale Systeme > > Reto Gantenbein > Administrator UBELIX > > Gesellschaftsstrasse 6 > CH-3012 Bern > Raum -104 > Tel. +41 (0)31 631 87 97 > > _______________________________________________ > Lustre-discuss mailing list > [email protected] > http://lists.lustre.org/mailman/listinfo/lustre-discuss _______________________________________________ Lustre-discuss mailing list [email protected] http://lists.lustre.org/mailman/listinfo/lustre-discuss
