On 2010-06-24, at 00:54, Maxence Dunnewind wrote:
> I'm using lustre 1.8.3 with a SSI (single system image). I'm trying to make 
> some compilation bench. My process is simple :
> download 2.6.34 kernel
> extract it on a lustre mount
> make defconfig
> time make -j X
> 
> for reference, if I use only one client, with a local filesystem, it takes 
> about 3min50. The same client alone with a mounted lustre partition (with 
> local lock) takes more than 10 minutes.
> 
> Using lustre on my SSI, I have these results : 
> -j 4 : 9min37
> -j 8 : 5min34
> -j 12 : 4min42
> -j 16 : 4min19
> 
> so even with 16 process (on 4 nodes), I can't compile faster than 1 local 
> node ...
> I tried 
> http://blogs.sun.com/atulvid/entry/improving_performance_of_small_files
> but it does not change anything.
> 
> My lustre setup : 
> 1 mgs + mds
> 3 OST
> 
> Is there some other way to optimize it or is lustre just bad on multiple 
> access for small file ? 

I don't think it is realistic to expect that a cache-coherent distributed 
filesystem that can scale to 10000's of clients is also performing as fast as a 
single client on a local filesystem.  That Lustre is completing in 4:19 vs. 
3:50 on the local filesystem (12% slower) is a pretty good result for Lustre, I 
think.

We're of course working on improving Lustre performance for this kind of 
situation, but it isn't really a priority for most of our customers.  I don't 
want to discourage you from using Lustre, and of course I'd also like Lustre to 
be faster than even the local filesystem but you should also look at the other 
benefits.

A more fair test might be to do the local-node compile, and then copy the 
kernel and all the modules to each of the client nodes, since Lustre is also 
making the output files available on all of the clients.  It is also worthwhile 
(if you have the time) to determine whether your "make -j 16" is CPU bound, or 
IO bound on the local filesystem?  You might try pre-staging all of the input 
files on the client nodes, and have the compiler output go into a separate 
directory (not sure if this is possible with linux kernel compiles) so that the 
output files created during the run do not invalidate the directory caches.


For comparison, on the same two systems (local fs vs. Lustre) try writing 
32*10GB files from 32 clients (use rsh or NFS or whatever you want to transport 
data from clients to local filesystem) and see how performance compares. :-)

Cheers, Andreas
--
Andreas Dilger
Lustre Technical Lead
Oracle Corporation Canada Inc.

_______________________________________________
Lustre-discuss mailing list
[email protected]
http://lists.lustre.org/mailman/listinfo/lustre-discuss

Reply via email to