It will continue downward as the number of files in the directory increase.
Interestingly, GPFS stat performance increased as the number of files
increased.  My tests were on 128 nodes * 8 processes/node * 10 - 500 files
per process.

- Richard


On 9/10/10 11:11 AM, "Michael Robbert" <[email protected]> wrote:

> We have been struggling with our Lustre performance for some time now
> especially with large directories. I recently did some informal benchmarking
> (on a live system so I know results are not scientifically valid) and noticed
> a huge drop in performance of reads(stat operations) past 20k files in a
> single directory. I'm using bonnie++, disabling IO testing (-s 0) and just
> creating, reading, and deleting 40kb files in a single directory. I've done
> this on for directory sizes of 2,000 to 40,000 files. Create performance is a
> flat line of ~150 files/sec across the board. Delete performance is all over
> the place, but no higher than 3,000 files/sec. The really interesting data
> point is read performance, which for these tests is just a stat of the file
> not reading data. Starting with the smaller directories it is relatively
> consistent at just below 2,500 files/sec, but when I jump from 20,000 files to
> 30,000 files the performance drops to around 100 files/sec. We were assuming
> this w
>  as somewhat expected behavior and are in the process of trying to get our
> users to change their code. Then yesterday I was browsing the Lustre
> Operations Manual and found section 33.8 that says Lustre is tested with
> directories as large as 10 million files in a single directory and still get
> lookups at a rate of 5,000 files/sec. That leaves me wondering 2 things. How
> can we get 5,000 files/sec for anything and why is our performance dropping
> off so suddenly at after 20k files?
> 
> Here is our setup:
> All IO servers are Dell PowerEdge 2950s. 2 8-core sockets with X5355  @
> 2.66GHz and 16Gb of RAM.
> The data is on DDN S2A 9550s with 8+2 RAID configuration connected directly
> with 4Gb Fibre channel.
> They are running RHEL 4.5, Lustre 6.7.2-ddn3, kernel
> 2.6.18-128.7.1.el5.ddn1.l1.6.7.2.ddn3smp
> 
> As a side note the users code is Parflow, developed at LLNL. The files are
> SILO files. We have as many as 1.4 million files in a single directory and we
> now have half a billion files that we need to deal with in one way or another.
> The code has already been modified to split the files on newer runs until
> multiple subdirectories, but we're still dealing with 10s of thousands of
> files in a single directory. The users have been able to run these data sets
> on Lustre systems at LLNL 3 orders of magnitude faster.
> 
> Thanks,
> Mike Robbert
> HPC & Networking Engineer
> Colorado School of Mines
> _______________________________________________
> Lustre-discuss mailing list
> [email protected]
> http://*lists.lustre.org/mailman/listinfo/lustre-discuss
> 


====================================================

Richard Hedges
Customer Support and Test - File Systems Project
Development Environment Group - Livermore Computing
Lawrence Livermore National Laboratory
7000 East Avenue, MS L-557
Livermore, CA    94551

v:    (925) 423-2699
f:    (925) 423-6961
E:    [email protected]

_______________________________________________
Lustre-discuss mailing list
[email protected]
http://lists.lustre.org/mailman/listinfo/lustre-discuss

Reply via email to