OK, I'll try a test with all files on the same OST. How about we both do some tests with the linux kernel?
git clone git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git This is pretty comparable to the repo I'm working with. $ du -hsc linux-stable 4.8G linux-stable 4.8G total $ find linux-stable | wc -l 76026 $ This took a very long time for me to clone from the web. But just cloning from disk to disk on a local SSD (git clone linux-stable linux-stable2) takes about 2 minutes - about the same as the repo I've been using. I just cloned it from the local SSD to lustre and it took me about 11.5 mintues for the clone. That timing is in line to what I reported earlier if you scale by the number of files. (By the way, I must have added an extra 0 in my plot heading for the number of files - the repo I've been using has about 47,000 files, not 469,000 files. My apologies for all the mis-information I've given in this thread!) I would also love to know what your "out of the box" io500 MD test numbers look like (./io500.sh config-minimal.ini) as those should be a good data to compare too. -----Original Message----- From: lustre-discuss <[email protected]> on behalf of Michael Di Domenico <[email protected]> Date: Thursday, January 14, 2021 at 8:58 AM Cc: "[email protected]" <[email protected]> Subject: [EXTERNAL] Re: [lustre-discuss] Tuning for metadata performance On Thu, Jan 14, 2021 at 10:36 AM Vicker, Darby J. (JSC-EG111)[Jacobs Technology, Inc.] <[email protected]> wrote: > > > By a "single OSS", do you mean the same OSS for all files? Or just 1 OSS for each individual file (but not necessarily the same OSS for all files). I think you mean the latter. All the lustre results I've sent so far are effectively using a single OSS (but not the same OSS) for all/almost all the files. Our default PFL uses a single OST up to 32 MB, 4 OST's up to 1GB and 8 OST's beyond that. In the git repo I've been using for this test, there are only 6 files bigger than 32 MB. And there was the test where I explicitly set the stripe count to 1 for all files (ephemeral1s). i mean locate all the files of the git repository on a single oss, maybe even a single ost. not a mapping of one file per oss or striping files across oss's. the theory at least, is that by having all the files on a single oss/ost there might be a reduction in rpc's and/or an added cache effect from the disks and/or some readahead. i don't know though, it's just a stab in the dark my hope is that by using a single client, single mdt, and single oss/ost you can push closer to the performance of nfs. i suspect the added overhead of the mdt reaching across to the oss is going to prevent this though. but it would be interesting nonetheless if we could move the needle at all if true, it might just be that the added RPC overhead of the MDS/OSS that the NFS doesn't have to contend with means the performance is what it is. there's probably a way to measure the RPC latency at various points from client to disk through lustre, but i don't know how i'll give you this, it's an interesting research experiment. if you could come up with a way to replicate your git repo with fluff data i could try to recreate the experiment and see how our results differ. _______________________________________________ lustre-discuss mailing list [email protected] https://gcc02.safelinks.protection.outlook.com/?url=http%3A%2F%2Flists.lustre.org%2Flistinfo.cgi%2Flustre-discuss-lustre.org&data=04%7C01%7Cdarby.vicker-1%40nasa.gov%7C9795cf3879ee4c04005e08d8b8a540e7%7C7005d45845be48ae8140d43da96dd17b%7C0%7C0%7C637462367183683842%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=%2FLh4t%2F89wlEphB%2BFz%2FOwCt1jwVCskHSA9teqkU17uvU%3D&reserved=0 _______________________________________________ lustre-discuss mailing list [email protected] http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
