Re: [lustre-discuss] Tuning for metadata performance

Vicker, Darby J. (JSC-EG111)[Jacobs Technology, Inc.] Thu, 14 Jan 2021 16:34:55 -0800

OK, I'll try a test with all files on the same OST.  

How about we both do some tests with the linux kernel?


git clone git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git

This is pretty comparable to the repo I'm working with.  
$ du -hsc linux-stable
4.8G    linux-stable
4.8G    total
$ find linux-stable | wc -l
76026
$


This took a very long time for me to clone from the web.  But just cloning from 
disk to disk on a local SSD (git clone linux-stable linux-stable2) takes about 
2 minutes - about the same as the repo I've been using.  I just cloned it from 
the local SSD to lustre and it took me about 11.5 mintues for the clone.  That 
timing is in line to what I reported earlier if you scale by the number of 
files.   

(By the way, I must have added an extra 0 in my plot heading for the number of 
files - the repo I've been using has about 47,000 files, not 469,000 files.  My 
apologies for all the mis-information I've given in this thread!)  

I would also love to know what your "out of the box" io500 MD test numbers look 
like (./io500.sh config-minimal.ini) as those should be a good data to compare 
too.  


-----Original Message-----
From: lustre-discuss <[email protected]> on behalf of 
Michael Di Domenico <[email protected]>
Date: Thursday, January 14, 2021 at 8:58 AM
Cc: "[email protected]" <[email protected]>
Subject: [EXTERNAL] Re: [lustre-discuss] Tuning for metadata performance

    On Thu, Jan 14, 2021 at 10:36 AM Vicker, Darby J. (JSC-EG111)[Jacobs
    Technology, Inc.] <[email protected]> wrote:
    >
    >
    > By a "single OSS", do you mean the same OSS for all files?  Or just 1 OSS 
for each individual file (but not necessarily the same OSS for all files).  I 
think you mean the latter.  All the lustre results I've sent so far are 
effectively using a single OSS (but not the same OSS) for all/almost all the 
files.  Our default PFL uses a single OST up to 32 MB, 4 OST's up to 1GB and 8 
OST's beyond that.  In the git repo I've been using for this test, there are 
only 6 files bigger than 32 MB.  And there was the test where I explicitly set 
the stripe count to 1 for all files (ephemeral1s).

    i mean locate all the files of the git repository on a single oss,
    maybe even a single ost.  not a mapping of one file per oss or
    striping files across oss's.

    the theory at least, is that by having all the files on a single
    oss/ost there might be a reduction in rpc's and/or an added cache
    effect from the disks and/or some readahead.  i don't know though,
    it's just a stab in the dark

    my hope is that by using a single client, single mdt, and single
    oss/ost you can push closer to the performance of nfs.  i suspect the
    added overhead of the mdt reaching across to the oss is going to
    prevent this though.  but it would be interesting nonetheless if we
    could move the needle at all

    if true, it might just be that the added RPC overhead of the MDS/OSS
    that the NFS doesn't have to contend with means the performance is
    what it is.  there's probably a way to measure the RPC latency at
    various points from client to disk through lustre, but i don't know
    how

    i'll give you this, it's an interesting research experiment.  if you
    could come up with a way to replicate your git repo with fluff data i
    could try to recreate the experiment and see how our results differ.
    _______________________________________________
    lustre-discuss mailing list
    [email protected]
    
https://gcc02.safelinks.protection.outlook.com/?url=http%3A%2F%2Flists.lustre.org%2Flistinfo.cgi%2Flustre-discuss-lustre.org&amp;data=04%7C01%7Cdarby.vicker-1%40nasa.gov%7C9795cf3879ee4c04005e08d8b8a540e7%7C7005d45845be48ae8140d43da96dd17b%7C0%7C0%7C637462367183683842%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&amp;sdata=%2FLh4t%2F89wlEphB%2BFz%2FOwCt1jwVCskHSA9teqkU17uvU%3D&amp;reserved=0

_______________________________________________
lustre-discuss mailing list
[email protected]
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

Re: [lustre-discuss] Tuning for metadata performance

Reply via email to