Re: [lustre-discuss] Tuning for metadata performance

Vicker, Darby J. (JSC-EG111)[Jacobs Technology, Inc.] Tue, 12 Jan 2021 14:55:47 -0800

No, I haven't.  By what means do you suggest analyzing the OP calls?  Just an 
strace? Or the server-size debug commands as outlined in 
https://doc.lustre.org/lustre_manual.xhtml#dbdoclet.50438274_62472 ?


We also have jobstats enabled and are outputting these to a file for later 
analysis.  So if I submitted this test in a slurm job, I'd get stats like:

$ grep MDT /aerolab/admin/slurm/19435076_qsmoore
        MDT:snapshot_time      : 2021-01-09 05:29:59
        MDT:setattr            : 110
        MDT:getattr            : 20908
        MDT:mkdir              : 11
        MDT:getxattr           : 20424
        MDT:mknod              : 48
        MDT:close              : 19829
        MDT:unlink             : 9
        MDT:open               : 20188
$

But you must be referring to an external tool like strace so I could do the 
same thing on both lustre and NFS.  

-----Original Message-----
From: Michael Di Domenico <[email protected]>
Date: Tuesday, January 12, 2021 at 10:48 AM
To: "Vicker, Darby J. (JSC-EG111)[Jacobs Technology, Inc.]" 
<[email protected]>
Cc: "[email protected]" <[email protected]>
Subject: Re: [EXTERNAL] Re: [lustre-discuss] Tuning for metadata performance

    have you run any analysis on the "A clone of these repo takes 550
    seconds on lustre", where you track the exact OP calls on lustre to
    see if it's a general slowness or if there is a specific OP that git
    is abusing?  i wonder if there's something specific that git is doing
    that lustre is unhappy with versus continuing to poke at the hardware
    or software tuning.

    thought less likely, i'd also be curious if you have any
    security/audit controls turned on on the clients.  i have some silly
    ones where i'm at that slow things down on lustre but not nfs because
    of how the kernel treats the filesystem

    i don't have any git repo's even close to that size so i can't perform
    the same analysis where i'm at.


    On Mon, Jan 11, 2021 at 1:45 PM Vicker, Darby J. (JSC-EG111)[Jacobs
    Technology, Inc.] <[email protected]> wrote:
    >
    > Sure.  Its a custom configuration on commodity hardware, which is quite a 
bit newer than the luster servers.  The overall setup is a bit complicated to 
support HA - two servers with an external JBOD with ZFS to manage the drives 
and the file system.  PCS to do the failover.  But none of that is too relevant 
in terms of performance so here are the hardware specs.
    >
    > Servers:
    > 192 GB DDR4 2666 MHz ECC Memory
    > 16 total physical cores (2x Intel Xeon Gold 6144 CPU @ 3.50GHz)
    > LSI SAS Card (can't find exact model but very similar to the cards in the 
lustre servers)
    >
    > JBOD:
    > Supermicro 3.5"
    > 24x 10TB 7200 RPM Seagate HDD's
    >
    > ZFS is used to configure the drives in a RAID10 with a zfs file system 
built on the zpool.  This is exported via NFS.  The only NFS tuning we are 
doing is to increase RPCNFSDCOUNT to 128 and export with async.
    >
    > So the HW configuration is overall fairly similar.   This is another 
reason I'm hopeful that we'd be able to get our lustre MD performance as good 
or better than the NFS server given that the lustre MDS has SSD's and the NFS 
server has HDD's.
    >
    >
    > -----Original Message-----
    > From: lustre-discuss <[email protected]> on behalf 
of Michael Di Domenico <[email protected]>
    > Date: Monday, January 11, 2021 at 8:07 AM
    > Cc: "[email protected]" <[email protected]>
    > Subject: [EXTERNAL] Re: [lustre-discuss] Tuning for metadata performance
    >
    > perhaps i missed it somewhere, but in order to do a fair comparison
    > can you detail the hardware/software behind the nfs server?
    >
    >

_______________________________________________
lustre-discuss mailing list
[email protected]
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

Re: [lustre-discuss] Tuning for metadata performance

Reply via email to