Stephen,

Looked into client requests, and it doesn't seem to lean heavily on any one NSD 
server. Of course, this is an eyeball assessment after reviewing IO request 
percentages to the different NSD servers from just a few nodes.

By the way, I later discovered our TSM/NSD server couldn't handle restoring a 
read-only file and ended-up writing my output file into GBs asking for my 
response...that seemed to have contributed to some unnecessary high write IO.

However, I still can't understand why write IO operations are 5x more latent 
than ready operations to the same class of disks.

Maybe it's time for a GPFS support ticket...


Thanks,


Oluwasijibomi (Siji) Saula

HPC Systems Administrator  /  Information Technology



Research 2 Building 220B / Fargo ND 58108-6050

p: 701.231.7749 / www.ndsu.edu<http://www.ndsu.edu/>



[cid:image001.gif@01D57DE0.91C300C0]



________________________________
From: gpfsug-discuss-boun...@spectrumscale.org 
<gpfsug-discuss-boun...@spectrumscale.org> on behalf of 
gpfsug-discuss-requ...@spectrumscale.org 
<gpfsug-discuss-requ...@spectrumscale.org>
Sent: Wednesday, June 3, 2020 9:19 PM
To: gpfsug-discuss@spectrumscale.org <gpfsug-discuss@spectrumscale.org>
Subject: gpfsug-discuss Digest, Vol 101, Issue 9

Send gpfsug-discuss mailing list submissions to
        gpfsug-discuss@spectrumscale.org

To subscribe or unsubscribe via the World Wide Web, visit
        http://gpfsug.org/mailman/listinfo/gpfsug-discuss
or, via email, send a message with subject or body 'help' to
        gpfsug-discuss-requ...@spectrumscale.org

You can reach the person managing the list at
        gpfsug-discuss-ow...@spectrumscale.org

When replying, please edit your Subject line so it is more specific
than "Re: Contents of gpfsug-discuss digest..."


Today's Topics:

   1. Re: Client Latency and High NSD Server Load Average
      (Stephen Ulmer)


----------------------------------------------------------------------

Message: 1
Date: Wed, 3 Jun 2020 22:19:49 -0400
From: Stephen Ulmer <ul...@ulmer.org>
To: gpfsug main discussion list <gpfsug-discuss@spectrumscale.org>
Subject: Re: [gpfsug-discuss] Client Latency and High NSD Server Load
        Average
Message-ID: <c1fb88ba-d37e-4fb6-9095-31b632dac...@ulmer.org>
Content-Type: text/plain; charset="utf-8"

Note that if nsd02-ib is offline, that nsd03-ib is now servicing all of the 
NSDs for *both* servers, and that if nsd03-ib gets busy enough to appear 
offline, then nsd04-ib would be next in line to get the load of all 3. The two 
servers with the problems are in line after the one that is off.

This is based on the candy striping of the NSD server order (which I think most 
of us do).

NSD fail-over is ?straight-forward? so to speak - the last I checked, it is 
really fail-over in the listed order not load balancing among the servers 
(which is why you stripe them). I do *not* know if individual clients make the 
decision that the I/O for a disk should go through the ?next? NSD server, or if 
it is done cluster-wide (in the case of intermittently super-slow I/O). 
Hopefully someone with source code access will answer that, because now I?m 
curious...

Check what path the clients are using to the NSDs, i.e. which server. See if 
you are surprised. :)

 --
Stephen


> On Jun 3, 2020, at 6:03 PM, Saula, Oluwasijibomi 
> <oluwasijibomi.sa...@ndsu.edu> wrote:
>
> ?
> Frederick,
>
> Yes on both counts! -  mmdf is showing pretty uniform (ie 5 NSDs out of 30 
> report 65% free; All others are uniform at 58% free)...
>
> NSD servers per disks are called in round-robin fashion as well, for example:
>
>  gpfs1         tier2_001    nsd02-ib,nsd03-ib,nsd04-ib,tsm01-ib,nsd01-ib
>  gpfs1         tier2_002    nsd03-ib,nsd04-ib,tsm01-ib,nsd01-ib,nsd02-ib
>  gpfs1         tier2_003    nsd04-ib,tsm01-ib,nsd01-ib,nsd02-ib,nsd03-ib
>  gpfs1         tier2_004    tsm01-ib,nsd01-ib,nsd02-ib,nsd03-ib,nsd04-ib
>
> Any other potential culprits to investigate?
>
> I do notice nsd03/nsd04 have long waiters, but nsd01 doesn't (nsd02-ib is 
> offline for now):
> [nsd03-ib ~]# mmdiag --waiters
> === mmdiag: waiters ===
> Waiting 6.5113 sec since 17:17:33, monitored, thread 4175 NSDThread: for I/O 
> completion
> Waiting 6.3810 sec since 17:17:33, monitored, thread 4127 NSDThread: for I/O 
> completion
> Waiting 6.1959 sec since 17:17:34, monitored, thread 4144 NSDThread: for I/O 
> completion
>
> nsd04-ib:
> Waiting 13.1386 sec since 17:19:09, monitored, thread 9971 NSDThread: for I/O 
> completion
> Waiting 10.3562 sec since 17:19:12, monitored, thread 9958 NSDThread: for I/O 
> completion
> Waiting 10.0338 sec since 17:19:12, monitored, thread 9951 NSDThread: for I/O 
> completion
>
> tsm01-ib:
> Waiting 8.1211 sec since 17:20:24, monitored, thread 3644 NSDThread: for I/O 
> completion
> Waiting 7.6690 sec since 17:20:24, monitored, thread 3641 NSDThread: for I/O 
> completion
> Waiting 7.4969 sec since 17:20:24, monitored, thread 3658 NSDThread: for I/O 
> completion
> Waiting 7.3573 sec since 17:20:24, monitored, thread 3642 NSDThread: for I/O 
> completion
>
> nsd01-ib:
> Waiting 0.2548 sec since 17:21:47, monitored, thread 30513 NSDThread: for I/O 
> completion
> Waiting 0.1502 sec since 17:21:47, monitored, thread 30529 NSDThread: for I/O 
> completion
>
>
> Thanks,
>
> Oluwasijibomi (Siji) Saula
> HPC Systems Administrator  /  Information Technology
>
> Research 2 Building 220B / Fargo ND 58108-6050
> p: 701.231.7749 / www.ndsu.edu<http://www.ndsu.edu>
>
>
>
>
>
> From: gpfsug-discuss-boun...@spectrumscale.org 
> <gpfsug-discuss-boun...@spectrumscale.org> on behalf of 
> gpfsug-discuss-requ...@spectrumscale.org 
> <gpfsug-discuss-requ...@spectrumscale.org>
> Sent: Wednesday, June 3, 2020 4:56 PM
> To: gpfsug-discuss@spectrumscale.org <gpfsug-discuss@spectrumscale.org>
> Subject: gpfsug-discuss Digest, Vol 101, Issue 6
>
> Send gpfsug-discuss mailing list submissions to
>         gpfsug-discuss@spectrumscale.org
>
> To subscribe or unsubscribe via the World Wide Web, visit
>         http://gpfsug.org/mailman/listinfo/gpfsug-discuss
> or, via email, send a message with subject or body 'help' to
>         gpfsug-discuss-requ...@spectrumscale.org
>
> You can reach the person managing the list at
>         gpfsug-discuss-ow...@spectrumscale.org
>
> When replying, please edit your Subject line so it is more specific
> than "Re: Contents of gpfsug-discuss digest..."
>
>
> Today's Topics:
>
>    1. Introducing SSUG::Digital
>       (Simon Thompson (Spectrum Scale User Group Chair))
>    2. Client Latency and High NSD Server Load Average
>       (Saula, Oluwasijibomi)
>    3. Re: Client Latency and High NSD Server Load Average
>       (Frederick Stock)
>
>
> ----------------------------------------------------------------------
>
> Message: 1
> Date: Wed, 03 Jun 2020 20:11:17 +0100
> From: "Simon Thompson (Spectrum Scale User Group Chair)"
>         <ch...@spectrumscale.org>
> To: "gpfsug-discuss@spectrumscale.org"
>         <gpfsug-discuss@spectrumscale.org>
> Subject: [gpfsug-discuss] Introducing SSUG::Digital
> Message-ID: <ab923605-e4fe-45ec-a1ea-b61a4a147...@spectrumscale.org>
> Content-Type: text/plain; charset="utf-8"
>
> Hi All.,
>
>
>
> I happy that we can finally announce SSUG:Digital, which will be a series of 
> online session based on the types of topic we present at our in-person events.
>
>
>
> I know it?s taken use a while to get this up and running, but we?ve been 
> working on trying to get the format right. So save the date for the first 
> SSUG:Digital event which will take place on Thursday 18th June 2020 at 4pm 
> BST. That?s:
> San Francisco, USA at 08:00 PDT
> New York, USA at 11:00 EDT
> London, United Kingdom at 16:00 BST
> Frankfurt, Germany at 17:00 CEST
> Pune, India at 20:30 IST
> We estimate about 90 minutes for the first session, and please forgive any 
> teething troubles as we get this going!
>
>
>
> (I know the times don?t work for everyone in the global community!)
>
>
>
> Each of the sessions we run over the next few months will be a different 
> Spectrum Scale Experts or Deep Dive session.
>
> More details at:
>
> https://www.spectrumscaleug.org/introducing-ssugdigital/
>
>
>
> (We?ll announce the speakers and topic of the first session in the next few 
> days ?)
>
>
>
> Thanks to Ulf, Kristy, Bill, Bob and Ted for their help and guidance in 
> getting this going.
>
>
>
> We?re keen to include some user talks and site updates later in the series, 
> so please let me know if you might be interested in presenting in this format.
>
>
>
> Simon Thompson
>
> SSUG Group Chair
>
> -------------- next part --------------
> An HTML attachment was scrubbed...
> URL: 
> <http://gpfsug.org/pipermail/gpfsug-discuss/attachments/20200603/e839fc73/attachment-0001.html>
>
> ------------------------------
>
> Message: 2
> Date: Wed, 3 Jun 2020 21:45:05 +0000
> From: "Saula, Oluwasijibomi" <oluwasijibomi.sa...@ndsu.edu>
> To: "gpfsug-discuss@spectrumscale.org"
>         <gpfsug-discuss@spectrumscale.org>
> Subject: [gpfsug-discuss] Client Latency and High NSD Server Load
>         Average
> Message-ID:
>         
> <dm6pr08mb5324b014bc4aa03ccf25557598...@dm6pr08mb5324.namprd08.prod.outlook.com>
>
> Content-Type: text/plain; charset="iso-8859-1"
>
>
> Hello,
>
> Anyone faced a situation where a majority of NSDs have a high load average 
> and a minority don't?
>
> Also, is 10x NSD server latency for write operations than for read operations 
> expected in any circumstance?
>
> We are seeing client latency between 6 and 9 seconds and are wondering if 
> some GPFS configuration or NSD server condition may be triggering this poor 
> performance.
>
>
>
> Thanks,
>
>
> Oluwasijibomi (Siji) Saula
>
> HPC Systems Administrator  /  Information Technology
>
>
>
> Research 2 Building 220B / Fargo ND 58108-6050
>
> p: 701.231.7749 / www.ndsu.edu<http://www.ndsu.edu/>
>
>
>
> [cid:image001.gif@01D57DE0.91C300C0]
>
>
> -------------- next part --------------
> An HTML attachment was scrubbed...
> URL: 
> <http://gpfsug.org/pipermail/gpfsug-discuss/attachments/20200603/2ac14173/attachment-0001.html>
>
> ------------------------------
>
> Message: 3
> Date: Wed, 3 Jun 2020 21:56:04 +0000
> From: "Frederick Stock" <sto...@us.ibm.com>
> To: gpfsug-discuss@spectrumscale.org
> Cc: gpfsug-discuss@spectrumscale.org
> Subject: Re: [gpfsug-discuss] Client Latency and High NSD Server Load
>         Average
> Message-ID:
>         
> <of4256061c.b3ca8966-on0025857c.00786c34-0025857c.00787...@notes.na.collabserv.com>
>
> Content-Type: text/plain; charset="us-ascii"
>
> An HTML attachment was scrubbed...
> URL: 
> <http://gpfsug.org/pipermail/gpfsug-discuss/attachments/20200603/c252f3b9/attachment.html>
>
> ------------------------------
>
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
>
>
> End of gpfsug-discuss Digest, Vol 101, Issue 6
> **********************************************
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
-------------- next part --------------
An HTML attachment was scrubbed...
URL: 
<http://gpfsug.org/pipermail/gpfsug-discuss/attachments/20200603/aaf42a58/attachment.html>

------------------------------

_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


End of gpfsug-discuss Digest, Vol 101, Issue 9
**********************************************
_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss

Reply via email to