Re: [gpfsug-discuss] High I/O wait times

Buterbaugh, Kevin L Tue, 03 Jul 2018 15:05:56 -0700

Hi Fred,

I have a total of 48 NSDs served up by 8 NSD servers.  12 of those NSDs are in 
our small /home filesystem, which is performing just fine.  The other 36 are in 
our ~1 PB /scratch and /data filesystem, which is where the problem is.  Our 
max filesystem block size parameter is set to 16 MB, but the aforementioned 
filesystem uses a 1 MB block size.


nsdMaxWorkerThreads is set to 1024 as shown below.  Since each NSD server 
serves an average of 6 NSDs and 6 x 12 = 72 we’re OK if I’m understanding the 
calculation correctly.  Even multiplying 48 x 12 = 576, so we’re good?!?

Your help is much appreciated!  Thanks again…

Kevin

On Jul 3, 2018, at 4:53 PM, Frederick Stock 
<[email protected]<mailto:[email protected]>> wrote:

How many NSDs are served by the NSD servers and what is your maximum file 
system block size?  Have you confirmed that you have sufficient NSD worker 
threads to handle the maximum number of IOs you are configured to have active?  
That would be the number of NSDs served times 12 (you have 12 threads per 
queue).

Fred
__________________________________________________
Fred Stock | IBM Pittsburgh Lab | 720-430-8821
[email protected]<mailto:[email protected]>



From:        "Buterbaugh, Kevin L" 
<[email protected]<mailto:[email protected]>>
To:        gpfsug main discussion list 
<[email protected]<mailto:[email protected]>>
Date:        07/03/2018 05:41 PM
Subject:        Re: [gpfsug-discuss] High I/O wait times
Sent by:        
[email protected]<mailto:[email protected]>
________________________________



Hi Fred,

Thanks for the response.  I have been looking at the “mmfsadm dump nsd” data 
from the two NSD servers that serve up the two NSDs that most commonly 
experience high wait times (although, again, this varies from time to time).  
In addition, I have been reading:

https://www.ibm.com/developerworks/community/wikis/home?lang=en#!/wiki/General%20Parallel%20File%20System%20(GPFS)/page/NSD%20Server%20Design%20and%20Tuning<https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.ibm.com%2Fdeveloperworks%2Fcommunity%2Fwikis%2Fhome%3Flang%3Den%23!%2Fwiki%2FGeneral+Parallel+File+System+(GPFS)%2Fpage%2FNSD+Server+Design+and+Tuning&data=02%7C01%7CKevin.Buterbaugh%40vanderbilt.edu%7C7658e1b458b147ad8a3908d5e12f6982%7Cba5a7f39e3be4ab3b45067fa80faecad%7C0%7C0%7C636662516110903567&sdata=cWw5UipcO7HgupLQTFgOWVwXF%2B9b8S%2Fw935%2FeqG6xIY%3D&reserved=0>

And:

https://www.ibm.com/developerworks/community/wikis/home?lang=en#!/wiki/General%20Parallel%20File%20System%20(GPFS)/page/NSD%20Server%20Tuning<https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.ibm.com%2Fdeveloperworks%2Fcommunity%2Fwikis%2Fhome%3Flang%3Den%23!%2Fwiki%2FGeneral%2520Parallel%2520File%2520System%2520(GPFS)%2Fpage%2FNSD%2520Server%2520Tuning&data=02%7C01%7CKevin.Buterbaugh%40vanderbilt.edu%7C7658e1b458b147ad8a3908d5e12f6982%7Cba5a7f39e3be4ab3b45067fa80faecad%7C0%7C0%7C636662516110903567&sdata=CAuOPOhC1MXdZW2e2HaVOY0PmySwP6FzlsvNNlteWZw%3D&reserved=0>

Which seem to be the most relevant documents on the Wiki.

I would like to do a more detailed analysis of the “mmfsadm dump nsd” output, 
but my preliminary looks at it seems to indicate that I see I/O’s queueing in 
the 50 - 100 range for the small queues and the 60 - 200 range on the large 
queues.

In addition, I am regularly seeing all 12 threads on the LARGE queues active, 
while it is much more rare that I see all - or even close to all - the threads 
on the SMALL queues active.

As far as the parameters Scott and Yuri mention, on our cluster they are set 
thusly:

[common]
nsdMaxWorkerThreads 640
[<all the GPFS servers listed here>]
nsdMaxWorkerThreads 1024
[common]
nsdThreadsPerQueue 4
[<all the GPFS servers listed here>]
nsdThreadsPerQueue 12
[common]
nsdSmallThreadRatio 3
[<all the GPFS servers listed here>]
nsdSmallThreadRatio 1

So to me it sounds like I need more resources on the LARGE queue side of things 
… i.e. it sure doesn’t sound like I want to change my small thread ratio.  If I 
increase the amount of threads it sounds like that might help, but that also 
takes more pagepool, and I’ve got limited RAM in these (old) NSD servers.  I do 
have nsdbufspace set to 70, but I’ve only got 16-24 GB RAM each in these NSD 
servers.  And a while back I did try increase the page pool on them (very 
slightly) and ended up causing problems because then they ran out of physical 
RAM.

Thoughts?  Followup questions?  Thanks!

Kevin

On Jul 3, 2018, at 3:11 PM, Frederick Stock 
<[email protected]<mailto:[email protected]>> wrote:

Are you seeing similar values for all the nodes or just some of them?  One 
possible issue is how the NSD queues are configured on the NSD servers.  You 
can see this with the output of "mmfsadm dump nsd".  There are queues for LARGE 
IOs (greater than 64K) and queues for SMALL IOs (64K or less).  Check the 
highest pending values to see if many IOs are queueing.  There are a couple of 
options to fix this but rather than explain them I suggest you look for 
information about NSD queueing on the developerWorks site.  There has been 
information posted there that should prove helpful.

Fred
__________________________________________________
Fred Stock | IBM Pittsburgh Lab | 720-430-8821
[email protected]<mailto:[email protected]>



From:        "Buterbaugh, Kevin L" 
<[email protected]<mailto:[email protected]>>
To:        gpfsug main discussion list 
<[email protected]<mailto:[email protected]>>
Date:        07/03/2018 03:49 PM
Subject:        [gpfsug-discuss] High I/O wait times
Sent by:        
[email protected]<mailto:[email protected]>
________________________________



Hi all,

We are experiencing some high I/O wait times (5 - 20 seconds!) on some of our 
NSDs as reported by “mmdiag —iohist" and are struggling to understand why.  One 
of the confusing things is that, while certain NSDs tend to show the problem 
more than others, the problem is not consistent … i.e. the problem tends to 
move around from NSD to NSD (and storage array to storage array) whenever we 
check … which is sometimes just a few minutes apart.

In the past when I have seen “mmdiag —iohist” report high wait times like this 
it has *always* been hardware related.  In our environment, the most common 
cause has been a battery backup unit on a storage array controller going bad 
and the storage array switching to write straight to disk.  But that’s *not* 
happening this time.

Is there anything within GPFS / outside of a hardware issue that I should be 
looking for??  Thanks!

—
Kevin Buterbaugh - Senior System Administrator
Vanderbilt University - Advanced Computing Center for Research and Education
[email protected]<mailto:[email protected]>- 
(615)875-9633


_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org<http://spectrumscale.org>
http://gpfsug.org/mailman/listinfo/gpfsug-discuss<https://na01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fgpfsug.org%2Fmailman%2Flistinfo%2Fgpfsug-discuss&data=02%7C01%7CKevin.Buterbaugh%40vanderbilt.edu%7Cd3d7ff675bb440286cb908d5e1212b66%7Cba5a7f39e3be4ab3b45067fa80faecad%7C0%7C0%7C636662454938066010&sdata=jL0pB5MEaWtJZjMbS8JzhsKGvwmYB6qV%2FVyosdUKcSU%3D&reserved=0>



_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org<http://spectrumscale.org>
https://na01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fgpfsug.org%2Fmailman%2Flistinfo%2Fgpfsug-discuss&data=02%7C01%7CKevin.Buterbaugh%40vanderbilt.edu%7Cd3d7ff675bb440286cb908d5e1212b66%7Cba5a7f39e3be4ab3b45067fa80faecad%7C0%7C0%7C636662454938076014&sdata=wIyB66HoqvL13I3LX0Ott%2Btr7HQQdInZ028QUp0QMhE%3D&reserved=0
_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org<http://spectrumscale.org>
http://gpfsug.org/mailman/listinfo/gpfsug-discuss<https://na01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fgpfsug.org%2Fmailman%2Flistinfo%2Fgpfsug-discuss&data=02%7C01%7CKevin.Buterbaugh%40vanderbilt.edu%7C7658e1b458b147ad8a3908d5e12f6982%7Cba5a7f39e3be4ab3b45067fa80faecad%7C0%7C0%7C636662516110923583&sdata=wbJuSS8E2n66iUoS3JiKjnly%2BESuuDpqlCLKGdM61rc%3D&reserved=0>



_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org<http://spectrumscale.org>
https://na01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fgpfsug.org%2Fmailman%2Flistinfo%2Fgpfsug-discuss&amp;data=02%7C01%7CKevin.Buterbaugh%40vanderbilt.edu%7C7658e1b458b147ad8a3908d5e12f6982%7Cba5a7f39e3be4ab3b45067fa80faecad%7C0%7C0%7C636662516110933587&amp;sdata=RKuWKLRGoBRMSDHkrMsKsuU6JkiFgruK4e7gGafxAGc%3D&amp;reserved=0

_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss

Re: [gpfsug-discuss] High I/O wait times

Reply via email to