[gpfsug-discuss] Filesystem Operation error

2018-07-03 Thread Grunenberg, Renar
Hallo All, follow a short story from yesterday on Version 5.0.1.1. We had a 3 - Node cluster (2 Nodes for IO and the third for a quorum Buster function). A Admin make a mistake an take a delete of the 3 Node (VM). We restored ist with a VM Snapshot no Problem. The only point here we lost

Re: [gpfsug-discuss] gpfsug-discuss Digest, Vol 78, Issue 6

2018-07-03 Thread Michael L Taylor
Hi Giuseppe, The GUI happens to document some of the zimon metrics in the KC here: https://www.ibm.com/support/knowledgecenter/en/STXKQY_5.0.0/com.ibm.spectrum.scale.v5r00.doc/bl1hlp_monperfmetrics.htm Hopefully that gets you a bit more of what you need but does not cover everything. Today's

Re: [gpfsug-discuss] High I/O wait times

2018-07-03 Thread Steve Crusan
Kevin, While this is happening, are you able to grab latency stats per LUN (hardware vendor agnostic) to see if there are any outliers? Also, when looking at the mmdiag output, are both reads and writes affected? Depending on the storage hardware, your writes might be hitting cache, so

Re: [gpfsug-discuss] High I/O wait times

2018-07-03 Thread Buterbaugh, Kevin L
Hi Fred, I have a total of 48 NSDs served up by 8 NSD servers. 12 of those NSDs are in our small /home filesystem, which is performing just fine. The other 36 are in our ~1 PB /scratch and /data filesystem, which is where the problem is. Our max filesystem block size parameter is set to 16

Re: [gpfsug-discuss] High I/O wait times

2018-07-03 Thread Frederick Stock
How many NSDs are served by the NSD servers and what is your maximum file system block size? Have you confirmed that you have sufficient NSD worker threads to handle the maximum number of IOs you are configured to have active? That would be the number of NSDs served times 12 (you have 12

Re: [gpfsug-discuss] High I/O wait times

2018-07-03 Thread Buterbaugh, Kevin L
Hi Fred, Thanks for the response. I have been looking at the “mmfsadm dump nsd” data from the two NSD servers that serve up the two NSDs that most commonly experience high wait times (although, again, this varies from time to time). In addition, I have been reading:

Re: [gpfsug-discuss] High I/O wait times

2018-07-03 Thread Frederick Stock
Are you seeing similar values for all the nodes or just some of them? One possible issue is how the NSD queues are configured on the NSD servers. You can see this with the output of "mmfsadm dump nsd". There are queues for LARGE IOs (greater than 64K) and queues for SMALL IOs (64K or less).

[gpfsug-discuss] High I/O wait times

2018-07-03 Thread Buterbaugh, Kevin L
Hi all, We are experiencing some high I/O wait times (5 - 20 seconds!) on some of our NSDs as reported by “mmdiag —iohist" and are struggling to understand why. One of the confusing things is that, while certain NSDs tend to show the problem more than others, the problem is not consistent …

Re: [gpfsug-discuss] preventing HSM tape recall storms

2018-07-03 Thread Christof Schmitt
> HSM over LTFS-EE runs the risk of a recall storm if files which have been migrated to tape> are then shared by Samba to Macs and PCs.> MacOS Finder and Windows Explorer will want to display all the thumbnail images of a> folder's contents, which will recall lots of files from tape.   SMB clients

[gpfsug-discuss] preventing HSM tape recall storms

2018-07-03 Thread Cameron Dunn
HSM over LTFS-EE runs the risk of a recall storm if files which have been migrated to tape are then shared by Samba to Macs and PCs. MacOS Finder and Windows Explorer will want to display all the thumbnail images of a folder's contents, which will recall lots of files from tape. According to