Re: [gpfsug-discuss] Long IO waiters and IBM Storwize V5030

2021-05-28 Thread Andrew Beattie
Hi Oluwasijibomi, If you set up a Storage Insights Standard account You can monitor the performance of your 5030, and pull the performance metrics of the block storage array when you see poor performance in your scale cluster. This will give you some idea as to what is happening, but The 5030

Re: [gpfsug-discuss] Long IO waiters and IBM Storwize V5030

2021-05-28 Thread Uwe Falke
You say you see that every few months: does it mean with about the same load sometimes the system knocks out and sometimes it behaves ok? Have you checked the v5k event log, if there is anything going on (write performance may suffer if write cache is off which might happen if the buffer

Re: [gpfsug-discuss] Long IO waiters and IBM Storwize V5030

2021-05-28 Thread Uwe Falke
Hi, odd prefetch strategy would affect read performance, but write latency is claimed to be even worse ... Have you simply checked what the actual IO performance of the v5k box under that load is and how it compares to its nominal performance and that of its disks? how is the storage organised?

[gpfsug-discuss] Long IO waiters and IBM Storwize V5030

2021-05-28 Thread Saula, Oluwasijibomi
Hi Folks, So, we are experiencing some very long IO waiters in our GPFS cluster: # mmdiag --waiters === mmdiag: waiters === Waiting 17.3823 sec since 10:41:01, monitored, thread 21761 NSDThread: for I/O completion Waiting 16.6140 sec since 10:41:02, monitored, thread 21730 NSDThread: for

Re: [gpfsug-discuss] Long IO waiters and IBM Storwize V5030

2021-05-28 Thread Jan-Frode Myklebust
One thing to check: Storwize/SVC code will *always* guess wrong on prefetching for GPFS. You can see this with having a lot higher read data throughput on mdisk vs. on on vdisks in the webui. To fix it, disable cache_prefetch with "chsystem -cache_prefetch off". This being a global setting, you

Re: [gpfsug-discuss] Mmrestripefs -R --metadat-only - how to estimate remaining execution time

2021-05-28 Thread Eric Horst
Yes Heiner, my experience is that the inode count in those operations is: inodes * snapshots = total. I observed that as it starts processing the snapshot inodes it moves faster as inode usage in snapshots is more sparse. -Eric On Fri, May 28, 2021 at 1:56 AM Billich Heinrich Rainer (ID SD) <

Re: [gpfsug-discuss] Mmrestripefs -R --metadat-only - how to estimate remaining execution time

2021-05-28 Thread Billich Heinrich Rainer (ID SD)
Hello, I just noticed: Maybe mmrestripefs does some extra processing on snapshots? The output looks much more as expected on filesystems with no snapshots present, both number of inodes and MB data processed allow to estimate the remaining runtime.   Unfortunately all our large

[gpfsug-discuss] Mmrestripefs -R --metadat-only - how to estimate remaining execution time

2021-05-28 Thread Billich Heinrich Rainer (ID SD)
Hello, I want to estimate how much longer a running mmrestripefs -R –metadata-only –qos maintenance job will take to finish. We did switch from ‘-m 1’ to ‘-m 2 ’ and now run mmrestripefs to get the second copy of all metadata. I know that the % values in the output are useless, but

Re: [gpfsug-discuss] Ransom attacks

2021-05-28 Thread macthev
Take a look at IAM nodes. Sent from my iPhone > On 28 May 2021, at 01:10, Henrik Morsing wrote: > >  > Hi, > > It struck me that switching a Spectrum Protect solution from tapes to a GPFS > filesystem offers much less protection against ransom encryption should the > SP server be

Re: [gpfsug-discuss] Ransom attacks

2021-05-28 Thread Jonathan Buzzard
On 28/05/2021 07:46, Henrik Morsing wrote: That might not make sense if GPFS is holding the SP backup data, but SP can do its own replication too - and could replicate using storage from a second GPFS file system off-site.  Take snapshots of this second storage, as well as SP database, and

Re: [gpfsug-discuss] Ransom attacks

2021-05-28 Thread Henrik Morsing
On Thu, May 27, 2021 at 02:17:37PM -0400, Lindsay Todd wrote: That might not make sense if GPFS is holding the SP backup data, but SP can do its own replication too - and could replicate using storage from a second GPFS file system off-site. Take snapshots of this second storage, as well as SP