Re: [gpfsug-discuss] Unkillable snapshots

Sven Oehme Thu, 20 Feb 2020 14:29:15 -0800

Filesystem quiesce failed has nothing to do with open files. 
What it means is that the filesystem couldn’t flush dirty data and metadata 
within a defined time to take a snapshot. This can be caused by to high 
maxfilestocache or pagepool settings. 
To give you an simplified example (its more complex than that, but good enough 
to make the point) - assume you have 100 nodes, each has 16 GB pagepool and 
your storage system can write data out at 10 GB/sec, it will take 160 seconds 
to flush all data data (assuming you did normal buffered I/O.
If i remember correct (talking out of memory here) the default timeout is 60 
seconds, given that you can’t write that fast it will always timeout under this 
scenario. 
There is one case where this can also happen which is a client is connected 
badly (flaky network or slow connection) and even your storage system is fast 
enough the node is too slow that it can’t de-stage within that time while 
everybody else can and the storage is not the bottleneck. Other than that only 
solutions are to a) buy faster storage or b) reduce pagepool and 
maxfilestocache which will reduce overall performance of the system.


Sven


Sent from my iPad

> On Feb 20, 2020, at 5:14 PM, Nathan Falk <[email protected]> wrote:
> 
> Good point, Simon. Yes, it is a "file system quiesce" not a "fileset 
> quiesce" so it is certainly possible that mmfsd is unable to quiesce because 
> there are processes keeping files open in another fileset.
> 
> 
> 
> Nate Falk
> IBM Spectrum Scale Level 2 Support
> Software Defined Infrastructure, IBM Systems
> 
> 
> 
> 
> From:        Simon Thompson <[email protected]>
> To:        gpfsug main discussion list <[email protected]>
> Date:        02/20/2020 04:39 PM
> Subject:        [EXTERNAL] Re: [gpfsug-discuss] Unkillable snapshots
> Sent by:        [email protected]
> 
> 
> Hi Nate,
> So we're trying to clean up snapshots from the GUI ... we've found that if it 
> fails to delete one night for whatever reason, it then doesn't go back 
> another day and clean up 😊
> But yes, essentially running this by hand to clean up.
> What I have found is that lsof hangs on some of the "suspect" nodes. But if I 
> strace it, its hanging on a process which is using a different fileset. For 
> example, the file-set we can't delete is:
> rds-projects-b which is mounted as /rds/projects/b
> But on some suspect nodes, strace lsof /rds, that hangs at a process which 
> has open files in:
> /rds/projects/g which is a different file-set.
> What I'm wondering if its these hanging processes in the "g" fileset which is 
> killing us rather than something in the "b" fileset. Looking at the "g" 
> processes, they look like a weather model and look to be dumping a lot of 
> files in a shared directory, so I wonder if the mmfsd process is busy 
> servicing that and so whilst its not got "b" locks, its just too slow to 
> respond?
> Does that sound plausible?
> Thanks
> Simon
> 
> 
> From: [email protected] 
> <[email protected]> on behalf of [email protected] 
> <[email protected]>
> Sent: 20 February 2020 21:26:39
> To: gpfsug main discussion list
> Subject: Re: [gpfsug-discuss] Unkillable snapshots
>  
> Hello Simon,
> 
> Sadly, that "1036" is not a node ID, but just a counter.
> 
> These are tricky to troubleshoot. Usually, by the time you realize it's 
> happening and try to collect some data, things have already timed out.
> 
> Since this mmdelsnapshot isn't something that's on a schedule from cron or 
> the GUI and is a command you are running, you could try some heavy-handed 
> data collection.
> 
> You suspect a particular fileset already, so maybe have a 'mmdsh -N all lsof 
> /path/to/fileset' ready to go in one window, and the 'mmdelsnapshot' ready to 
> go in another window? When the mmdelsnapshot times out, you can find the 
> nodes it was waiting on in the file system manager mmfs.log.latest and see 
> what matches up with the open files identified by lsof.
> 
> It sounds like you already know this, but the <c0n42> type of internal node 
> names in the log messages can be translated with 'mmfsadm dump tscomm' or 
> also plain old 'mmdiag --network'.
> 
> Thanks,
> Nate Falk
> IBM Spectrum Scale Level 2 Support
> Software Defined Infrastructure, IBM Systems
> 
> 
> 
> 
> 
> 
> From:        Simon Thompson <[email protected]>
> To:        gpfsug main discussion list <[email protected]>
> Date:        02/20/2020 03:14 PM
> Subject:        [EXTERNAL] Re: [gpfsug-discuss] Unkillable snapshots
> Sent by:        [email protected]
> 
> Hmm ... mmdiag --tokenmgr shows:
> 
> 
>    Server stats: requests 195417431 ServerSideRevokes 120140
>           nTokens 2146923 nranges 4124507
>           designated mnode appointed 55481 mnode thrashing detected 1036
> So how do I convert "1036" to a node?
> Simon
> 
> 
> 
> From: [email protected] 
> <[email protected]> on behalf of Simon Thompson 
> <[email protected]>
> Sent: 20 February 2020 19:45:02
> To: gpfsug main discussion list
> Subject: [gpfsug-discuss] Unkillable snapshots
>  
> Hi,
> We have a snapshot which is stuck in the state "DeleteRequired". When 
> deleting, it goes through the motions but eventually gives up with:
> 
> 
> Unable to quiesce all nodes; some processes are busy or holding required 
> resources.
> mmdelsnapshot: Command failed. Examine previous error messages to determine 
> cause.
> And in the mmfslog on the FS manager there are a bunch of retries and 
> "failure to quesce" on nodes. However in each retry its never the same set of 
> nodes. I suspect we have one HPC job somewhere killing us.
> What's interesting is that we can delete other snapshots OK, it appears to be 
> one particular fileset.
> My old goto "mmfsadm dump tscomm" isn't showing any particular node, and 
> waiters around just tend to point to the FS manager node.
> So ... any suggestions? I'm assuming its some workload holding a lock open or 
> some such, but tracking it down is proving elusive!
> Generally the FS is also "lumpy" ... at times it feels like a wifi connection 
> on a train using a terminal, I guess its all related though.
> Thanks
> Simon
> 
> 
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
> 
> 
> 
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
> 
> 
> 
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss

_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss

Re: [gpfsug-discuss] Unkillable snapshots

Reply via email to