Hmm ... mmdiag --tokenmgr shows:
Server stats: requests 195417431 ServerSideRevokes 120140
nTokens 2146923 nranges 4124507
designated mnode appointed 55481 mnode thrashing detected 1036
So how do I convert "1036" to a node?
Simon
________________________________
From: [email protected]
<[email protected]> on behalf of Simon Thompson
<[email protected]>
Sent: 20 February 2020 19:45:02
To: gpfsug main discussion list
Subject: [gpfsug-discuss] Unkillable snapshots
Hi,
We have a snapshot which is stuck in the state "DeleteRequired". When deleting,
it goes through the motions but eventually gives up with:
Unable to quiesce all nodes; some processes are busy or holding required
resources.
mmdelsnapshot: Command failed. Examine previous error messages to determine
cause.
And in the mmfslog on the FS manager there are a bunch of retries and "failure
to quesce" on nodes. However in each retry its never the same set of nodes. I
suspect we have one HPC job somewhere killing us.
What's interesting is that we can delete other snapshots OK, it appears to be
one particular fileset.
My old goto "mmfsadm dump tscomm" isn't showing any particular node, and
waiters around just tend to point to the FS manager node.
So ... any suggestions? I'm assuming its some workload holding a lock open or
some such, but tracking it down is proving elusive!
Generally the FS is also "lumpy" ... at times it feels like a wifi connection
on a train using a terminal, I guess its all related though.
Thanks
Simon
_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss