Hmm ... mmdiag --tokenmgr shows:

    Server stats: requests 195417431 ServerSideRevokes 120140
           nTokens 2146923 nranges 4124507
           designated mnode appointed 55481 mnode thrashing detected 1036


So how do I convert "1036" to a node?


Simon

________________________________
From: gpfsug-discuss-boun...@spectrumscale.org 
<gpfsug-discuss-boun...@spectrumscale.org> on behalf of Simon Thompson 
<s.j.thomp...@bham.ac.uk>
Sent: 20 February 2020 19:45:02
To: gpfsug main discussion list
Subject: [gpfsug-discuss] Unkillable snapshots


Hi,


We have a snapshot which is stuck in the state "DeleteRequired". When deleting, 
it goes through the motions but eventually gives up with:

Unable to quiesce all nodes; some processes are busy or holding required 
resources.
mmdelsnapshot: Command failed. Examine previous error messages to determine 
cause.


And in the mmfslog on the FS manager there are a bunch of retries and "failure 
to quesce" on nodes. However in each retry its never the same set of nodes. I 
suspect we have one HPC job somewhere killing us.


What's interesting is that we can delete other snapshots OK, it appears to be 
one particular fileset.


My old goto "mmfsadm dump tscomm" isn't showing any particular node, and 
waiters around just tend to point to the FS manager node.


So ... any suggestions? I'm assuming its some workload holding a lock open or 
some such, but tracking it down is proving elusive!


Generally the FS is also "lumpy" ... at times it feels like a wifi connection 
on a train using a terminal, I guess its all related though.


Thanks


Simon
_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss

Reply via email to