On Fri, 22 Jun 2018 13:28:02 -, "Oesterlin, Robert" said:
> [root@nrg1-gpfs01 ~]# mmchmgr dataeng nrg1-gpfs05
> Sending migrate request to current manager node 10.30.43.136 (nrg1-gpfs13).
> Node 10.30.43.136 (nrg1-gpfs13) resigned as manager for dataeng.
> Node 10.30.43.136 (nrg1-gpfs13)
We were adding disks to one of our larger filesystems today. During the
"checking allocation map for storage pool system" we had to interrupt the
command since it was causing slow downs on our filesystem.
Now commands like mmrepquota, mmdf, etc. are timing out with tsaddisk
command is running
Hi Bob,
Also tracing waiters on the cluster can help you understand if there is
something that is blocking this kind of operation.
Beyond the command output, which is usually too terse to understand what is
actually happening, do the logs on the nodes in the cluster give you any
further
Two thoughts:
1) Has your config data update fully propagated after the mmchnode? We've
(rarely) seen some weird stuff happen when that process isn't complete yet,
or if a node in question simply didn't get the update (try md5sum'ing the
mmsdrfs file on nrg1-gpfs13 and compare to the cluster
Yep. And nrg1-gpfs13 isn’t even a manager node anymore!
[root@nrg1-gpfs01 ~]# mmchmgr dataeng nrg1-gpfs05
Sending migrate request to current manager node 10.30.43.136 (nrg1-gpfs13).
Node 10.30.43.136 (nrg1-gpfs13) resigned as manager for dataeng.
Node 10.30.43.136 (nrg1-gpfs13) appointed as
Hi Bob,
Have you tried explicitly moving it to a specific manager node? That’s what I
always do … I personally never let GPFS pick when I’m moving the management
functions for some reason. Thanks…
Kevin
On Jun 22, 2018, at 8:13 AM, Oesterlin, Robert
mailto:robert.oester...@nuance.com>>
Any idea why I can’t force the file system manager off this node? I turned off
the manager on the node (mmchnode --client) and used mmchmgr to move the other
file systems off, but I can’t move this one. There are 6 other good choices for
file system managers. I’ve never seen this message