On 21/11/2020 00:37, Peter van Hooft wrote:
Hello, Is it possible to find out the progress of the 'mmchdisk /dev/fs start -a' command when the controlling terminal had been lost?
I don't think so. You are lucky it is still running
We can see the task running on the fs manager node with 'mmdiag --commands' with attributes 'hold PIT/disk waitTime 0' We are starting to worry the mmchdisk is taking too long, and see continuously waiters like Waiting 3.1946 sec since 01:28:23, ignored, thread 22092 TSCHDISKCmdThread: on ThCond 0x180267573D0 (SGManagementMgrDataCondvar), reason 'waiting for stripe group to recover' Thanks for any hints.
Not that this is going to help this time, but it is why you should *ALWAYS* without exception run these sorts of commands within a screen/tmux session so when you loose the connection to the server you can just reconnect and pick it up again.
This is introductory system administration 101. No critical or long running command should ever be dependant on a remote controlling terminal. If you can't run them locally then run them in a screen or tmux session.
There are plenty of good howto's for both screen and tmux on the internet. Depending on which distribution you use I would note that RedHat have very annoyingly and for completely specious reasons removed screen from RHEL8 and left tmux. So if you are starting from scratch tmux is the one to learn :-(
JAB. -- Jonathan A. Buzzard Tel: +44141-5483420 HPC System Administrator, ARCHIE-WeSt. University of Strathclyde, John Anderson Building, Glasgow. G4 0NG _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss
