Re: [gpfsug-discuss] gpfs client cluster, lost quorum, ccr issues

2018-06-27 Thread KG
Can you also check the time differences between nodes? We had a situation recently where the server time mismatch caused failures. On Thu, Jun 28, 2018 at 2:50 AM, Kevin D Johnson wrote: > You can also try to convert to the old primary/secondary model to back it > away from the default CCR

Re: [gpfsug-discuss] gpfs client cluster, lost quorum, ccr issues

2018-06-27 Thread Kevin D Johnson
You can also try to convert to the old primary/secondary model to back it away from the default CCR configuration.   mmchcluster --ccr-disable -p servername   Then, temporarily go with only one quorum node and add more once the cluster comes back up.  Once the cluster is back up and has at least

Re: [gpfsug-discuss] gpfs client cluster, lost quorum, ccr issues

2018-06-27 Thread IBM Spectrum Scale
Hi Renata, You may want to reduce the set of quorum nodes. If your version supports the --force option, you can run mmchnode --noquorum -N --force It is a good idea to configure tiebreaker disks in a cluster that has only 2 quorum nodes. Regards, The Spectrum Scale (GPFS) team

Re: [gpfsug-discuss] gpfs client cluster, lost quorum, ccr issues

2018-06-27 Thread Renata Maria Dart
Hi Simon, yes I ran mmsdrrestore -p and that helped to create the /var/mmfs/ccr directory which was missing. But it didn't create a ccr.nodes file, so I ended up scp'ng that over by hand which I hope was the right thing to do. The one host that is no longer in service is still in that

Re: [gpfsug-discuss] gpfs client cluster, lost quorum, ccr issues

2018-06-27 Thread Renata Maria Dart
Hi, any gpfs commands fail with: root@ocio-gpu01 ~]# mmlsmgr get file failed: Not enough CCR quorum nodes available (err 809) gpfsClusterInit: Unexpected error from ccr fget mmsdrfs. Return code: 158 mmlsmgr: Command failed. Examine previous error messages to determine cause. The two "working"

Re: [gpfsug-discuss] gpfs client cluster, lost quorum, ccr issues

2018-06-27 Thread Iban Cabrillo
Hi,    Have you check if there is any manager node available? #mmlsmgrIf not could you try to asig a new cluster/gpfs_fs manager.Mmchmgr    gpfs_fs. Manager_nodeMmchmgr.   -c.  Cluster_manager_nodeCheers. ___ gpfsug-discuss mailing list gpfsug-discuss at

Re: [gpfsug-discuss] gpfs client cluster, lost quorum, ccr issues

2018-06-27 Thread Simon Thompson
Have you tried running mmsdrestore in the reinstalled node to reads to the cluster and then try and startup gpfs on it? https://www.ibm.com/support/knowledgecenter/STXKQY_4.2.3/com.ibm.spectrum.scale.v4r23.doc/bl1pdg_mmsdrrest.htm Simon From:

[gpfsug-discuss] gpfs client cluster, lost quorum, ccr issues

2018-06-27 Thread Renata Maria Dart
Hi, we have a client cluster of 4 nodes with 3 quorum nodes. One of the quorum nodes is no longer in service and the other was reinstalled with a newer OS, both without informing the gpfs admins. Gpfs is still "working" on the two remaining nodes, that is, they continue to have access to the

Re: [gpfsug-discuss] Snapshot handling in mixed Windows/MacOSenvironments

2018-06-27 Thread Christof Schmitt
Hi,   we currently support the SMB protocol method of quering snapshots, which is used by the Windows "Previous versions" dialog. Mac clients unfortunately do not implement these explicit queries. Browsing the snapshot directories with the @GMT names through SMB currently is not supported.   Could

Re: [gpfsug-discuss] PM_MONITOR refresh task failed

2018-06-27 Thread Sobey, Richard A
Hi Andreas, Output of the debug log – no clue, but maybe you can interpret it better  [root@icgpfsq1 ~]# /usr/lpp/mmfs/gui/cli/runtask pm_monitor --debug debug: locale=en_US debug: Raising event: gui_pmcollector_connection_ok, for node: localhost.localdomain err:

Re: [gpfsug-discuss] PM_MONITOR refresh task failed

2018-06-27 Thread Andreas Koeninger
Hi Richard,   if you double-click the event there should be some additional help available. The steps under "User Action" will hopefully help to identify the root cause:   1.) Check if there is additional information available by executing '/usr/lpp/mmfs/gui/cli/lstasklog [taskname]'.2.) Run the

Re: [gpfsug-discuss] PM_MONITOR refresh task failed

2018-06-27 Thread Sobey, Richard A
Hi Renar, No, it all runs over the same network. Thanks, Richard From: gpfsug-discuss-boun...@spectrumscale.org [mailto:gpfsug-discuss-boun...@spectrumscale.org] On Behalf Of Grunenberg, Renar Sent: 27 June 2018 12:29 To: 'gpfsug main discussion list' Subject: Re: [gpfsug-discuss] PM_MONITOR

[gpfsug-discuss] Snapshot handling in mixed Windows/MacOS environments

2018-06-27 Thread Altenburger Ingo (ID SD)
Hi all, our (Windows) users are familiared with the 'previous versions' self-recover feature. We honor this by creating regular snapshots with the default @GMT prefix (non-@-heading prefixes are not visible in 'previous versions'). Unfortunately, MacOS clients having the same share mounted via

Re: [gpfsug-discuss] PM_MONITOR refresh task failed

2018-06-27 Thread Grunenberg, Renar
Hallo Richard, do have a private admin-interface-lan in your cluster if yes than the logic of query the collector-node, and the representing ccr value are wrong. Can you ‘mmperfmon query cpu’? If not then you hit a problem that I had yesterday. Renar Grunenberg Abteilung Informatik – Betrieb

[gpfsug-discuss] PM_MONITOR refresh task failed

2018-06-27 Thread Sobey, Richard A
Hi all, I'm getting the following error in the GUI, running 5.0.1: "The following GUI refresh task(s) failed: PM_MONITOR". As yet, this is the only node I've upgraded to 5.0.1 - the rest are running (healthily, according to the GUI) 4.2.3.7. I'm not sure if this version mismatch is relevant