Hi Araon, in my setup I have no chance to define a tiebreaker disk. So if one node goes down I would change the role if this node.
mmchnode --nonquorum -N nodename --force After that I can start the filesystem and mount it. Thanks, Matthias Best Regards Matthias Knigge R&D File Based Media Solutions Rohde & Schwarz GmbH & Co. KG Hanomaghof 1 30449 Hannover Telefon +49 511 67 80 7 213 Fax +49 511 37 19 74 Internet: [email protected] ------------------------------------------------------------ Geschäftsführung / Executive Board: Christian Leicher (Vorsitzender / Chairman), Peter Riedel, Sitz der Gesellschaft / Company's Place of Business: München, Registereintrag / Commercial Register No.: HRA 16 270, Persönlich haftender Gesellschafter / Personally Liable Partner: RUSEG Verwaltungs-GmbH, Sitz der Gesellschaft / Company's Place of Business: München, Registereintrag / Commercial Register No.: HRB 7 534, Umsatzsteuer-Identifikationsnummer (USt-IdNr.) / VAT Identification No.: DE 130 256 683, Elektro-Altgeräte Register (EAR) / WEEE Register No.: DE 240 437 86 -----Original Message----- From: [email protected] <[email protected]> On Behalf Of Aaron Knister Sent: Friday, September 07, 2018 3:35 PM To: [email protected] Subject: *EXT* [Newsletter] Re: [gpfsug-discuss] Problem with mmlscluster and callback scripts Hi Matthias, Looks like you lost quorum in the cluster (you've got to have (n/2+1) quorum nodes up if you're using node-based quorum). Do you have a tiebreaker disk defined? (i.e. mmlsconfig tiebreakerdisk). -Aaron On 9/7/18 7:51 AM, Matthias Knigge wrote: > Hello together, > > I am using the version 5.0.2.0 of GPFS and have problems with the > command mmlscluster and callback-scripts. It is a small cluster of two > nodes only. If I shutdown one of the nodes sometimes mmlscluster > reports the following output: > > [root@gpfs-tier1 gpfs5.2]# mmgetstate > > Node number Node name GPFS state > > ------------------------------------------- > > 1 gpfs-tier1 arbitrating > > [root@gpfs-tier1 gpfs5.2]# mmlscluster > > ssh: connect to host gpfs-tier2 port 22: No route to host > > mmlscluster: Unable to retrieve GPFS cluster files from node > gpfs-tier2 > > mmlscluster: Command failed. Examine previous error messages to > determine cause. > > Normally the output is like this: > > [root@gpfs-tier1 gpfs5.2]# mmlscluster > > GPFS cluster information > > ======================== > > GPFS cluster name: TIERCLUSTER.gpfs-tier1 > > GPFS cluster id: 12458173498278694815 > > GPFS UID domain: TIERCLUSTER.gpfs-tier1 > > Remote shell command: /usr/bin/ssh > > Remote file copy command: /usr/bin/scp > > Repository type: server-based > > GPFS cluster configuration servers: > > ----------------------------------- > > Primary server: gpfs-tier2 > > Secondary server: gpfs-tier1 > > Node Daemon node name IP address Admin node name Designation > > ---------------------------------------------------------------------- > > 1 gpfs-tier1 192.168.178.10 gpfs-tier1 > quorum-manager > > 2 gpfs-tier2 192.168.178.11 gpfs-tier2 > quorum-manager > > [root@gpfs-tier1 gpfs5.2]# mmlscallback > > NodeDownCallback > > command = /var/mmfs/rs/nodedown.ksh > > priority = 1 > > event = quorumNodeLeave > > parms = %eventNode %quorumNodes > > NodeUpCallback > > command = /var/mmfs/rs/nodeup.ksh > > priority = 1 > > event = quorumNodeJoin > > parms = %eventNode %quorumNodes > > If I shutdown the filesystem via mmshutdown the callback script works > but if I shutdown the whole node the scripts does not run. > > The latest log-entry in mmfs.log.latest shows only this information: > > 2018-09-07_13:12:36.724+0200: [I] Cluster Manager connection broke. > Probing cluster TIERCLUSTER.gpfs-tier1 > > 2018-09-07_13:12:37.226+0200: [E] Unable to contact enough other > quorum nodes during cluster probe. > > 2018-09-07_13:12:37.226+0200: [E] Lost membership in cluster > TIERCLUSTER.gpfs-tier1. Unmounting file systems. > > 2018-09-07_13:12:38.448+0200: [N] Connecting to 192.168.178.11 > gpfs-tier2 <c0p1> > > Could anybody help me in this case? I want to try to start a script if > one node goes down or up to change the roles for starting the > filesystem. The callback event NodeLeave and NodeJoin do not run too. > > Any more information required? If yes, please let me know! > > Many thanks in advance and a nice weekend! > > Matthias > > Best Regards > > Matthias Knigge > R&D File Based Media Solutions > > Rohde & Schwarz > GmbH & Co. KG > Hanomaghof 1 > 30449 Hannover > Telefon +49 511 67 80 7 213 > Fax +49 511 37 19 74 > Internet: [email protected] > ------------------------------------------------------------ > Geschäftsführung / Executive Board: Christian Leicher (Vorsitzender / > Chairman), Peter Riedel, Sitz der Gesellschaft / Company's Place of > Business: München, Registereintrag / Commercial Register No.: HRA 16 > 270, Persönlich haftender Gesellschafter / Personally Liable Partner: > RUSEG Verwaltungs-GmbH, Sitz der Gesellschaft / Company's Place of > Business: München, Registereintrag / Commercial Register No.: HRB 7 > 534, Umsatzsteuer-Identifikationsnummer (USt-IdNr.) / VAT Identification No.: > DE 130 256 683, Elektro-Altgeräte Register (EAR) / WEEE Register No.: > DE > 240 437 86 > > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > -- Aaron Knister NASA Center for Climate Simulation (Code 606.2) Goddard Space Flight Center (301) 286-2776 _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss
