Hi, Tina. I hope the issues have been resolved. You said that "I have a number of nodes which now have stale file handles." In my experience, that happens pretty often when an NSD node goes down, even if another NSD node serves the needed data and metadata. I was almost always able to fix the stale file handle problem by running the mmmount command again on the clients that were getting the stale file system handles errors. I might have had to run mmumount first, but probably not.
Regarding adding a quorum node, I'm not sure it will have much benefit because you will still have a single point of failure (because you only have one working NSD node). If you can get the failed NSD node back up quickly, you might not need to add another quorum node. Wally email: [email protected] ------- Original Message ------- On Thursday, September 7th, 2023 at 9:39 AM, [email protected] <[email protected]> wrote: > Send gpfsug-discuss mailing list submissions to > [email protected] > > To subscribe or unsubscribe via the World Wide Web, visit > http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org > or, via email, send a message with subject or body 'help' to > [email protected] > > You can reach the person managing the list at > [email protected] > > When replying, please edit your Subject line so it is more specific > than "Re: Contents of gpfsug-discuss digest..." > > > Today's Topics: > > 1. Question regarding quorum nodes (Tina Friedrich) > 2. Re: Question regarding quorum nodes (Danny Lang) > > > ---------------------------------------------------------------------- > > Message: 1 > Date: Thu, 7 Sep 2023 14:28:55 +0100 > From: Tina Friedrich [email protected] > > To: "[email protected]" [email protected] > > Cc: [email protected] > > Subject: [gpfsug-discuss] Question regarding quorum nodes > Message-ID: [email protected] > > Content-Type: text/plain; charset="UTF-8"; format=flowed > > Hello All, > > I hope someone can answer this quickly! > > We have - it seems - just lost one of our NSDs.The other took over as it > should - but unfortunately, the protocol nodes (i.e. I have a number of > nodes which now have stale file handles. (The file system is accessible > on the remaining NSD server and a number of other clients.) > > Unfortunately, in the 'home' cluster - the one that ones the disks - I > only have, five servers; three of which are quorum nodes, and of course > the failed NSD server is one of them. > > The question is - can I add quorum nodes with one node down, and can I > remove quorum functionality from a failed node? > > Tina > > > > ------------------------------ > > Message: 2 > Date: Thu, 7 Sep 2023 13:39:00 +0000 > From: Danny Lang [email protected] > > To: "[email protected]" [email protected] > > Cc: "[email protected]" [email protected] > > Subject: Re: [gpfsug-discuss] Question regarding quorum nodes > Message-ID: > db9pr05mb8715ac665cca2e428f5dcbf9c0...@db9pr05mb8715.eurprd05.prod.outlook.com > > > Content-Type: text/plain; charset="utf-8" > > Hi Tina, > > The command you're looking for is: > > > mmchnode > > https://www.ibm.com/docs/en/storage-scale/4.2.0?topic=commands-mmchnode-command > > This will allow you to add quorum nodes and to remove. > > ------ > > I would advise checking everything prior to running commands. ? > > Thanks > Danny > > ________________________________ > From: gpfsug-discuss [email protected] on behalf of Tina > Friedrich [email protected] > > Sent: 07 September 2023 2:28 PM > To: [email protected] [email protected] > > Cc: [email protected] [email protected] > > Subject: [gpfsug-discuss] Question regarding quorum nodes > > > External Sender: Use caution. > > > Hello All, > > I hope someone can answer this quickly! > > We have - it seems - just lost one of our NSDs.The other took over as it > should - but unfortunately, the protocol nodes (i.e. I have a number of > nodes which now have stale file handles. (The file system is accessible > on the remaining NSD server and a number of other clients.) > > Unfortunately, in the 'home' cluster - the one that ones the disks - I > only have, five servers; three of which are quorum nodes, and of course > the failed NSD server is one of them. > > The question is - can I add quorum nodes with one node down, and can I > remove quorum functionality from a failed node? > > Tina > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at gpfsug.org > https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fgpfsug.org%2Fmailman%2Flistinfo%2Fgpfsug-discuss_gpfsug.org&data=05|01||b25b485dd562417c674d08dbafa6d05a|4eed7807ebad415aa7a99170947f4eae|0|0|638296903233929525|Unknown|TWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D|3000|||&sdata=P62lXeMqP%2F1GLce6mv5dGDHgUILHmpOkzY0F%2BU%2BVGYU%3D&reserved=0http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org > > > The Francis Crick Institute Limited is a registered charity in England and > Wales no. 1140062 and a company registered in England and Wales no. 06885462, > with its registered office at 1 Midland Road London NW1 1AT > -------------- next part -------------- > An HTML attachment was scrubbed... > URL: > http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20230907/aadc27eb/attachment.htm > > > ------------------------------ > > Subject: Digest Footer > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at gpfsug.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org > > > ------------------------------ > > End of gpfsug-discuss Digest, Vol 138, Issue 10 > *********************************************** _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at gpfsug.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org
