It does sound like "mmvdisk rg change --restart" is the "varyon" command you're looking for.. but it's not clear why it's failing. I would start by looking at if there are any lower level issues with your cluster. Are your nodes healthy on a GPFS-level? "mmnetverify -N all" says network is OK ? "mmhealth node show -N all" not indicating any issues ? Check mmfs.log.latest ?
On Thu, Aug 24, 2023 at 1:41 PM Walter Sklenka <[email protected]> wrote: > > > > > Mit freundlichen Grüßen > *Walter Sklenka* > *Technical Consultant* > > > > EDV-Design Informationstechnologie GmbH > Giefinggasse 6/1/2, A-1210 Wien > Tel: +43 1 29 22 165-31 > Fax: +43 1 29 22 165-90 > E-Mail: [email protected] > Internet: www.edv-design.at > > > > *From:* Walter Sklenka > *Sent:* Donnerstag, 24. August 2023 12:02 > *To:* '[email protected]' < > [email protected]> > *Subject:* FW: ESS 3500-C5 : rg has resigned permanently > > > > Hi ! > > Does someone eventually have experience with ESS 3500 ( no hybrid config, > only NLSAS with 5 enclosures ) > > > > We have issues with a shared recoverygroup. After creating it we made a > test of setting only one node active (mybe not an optimal idea) > > But since then the recoverygroup is down > > We have created a PMR but do not get any response until now. > > > > The rg has no vdisks of any filesystem > > [gpfsadmin@hgess02-m ~]$ ^C > [gpfsadmin@hgess02-m ~]$ sudo mmvdisk rg change --rg > ess3500_hgess02_n1_hs_hgess02_n2_hs --restart > mmvdisk: > mmvdisk: > mmvdisk: Unable to reset server list for recovery group > 'ess3500_hgess02_n1_hs_hgess02_n2_hs'. > mmvdisk: Command failed. Examine previous error messages to determine > cause. > > > > We also tried > > 2023-08-21_16:57:26.174+0200: [I] Command: tsrecgroupserver > ess3500_hgess02_n1_hs_hgess02_n2_hs -f -l root hgess02-n2-hs.invalid > 2023-08-21_16:57:26.201+0200: [I] Recovery group > ess3500_hgess02_n1_hs_hgess02_n2_hs has resigned permanently > 2023-08-21_16:57:26.201+0200: [E] Command: err 2: tsrecgroupserver > ess3500_hgess02_n1_hs_hgess02_n2_hs -f -l root hgess02-n2-hs.invalid > 2023-08-21_16:57:26.201+0200: Specified entity, such as a disk or file > system, does not exist. > 2023-08-21_16:57:26.207+0200: [I] Command: tsrecgroupserver > ess3500_hgess02_n1_hs_hgess02_n2_hs -f -l LG001 hgess02-n2-hs.invalid. > 2023-08-21_16:57:26.207+0200: [E] Command: err 212: tsrecgroupserver > ess3500_hgess02_n1_hs_hgess02_n2_hs -f -l LG001 hgess02-n2-hs.invalid > 2023-08-21_16:57:26.207+0200: The current file system manager failed and > no new manager will be appointed. This may cause nodes mounting the file > system to experience mount failures. > 2023-08-21_16:57:26.213+0200: [I] Command: tsrecgroupserver > ess3500_hgess02_n1_hs_hgess02_n2_hs -f -l LG002 hgess02-n2-hs.invalid > 2023-08-21_16:57:26.213+0200: [E] Command: err 212: tsrecgroupserver > ess3500_hgess02_n1_hs_hgess02_n2_hs -f -l LG002 hgess02-n2-hs.invalid > 2023-08-21_16:57:26.213+0200: The current file system manager failed and > no new manager will be appointed. This may cause nodes mounting the file > system to experience mount failures. > > > > > > For us it is crucial to know what we can do if theis happens again ( it > has no vdisks yet so it is not critical ). > > > > Do you know: is there a non documented way to “vary on”, or activate a > recoverygroup again? > > The doc : > > > https://www.ibm.com/docs/en/ess/6.1.6_lts?topic=rgi-recovery-group-issues-shared-recovery-groups-in-ess > > tells to mmshutdown and mmstartup, but the RGCM does say nothing > > When trying to execute any vdisk command it only says “rg down”, no idea > how we could recover from that without deleting the rg ( I hope it will > never happen, when we have vdisks on it > > > > > > > > Have a nice day > > Walter > > > > > > > > > > Mit freundlichen Grüßen > *Walter Sklenka* > *Technical Consultant* > > > > EDV-Design Informationstechnologie GmbH > Giefinggasse 6/1/2, A-1210 Wien > Tel: +43 1 29 22 165-31 > Fax: +43 1 29 22 165-90 > E-Mail: [email protected] > Internet: www.edv-design.at > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at gpfsug.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org >
_______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at gpfsug.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org
