[gpfsug-discuss] FW: ESS 3500-C5 : rg has resigned permanently

Walter Sklenka Thu, 24 Aug 2023 04:43:03 -0700


Mit freundlichen Grüßen
Walter Sklenka
Technical Consultant


EDV-Design Informationstechnologie GmbH
Giefinggasse 6/1/2, A-1210 Wien
Tel: +43 1 29 22 165-31
Fax: +43 1 29 22 165-90
E-Mail: [email protected]<mailto:[email protected]>
Internet: www.edv-design.at<http://www.edv-design.at/>

From: Walter Sklenka
Sent: Donnerstag, 24. August 2023 12:02
To: '[email protected]' <[email protected]>
Subject: FW: ESS 3500-C5 : rg has resigned permanently

Hi !
Does someone eventually have experience with ESS 3500 ( no hybrid config, only 
NLSAS with 5 enclosures )

We have issues with a shared recoverygroup. After creating it we made a test of 
setting only one node active (mybe not an optimal idea)
But since then the recoverygroup is down
We have created a PMR but do not get any response until now.

The rg has no vdisks of any filesystem
[gpfsadmin@hgess02-m ~]$ ^C
[gpfsadmin@hgess02-m ~]$ sudo mmvdisk rg change --rg 
ess3500_hgess02_n1_hs_hgess02_n2_hs --restart
mmvdisk:
mmvdisk:
mmvdisk: Unable to reset server list for recovery group 
'ess3500_hgess02_n1_hs_hgess02_n2_hs'.
mmvdisk: Command failed. Examine previous error messages to determine cause.

We also tried
2023-08-21_16:57:26.174+0200: [I] Command: tsrecgroupserver 
ess3500_hgess02_n1_hs_hgess02_n2_hs -f -l root hgess02-n2-hs.invalid
2023-08-21_16:57:26.201+0200: [I] Recovery group 
ess3500_hgess02_n1_hs_hgess02_n2_hs has resigned permanently
2023-08-21_16:57:26.201+0200: [E] Command: err 2: tsrecgroupserver 
ess3500_hgess02_n1_hs_hgess02_n2_hs -f -l root hgess02-n2-hs.invalid
2023-08-21_16:57:26.201+0200: Specified entity, such as a disk or file system, 
does not exist.
2023-08-21_16:57:26.207+0200: [I] Command: tsrecgroupserver 
ess3500_hgess02_n1_hs_hgess02_n2_hs -f -l LG001 hgess02-n2-hs.invalid.
2023-08-21_16:57:26.207+0200: [E] Command: err 212: tsrecgroupserver 
ess3500_hgess02_n1_hs_hgess02_n2_hs -f -l LG001 hgess02-n2-hs.invalid
2023-08-21_16:57:26.207+0200: The current file system manager failed and no new 
manager will be appointed. This may cause nodes mounting the file system to 
experience mount failures.
2023-08-21_16:57:26.213+0200: [I] Command: tsrecgroupserver 
ess3500_hgess02_n1_hs_hgess02_n2_hs -f -l LG002 hgess02-n2-hs.invalid
2023-08-21_16:57:26.213+0200: [E] Command: err 212: tsrecgroupserver 
ess3500_hgess02_n1_hs_hgess02_n2_hs -f -l LG002 hgess02-n2-hs.invalid
2023-08-21_16:57:26.213+0200: The current file system manager failed and no new 
manager will be appointed. This may cause nodes mounting the file system to 
experience mount failures.


For us it is crucial to know what we can do if theis happens again  ( it has no 
vdisks yet so it is not critical ).

Do you know: is there a non documented way to "vary on", or activate a 
recoverygroup again?
The doc :
https://www.ibm.com/docs/en/ess/6.1.6_lts?topic=rgi-recovery-group-issues-shared-recovery-groups-in-ess
tells to mmshutdown and mmstartup, but the RGCM does say nothing
When trying to execute any vdisk command it only says "rg down", no idea how we 
could recover from that without deleting the rg ( I hope it will never happen, 
when we have vdisks on it



Have a nice day
Walter




Mit freundlichen Grüßen
Walter Sklenka
Technical Consultant

EDV-Design Informationstechnologie GmbH
Giefinggasse 6/1/2, A-1210 Wien
Tel: +43 1 29 22 165-31
Fax: +43 1 29 22 165-90
E-Mail: [email protected]<mailto:[email protected]>
Internet: www.edv-design.at<http://www.edv-design.at/>

_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at gpfsug.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org

[gpfsug-discuss] FW: ESS 3500-C5 : rg has resigned permanently

Reply via email to