OK – I’ve run across this before, and it’s because of a bug (as I recall) 
having to do with CCR and quorum. What I think you can do is set the cluster to 
non-ccr (mmchcluster –ccr-disable) with all the nodes down, bring it back up 
and then re-enable ccr.

I’ll see if I can find this in one of the recent 4.2 release nodes.


Bob Oesterlin
Sr Principal Storage Engineer, Nuance


From: <[email protected]> on behalf of "Buterbaugh, 
Kevin L" <[email protected]>
Reply-To: gpfsug main discussion list <[email protected]>
Date: Tuesday, September 19, 2017 at 4:03 PM
To: gpfsug main discussion list <[email protected]>
Subject: [EXTERNAL] [gpfsug-discuss] CCR cluster down for the count?

Hi All,

We have a small test cluster that is CCR enabled.  It only had/has 3 NSD 
servers (testnsd1, 2, and 3) and maybe 3-6 clients.  testnsd3 died a while 
back.  I did nothing about it at the time because it was due to be life-cycled 
as soon as I finished a couple of higher priority projects.

Yesterday, testnsd1 also died, which took the whole cluster down.  So now 
resolving this has become higher priority… ;-)

I took two other boxes and set them up as testnsd1 and 3, respectively.  I’ve 
done a “mmsdrrestore -p testnsd2 -R /usr/bin/scp” on both of them.  I’ve also 
done a "mmccr setup -F” and copied the ccr.disks and ccr.nodes files from 
testnsd2 to them.  And I’ve copied /var/mmfs/gen/mmsdrfs from testnsd2 to 
testnsd1 and 3.  In case it’s not obvious from the above, networking is fine … 
ssh without a password between those 3 boxes is fine.

However, when I try to startup GPFS … or run any GPFS command I get:

/root
root@testnsd2# mmstartup -a
get file failed: Not enough CCR quorum nodes available (err 809)
gpfsClusterInit: Unexpected error from ccr fget mmsdrfs.  Return code: 158
mmstartup: Command failed. Examine previous error messages to determine cause.
/root
root@testnsd2#

I’ve got to run to a meeting right now, so I hope I’m not leaving out any 
crucial details here … does anyone have an idea what I need to do?  Thanks…

—
Kevin Buterbaugh - Senior System Administrator
Vanderbilt University - Advanced Computing Center for Research and Education
[email protected]<mailto:[email protected]> - 
(615)875-9633



_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss

Reply via email to