Bob, (anyone?) Have you tried mmsdrestore to see if its working in 4.1.1?
# mmsdrrestore -p PRIMARY -R /usr/bin/scp Fri 3 Jul 11:56:05 BST 2015: mmsdrrestore: Processing node PRIMARY ccrio initialization failed (err 811) mmsdrrestore: Unable to retrieve GPFS cluster files from CCR. mmsdrrestore: Unexpected error from updateMmfsEnvironment. Return code: 1 mmsdrrestore: Command failed. Examine previous error messages to determine cause. It seems to copy the mmsdrfs file to the local node into /var/mmfs/gen/mmsdrfs but then fails to actually work. Simon From: Bob Oesterlin <[email protected]<mailto:[email protected]>> Reply-To: gpfsug main discussion list <[email protected]<mailto:[email protected]>> Date: Thursday, 2 July 2015 20:03 To: gpfsug main discussion list <[email protected]<mailto:[email protected]>> Subject: Re: [gpfsug-discuss] 4.1.1 protocol support On Thu, Jul 2, 2015 at 1:52 PM, Simon Thompson (Research Computing - IT Services) <[email protected]<mailto:[email protected]>> wrote: I do note that it needs CCR enabled, which we currently don’t have. Now I think this was because we saw issues with mmsdrestore when adding a node that had been reinstalled back into the cluster. I need to check if that is still the case (we work on being able to pull clients, NSDs etc from the cluster and using xcat to reprovision and the a config tool to do the relevant bits to rejoin the cluster … makes it easier for us to stage kernel, GPFS, OFED updates as we just blat on a new image). Yes, and this is why we couldn't use CCR - our compute nodes are netboot, so they go thru a mmsdrrestore every time they reboot. Now, they have fixed this in 4.1.1, which means if you can get (the cluster) to 4.1.1 and turn on CCR, mmsdrrestore should work. Note to self: Test this out in your sandbox cluster. :-) Bob Oesterlin
_______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at gpfsug.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss
