Yeah that did it, it was set to the default value of “no”. What exactly does “no” mean as opposed to “yes”? The docs https://www.ibm.com/support/knowledgecenter/en/STXKQY_5.0.0/com.ibm.spectrum.scale.v5r00.doc/bl1adm_tuningguide.htm
Aren’t very forthcoming on this … (note it looks like we also have to set this in multi-cluster environments in client clusters as well) Simon From: "robert.oester...@nuance.com" <robert.oester...@nuance.com> Date: Friday, 13 April 2018 at 21:17 To: "gpfsug-discuss@spectrumscale.org" <gpfsug-discuss@spectrumscale.org> Cc: "Simon Thompson (IT Research Support)" <s.j.thomp...@bham.ac.uk> Subject: Re: [Replicated and non replicated data Add: unmountOnDiskFail=meta To your config. You can add it with “-I” to have it take effect w/o reboot. Bob Oesterlin Sr Principal Storage Engineer, Nuance From: <gpfsug-discuss-boun...@spectrumscale.org> on behalf of "Simon Thompson (IT Research Support)" <s.j.thomp...@bham.ac.uk> Reply-To: gpfsug main discussion list <gpfsug-discuss@spectrumscale.org> Date: Friday, April 13, 2018 at 3:06 PM To: "gpfsug-discuss@spectrumscale.org" <gpfsug-discuss@spectrumscale.org> Subject: [EXTERNAL] [gpfsug-discuss] Replicated and non replicated data I have a question about file-systems with replicated an non replicated data. We have a file-system where metadata is set to copies=2 and data copies=2, we then use a placement policy to selectively replicate some data only once based on file-set. We also place the non-replicated data into a specific pool (6tnlsas) to ensure we know where it is placed. My understanding was that in doing this, if we took the disks with the non replicated data offline, we’d still have the FS available for users as the metadata is replicated. Sure accessing a non-replicated data file would give an IO error, but the rest of the FS should be up. We had a situation today where we wanted to take stg01 offline today, so tried using mmchdisk stop -d …. Once we got to about disk stg01-01_12_12, GPFS would refuse to stop any more disks and complain about too many disks, similarly if we shutdown the NSD servers hosting the disks, the filesystem would have an SGPanic and force unmount. First, am I correct in thinking that a FS with non-replicated data, but replicated metadata should still be accessible (not the non-replicated data) when the LUNS hosting it are down? If so, any suggestions why my FS is panic-ing when we take down the one set of disks? I thought at first we had some non-replicated metadata, tried a mmrestripefs -R –metadata-only to force it to ensure 2 replicas, but this didn’t help. Running 5.0.0.2 on the NSD server nodes. (First time we went round this we didn’t have a FS descriptor disk, but you can see below that we added this) Thanks Simon
_______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss