On 11/09/2020 15:25, Stephen Ulmer wrote:

On Sep 9, 2020, at 10:04 AM, Skylar Thompson <[email protected] <mailto:[email protected]>> wrote:

On Wed, Sep 09, 2020 at 12:02:53PM +0100, Jonathan Buzzard wrote:
On 08/09/2020 18:37, IBM Spectrum Scale wrote:
I think it is incorrect to assume that a command that continues
after detecting the working directory has been removed is going to
cause damage to the file system.

No I am not assuming it will cause damage. I am making the fairly reasonable
assumption that any command which fails has an increased probability of
causing damage to the file system over one that completes successfully.

I think there is another angle here, which is that this command's output
has the possibility of triggering an "oh ----" (fill in your preferred
colorful metaphor here) moment, followed up by a panicked Ctrl-C. That
reaction has the possibility of causing its own problems (i.e. not sure if
mmafmctl touches CCR, but aborting it midway could leave CCR inconsistent).
I'm with Jonathan here: the command should fail with an informative
message, and the admin can correct the problem (just cd somewhere else).


I’m now (genuinely) curious as to what Spectrum Scale commands *actually* depend on the working directory existing and why. They shouldn’t depend on anything but existing well-known directories (logs, SDR, /tmp, et cetera) and any file or directories passed as arguments to the command. This is the Unix way.

It seems like the *right* solution is to armor commands against doing something “bad” if they lose a resource required to complete their task. If $PWD goes away because an admin’s home goes away in the middle of a long restripe, it’s better to complete the work and let them look in the logs. It's not Scale’s problem if something not affecting its work happens.
>
> Maybe I’ve got a blind spot here...
>

This jogged my memory that best practice would be to have a call to chdir to set the working directory to "/" very early on. Before anything critical is started.

I am 99.999% sure that its covered in Steven's (can't check as I am away for the weekend) so really there is no excuse. If / goes away then really really bad things have happened and it all sort of becomes moot anyway.


JAB.

--
Jonathan A. Buzzard                         Tel: +44141-5483420
HPC System Administrator, ARCHIE-WeSt.
University of Strathclyde, John Anderson Building, Glasgow. G4 0NG
_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss

Reply via email to