Re: [gpfsug-discuss] Question about Policies - using mmapplypolicy/EXTERNAL LIST/mmxargs

2019-12-30 Thread Jonathan Buzzard
On 29/12/2019 14:24, Marc A Kaplan wrote:
> Correct, you may need to use similar parsing/quoting techniques in your 
> renaming scripts.
> 0
> Just remember, in Unix/Posix/Linux the only 2 special characters/codes 
> in path names are '/' and \0. The former delimits directories and the 
> latter marks the end of the string.
> And technically the latter isn't ever in a path name, it's only used by 
> system APIs to mark the end of a string that is the pathname argument.
>i

I am not sure even that is entirely true. Certainly MacOS X in the past 
would allow '/' in file names. You find this out when a MacOS user tries 
to migrate their files to a SMB based file server and the process trips 
up because they have named a whole bunch of files in the format

 "My Results 30/12/2019.txt"

At this juncture I note that MacOS is certified Unix :-)

I think it is more a file system limitation than anything else. I wonder 
what happens when you mount a HFS+ file system with such named files on 
Linux... I would at this point note that the vast majority of "wacky" 
file names originate from MacOS (both Classic and X) users.

Also while you are otherwise technically correct about what is allowed 
in a file name just try creating a file name with a newline character in 
it using either a GUI tool or the command line. You have to be really 
determined to achieve it. I have also seen \007 in a file name, I mean 
really.

Our training for new HPC users has a section covering file names which 
includes advising users not to use "wacky" characters in them as we 
don't guarantee their continued survival. That is if we do something on 
the file system and they get "lost" as a result it's your own fault.

In my view restricting yourself to the following is entirely sensible

https://docs.microsoft.com/en-us/rest/api/storageservices/naming-and-referencing-shares--directories--files--and-metadata

Also while Unix is generally case sensitive creating files that would 
clash if accessed case insensitive is really dumb and should be avoided. 
Again, if it causes you problems in future, it sucks to be you.


JAB.

-- 
Jonathan A. Buzzard Tel: +44141-5483420
HPC System Administrator, ARCHIE-WeSt.
University of Strathclyde, John Anderson Building, Glasgow. G4 0NG
___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


Re: [gpfsug-discuss] Question about Policies - using mmapplypolicy/EXTERNAL LIST/mmxargs

2019-12-30 Thread Marc A Kaplan
Also see if your distribution includes samples/ilm/mmxcp
which, if you are determined to cp or mv from one path to another, shows a
way to do that easily in perl,
using code similar to the aforementions bin/mmxargs

Here is the path changing part...

   ...

   $src =~ s/'/'\\''/g;  # any ' within the name like x'y become x'\''y
then we  quote all names passed to commands
my @src = split('/',$src);
my $sra = join('/', @src[$strip+1..$#src-1]);
$newtarg = "'" . $target . '/' . $sra . "'";

 ...
___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


Re: [gpfsug-discuss] Question about Policies - using mmapplypolicy/EXTERNAL LIST/mmxargs

2019-12-29 Thread Marc A Kaplan

Correct, you may need to use similar parsing/quoting techniques in your
renaming scripts.
0
Just remember, in Unix/Posix/Linux the only 2 special characters/codes in
path names are  '/'  and \0.  The former delimits directories and the
latter marks the end of the string.
And technically the latter isn't  ever in a path name, it's only used by
system APIs to mark the end of a string that is the pathname argument.

Happy New Year,



From:   Jonathan Buzzard 
To: "gpfsug-discuss@spectrumscale.org"

Date:   12/29/2019 05:01 AM
Subject:[EXTERNAL] Re: [gpfsug-discuss] Question about Policies - using
    mmapplypolicy/EXTERNAL LIST/mmxargs
Sent by:gpfsug-discuss-boun...@spectrumscale.org



On 28/12/2019 19:49, Marc A Kaplan wrote:
> The script in mmfs/bin/mmxargs handles mmapplypolicy EXTERNAL LIST file
> lists perfectly. No need to worry about whitespaces and so forth.
> Give it a look-see and a try
>

Indeed, but I get the feeling from the original post that you will need
to mung the path/file names to produce a new directory path that the
files is to be moved to. At this point the whole issue of "wacky"
directory and file names will rear it's ugly head.

So for example

/gpfs/users/joeblogs/experiment`1234?/results *-12-2019.txt

would need moving to something like

/gpfs/users/joeblogs/experiment`1234?/old_data/results *-12-2019.txt

That is a pit of woe unless you are confident that users are being
sensible, or you just forget about wacky named files.

In a similar vein, in the past I have for results coming of a piece of
experimental equipment ziped up every 30 days. Each run on the equipment
and the results go in a different directory/ So for example the directory

/gpfs/users/joeblogs/nmr_spectroscopy/2019/results-1229-01/

would be zipped up to

/gpfs/users/joeblogs/nmr_spectroscopy/2019/results-1229-01.zip

and the original directory removed. This works well because both widows
explorer and finder will allow you to click into the zip files to see
the contents. However the script that did this worked in the principle
of a very strict naming convention that if was not adhered to would mean
the folders where not zipped up.

Given the original posters institution, a good guess is that something
like this is what is wanting to be achieved.


JAB.

--
Jonathan A. Buzzard Tel: +44141-5483420
HPC System Administrator, ARCHIE-WeSt.
University of Strathclyde, John Anderson Building, Glasgow. G4 0NG
___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss=DwICAg=jf_iaSHvJObTbx-siA1ZOg=cvpnBBH0j41aQy0RPiG2xRL_M8mTc1izuQD3_PmtjZ8=prco68XIUUkBHwRlOlBP9xNlbXteQlfo6eTljgmJseQ=dQ0hsxzBJZzZG2Y2Xkh_u6eNGasZl-wHlffQDLn9kiw=




___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


Re: [gpfsug-discuss] Question about Policies - using mmapplypolicy/EXTERNAL LIST/mmxargs

2019-12-29 Thread Jonathan Buzzard
On 28/12/2019 19:49, Marc A Kaplan wrote:
> The script in mmfs/bin/mmxargs handles mmapplypolicy EXTERNAL LIST file 
> lists perfectly. No need to worry about whitespaces and so forth.
> Give it a look-see and a try
> 

Indeed, but I get the feeling from the original post that you will need 
to mung the path/file names to produce a new directory path that the 
files is to be moved to. At this point the whole issue of "wacky" 
directory and file names will rear it's ugly head.

So for example

/gpfs/users/joeblogs/experiment`1234?/results *-12-2019.txt

would need moving to something like

/gpfs/users/joeblogs/experiment`1234?/old_data/results *-12-2019.txt

That is a pit of woe unless you are confident that users are being 
sensible, or you just forget about wacky named files.

In a similar vein, in the past I have for results coming of a piece of 
experimental equipment ziped up every 30 days. Each run on the equipment 
and the results go in a different directory/ So for example the directory

/gpfs/users/joeblogs/nmr_spectroscopy/2019/results-1229-01/

would be zipped up to

/gpfs/users/joeblogs/nmr_spectroscopy/2019/results-1229-01.zip

and the original directory removed. This works well because both widows 
explorer and finder will allow you to click into the zip files to see 
the contents. However the script that did this worked in the principle 
of a very strict naming convention that if was not adhered to would mean 
the folders where not zipped up.

Given the original posters institution, a good guess is that something 
like this is what is wanting to be achieved.


JAB.

-- 
Jonathan A. Buzzard Tel: +44141-5483420
HPC System Administrator, ARCHIE-WeSt.
University of Strathclyde, John Anderson Building, Glasgow. G4 0NG
___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


Re: [gpfsug-discuss] Question about Policies - using mmapplypolicy/EXTERNAL LIST/mmxargs

2019-12-28 Thread Marc A Kaplan

The script in mmfs/bin/mmxargs handles mmapplypolicy EXTERNAL LIST file
lists perfectly.  No need to worry about whitespaces and so forth.
Give it a look-see and a try

-- marc of GPFS -


From:   Jonathan Buzzard 
To: "gpfsug-discuss@spectrumscale.org"

Date:   12/28/2019 10:17 AM
Subject:[EXTERNAL] Re: [gpfsug-discuss] Question about Policies
Sent by:gpfsug-discuss-boun...@spectrumscale.org



On 27/12/2019 14:20, david_john...@brown.edu wrote:
> You would want to look for examples of external scripts that work on the
> result of running the policy engine in listing mode.  The one issue that
> might need some attention is the way that gpfs quotes unprintable
> characters in the pathname. So the policy engine generates the list and
> your external script does the moving.
>

In my experience a good starting point would be to scan the list of
files from the policy engine and separate the files out into "normal";
that is files using basic ASCII and no special characters and the rest
also known as the "wacky pile".

Given that you are UK based it is not unreasonable to expect all path
and file names to be in English. There might (and if not probably
should) be an institutional policy mandating it. Not much use if a
researcher saves everything in Greek then gets knocked over by a bus and
person picking up the work is Spanish for example.

Hopefully the "wacky pile" is small, however expect to find all sorts of
bizarre file and path names in it. We are talking wildcards, back ticks,
even newline characters to name but a few.

Depending on the amount of data in the "wacky" pile you might just want
to forget about moving them, as they are orders of magnitude more
difficult to deal with than files with "sane" path and file names and
can rapidly soak up large chunks of time trying to deal with them in
scripts.

JAB.

--
Jonathan A. Buzzard Tel: +44141-5483420
HPC System Administrator, ARCHIE-WeSt.
University of Strathclyde, John Anderson Building, Glasgow. G4 0NG
___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss=DwIGaQ=jf_iaSHvJObTbx-siA1ZOg=cvpnBBH0j41aQy0RPiG2xRL_M8mTc1izuQD3_PmtjZ8=ndS4tGx_CLuYWNl3PoYZUZGMwTDw0IFQAVCovuw2qbc=VLuDBejMqsG2ggu2YNluBW2c_g-bpbNluifBXQNHRM4=




___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss