On 09/06/2020 14:07, Stephen Ulmer wrote:
Jonathan brings up a good point that you’ll only get one shot at this — if you’re using the file system as your record of who owns what.

Not strictly true if my existing UID's are in the range 10000-19999 and my target UID's are in the range 50000-99999 for example then I get an infinite number of shots at it.

It is only if the target and source ranges have any overlap that there is a problem and that should be easy to work out in advance.

If it where me and there was overlap between input and output states I would go via an intermediate state where there is no overlap. Linux has had 32bit UID's since a very long time now (we are talking kernel versions <1.0 from memory) so none overlapping mappings are perfectly possible to arrange.

> With respect to that, it is surprising how easy the sqlite C API is to
> use (though I would still recommend Perl or Python), and equally
> surprising how *bad* the JOIN performance is. If you go with sqlite,
> denormalize *everything* as it’s collected. If that is too dirty for
> you, then just use MariaDB or something else.

I actually thinking on it more thought a generic C random UID/GID to UID/GID mapping program is a really simple piece of code and should be nearly as fast as chown -R. It will be very slightly slower as you have to look the mapping up for each file. Read the mappings in from a CSV file into memory and just use nftw/lchown calls to walk the file system and change the UID/GID as necessary.

If you are willing to sacrifice some error checking on the input mapping file (not unreasonable to assume it is good) and have some hard coded site settings (avoiding processing command line arguments) then 200 lines of C tops should do it. Depending on how big your input UID/GID ranges are you could even use array indexing for the mapping. For example on our system the UID's start at just over 5000 and end just below 6000 with quite a lot of holes. Just allocate an array of 6000 int's which is only ~24KB and off you go with something like

        new_uid = uid_mapping[uid];

Nice super speedy lookup of mappings. If you need to manipulate ACL's then C is the only way to go anyway.

JAB.

--
Jonathan A. Buzzard                         Tel: +44141-5483420
HPC System Administrator, ARCHIE-WeSt.
University of Strathclyde, John Anderson Building, Glasgow. G4 0NG
_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss

Reply via email to