Re: [gpfsug-discuss] Change uidNumber and gidNumber for billions of files

2020-06-09 Thread Aaron Knister
Lohit,

I did this while working @ NASA. I had two tools I used, one affectionately
known as "luke file walker" (to modify traditional unix permissions) and
the other known as the "milleniumfacl" (to modify posix ACLs). Stupid jokes
aside, there were some real technical challenges here.

I don't know if anyone from the NCCS team at NASA is on the list, but if
they are perhaps they'll jump in if they're willing to share the code :)

>From what I recall, I used uthash and the gpfs API's to store in-memory a
hash of inodes and their uid/gid information. I then walked the filesystem
using the gpfs API's and could lookup the given inode in the in-memory hash
to view its ownership details. Both the inode traversal and directory walk
were parallelized/threaded. They way I actually executed the chown was
particularly security-minded. There is a race condition that exists if you
chown /path/to/file. All it takes is either a malicious user or someone
monkeying around with the filesystem while it's live to accidentally chown
the wrong file if a symbolic link ends up in the file path. My work around
was to use openat() and fchmod (I think that was it, I played with this
quite a bit to get it right) and for every path to be chown'd I would walk
the hierarchy, opening each component with the O_NOFOLLOW flags to be sure
I didn't accidentally stumble across a symlink in the way. I also
implemented caching of open path component file descriptors since odds are
I would be chowning/chgrp'ing files in the same directory. That bought me
some speed up.

I opened up RFE's at one point, I believe, for gpfs API calls to do this
type of operation. I would ideally have liked a mechanism to do this based
on inode number rather than path which would help avoid issues of race
conditions.

One of the gotchas to be aware of, is quotas. My wrapper script would clone
quotas from the old uid to the new uid. That's easy enough. However, keep
in mind, if the uid is over their quota your chown operation will
absolutely kill your cluster. Once a user is over their quota the
filesystem seems to want to quiesce all of its accounting information on
every filesystem operation for that user. I would check for adequate quota
headroom for the user in question and abort if there wasn't enough.

The ACL changes were much more tricky. There's no way, of which I'm aware,
to atomically update ACL entries. You run the risk that you could clobber a
user's ACL update if it occurs in the milliseconds between you reading the
ACL and updating it as part of the UID/GID update. Thankfully we were using
Posix ACLs which were easier for me to deal with programmatically. I still
had the security concern over symbolic links appearing in paths to have
their ACLs updated either maliciously or organically. I was able to deal
with that by modifying libacl to implement ACL calls that used variants of
xattr calls that took file descriptors as arguments and allowed me to throw
nofollow flags. That code is here (
https://github.com/aaronknister/acl/commits/nofollow). I couldn't take
advantage of the GPFS API's here to meet my requirements, so I just walked
the filesystem tree in parallel if I recall correctly, retrieved every ACL
and updated if necessary.

If you're using NFS4 ACLs... I don't have an easy answer for you :)

We did manage to migrate UID numbers for several hundred users and half a
billion inodes in a relatively small amount of time with the filesystem
active. Some of the concerns about symbolic links can be mitigated if there
are no users active on the filesystem while the migration is underway.

-Aaron

On Mon, Jun 8, 2020 at 2:01 PM Lohit Valleru  wrote:

> Hello Everyone,
>
> We are planning to migrate from LDAP to AD, and one of the best solution
> was to change the uidNumber and gidNumber to what SSSD or Centrify would
> resolve.
>
> May I know, if anyone has come across a tool/tools that can change the
> uidNumbers and gidNumbers of billions of files efficiently and in a
> reliable manner?
> We could spend some time to write a custom script, but wanted to know if a
> tool already exists.
>
> Please do let me know, if any one else has come across a similar
> situation, and the steps/tools used to resolve the same.
>
> Regards,
> Lohit
> ___
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
>
___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


Re: [gpfsug-discuss] Change uidNumber and gidNumber for billions of files

2020-06-09 Thread Jonathan Buzzard

On 09/06/2020 14:57, Jonathan Buzzard wrote:

[SNIP]



I actually thinking on it more thought a generic C random UID/GID to 
UID/GID mapping program is a really simple piece of code and should be 
nearly as fast as chown -R. It will be very slightly slower as you have 
to look the mapping up for each file. Read the mappings in from a CSV 
file into memory and just use nftw/lchown calls to walk the file system 
and change the UID/GID as necessary.




Because I was curious I thought I would have a go this evening coding 
something up in C.


It's standing at 213 lines of code put there is some extra fluff and 
some error checking and a large copyright comment. Updating ACL's would 
increase the size too. It would however be relatively simple I think. 
The public GPFS API documentation on ACL's is incomplete so some guess 
work and testing would be required.


It's stupidly fast on my laptop changing the ownership of the latest 
version of gcc untarred. However there is only one user in the map file 
and it's an SSD. Obviously if you have billions of files it is going to 
take longer :-)


JAB.

--
Jonathan A. Buzzard Tel: +44141-5483420
HPC System Administrator, ARCHIE-WeSt.
University of Strathclyde, John Anderson Building, Glasgow. G4 0NG
___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


Re: [gpfsug-discuss] Change uidNumber and gidNumber for billions of files

2020-06-09 Thread Stephen Ulmer
Jonathan brings up a good point that you’ll only get one shot at this — if 
you’re using the file system as your record of who owns what.

You might want to use the policy engine to record the existing file names and 
ownership (and then provide updates using the same policy engine for the things 
that changed after the last time you ran it). At that point, you’ve got the 
list of who should own what from before you started.

You could even do some things to see how complex your problem is, like "how 
many directories have files owned by more than one UID?”

With respect to that, it is surprising how easy the sqlite C API is to use 
(though I would still recommend Perl or Python), and equally surprising how 
*bad* the JOIN performance is. If you go with sqlite, denormalize *everything* 
as it’s collected. If that is too dirty for you, then just use MariaDB or 
something else.


-- 
Stephen



> On Jun 9, 2020, at 7:20 AM, Jonathan Buzzard  
> wrote:
> 
> On 08/06/2020 18:44, Lohit Valleru wrote:
>> Hello Everyone,
>> We are planning to migrate from LDAP to AD, and one of the best solution was 
>> to change the uidNumber and gidNumber to what SSSD or Centrify would resolve.
>> May I know, if anyone has come across a tool/tools that can change the 
>> uidNumbers and gidNumbers of billions of files efficiently and in a reliable 
>> manner?
> 
> Not to my knowledge.
> 
>> We could spend some time to write a custom script, but wanted to know if a 
>> tool already exists.
> 
> If you can be sure that all files under a specific directory belong to a 
> specific user and you have no ACL's then a whole bunch of "chown -R" would be 
> reasonable. That is you have a lot of user home directories for example.
> 
> What I do in these scenarios is use a small sqlite database, say in this 
> scenario which has the directory that I want to chown on, the target UID and 
> GID and a status field. Initially I set the status field to -1 which 
> indicates they have not been processed. The script sets the status field to 
> -2 when it starts processing an entry and on completion sets the status field 
> to the exit code of the command you are running. This way when the script is 
> finished you can see any directory hierarchies that had a problem and if it 
> dies early you can see where it got up to (that -2).
> 
> You can also do things like set all none zero status codes back to -1 and run 
> again with a simple SQL update on the database from the sqlite CLI.
> 
> If you don't need to modify ACL's but have mixed ownership under directory 
> hierarchies then a script is reasonable but not a shell script. The overhead 
> of execing chown billions of times on individual files will be astronomical. 
> You need something like Perl or Python and make use of the builtin chown 
> facilities of the language to avoid all those exec's. That said I suspect you 
> will see a significant speed up from using C.
> 
> If you have ACL's to contend with then I would definitely spend some time and 
> write some C code using the GPFS library. It will be a *LOT* faster than any 
> script ever will be. Dealing with mmpgetacl and mmputacl in any script is 
> horrendous and you will have billions of exec's of each command.
> 
> As I understand it GPFS stores each ACL once and each file then points to the 
> ACL. Theoretically it would be possible to just modify the stored ACL's for a 
> very speedy update of all the ACL's on the files/directories. However I would 
> imagine you need to engage IBM and bend over while they empty your wallet for 
> that option :-)
> 
> The biggest issue to take care of IMHO is do any of the input UID/GID numbers 
> exist in the output set??? If so life just got a lot harder as you don't get 
> a second chance to run the script/program if there is a problem.
> 
> In this case I would be very tempted to remove such clashes prior to the main 
> change. You might be able to do that incrementally before the main switch and 
> update your LDAP in to match.
> 
> Finally be aware that if you are using TSM for backup you will probably need 
> to back every file up again after the change of ownership as far as I am 
> aware.
> 
> JAB.
> 
> -- 
> Jonathan A. Buzzard Tel: +44141-5483420
> HPC System Administrator, ARCHIE-WeSt.
> University of Strathclyde, John Anderson Building, Glasgow. G4 0NG
> ___
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss

___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


Re: [gpfsug-discuss] Change uidNumber and gidNumber for billions of files

2020-06-09 Thread Jonathan Buzzard

On 09/06/2020 14:07, Stephen Ulmer wrote:
Jonathan brings up a good point that you’ll only get one shot at this — 
if you’re using the file system as your record of who owns what.


Not strictly true if my existing UID's are in the range 1-1 and 
my target UID's are in the range 5-9 for example then I get an 
infinite number of shots at it.


It is only if the target and source ranges have any overlap that there 
is a problem and that should be easy to work out in advance.


If it where me and there was overlap between input and output states I 
would go via an intermediate state where there is no overlap. Linux has 
had 32bit UID's since a very long time now (we are talking kernel 
versions <1.0 from memory) so none overlapping mappings are perfectly 
possible to arrange.


> With respect to that, it is surprising how easy the sqlite C API is to
> use (though I would still recommend Perl or Python), and equally
> surprising how *bad* the JOIN performance is. If you go with sqlite,
> denormalize *everything* as it’s collected. If that is too dirty for
> you, then just use MariaDB or something else.

I actually thinking on it more thought a generic C random UID/GID to 
UID/GID mapping program is a really simple piece of code and should be 
nearly as fast as chown -R. It will be very slightly slower as you have 
to look the mapping up for each file. Read the mappings in from a CSV 
file into memory and just use nftw/lchown calls to walk the file system 
and change the UID/GID as necessary.


If you are willing to sacrifice some error checking on the input mapping 
file (not unreasonable to assume it is good) and have some hard coded 
site settings (avoiding processing command line arguments) then 200 
lines of C tops should do it. Depending on how big your input UID/GID 
ranges are you could even use array indexing for the mapping. For 
example on our system the UID's start at just over 5000 and end just 
below 6000 with quite a lot of holes. Just allocate an array of 6000 
int's which is only ~24KB and off you go with something like


new_uid = uid_mapping[uid];

Nice super speedy lookup of mappings. If you need to manipulate ACL's 
then C is the only way to go anyway.


JAB.

--
Jonathan A. Buzzard Tel: +44141-5483420
HPC System Administrator, ARCHIE-WeSt.
University of Strathclyde, John Anderson Building, Glasgow. G4 0NG
___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


Re: [gpfsug-discuss] Change uidNumber and gidNumber for billions of files

2020-06-09 Thread Jonathan Buzzard

On 08/06/2020 18:44, Lohit Valleru wrote:

Hello Everyone,

We are planning to migrate from LDAP to AD, and one of the best solution 
was to change the uidNumber and gidNumber to what SSSD or Centrify would 
resolve.


May I know, if anyone has come across a tool/tools that can change the 
uidNumbers and gidNumbers of billions of files efficiently and in a 
reliable manner?


Not to my knowledge.

We could spend some time to write a custom script, but wanted to know if 
a tool already exists.




If you can be sure that all files under a specific directory belong to a 
specific user and you have no ACL's then a whole bunch of "chown -R" 
would be reasonable. That is you have a lot of user home directories for 
example.


What I do in these scenarios is use a small sqlite database, say in this 
scenario which has the directory that I want to chown on, the target UID 
and GID and a status field. Initially I set the status field to -1 which 
indicates they have not been processed. The script sets the status field 
to -2 when it starts processing an entry and on completion sets the 
status field to the exit code of the command you are running. This way 
when the script is finished you can see any directory hierarchies that 
had a problem and if it dies early you can see where it got up to (that -2).


You can also do things like set all none zero status codes back to -1 
and run again with a simple SQL update on the database from the sqlite CLI.


If you don't need to modify ACL's but have mixed ownership under 
directory hierarchies then a script is reasonable but not a shell 
script. The overhead of execing chown billions of times on individual 
files will be astronomical. You need something like Perl or Python and 
make use of the builtin chown facilities of the language to avoid all 
those exec's. That said I suspect you will see a significant speed up 
from using C.


If you have ACL's to contend with then I would definitely spend some 
time and write some C code using the GPFS library. It will be a *LOT* 
faster than any script ever will be. Dealing with mmpgetacl and mmputacl 
in any script is horrendous and you will have billions of exec's of each 
command.


As I understand it GPFS stores each ACL once and each file then points 
to the ACL. Theoretically it would be possible to just modify the stored 
ACL's for a very speedy update of all the ACL's on the 
files/directories. However I would imagine you need to engage IBM and 
bend over while they empty your wallet for that option :-)


The biggest issue to take care of IMHO is do any of the input UID/GID 
numbers exist in the output set??? If so life just got a lot harder as 
you don't get a second chance to run the script/program if there is a 
problem.


In this case I would be very tempted to remove such clashes prior to the 
main change. You might be able to do that incrementally before the main 
switch and update your LDAP in to match.


Finally be aware that if you are using TSM for backup you will probably 
need to back every file up again after the change of ownership as far as 
I am aware.


JAB.

--
Jonathan A. Buzzard Tel: +44141-5483420
HPC System Administrator, ARCHIE-WeSt.
University of Strathclyde, John Anderson Building, Glasgow. G4 0NG
___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


Re: [gpfsug-discuss] Introducing SSUG::Digital

2020-06-09 Thread Simon Thompson (Spectrum Scale User Group Chair)
First talk:

https://www.spectrumscaleug.org/event/ssugdigital-spectrum-scale-expert-talk-what-is-new-in-spectrum-scale-5-0-5/

 

What is new in Spectrum Scale 5.0.5?

 

18th June 2020. No registration required, just click the Webex link in the page 
above.

 

Simon

 

From:  on behalf of 
"ch...@spectrumscale.org" 
Reply to: "gpfsug-discuss@spectrumscale.org" 
Date: Wednesday, 3 June 2020 at 20:11
To: "gpfsug-discuss@spectrumscale.org" 
Subject: [gpfsug-discuss] Introducing SSUG::Digital

 

Hi All.,

 

I happy that we can finally announce SSUG:Digital, which will be a series of 
online session based on the types of topic we present at our in-person events.

 

I know it’s taken use a while to get this up and running, but we’ve been 
working on trying to get the format right. So save the date for the first 
SSUG:Digital event which will take place on Thursday 18th June 2020 at 4pm BST. 
That’s:
San Francisco, USA at 08:00 PDT
New York, USA at 11:00 EDT
London, United Kingdom at 16:00 BST
Frankfurt, Germany at 17:00 CEST
Pune, India at 20:30 IST
We estimate about 90 minutes for the first session, and please forgive any 
teething troubles as we get this going!

 

(I know the times don’t work for everyone in the global community!)

 

Each of the sessions we run over the next few months will be a different 
Spectrum Scale Experts or Deep Dive session.

More details at:

https://www.spectrumscaleug.org/introducing-ssugdigital/

 

(We’ll announce the speakers and topic of the first session in the next few 
days …)

 

Thanks to Ulf, Kristy, Bill, Bob and Ted for their help and guidance in getting 
this going.

 

We’re keen to include some user talks and site updates later in the series, so 
please let me know if you might be interested in presenting in this format.

 

Simon Thompson

SSUG Group Chair

___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


Re: [gpfsug-discuss] Change uidNumber and gidNumber for billions of files

2020-06-09 Thread Simon Thompson
> I presume there is a large technical blocker which is why you are looking at 
> remapping the filesystem?

Like anytime there is a corporate AD with mandated attributes? 

Though isn’t there an AD thing now for doing schema view type things now which 
allow you to inherit certain attributes and overwrite others?

Simon
___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


Re: [gpfsug-discuss] Change uidNumber and gidNumber for billions of files

2020-06-09 Thread Jez Tucker
Hi Lohit (hey Jim & Christof),

  Whilst you _could_ trawl your entire filesystem, flip uids and work
out how to successfully replace ACL ids without actually pushing ACLs
(which could break defined inheritance options somewhere in your file
tree if you had not first audited your filesystem) the systems head in
me says:

"We are planning to migrate from LDAP to AD, and one of the best
solution was to change the uidNumber and gidNumber to what SSSD or
Centrify would resolve."
Here's the problem: to what SSSD or Centrify would resolve

I've done this a few times in the past in a previous life.  In many
respects it is easier (and faster!) to remap the AD side to the uids
already on the filesystem.
E.G. if user foo is id 1234, ensure user foo is 1234 in AD when you move
your LDAP world over.
Windows ldifde utility can import an ldif from openldap to take the
config across.
Automation or inline munging can be achieved with powershell or python.

I presume there is a large technical blocker which is why you are
looking at remapping the filesystem?

Best,

Jez



On 09/06/2020 03:52, Christof Schmitt wrote:
> If there are ACLs, then you also need to update all ACLs 
> (gpfs_getacl(), update uids and gids in all entries, gpfs_putacl()),
> in addition to the chown() call.
>  
> Regards,
>  
> Christof Schmitt || IBM || Spectrum Scale Development || Tucson, AZ
> christof.schm...@us.ibm.com  ||  +1-520-799-2469    (T/L: 321-2469)
>  
>  
>
> - Original message -
> From: Jim Doherty 
> Sent by: gpfsug-discuss-boun...@spectrumscale.org
> To: gpfsug main discussion list 
> Cc:
> Subject: [EXTERNAL] Re: [gpfsug-discuss] Change uidNumber and
> gidNumber for billions of files
> Date: Mon, Jun 8, 2020 5:57 PM
>  
>  
> You will need to do this with chown from the  c library functions 
> (could do this from perl or python).   If you try to change this
> from a shell script  you will hit the Linux command  which will
> have a lot more overhead. I had a customer attempt this using
> the shell and it ended up taking forever due to a brain damaged
> NIS service :-).   
>  
> Jim 
>  
> On Monday, June 8, 2020, 2:01:39 PM EDT, Lohit Valleru
>  wrote:
>  
>  
> Hello Everyone,
>  
> We are planning to migrate from LDAP to AD, and one of the best
> solution was to change the uidNumber and gidNumber to what SSSD or
> Centrify would resolve.
>  
> May I know, if anyone has come across a tool/tools that can change
> the uidNumbers and gidNumbers of billions of files efficiently and
> in a reliable manner?
> We could spend some time to write a custom script, but wanted to
> know if a tool already exists.
>  
> Please do let me know, if any one else has come across a similar
> situation, and the steps/tools used to resolve the same.
>  
> Regards,
> Lohit
> ___
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
> ___
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss 
>
>  
>
>
> ___
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss


-- 
*Jez Tucker*
VP Research and Development | Pixit Media
e: jtuc...@pixitmedia.com 
Visit www.pixitmedia.com 

-- 
   


 


This email is confidential in that it is 
intended for the exclusive attention of the addressee(s) indicated. If you 
are not the intended recipient, this email should not be read or disclosed 
to any other person. Please notify the sender immediately and delete this 
email from your computer system. Any opinions expressed are not necessarily 
those of the company from which this email was sent and, whilst to the best 
of our knowledge no viruses or defects exist, no responsibility can be 
accepted for any loss or damage arising from its receipt or subsequent use 
of this email.
___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss