Re: [gpfsug-discuss] naive question about rsync: run it on a client or on NSD server?

Jonathan Buzzard Fri, 14 Feb 2020 13:09:48 -0800

On 14/02/2020 16:24, Sanchez, Paul wrote:

Some (perhaps obvious) points to consider:


- There are some corner cases (e.g. preserving hard-linked files or
sparseness) which require special options.

- Depending on your level of churn, it may be helpful to pre-stage
the sync before your cutover so that there is less data movement
required, and you're primarily comparing metadata.

- Files on the source filesysytem might change (and become internally
inconsistent) during your rsync, so you should generally sync from a
snapshot on the source.

In my experience this causes an rsync to exit with a none zero errorcode. See later as to why this is useful. Also it will likely have adifferent mtime that will cause it be resynced on a subsequent run, thefinal one will be with the file system in a "read only" state. Notnecessarily mounted read only but without anything running that mightchange stuff.


[SNIP]


- If you decide to do a final "offline" sync, you want it to be fast
so users can get back to work sooner, so parallelism is usually a
must.  If you have lots of filesets, then that's a convenient way to
split the work.

This final "offline" sync is an absolute must, in my experience unlessyou are able to be rather woolly about preserving data.


- If you have any filesets with many more inodes than the others,
keep in mind that those will likely take the longest to complete.

Indeed. We found last time that we did an rsync which was for a HPCsystem from the put of woe that is Lustre to GPFS there was huge mileageto be hand from telling users that they would get on the new system oncetheir data was synced, it would be done on a "per user" basis with thepriority given to the users with a combination of the smallest amount ofdata and the smallest number of files. Did unbelievable wonders for theusers to clean up their files. One user went from over 17 million filesto under 50 thousand! The amount of data needing syncing nearly halved.It shrank to ~60% of the pre-announcement size.

- Test, test, test.  You usually won't get this right on the first go
or know how long a full sync takes without practice.  Remember that
you'll need to employ options to delete extraneous files on the
target when you're syncing over the top of a previous attempt, since
files intentionally deleted on the source aren't usually welcome if
they reappear after a migration.


rsync has a --delete option for that.

I am going to add that if you do any sort of ILM/HSM then an rsync isgoing to destroy you ability to identify old files that have not beenaccessed, as the rsync will up date the atime of everything (don't askhow I know).

If you have a backup (of course you do) I would strongly recommendconsidering getting your first "pass" from a restore. Firstly it won'timpact the source file system while it is still in use and second itallows you to check your backup actually works :-)

Finally when rsyncing systems like this I use a Perl script with ansqlite DB. Basically a list of directories to sync, you can have bothsource and destination to make wonderful things happen if wanted, alongwith a flag field. The way I use that is -1 means not synced, -2 meansthe folder in question is currently been synced, and anything else isthe exit code of rsync.

If you write the Perl script correctly you can start it on any number ofnodes, just dump the sqlite DB on a shared folder somewhere (either thesource or destination file systems work well here). If you are doing itin parallel record the node which did the rsync as well it can be usefulin finding any issues in my experience.

Once everything is done you can quickly check the sqlite DB for nonezero flag fields to find out what if anything has failed, which givesyou the confidence that your sync has completed accurately. Also anyflag fields less than zero show you it's not finished.

Finally you might want to record the time each individual rsync took,it's handy for working out that ordering I mentioned :-)


JAB.

--
Jonathan A. Buzzard                         Tel: +44141-5483420
HPC System Administrator, ARCHIE-WeSt.
University of Strathclyde, John Anderson Building, Glasgow. G4 0NG
_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss

Re: [gpfsug-discuss] naive question about rsync: run it on a client or on NSD server?

Reply via email to