On 14/02/2020 16:24, Sanchez, Paul wrote:
Some (perhaps obvious) points to consider:

- There are some corner cases (e.g. preserving hard-linked files or
sparseness) which require special options.

- Depending on your level of churn, it may be helpful to pre-stage
the sync before your cutover so that there is less data movement
required, and you're primarily comparing metadata.

- Files on the source filesysytem might change (and become internally
inconsistent) during your rsync, so you should generally sync from a
snapshot on the source.
In my experience this causes an rsync to exit with a none zero error code. See later as to why this is useful. Also it will likely have a different mtime that will cause it be resynced on a subsequent run, the final one will be with the file system in a "read only" state. Not necessarily mounted read only but without anything running that might change stuff.

[SNIP]


- If you decide to do a final "offline" sync, you want it to be fast
so users can get back to work sooner, so parallelism is usually a
must.  If you have lots of filesets, then that's a convenient way to
split the work.

This final "offline" sync is an absolute must, in my experience unless you are able to be rather woolly about preserving data.


- If you have any filesets with many more inodes than the others,
keep in mind that those will likely take the longest to complete.


Indeed. We found last time that we did an rsync which was for a HPC system from the put of woe that is Lustre to GPFS there was huge mileage to be hand from telling users that they would get on the new system once their data was synced, it would be done on a "per user" basis with the priority given to the users with a combination of the smallest amount of data and the smallest number of files. Did unbelievable wonders for the users to clean up their files. One user went from over 17 million files to under 50 thousand! The amount of data needing syncing nearly halved. It shrank to ~60% of the pre-announcement size.

- Test, test, test.  You usually won't get this right on the first go
or know how long a full sync takes without practice.  Remember that
you'll need to employ options to delete extraneous files on the
target when you're syncing over the top of a previous attempt, since
files intentionally deleted on the source aren't usually welcome if
they reappear after a migration.


rsync has a --delete option for that.

I am going to add that if you do any sort of ILM/HSM then an rsync is going to destroy you ability to identify old files that have not been accessed, as the rsync will up date the atime of everything (don't ask how I know).

If you have a backup (of course you do) I would strongly recommend considering getting your first "pass" from a restore. Firstly it won't impact the source file system while it is still in use and second it allows you to check your backup actually works :-)

Finally when rsyncing systems like this I use a Perl script with an sqlite DB. Basically a list of directories to sync, you can have both source and destination to make wonderful things happen if wanted, along with a flag field. The way I use that is -1 means not synced, -2 means the folder in question is currently been synced, and anything else is the exit code of rsync.

If you write the Perl script correctly you can start it on any number of nodes, just dump the sqlite DB on a shared folder somewhere (either the source or destination file systems work well here). If you are doing it in parallel record the node which did the rsync as well it can be useful in finding any issues in my experience.

Once everything is done you can quickly check the sqlite DB for none zero flag fields to find out what if anything has failed, which gives you the confidence that your sync has completed accurately. Also any flag fields less than zero show you it's not finished.

Finally you might want to record the time each individual rsync took, it's handy for working out that ordering I mentioned :-)

JAB.

--
Jonathan A. Buzzard                         Tel: +44141-5483420
HPC System Administrator, ARCHIE-WeSt.
University of Strathclyde, John Anderson Building, Glasgow. G4 0NG
_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss

Reply via email to