On 16/11/2020 21:58, Skylar Thompson wrote:
When we did a similar (though larger, at ~2.5PB) migration, we used rsync
as well, but ran one rsync process per Isilon node, and made sure the NFS
clients were hitting separate Isilon nodes for their reads. We also didn't
have more than one rsync process running per client, as the Linux NFS
client (at least in CentOS 6) was terrible when it came to concurrent access.


The million dollar question IMHO is the number of files and their sizes.

Basically if you have a million 1KB files to move it is going to take much longer than a 100 1GB files. That is the overhead of dealing with each file is a real bitch and kills your attainable transfer speed stone dead.

One option I have used in the past is to use your last backup and restore to the new system, then rsync in the changes. That way you don't impact the source file system which is live.

Another option I have used is to inform users in advance that data will be transferred based on a metric of how many files and how much data they have. So the less data and fewer files the quicker you will get access to the new system once access to the old system is turned off.

It is amazing how much users clear up junk under this scenario. Last time I did this a single user went from over 17 million files to 11 thousand! In total many many TB of data just vanished from the system (around half of the data when puff) as users actually got around to some house keeping LOL. Moving less data and files is always less painful.

Whatever method you end up using, I can guarantee you will be much happier
once you are on GPFS. :)

Goes without saying :-)


JAB.

--
Jonathan A. Buzzard                         Tel: +44141-5483420
HPC System Administrator, ARCHIE-WeSt.
University of Strathclyde, John Anderson Building, Glasgow. G4 0NG
_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss

Reply via email to