As i expect from the smartest sysadmins on the planet, everyone has over analyzed the issue... :)
lets see if i can clarify assuming there are two clusters - clusterA and clusterB Each cluster is 32nodes and has 50TB of storage attached the aggregate network bandwidth between the clusters is 800MB/sec the problem is the per-node bandwidth on clusterB is 30MB/sec so i use a single node to copy the 20TB of data from clusterB, yes it's going to take me 7days to copy everything I'd like to paralyze that across multiple nodes to drive the aggregate up I was hoping someone would pop up say, hey use this magical piece of software. (of which im unable to locate).. On Fri, Mar 5, 2010 at 11:30 AM, kyron <[email protected]> wrote: > On Fri, 05 Mar 2010 11:22:14 -0500, Mike Davis <[email protected]> wrote: >> Michael Di Domenico wrote: >>> How does one copy large (20TB) amounts of data from one cluster to >>> another? >>> >>> Assuming that each node in the cluster can only do about 30MB/sec >>> between clusters and i want to preserve the uid/gid/timestamps, etc >>> >> If the clusters are co-lo I wouldn't copy I would use shared storage. If > >> they are not co-located I would use patience. >> >> Seriously though, for a one time copy, I would consider copying to an >> external system and then physically moving that system. To do this and >> preserve ownerships you will need to duplicate accounts and groups. > > > ...and we are all assuming non-compressibility; otherwise, use pbzip2 ;) > _______________________________________________ Beowulf mailing list, [email protected] sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
