Felix Rauch Valenti
Sun, 06 May 2007 20:35:43 -0700
On 04/05/07, Bill Broadley <[EMAIL PROTECTED]> wrote:
Geoff Galitz wrote: > During an HPC talk some years ago, I recall someone mentioned a tool > which can copy large datasets across a cluster using a ring topology. > Perhaps someone here knows of this tool? Not sure about a ring topology, seems kinda silly...
Why would that be silly? To clarify: The transmission through the ring happens in parallel, i.e., while a node n receives the data stream from node n-1, it writes the stream to disk and at the same time forwards it to node n+1. I have yet to see a tool that can achieve better data rates in practice, for reliable, high speed and large scale data distribution in clusters.
> More to the point, we are pushing around datasets that are about > 1Gbyte. The datasets are pushed out to dozens of nodes all at once and How often? I just bit-torrented a 1GB file to 165 nodes in 3 minutes, 1.5 minutes was the lazy why I launched it (the last node didn't start until 1.5 minutes into the run). BTW, 140 or so of those nodes already had 1 job per CPU running.
1 GB file in 1.5 minutes translates to about 11 MB/s, which sounds a lot like Fast Ethernet (100 mbps). By today's standards that's relatively slow and it's quite likely that the network will be the bottleneck for almost any tool.
There are various ways to maximize I/O with bit-torrent. Various seeders allow uploading each block only once (usually called super seeder mode). Assuming you have a few GB ram on the file server you could even prefetch the file before torrenting (i.e. dd if=file_to_server of=/dev/null) since the limit on bit-torrent bandwidth is often how quickly you can seek. Additionally you can make the chunk size larger to reduce the number of seeks. On the client side preallocation can greatly reduce the number of seeks.
More advantages of the ring topology: It uploads every block on every node exactly once, no prefetching and no seeks are required (if you replicate a whole partition or a single large file). If you are interested in more details about the technology, like models and performance measurements (somewhat old by now), check out the second paper in this list: http://www.cs.inf.ethz.ch/cops/patagonia/#relmat - Felix _______________________________________________ Beowulf mailing list, Beowulf@beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf