That should work, but as I understand it there will be 4 threads running. Bandwith on source-server has to be shared between these threads. What I'm dreaming of is some kind of broad- multicasting to 4 ip-adress to get max throughput. Maybe it's impossible but would be very efficient wouldn't it ?
Henk On Sep 10, 2013, at 9:09 PM, James Burton <[email protected]> wrote: > Henk, > > rsync will mostly do what you want it to do, but rsync doesn't support > remote->remote copy. > > The way I do multi-node copies involves using a recursive copy algorithm that > uses ssh to run rsync on the remote machines. On each pass, every node that > has source copies to a node that doesn't, which quickly copies the source to > all the nodes in the list. > > Here is the psuedocode. Of course, rsync and ssh need to be set up correctly > on all the nodes and you have to be sure you are using the right syntax for > your application, but this is a basic idea of what to do. > > copyAll( nodes[] ): > > # assume source is at node[0] > > len = nodes.length() > > if len == 1: return > > # copy the source to node in the the middle of the list. > ssh user@node[0] "rsync -a /path/to/files user@node[len/2]:/path/to/files" > > # partition the list and call recursively on separate threads > > # this copies from node[0]->node[len/4] > thread(copyAll(nodes[0:len/2])) > > # this copies from node[len/2]->node[3*len/4] > thread(copyAll(nodes[len/2:len])) > > Hope that helps. > > Jim > > > On Tue, Sep 10, 2013 at 12:36 PM, Henk D. Schoneveld <[email protected]> > wrote: > > On Sep 10, 2013, at 4:56 PM, James Burton <[email protected]> wrote: > > > Henk, > > > > I'm not sure what you are trying to do. > > > > Are you looking to copy data from one server to a series of servers? > Yes > > Is this a one time copy for setup or will this be part of an ongoing system? > It will be part of an ongoing system. > > > > Thanks, > > > > Jim > > > > > > On Mon, Sep 9, 2013 at 4:25 PM, Henk D. Schoneveld <[email protected]> > > wrote: > > Hi everybody, > > > > I'm thinking about installing 5 groups of 30 pvfs2-systems in a 100Mb/s > > WAN. The reason for this setup is that if one group would fail the > > remaining 4 groups will be able to serve the original amount of intended > > clients. IO-load would be 5/4 of the original setup. > > > > All groups share one 5Gb/s connection to the internet. > > > > To get minimal data transferred from the server somewhere on the internet > > I'm thinking about following scenario. Copy a file with 30x100Mb/s = 3Gb/s > > on 1 group and then parallel redistribute to the remaining groups. > > > > Any ideas how to do this most efficiently ? I know tee < source > dest0 > > dest1 dest2 dest3 dest4 would do this but it's not recursively and > > wildcards aren't accepted. rsync works with wildcards and recursively but > > how to get it done parallel in a way that load on the source group is > > minimal ? > > > > Suggestions ver welcome > > > > Henk > > _______________________________________________ > > Pvfs2-users mailing list > > [email protected] > > http://www.beowulf-underground.org/mailman/listinfo/pvfs2-users > > > > _______________________________________________ Pvfs2-users mailing list [email protected] http://www.beowulf-underground.org/mailman/listinfo/pvfs2-users
