On Sun, 28 Mar 2010, Dana Hudes wrote:

Why is rsync a problem? Where is the bottleneck in the protocol or the code 
implementing it?
Specifics!
SAR is antiquated doesn't give the info you really need. Using a linux system? 
Use procallator and feed resulting collected data to ORCA. Better yet, use 
DTrace or at least truss.  Compile rsync with profiling code -- use Sun Studio 
12 it runs on Linux as well as Solaris and its a free download.

Wow.  You kids and your new shiny toys...  Look, here's a nice little
specific example for you.  I run an rsync server that contains 8,700+ files
and directories.  Now, say I want to sync a mere thirty-two new files.
Making that request on my server causes the rsync daemon to stat the entire
hierarchy to the tune of 18,000+ f & lstats.  Per request.  Freaking ouch.
And that's a tolerable use-case in my mind for rsync.  That's a hell of alot
I/O generated which would take but a couple of stats to retrieve via HTTP or
FTP.  Assuming you knew what you needed already.

Now, when you add in a file set of sufficient size to exhaust filesystem
caching, plus a crap load of concurrent requests, my archaic SAR reports
written on stone tables tend to say your I/O wait states starts pushing the
load levels unacceptably high, not to mention the pages being thrashed from
memory's cache pool, high interrupts and excessive seeks on the drives, and
so on and so forth.  <sniff>  Cavemen are people, too.

Now, look at the size of CPAN with *hundreds* of thousands of files.  Can
you imagine that amount of I/O *per* request?!

From a network protocol perspective rsync is quite good. If your network 
capacity is so large that it exceeds bandwidth or IOPs of your disks you 
probably can afford better disks or a more efficient disk storage layout.
Are mirrors like nic.funet.fi running multiple gigabit WAN connections?  If so 
they could sure demand stream more than a bunch of SATA2 disks can provide.

Without performance data its a waste of time to argue against rsync

And without having had examined how rsync works on both ends it should have been a waste of time to argue the merits of rsync.

        --Arthur Corliss
          Live Free or Die

Reply via email to