On 12/21/10 18:44, Randy Syring wrote: > Did you do testing on the local machine only ( i.e. do tests from one > folder to another)? Doing comparisons on the older server and then the > new server should show you if its just the server to blaim. If that > proves solid, then I would look at network stuff. Maybe use a different > NIC, etc. I know you said you did testing with --force that made it > work faster and you therefore ruled out the network. But IMO, that > isn't thorough enough, I want to remove the network completely from the > equation. Your NIC could be having problems with something that only > happens when doing the diff algorithm (or something else weird).
Hi Randy Ahem. I guess sometimes one indeed should go back to basics and ask these questions. As it turns out, a local-to-local test indicates there isn't a problem; the new server outperforms the old server in both tests by some 30-40%. Shame on us for not testing that. Albeit the test was done with a tiny directory so results may not be representative. I did other tests immediately: I wanted to find out whether an rsync of the network data to local storage and an rdiff-backup of that local data runs *faster* than an rdiff-backup of the network data. If so that could be a suitable workaround, also it would point to the problem. I'm still waiting for the results but here is the obvious reason we disregarded the network as possible cause of the slowdowns: the rsync of 125 GB of network data to local storage took no more than 2 minute 40 seconds. (sync to pre-existing data of course!). It is therefore understandable we said to ourselves "Okay, there's certainly no bottleneck there...". I have the figures of these tests now. First test: rsync of data (target dir already populated) over the network (where "data" consists of 125 GB worth of files) : 2m40.494s Rdiff-backup of that local dir to a fresh _empty_ repo: 58m17.936s Rdiff-backup of that same dir to preexisting/populated repo 2m25.840s And to be able to compare apples to apples, copy of local src dir to local empty dst dir using rsync: 59m42.792s So, I still have to reach hard conclusions but some things are obvious: rdiff-backup on local resources performs well. On par with rsync for unpopulated target dirs, and very very fast for existing repos. So, a combination of running rsync in step #1 and rdiff-backup in step #2 would get the job done in around 5 minutes instead of multiple hours. Strange, but a result we can probably live with. We have more than enough storage space to justify storing this data twice (12 TB). > Maybe this isn't the right track at all, or maybe you have even tried > this already, but its an idea anyway. :) As it turns out, it was a very very good idea! Thanks, Maarten > -------------------------------------- > Randy Syring > Intelicom > Direct: 502-276-0459 > Office: 502-212-9913 > > For the wages of sin is death, but the > free gift of God is eternal life in > Christ Jesus our Lord (Rom 6:23) > > > On 12/21/2010 12:25 PM, Maarten J H van den Berg wrote: >> Hi there, >> >> I'm looking for help with a very severe performance problem using >> rdiff-backup. Basically we've bought a new server with more and faster >> resources to replace a 4-year old one. However, rdiff-backup refuses to >> perform on the new server. >> >> Various tests show that generic disk accesses are much faster on the new >> server, and it has more memory and faster CPU's. >> Nevertheless we see a very severe slowdown when rdiff-backup is making >> an incremental backup, up to four or even tenfold or more times slower >> than it used to be before on the old server. >> >> I gathered some numbers but they differ wildly depending on the source >> material / dir. Maybe it is therefore better to leave specific numbers >> for what they are for now and focus on the big picture: our old server >> did an rdiff-backup of a remote storage server, worth some 300 GB, in >> typically under an hour or so. The new server running with the same >> source dataset typically starts at night and is still running the next >> morning, into the afternoon even(!). >> >> When we do trials on a tiny subset of the data we get varying results. >> Some data takes eightfold the amount of time, some is within a +80% >> margin. So that is not very dependable, alas. >> Still, what is observable is that any initial backup run (with --force) >> runs significantly faster on the new server. Any differential run >> afterwards is slower than on the original server. I feel this proves >> there are no performance bottlenecks in the network, disks, filesystems >> etc of the server. >> >> This is fully repeatable and a real time tail on the log file shows no >> one file is to blame, it is just the overall speed that's slow. >> >> The new server runs rdiff-backup 1.2.8, the old one 1.0.5. Downgrading >> the new server to 1.0.5 makes things a bit interesting: that speeds it >> up a bit, but still a fair bit slower than the original. >> >> During investigation we experimented with different filesystems, testing >> local versus remote backups, looking at compile flags and versions of >> librsync and python, but we have had no success there. >> All versions use librsync 0.9.7 All OS'es are Gentoo, 32 bit. >> >> We did search for workarounds like spawning multiple parallel >> rdiff-backup processes dealing each with separate directories so as to >> fully use the eight CPU cores. Sadly even that speedup is still not >> resulting in an acceptable overall speed. We compared compilation flags, >> options and parameters but nothing obvious struck us in that regard. >> >> >> I'm basically out of ideas. I was tearing my hair out over this until a >> couple of days ago (yes, well, I'm bald now). I turn to this list as a >> last resort. Can anyone help debugging this strange problem please ? >> >> Regards, thanks for listening, >> >> Maarten >> >> >> _______________________________________________ >> rdiff-backup-users mailing list at rdiff-backup-users@nongnu.org >> http://lists.nongnu.org/mailman/listinfo/rdiff-backup-users >> Wiki URL: >> http://rdiff-backup.solutionsfirst.com.au/index.php/RdiffBackupWiki >> > > _______________________________________________ > rdiff-backup-users mailing list at rdiff-backup-users@nongnu.org > http://lists.nongnu.org/mailman/listinfo/rdiff-backup-users > Wiki URL: > http://rdiff-backup.solutionsfirst.com.au/index.php/RdiffBackupWiki -- Maarten J H van den Berg Kratz business solutions bv Some people, when confronted with a problem, think "I know, I'll use regular expressions." Now they have two problems. -- Jamie Zawinski _______________________________________________ rdiff-backup-users mailing list at rdiff-backup-users@nongnu.org http://lists.nongnu.org/mailman/listinfo/rdiff-backup-users Wiki URL: http://rdiff-backup.solutionsfirst.com.au/index.php/RdiffBackupWiki