The real question is why your backups and your nightly runs are taking so long to complete.
One reason might be network bandwidth. In which case, you're probably stuck. I'm gonna guess this isn't the problem though. If you're on a LAN almost certainly this isn't it. It's easy to measure this. Another reason might be CPU on the backup server. This is easy enough to monitor. I kind of doubt this is your problem either. This leaves us with resource constraints on the backup clients - this is possible on busy file servers, but doesn't explain why nightly jobs take so long. My bet is that your problem is disk I/O on the backup server. BackupPC's (very clever, a neat hack and i love it) pooling scheme has a problem. All those hard links all over the place mean that your disk heads need to seek a lot. This is a real bummer and really slows down your disk I/O, especially with ATA (including most SATA) disks. all parallel ATA & most serial ATA disks can only handle a single request at a time. this means they service a request, then get another and service it. This means that each and every request, even for a tiny block of data, needs a rotation of the disk to occur (probably on average half a rotation?). This shouldn't make much of a difference on the actual data transfer stage of your backups, but it will make your backuppc_link and backuppc_nightly processes take forever. SCSI disks with tagged command queuing (most of them, i think) and SATA disks that support Native Command Queuing (definitely not most of them, and your OS & maybe your SATA card need to support it too) can potentially service many more requests, because the disk can service the requests out of order. Seeks happen much faster than a single rotation of a disk, especially on 7200rpm (and 5400rpm, egads) disks. Another possibility for slow i/o is RAID5. Writes to RAID5 are significantly slower than writes to RAID1 (or simple unraid'd disks). RAID10 is better than RAID5 but is probably not optimal either. For this kind of workload, you probably do not want RAID 10, but rather you want to have multiple RAID 1 devices concatenated together (what linux md calls LINEAR and solaris disksuite calls CONCAT). This allows the disk pairs to seek independently of one another. RAID10 is designed for high raw throughput, not to optimize seeks. With all this in mind, if you think this is causing your problems, there are a few ways to improve your situation. 1) the most obvious is to get disks that support command queueing. 2) the next most obvious is to get rid of RAID5 or RAID10 and move to concatenated RAID1's. * Getting rid of RAID5 might be more bang for the buck than getting command queuing. 3) Even if you don't need the extra space you may find that things improve just by adding extra RAID1 devices to your concatenated disk. This will spread the seeks out a bit. Ideally, ensure that all new stuff gets written to the new disk by doing something like this: - Fill the existing disk with empty files - add the new disk at the end of your device - grow your filesystem to match the size of your device. Of course, during this process, pray that it all works and that you don't whack your backup store. Back it up first if you can. Hint, standard backup tools like tar or rsync or dump won't work worth a damn on the backuppc store, for the same kind of reasons that i discuss above, but you can just dd the raw device to a backup disk or tape. On Wed, Jan 31, 2007 at 04:55:12PM -0600, Jason Hughes wrote: > James Ward wrote: > > it looks like they're going to all get started at the same time again > > due to waiting on the nightly process to complete after the longest > > of these backups. > > > > Does version 3 get me away from this scenario? > Yes. Version 3 doesn't need nightly processing to be mutually exclusive > to backups. They should fire off whenever they're due to start. > However, if you having trouble with one taking more than 24 hours, and > the period between backups is less than this, it will pretty much always > be backing up those machines and falling further behind. Since you > mention having multiple backup servers, perhaps putting the largest file > server hosts onto different backuppc servers would help? > > And on my other > > server that's backing up 200 machines (some remote), will it be able > > to just backup 24x7 with version 3? Right now it spends most of > > every day from the wee hours until the afternoon doing the nightly > > cleanup. > > > Again, this should be alleviated in version 3. Even if the processing > is still lengthy, it should not bunch up your backups anymore, so > theoretically, the same server has greater capacity in version 3 than in > version 2. > > JH > > ------------------------------------------------------------------------- > Using Tomcat but need to do more? Need to support web services, security? > Get stuff done quickly with pre-integrated technology to make your job easier. > Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo > http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642 > _______________________________________________ > BackupPC-users mailing list > BackupPC-users@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/backuppc-users > http://backuppc.sourceforge.net/ danno -- Dan Pritts, System Administrator Internet2 office: +1-734-352-4953 | mobile: +1-734-834-7224 ------------------------------------------------------------------------- Using Tomcat but need to do more? Need to support web services, security? Get stuff done quickly with pre-integrated technology to make your job easier. Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642 _______________________________________________ BackupPC-users mailing list BackupPC-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/backuppc-users http://backuppc.sourceforge.net/