The real question is why your backups and your nightly runs are taking
so long to complete.

One reason might be network bandwidth.  In which case, you're probably
stuck.  I'm gonna guess this isn't the problem though.  If you're on a LAN
almost certainly this isn't it.  It's easy to measure this.

Another reason might be CPU on the backup server.  This is easy enough to
monitor.  I kind of doubt this is your problem either.

This leaves us with resource constraints on the backup clients - this
is possible on busy file servers, but doesn't explain why nightly jobs take
so long.

My bet is that your problem is disk I/O on the backup server.  

BackupPC's (very clever, a neat hack and i love it) pooling scheme
has a problem.  All those hard links all over the place mean that your
disk heads need to seek a lot.  This is a real bummer and really slows
down your disk I/O, especially with ATA (including most SATA) disks.

all parallel ATA & most serial ATA disks can only handle a single request
at a time.  this means they service a request, then get another and
service it.  This means that each and every request, even for a tiny
block of data, needs a rotation of the disk to occur (probably on average
half a rotation?).  

This shouldn't make much of a difference on the actual data transfer stage
of your backups, but it will make your backuppc_link and backuppc_nightly
processes take forever.

SCSI disks with tagged command queuing (most of them, i think) and SATA
disks that support Native Command Queuing (definitely not most of them, and
your OS & maybe your SATA card need to support it too) can potentially
service many more requests, because the disk can service the requests 
out of order.  

Seeks happen much faster than a single rotation of a disk, especially on 
7200rpm (and 5400rpm, egads) disks.  

Another possibility for slow i/o is RAID5.  Writes to RAID5 are significantly
slower than writes to RAID1 (or simple unraid'd disks).

RAID10 is better than RAID5 but is probably not optimal either.  For this
kind of workload, you probably do not want RAID 10, but rather you want
to have multiple RAID 1 devices concatenated together (what linux md
calls LINEAR and solaris disksuite calls CONCAT).  This allows the disk
pairs to seek independently of one another.  RAID10 is designed for high
raw throughput, not to optimize seeks.


With all this in mind, if you think this is causing your problems,
there are a few ways to improve your situation.

1) the most obvious is to get disks that support command queueing.

2) the next most obvious is to get rid of RAID5 or RAID10 and move
to concatenated RAID1's.    

 * Getting rid of RAID5 might be more bang for the buck than getting 
   command queuing.

3) Even if you don't need the extra space you may find that things
improve just by adding extra RAID1 devices to your concatenated disk.
This will spread the seeks out a bit.  Ideally, ensure that all new
stuff gets written to the new disk by doing something like this:

 - Fill the existing disk with empty files
 - add the new disk at the end of your device
 - grow your filesystem to match the size of your device.

Of course, during this process, pray that it all works and that you
don't whack your backup store.  Back it up first if you can.  Hint,
standard backup tools like tar or rsync or dump won't work worth a damn
on the backuppc store, for the same kind of reasons that i discuss above,
but you can just dd the raw device to a backup disk or tape.






On Wed, Jan 31, 2007 at 04:55:12PM -0600, Jason Hughes wrote:
> James Ward wrote:
> > it looks like they're going to all get started at the same time again  
> > due to waiting on the nightly process to complete after the longest  
> > of these backups.
> >
> > Does version 3 get me away from this scenario?  
> Yes.  Version 3 doesn't need nightly processing to be mutually exclusive 
> to backups.  They should fire off whenever they're due to start.  
> However, if you having trouble with one taking more than 24 hours, and 
> the period between backups is less than this, it will pretty much always 
> be backing up those machines and falling further behind.  Since you 
> mention having multiple backup servers, perhaps putting the largest file 
> server hosts onto different backuppc servers would help?
> > And on my other  
> > server that's backing up 200 machines (some remote), will it be able  
> > to just backup 24x7 with version 3?  Right now it spends most of  
> > every day from the wee hours until the afternoon doing the nightly  
> > cleanup.
> >   
> Again, this should be alleviated in version 3.  Even if the processing 
> is still lengthy, it should not bunch up your backups anymore, so 
> theoretically, the same server has greater capacity in version 3 than in 
> version 2.
> 
> JH
> 
> -------------------------------------------------------------------------
> Using Tomcat but need to do more? Need to support web services, security?
> Get stuff done quickly with pre-integrated technology to make your job easier.
> Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
> http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
> _______________________________________________
> BackupPC-users mailing list
> BackupPC-users@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/backuppc-users
> http://backuppc.sourceforge.net/


danno
--
Dan Pritts, System Administrator
Internet2
office: +1-734-352-4953 | mobile: +1-734-834-7224

-------------------------------------------------------------------------
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier.
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
_______________________________________________
BackupPC-users mailing list
BackupPC-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/backuppc-users
http://backuppc.sourceforge.net/

Reply via email to