This was originally part of: Subject: Re: [BackupPC-users] FS and backuppc performance In-Reply-To: <49c29c96.4030...@gmail.com>
I am starting a new thread on this rather then hijacking the original thread. On Thu, Mar 19, 2009 at 02:27:18PM -0500, Les Mikesell wrote: > Carl Wilhelm Soderstrom wrote: > > > > Backuppc will use all the processor, ram, and disk speed you give it. I've > > not had a box where they weren't all pegged. I tend to limit concurrent > > backups to 2; maybe 3 or 4 on a really high-end box (multiple processors > > and a proven fast disk array); to control disk-head thrashing. > > One thing I think is missing from backuppc that amanda has had for years > is a concept of grouping (or excluding...) by network connectivity. I > have a mix of local and remote targets and would like to be able to > control concurrency to permit 1 or 2 local backups plus separate limits > for each independent WAN path. I would like this too. I currently use semaphore to create a set of available slots and lock the slot during the backup using a pre/post dump command. Most of our hosts are named after the site they are at: box1.site1.example.com box1.site2.example.com box2.site3.example.com etc. With semaphore I create one resource pool for each remote site based on how many parallel backups I am willing to allow from that site: Semaphore site1 has 20 resources. Resource 0 is available. Resource 1 is available. Resource 2 is available. Resource 3 is available. Resource 4 is available. Resource 5 is available. Resource 6 is available. Resource 7 is available. Resource 8 is available. Resource 9 is available. Resource 10 is available. Resource 11 is available. Resource 12 is available. Resource 13 is available. Resource 14 is available. Resource 15 is available. Resource 16 is available. Resource 17 is available. Resource 18 is available. Resource 19 is available. Semaphore site2 has 2 resources. Resource 0 is available. Resource 1 is taken by PID 29224. Semaphore site3 has 2 resources. Resource 0 is available. Resource 1 is available. Using some home written scripts (runUserCmds, CheckQueue), I set: $Conf{DumpPreUserCmd} = '/etc/BackupPC/bin/runUserCmds -t $type \ -c $client -H $host -P $cmdType CheckQueue'; $Conf{DumpPostUserCmd} = '/etc/BackupPC/bin/runUserCmds -t $type \ -c $client -H $host -P $cmdType CheckQueue'; which locks one of the available semaphores if it's a PreUserCmd and unlocks if it's a PostUserCmd. If it can't lock a semaphore, it exits with exit code 1, and because: $Conf{UserCmdCheckStatus} = 1; is enabled in the config, the host is skipped for that cycle. So it is doable in BackupPC without any core changes and the upside of this is that you can group by factors other than remote site. The downside is that the log file shows: 2009-03-02 08:50:07 DumpPreUserCmd returned error status 256... exiting every time the host is scheduled to be backed up but is unable to reserve a slot. Also you can have backups fail when they are starved for resources. For example: One thing I have to watch is bandwidth usage. My plan for handling that is to allocate bandwidth in 64KB/s (512Kb/s) chunks, and use the CheckQueue script to determine what the bw limit is for the given host (by scanning /etc/BackupPC/pc/hostname.pl or config.pl). Then I just reserve the proper number of chunks to reserve that bandwidth. So I have a site that is bw limited to 2Mb/s (approx 4 chunks), I will allocate 4 resources in the pool for the site. If one of the hosts (one_mb) at that site has a bwlimit of 1Mb/s, then it won't run unless there are at least 2 free resources. So no more than 2 512Mb/sec hosts can be running. Semaphore does support fair queing where nothing queued after one_mb will run till one_mb has run. This guarantees that one_mb will get run at some point. However this doesn't work with BackupPC's queing mechanism. With $Conf{MaxBackups} = 8; to keep reasonable on the system, any backup that is run and queued waiting for a resource uses one of these 8 slots. So I could have 7 jobs waiting on a resource for site2, but yet backups for site1 and site3 have plenty of resources available. the only way I can see around this is to set: $Conf{MaxBackups} = 10000; or some such number, and have an additional queue: Semaphore actual_running_backups has 8 resources. Resource 0 is available. Resource 1 is taken by PID 29224. Resource 2 is available. Resource 3 is available. Resource 4 is available. Resource 5 is available. Resource 6 is available. Resource 7 is available. so basically BackupPC will queue a host that needs a backup, and the control of how many are actually running is totally external to BackupPC. I haven't tried this yet, but I think it will work. (BTW, semaphore is a ksh impletation of semaphores written by John Spurgeon under the GPL. I have made a fixed copy at: http://www.cs.umb.edu/~rouilj/shell_semaphore as the original magazine it was published in no longer has it available.) -- -- rouilj John Rouillard System Administrator Renesys Corporation 603-244-9084 (cell) 603-643-9300 x 111 ------------------------------------------------------------------------------ Apps built with the Adobe(R) Flex(R) framework and Flex Builder(TM) are powering Web 2.0 with engaging, cross-platform capabilities. Quickly and easily build your RIAs with Flex Builder, the Eclipse(TM)based development software that enables intelligent coding and step-through debugging. Download the free 60 day trial. http://p.sf.net/sfu/www-adobe-com _______________________________________________ BackupPC-users mailing list BackupPC-users@lists.sourceforge.net List: https://lists.sourceforge.net/lists/listinfo/backuppc-users Wiki: http://backuppc.wiki.sourceforge.net Project: http://backuppc.sourceforge.net/