Hello, 27.02.2010 19:08, Stan Meier wrote: > * Arno Lehmann <a...@its-lehmann.de>: >> 27.02.2010 14:46, Stan Meier wrote: >>> 1. Keeping configuration sane: With more than 120 servers, we need to >>> find a way to keep the configuration files readable. Our servers all >>> follow some naming scheme, for example, we got "appserver01" through >>> "appserver08" or "webcache01" through "webcache04". We think we should >>> split client configurations for each server group, so the file >>> "clientdefs/appserver.conf" would define all appserver0X clients. >>> Furthermore, most of those servers will need a default job performed >>> (/etc, /root, /opt and so on). While it's easy to reuse a "JobDefs" >>> stanza to actually define all those jobs, isn't there any way to >>> "group" those servers? Do we really have to define more than 120 jobs, >>> one for each server? >> The way to go, in my opinion, is to create the actual configuration >> dynamically - you can include script output into the configuration >> *anywhere*... now use a script that creates the actual configuration >> dynamically, starting with a template where you insert the client name. > > While you are right and creating a configuration based on scripts is > quite easy (and has added benefits, for example that you dan define > one file pool per server group), we still have to deal with 120 backup > jobs. > > But, since you didn't jump on that part of my question, I presume > there is no solution to that?
In my opinion, there's no solution because there's no problem :-) 120 jobs - plus, in the worst case, 120 copy jobs - are no problem to Bacula. You'll need to tweak concurrency, priorities, and all the (new) directives managing concurrency of several instances of a single job a bit, but basically just scheduling all you jobs at the same time, with the same priority, and letting them just should work well. For the copy jobs, I'd recommend (as others already did) to use a different priority. I'd also recommend to use a sqlquery as selection scheme. >>> 2. Backup availability: One plan would be to use a large part of the >>> 24TB available as a FilePool (or several). Each job would then write >>> it's data to that pool. A Copy job could copy the data to tape later >>> on - with the advantage that restores of recent data would be quite >>> fast since they would still be sitting on disk. Before running the >>> backup the next day, we would simply recycle those file volumes. Is >>> that a reasonable strategy? >> Yes. Properly set up, that's a very reasonable approach. You'll need >> to understand retention times and how to select jobs for migration in >> detail. > > I see several things here which we will have to look at. Please > correct me if I'm wrong or if I forgot anything: > > 1. Concurrency: We will need to investigate all the different places > in Bacula where job concurrency, concurrent pool/storage usage and > connection limits are defined and adjust them to "fit together" as > well as optimize them to the I/O operations limit of our raid storage. Yes. This is probably easier than it looks right now, because you will find that having one job per client, and as many jobs per storage daemon as possible, will serve you best. So you only have to find out what your SD can manage reasonably. > 2. Scheduling: Ideally, the copy job would start as soon as all the > backup jobs have finished. But since the Schedule resource does not > allow references to Job names, we are pretty much screwed in that > department and will probably have to resort to a fixed schedule. Schedule, for example, an hour after the backup job and use a different priority. > 3. Job selection: You already pointed that one out. Would it be as > easy as just selecting all uncopied jobs from a given pool? As mentioned above, I prefer to use a hand-tailored sql query. For example, in my office, I'm doing monthly copies of the latest full backups to tape. I select all successfully finished full backups that were started less than four weeks ago. The regular full backups are scheduled to happen during the first week of each month, and the copy job is scheduled to run the second week of the month. Works like a charm, but I would refine things a bit if I had to manage more than my 10-20 jobs. > Since all > volumes in the pool are recycle before backup starts (or while it is > running), naturally, the only uncopied jobs would be those that were > written to disk recently. Sounds reasonable. Depending on the available disk space for you, you might find that keeping as many jobs on disk as possible is more convenient, and then you couldn't rely on the above assumption anymore. > So, for now, our backup plan would be something like: > > 1. Start actual backup > 1.1 Start backup jobs in parallel > 1.2 (Possibly erase all used volumes in a given pool) > 1.3 Get data from clients, write to File pools > 1.4 wait until all backup operations have finished > 2. Copy all recent jobs to tape > 3. Backup catalog > > I don't see how we can synchronize more than 120 backup jobs yet, to > be honest. We could run backup at 9pm and the copy job at 10am to > allow for a large margin of error, but that just doesn't feel like a > proper solution. Priorities... or, and that's probably even simpler, just start copy jobs during all the daytime (reserving nights for backups). If there are no jobs to copy, nothing happens. As soon as uncopied jobs show up, they will be copied to tape. However, if you want to start a copy as soon as possible after a job is finished, I'd recommend to do this with a run script in each job. Arno > Stan > > ------------------------------------------------------------------------------ > Download Intel® Parallel Studio Eval > Try the new software tools for yourself. Speed compiling, find bugs > proactively, and fine-tune applications for parallel performance. > See why Intel Parallel Studio got high marks during beta. > http://p.sf.net/sfu/intel-sw-dev > _______________________________________________ > Bacula-users mailing list > Bacula-users@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/bacula-users > -- Arno Lehmann IT-Service Lehmann Sandstr. 6, 49080 Osnabrück www.its-lehmann.de ------------------------------------------------------------------------------ Download Intel® Parallel Studio Eval Try the new software tools for yourself. Speed compiling, find bugs proactively, and fine-tune applications for parallel performance. See why Intel Parallel Studio got high marks during beta. http://p.sf.net/sfu/intel-sw-dev _______________________________________________ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users