On 30 Sep 2007 at 11:44, Ralf Gross wrote: > Kern Sibbald schrieb: > > On Sunday 30 September 2007 11:05, Ralf Gross wrote: > > > sorry if this may be more on topic on the users list, but I'd like > > > to hear some developer opinions before I possibly create a feature > > > request. > > > > > > I'm using spooling for all jobs. Now we are beginning to backup > > > very large amounts of data (some TB) in a single job (I'm still > > > looking for a good way to reduce the jobs size, but this seems not > > > to be easy). > > > > > > These jobs will run for >24h, with spooling enabled it will even > > > take longer. The current spooling implementation is good if many > > > jobs are running concurrently, but I've only few jobs running in > > > parallel, most of the time just one job. > > > > > > In this case it would help if spooling and despooling to tape > > > could happen in parallel. With just one spool file per job this > > > might be hard or even impossible to implement. > > > > > > What about multiple spool files? I can use 1 TB disk space for > > > spooling, so bacula could use 4 250 GB files for each job. > > > > > > This is what I'm thinking of: > > > > > > 1. spooling to file1 > > > 2. spooling to file2 and despooling the data from file1 to tape > > > > > > ...and so on. This will save time where the tape is idle because > > > the job is spooling data. > > > > > > If spooling is much faster than despooling to tape and all 4 > > > spoolfiles are in use, the job just waits until the next (first) > > > spoolfile can be used again. > > > > > > I'm not that familiar with the spooling implementation and > > > spooling of attributes is also involved. Thus I don't know if this > > > idea will result in a complete redesign of the spooling concept or > > > if it might be possible to just be added to the current spooling > > > implementation. > > > > > > Any opinions? > > > > Some time ago, Eric and I discussed implementing the feature you > > request because for users with really long running jobs like you, it > > could give a significant performance enhancement in terms of total > > runtime of the job. > > > > At first, I thought it would be rather trivial to implement, but it > > is in fact a a medium size project rather than something trivial. I > > think it would be a very good idea to implement multiple spool > > "directories" at the same time so that the spooling can be more > > easily spread across several different disks for even more > > performance improvements. > > > > The bottom line is that this is a project that is worth while, but > > IMO the priority is much lower than a number of the other projects > > which are critical to enterprise acceptance of Bacula. However, if > > someone would like to work on this we would be happy to provide the > > appropriate guidance to ensure that any patch developed would be > > accepted. > > Ok, then it will be worth a feature request, even if it won't be on > top of the projects list. Unfortunately I won't be the one that > implements that feature, my C skills are not adequate for a project of > that size.
Could not a similar result be gained through the use of multiple Storage Daemon on the same box? Or multiple storage devices, each spooling to their own location, on the same box? -- Dan Langille - http://www.langille.org/ Available for hire: http://www.freebsddiary.org/dan_langille.php ------------------------------------------------------------------------- This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2005. http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ _______________________________________________ Bacula-devel mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/bacula-devel
