On 30 Sep 2007 at 11:44, Ralf Gross wrote:

> Kern Sibbald schrieb:
> > On Sunday 30 September 2007 11:05, Ralf Gross wrote:
> > > sorry if this may be more on topic on the users list, but I'd like
> > > to hear some developer opinions before I possibly create a feature
> > > request.
> > >
> > > I'm using spooling for all jobs. Now we are beginning to backup
> > > very large amounts of data (some TB) in a single job (I'm still
> > > looking for a good way to reduce the jobs size, but this seems not
> > > to be easy).
> > >
> > > These jobs will run for >24h, with spooling enabled it will even
> > > take longer. The current spooling implementation is good if many
> > > jobs are running concurrently, but I've only few jobs running in
> > > parallel, most of the time just one job.
> > >
> > > In this case it would help if spooling and despooling to tape
> > > could happen in parallel. With just one spool file per job this
> > > might be hard or even impossible to implement.
> > >
> > > What about multiple spool files? I can use 1 TB disk space for
> > > spooling, so bacula could use 4 250 GB files for each job.
> > >
> > > This is what I'm thinking of:
> > >
> > > 1. spooling to file1
> > > 2. spooling to file2 and despooling the data from file1 to tape
> > >
> > > ...and so on. This will save time where the tape is idle because
> > > the job is spooling data.
> > >
> > > If spooling is much faster than despooling to tape and all 4
> > > spoolfiles are in use, the job just waits until the next (first)
> > > spoolfile can be used again.
> > >
> > > I'm not that familiar with the spooling implementation and
> > > spooling of attributes is also involved. Thus I don't know if this
> > > idea will result in a complete redesign of the spooling concept or
> > > if it might be possible to just be added to the current spooling
> > > implementation.
> > >
> > > Any opinions?
> > 
> > Some time ago, Eric and I discussed implementing the feature you
> > request because for users with really long running jobs like you, it
> > could give a significant performance enhancement in terms of total
> > runtime of the job.
> > 
> > At first, I thought it would be rather trivial to implement, but it
> > is in fact a a medium size project rather than something trivial.  I
> > think it would be a very good idea to implement multiple spool
> > "directories" at the same time so that the spooling can be more
> > easily spread across several different disks for even more
> > performance improvements.
> > 
> > The bottom line is that this is a project that is worth while, but
> > IMO the priority is much lower than a number of the other projects
> > which are critical to enterprise acceptance of Bacula.  However, if
> > someone would like to work on this we would be happy to provide the
> > appropriate guidance to ensure that any patch developed would be
> > accepted.
> 
> Ok, then it will be worth a feature request, even if it won't be on
> top of the projects list. Unfortunately I won't be the one that
> implements that feature, my C skills are not adequate for a project of
> that size.

Could not a similar result be gained through the use of multiple 
Storage Daemon on the same  box?  Or multiple storage devices, each 
spooling to their own location, on the same box?

-- 
Dan Langille - http://www.langille.org/
Available for hire: http://www.freebsddiary.org/dan_langille.php



-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2005.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
_______________________________________________
Bacula-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/bacula-devel

Reply via email to