On Tuesday 22 March 2005 22:00, Jeff McCune wrote:
> Roland Arendes wrote:
> > Hi
> >
> > This will speed up your dbcheck (and tree building before a restore)
> > drastically:
> >
> > Mysql
> >
> >>use bacula;
> >>ALTER TABLE File ADD INDEX (JobId, PathId, FilenameId);
> >
> > Wait for the index creation to finish (takes some time on a huge db).
>
> Index creation took 20 minutes, and dbcheck -f ran for 27 hours before I
> killed it and wiped the entire machine to upgrade to RHEL4.
>
> This brings me to an interesting question regarding bacula and
> scalability.  I've seen multiple people comment that 1.5 TB of data
> isn't a problem for bacula, but I have my doubts after this episode.
>
> Basically, the dbcheck program was operating as it should.  However, the
> operation I asked it to perform ran for over *three days* without sign
> of completing anytime soon.  After performing the proposed optimization,
> the operation still ran for 26+ hours.  This effectively makes the
> operation useless.  An operation that runs on a locked catalog for over
> 26 hours is not acceptable in a production backup system.

A couple of points here.  Normally, one should not have to run the dbcheck 
program. It is *not* something to run every day, but something to run when 
your database is screwed up.  If both Bacula and your underlying database 
program function correctly, then you should never have to run dbcheck. In 5 
years, I think I have run it twice on my db -- both times because of a Bacula 
bug which is now well behind us.

The other point is that I have made no attempt to optimize the longest running 
part of the dbcheck code. Some smart DB guy could undoubtedly speed it up by 
at least a factor of 10 and possibly a factor of 100.  The pruning algorithm 
in Bacula core code has the potential to run for equally long times, but it 
is much more sophisticated (i.e. lots more lines of code), and so it breaks 
the task up into smaller chunks, uses *lots* of memory, and if there is too 
much work to do it only prunes a part each time it is called.

>
> Currently, a Full backup of my /home volume is 480GB in just over 4
> million files.  That's just /home.  In addition, I have 15 other servers
> doing special things and about 40 workstations to backup.  I'll estimate
> a full backup of our site at 2500 to 3000 GB of data in 10 *million* files.
>
> We're testing our new NFS server which will replace our current 480 GB
> volume with a 1.5 TB volume.  I can reasonably expect the number of
> files I backup to at least double in the next few months, and eventually
> triple.
>
> If I want to keep 2 fulls in the catalog at any given time, I should
> expect *at least* 50 million file records at any given time.
>
> Am I naive trying to cram this much information into one MySQL database?
>   Should I be splitting this up across multiple catalogs?  

I cannot personally answer these questions, but I think you can by running 
some tests.  One of the great strengths of Bacula is using the SQL database 
-- it is something that no other backup program does (at least to my 
knowledge) at the same time, SQL DB engines are not known for their speed, so 
there are bound to be some growing problems until we figure out how to 
configure and run such gigantic DBs. In the short run you might try seriously 
thinking about how you can reduce your retention periods.

>   Should I 
> investigate other optimizations to deal with this volume of information
> inside the catalog?

Yes, as well as optimizations to dbcheck if you really feal the need to run 
it.

>
> It may be that bacula isn't yet ready to manage a catalog of this
> volume, which is perfectly fine.  

Perhaps, I have always worried about this, but apparently a good number of 
users have found workarounds -- one even backs up a million or more files 
each job!

The other end of this, which you have not mentioned, is the time for the 
restore command to build the in memory tree -- that is something that really 
needs more work.

> I'm hoping to get some feedback on the 
> topic of scalability, as I haven't really seen it mentioned much on this
> mailing list or in the documentation.  It seems like a big issue as
> bacula matures and becomes a viable enterprise solution.
>
> What are bacula's limitations in terms of long running operations versus
> catalog size?  Are they linearly related, exponentially, etc...?

If someone knows the answer to this, he knows a lot more than I do, and I 
would like to hear the answer, and to document it.

I think that you probably did what a lot of users did not do -- you read the 
manual, and unfortunately fell into the Mulitple Connections problem, and I 
am sorry for that. The documentation did say that the directive had not been 
tested -- in any case, I have now removed the documentation from the manual, 
and the ability to enable it in the code.

-- 
Best regards,

Kern


-------------------------------------------------------
This SF.net email is sponsored by: 2005 Windows Mobile Application Contest
Submit applications for Windows Mobile(tm)-based Pocket PCs or Smartphones
for the chance to win $25,000 and application distribution. Enter today at
http://ads.osdn.com/?ad_id=6882&alloc_id=15148&op=click
_______________________________________________
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users

Reply via email to