Hi,

23.08.2007 18:27,, [EMAIL PROTECTED] wrote::
> 
> I've been following this thread with some interest, but with little to
> contribute directly. However, the thread does bring up some questions that 
> I've 
> had about the concept of pruning the bacula database.
> 
> It seems to me that the database should only be pruned when files are added
> (ie., when a backup succeeds), not merely because a volume was accessed for
> reading.

Yes. Note that, currently, I don't know if pruning happens when 
volumes are needed for restores, though. I doubt that, but haven't 
verified.

The current realeased version, by the way, uses a much less aggressive 
pruning method.

> In other realms, such as the algorithm that bacula employs to decide which
> volume to use, bacula tries very hard to keep data on the backup media (and in
> the catalog) for as long as possible. The idea that accessing a volume (ie.,
> doing a restore) can prune records seems to be inconsistent with the 
> philosophy
> of keeping the data whenever possible.

Yes.

> In our environment, it's pretty common that a request to restore files from
> directory "X" is then followed by another request to restore files from
> directory "Y", when the user realizes that they didn't get all the files they 
> needed. If the first restore caused the database records for a multi-TB
> backup that spans 3 or 4 tapes to be pruned, then the second restore will be
> extremely slow and painful.
> 
> It seems that the disk space resource to store a large database catalog is 
> much
> "cheaper" than the time resource for a system administrator to bscan tapes or
> the time resource for an end user to wait for a restore.

That depends, but obviously, you know your setup better than we do. 
Most of the time, when you back up user-generated data, especially 
when you have many restore requests, I would agree.

> What are the settings required to configure bacula so that the catalog 
> retention
> period is the same as the data retention? In other words, as long as the data
> exists on the backup media, the database catalog records exist as well?

Quite simple: Make your job and file retention times at least as long 
as your volume retention times, and you've got what you need. Files 
and jobs will then only be pruned when volumes are recycled.

> Obviously, the database size may expand a great deal.

That's definitely the case... "a great deal" could, depending on your 
equipment, mean that you need lots of new disks, or even a new storage 
system, a new storage sever, a SAN, or simply have a few GB less space 
on your backup servers hard disk.

>  I know very, very little
> about databases, but would there be significant performance or database
> stability advantages to moving the old records into distinct tables?

Performance - typically not significant, because you could do that 
when the system is idle otherwise.
Stability - if your database became unstable in such a situation I'd 
look for a new one :-)

> In other
> words, instead of pruning database records (while the data still exists on the
> backup media), move those records to a separate table or separate database? 
> This
> would keep the "hot" database of the most current records at a smaller size,
> while letting the database of older records grow. That distinction could give
> system administrators & DBAs the choice of what physical disks are used to 
> store
> each database, etc.

Possible. The database gurus here will have a better answer, I 
think... my impression is that the overall gains will not be too 
impressive.
Basically, you still have to manage all the data - the "historical" 
catalog needs to be pruned when you recycle volumes, for example.
Most queries in backup operations would not benefit from such an 
operation, because you will still need some relationships between the 
"current" and the "historic" database.

It might help if, for informational queries or during restores, only a 
limited set of data needs to be accessed. But this can be achieved by 
using multiple catalogs, separated per client, today.

Of course I might be totally wrong...

> Does anyone know the practical database size limits for MySQL and Postgres?

MySQL can handle databases with many hundreds of GB. I don't have 
numbers or references I can share, though.

You need beefy hardware, of course.

PostgreSQL can manage tables with up to 32 TB, according to their web 
site. That should be enough for lots of files in the catalog :-)

> Would typical bacula installations (if there is such a thing) be in danger of
> reaching those limits if database records were retained as long as the data
> itself?

All the installations I know set some limits to the catalog size, 
typically by either keeping volumes only a limited time, or by pruning 
files from long-time archival jobs, so I don't know for sure.

At least I think it's safe to say that typical installations don't 
reach these limits, because that would be discussed here.

Arno

> 
> Thanks,
> 
> Mark
> 
> ----
> Mark Bergman                      [EMAIL PROTECTED]
> System Administrator
> Section of Biomedical Image Analysis             215-662-7310
> Department of Radiology,           University of Pennsylvania
> 
> http://pgpkeys.pca.dfn.de:11371/pks/lookup?search=mark.bergman%40.uphs.upenn.edu
> 
> 
> 
> -------------------------------------------------------------------------
> This SF.net email is sponsored by: Splunk Inc.
> Still grepping through log files to find problems?  Stop.
> Now Search log events and configuration files using AJAX and a browser.
> Download your FREE copy of Splunk now >>  http://get.splunk.com/
> _______________________________________________
> Bacula-users mailing list
> Bacula-users@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/bacula-users

-- 
Arno Lehmann
IT-Service Lehmann
www.its-lehmann.de

-------------------------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >>  http://get.splunk.com/
_______________________________________________
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users

Reply via email to