Hi, 23.08.2007 18:27,, [EMAIL PROTECTED] wrote:: > > I've been following this thread with some interest, but with little to > contribute directly. However, the thread does bring up some questions that > I've > had about the concept of pruning the bacula database. > > It seems to me that the database should only be pruned when files are added > (ie., when a backup succeeds), not merely because a volume was accessed for > reading.
Yes. Note that, currently, I don't know if pruning happens when volumes are needed for restores, though. I doubt that, but haven't verified. The current realeased version, by the way, uses a much less aggressive pruning method. > In other realms, such as the algorithm that bacula employs to decide which > volume to use, bacula tries very hard to keep data on the backup media (and in > the catalog) for as long as possible. The idea that accessing a volume (ie., > doing a restore) can prune records seems to be inconsistent with the > philosophy > of keeping the data whenever possible. Yes. > In our environment, it's pretty common that a request to restore files from > directory "X" is then followed by another request to restore files from > directory "Y", when the user realizes that they didn't get all the files they > needed. If the first restore caused the database records for a multi-TB > backup that spans 3 or 4 tapes to be pruned, then the second restore will be > extremely slow and painful. > > It seems that the disk space resource to store a large database catalog is > much > "cheaper" than the time resource for a system administrator to bscan tapes or > the time resource for an end user to wait for a restore. That depends, but obviously, you know your setup better than we do. Most of the time, when you back up user-generated data, especially when you have many restore requests, I would agree. > What are the settings required to configure bacula so that the catalog > retention > period is the same as the data retention? In other words, as long as the data > exists on the backup media, the database catalog records exist as well? Quite simple: Make your job and file retention times at least as long as your volume retention times, and you've got what you need. Files and jobs will then only be pruned when volumes are recycled. > Obviously, the database size may expand a great deal. That's definitely the case... "a great deal" could, depending on your equipment, mean that you need lots of new disks, or even a new storage system, a new storage sever, a SAN, or simply have a few GB less space on your backup servers hard disk. > I know very, very little > about databases, but would there be significant performance or database > stability advantages to moving the old records into distinct tables? Performance - typically not significant, because you could do that when the system is idle otherwise. Stability - if your database became unstable in such a situation I'd look for a new one :-) > In other > words, instead of pruning database records (while the data still exists on the > backup media), move those records to a separate table or separate database? > This > would keep the "hot" database of the most current records at a smaller size, > while letting the database of older records grow. That distinction could give > system administrators & DBAs the choice of what physical disks are used to > store > each database, etc. Possible. The database gurus here will have a better answer, I think... my impression is that the overall gains will not be too impressive. Basically, you still have to manage all the data - the "historical" catalog needs to be pruned when you recycle volumes, for example. Most queries in backup operations would not benefit from such an operation, because you will still need some relationships between the "current" and the "historic" database. It might help if, for informational queries or during restores, only a limited set of data needs to be accessed. But this can be achieved by using multiple catalogs, separated per client, today. Of course I might be totally wrong... > Does anyone know the practical database size limits for MySQL and Postgres? MySQL can handle databases with many hundreds of GB. I don't have numbers or references I can share, though. You need beefy hardware, of course. PostgreSQL can manage tables with up to 32 TB, according to their web site. That should be enough for lots of files in the catalog :-) > Would typical bacula installations (if there is such a thing) be in danger of > reaching those limits if database records were retained as long as the data > itself? All the installations I know set some limits to the catalog size, typically by either keeping volumes only a limited time, or by pruning files from long-time archival jobs, so I don't know for sure. At least I think it's safe to say that typical installations don't reach these limits, because that would be discussed here. Arno > > Thanks, > > Mark > > ---- > Mark Bergman [EMAIL PROTECTED] > System Administrator > Section of Biomedical Image Analysis 215-662-7310 > Department of Radiology, University of Pennsylvania > > http://pgpkeys.pca.dfn.de:11371/pks/lookup?search=mark.bergman%40.uphs.upenn.edu > > > > ------------------------------------------------------------------------- > This SF.net email is sponsored by: Splunk Inc. > Still grepping through log files to find problems? Stop. > Now Search log events and configuration files using AJAX and a browser. > Download your FREE copy of Splunk now >> http://get.splunk.com/ > _______________________________________________ > Bacula-users mailing list > Bacula-users@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/bacula-users -- Arno Lehmann IT-Service Lehmann www.its-lehmann.de ------------------------------------------------------------------------- This SF.net email is sponsored by: Splunk Inc. Still grepping through log files to find problems? Stop. Now Search log events and configuration files using AJAX and a browser. Download your FREE copy of Splunk now >> http://get.splunk.com/ _______________________________________________ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users