On 10 Feb 2006 at 5:53, [EMAIL PROTECTED] wrote: > While de-spooling attributes into my postgres database for a full backup > takes about two hours, I noticed that postgres was able to dump the > entire database and create a brand new one including indexes in just > a couple of minutes. It seems to me there's some room for improving > bacula's de-spooling speed.
Dumping the database and creating a new one is quite different from what happens during data spooling. One is the creation of a databse. When spooling attributes, you're not just throwing stuff into the database. You're checking to see if said file has already been saved, if it had, you use that reference. If not, you create that reference and then record the file size, md5 etc. This involves putting something into the path table: bacula=# select * from path limit 10; pathid | path --------+-------------------------------- 1 | /var/log/ 2 | /etc/ 3 | /etc/ppp/ 4 | /usr/local/etc/rc.d/ 5 | /usr/local/etc/ 6 | /usr/local/etc/netsaint/ 7 | /usr/local/etc/apache/ssl.crt/ 8 | /usr/local/etc/apache/ssl.crl/ 9 | /usr/local/etc/apache/ssl.csr/ 10 | /usr/local/etc/apache/ssl.key/ (10 rows) bacula=# It also involves putting something into the filename table: bacula=# select * from filename limit 10; filenameid | name ------------+----------------- 1 | lpd-errs 2 | lastlog 3 | security 4 | slip.log 5 | adduser 6 | dmesg.yesterday 7 | userlog 8 | setuid.today 9 | mount.today 10 | mount.yesterday (10 rows) bacula=# And also updating the file table: bacula=# select fileid, fileindex, jobid, pathid from file limit 10; fileid | fileindex | jobid | pathid ----------+-----------+-------+-------- 22103490 | 1 | 3028 | 18266 22103491 | 2 | 3028 | 18157 22103492 | 3 | 3028 | 212 22103493 | 4 | 3028 | 212 22103494 | 5 | 3028 | 212 22103495 | 6 | 3028 | 212 22103496 | 7 | 3028 | 212 22103497 | 8 | 3028 | 212 22103498 | 9 | 3028 | 212 22103499 | 10 | 3028 | 212 (10 rows) bacula=# > Since copying a database involves the use of the COPY command, I > was wondering what it did differently (ie locking the entire table, > transactions, etc.) to allow this speed. Could we use the copy command > when we de-spool the attributes? Could we do the same thing the copy > command it doing for speed? I don't think COPY will be useful. http://www.postgresql.org/docs/8.1/static/sql-copy.html The data we are adding to the database does not already exist. There's nothing to COPY. > Cutting my postgres update time to minutes from hours would certainly > make my backups run far smoother. I think transactions are more important here. We need to look more closely at that. -- Dan Langille : Software Developer looking for work my resume: http://www.freebsddiary.org/dan_langille.php ------------------------------------------------------- This SF.net email is sponsored by: Splunk Inc. Do you grep through log files for problems? Stop! Download the new AJAX search engine that makes searching your log files as easy as surfing the web. DOWNLOAD SPLUNK! http://sel.as-us.falkag.net/sel?cmd=lnk&kid=103432&bid=230486&dat=121642 _______________________________________________ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users