On 10 Feb 2006 at 5:53, [EMAIL PROTECTED] wrote:

> While de-spooling attributes into my postgres database for a full backup
> takes about two hours, I noticed that postgres was able to dump the
> entire database and create a brand new one including indexes in just
> a couple of minutes. It seems to me there's some room for improving
> bacula's de-spooling speed.

Dumping the database and creating a new one is quite different from 
what happens during data spooling.  One is the creation of a databse. 
When spooling attributes, you're not just throwing stuff into the 
database.  You're checking to see if said file has already been 
saved, if it had, you use that reference.  If not, you create that 
reference and then record the file size, md5 etc.

This involves putting something into the path table:

bacula=# select * from path limit 10;
 pathid |              path
--------+--------------------------------
      1 | /var/log/
      2 | /etc/
      3 | /etc/ppp/
      4 | /usr/local/etc/rc.d/
      5 | /usr/local/etc/
      6 | /usr/local/etc/netsaint/
      7 | /usr/local/etc/apache/ssl.crt/
      8 | /usr/local/etc/apache/ssl.crl/
      9 | /usr/local/etc/apache/ssl.csr/
     10 | /usr/local/etc/apache/ssl.key/
(10 rows)

bacula=#

It also involves putting something into the filename table:

bacula=# select * from filename limit 10;
 filenameid |      name
------------+-----------------
          1 | lpd-errs
          2 | lastlog
          3 | security
          4 | slip.log
          5 | adduser
          6 | dmesg.yesterday
          7 | userlog
          8 | setuid.today
          9 | mount.today
         10 | mount.yesterday
(10 rows)

bacula=#

And also updating the file table:

bacula=# select fileid, fileindex, jobid, pathid from file limit 10;
  fileid  | fileindex | jobid | pathid
----------+-----------+-------+--------
 22103490 |         1 |  3028 |  18266
 22103491 |         2 |  3028 |  18157
 22103492 |         3 |  3028 |    212
 22103493 |         4 |  3028 |    212
 22103494 |         5 |  3028 |    212
 22103495 |         6 |  3028 |    212
 22103496 |         7 |  3028 |    212
 22103497 |         8 |  3028 |    212
 22103498 |         9 |  3028 |    212
 22103499 |        10 |  3028 |    212
(10 rows)

bacula=#

> Since copying a database involves the use of the COPY command, I
> was wondering what it did differently (ie locking the entire table,
> transactions, etc.) to allow this speed. Could we use the copy command
> when we de-spool the attributes? Could we do the same thing the copy
> command it doing for speed?

I don't think COPY will be useful.

   http://www.postgresql.org/docs/8.1/static/sql-copy.html

The data we are adding to the database does not already exist.  
There's nothing to COPY.

> Cutting my postgres update time to minutes from hours would certainly
> make my backups run far smoother.

I think transactions are more important here.  We need to look more 
closely at that.

-- 
Dan Langille : Software Developer looking for work
my resume: http://www.freebsddiary.org/dan_langille.php




-------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc. Do you grep through log files
for problems?  Stop!  Download the new AJAX search engine that makes
searching your log files as easy as surfing the  web.  DOWNLOAD SPLUNK!
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=103432&bid=230486&dat=121642
_______________________________________________
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users

Reply via email to