For disaster recovery I'm pushing our catalog (along with Bareos configs & keys) out to S3. I'm doing this as a Run After Job script and pushing the catalog to S3 via s3cmd.

My problem is that the catalog is large and takes about 10 hours to push to S3. What I want to do is have the push to S3 happen in the backkground and release the Director to move on to other things. But it doesn't.

My script is like this:

#!/bin/zsh

path=(/usr/local/bin /usr/local/sbin $path)

## We do the work in a sub-shell that we push into the background
## so that we don't have Bareos waiting the 10ish hours it takes
## to push the PostgreSQL dump of the catalog to S3.

(
    # Do the S3 equivelent of an rsync of /usr/local/etc/bareos
s3cmd --config /var/db/bareos/.s3cfg --delete-removed --no-progress sync \
        /usr/local/etc/bareos/ s3://dr.bareos.iteris.com/config/

    # In suspenders fashion, store a compressed tarball of
    # /usr/local/etc/bareos in S3
    tar --create --gzip --file=- --directory=/usr/local/etc/bareos . | \
s3cmd --config /var/db/bareos/.s3cfg put - s3://dr.bareos.iteris.com/bareos-config.tgz

    # Now the catalog.

    # First compress the catalog because it is a huge SQL dump
    # and compresses very well. This takes a while.
    pbzip2 --best /var/db/bareos/bareos.sql

    # Now push the catalog out to S3
    time s3cmd --config /var/db/bareos/.s3cfg put --no-progress \
        /var/db/bareos/bareos.sql.bz2 \
        s3://dr.bareos.iteris.com/bareos.sql.bz2

    # Lets double check the upload via MD5 signatures
md5sum=$(s3cmd --config /var/db/bareos/.s3cfg info s3://dr.bareos.iteris.com/bareos.sql.bz2 2>/dev/null | \
             awk '/MD5 sum:/ {print $3}')
    echo "S3 has a MD5 checksum of ${md5sum}, checking ..."
    if md5 -c ${md5sum} /var/db/bareos/bareos.sql.bz2; then
        # And finally delete the catalog
        rm -f /var/db/bareos/bareos.sql.bz2
        exit 0
    else
        echo "==> OH Shit! The signatures didn't match."
        echo "==> Leaving /var/db/bareos/bareos.sql.bz2 in place."
        echo "==> The broken upload nees to be fixed by hand."
        echo "==> And then the bareos.sql.bz2 file deleted."
        exit 1
    fi
) |& Mail -s "Offsite Bareos Catalog Sync" [email protected] &

But the director gets stuck:

Running Jobs:
Console connected at 01-Dec-16 15:29
Console connected at 03-Dec-16 16:07
Console connected at 04-Dec-16 14:33
 JobId Level   Name                       Status
======================================================================
5792 Full catalog-offsite.2016-12-04_09.00.11_07 has terminated with warnings
====

And everything gets suck on a <defunct> job. Once the push to S3 finishes, everything cleans up and we are good to go, just 10 hours later. Here is the ps list:

UID   PID  PPID CPU PRI NI    VSZ   RSS MWCHAN STAT TT       TIME COMMAND
997 1106 1 0 20 0 110340 15936 uwait Is - 217:42.36 /usr/local/sbin/bareos-sd -u bareos -g bareos - 997 1112 1 0 20 0 179900 36876 sbwait Is - 21:44.32 /usr/local/sbin/bareos-dir -u bareos -g bareos
997 72127  1112   0  37  0      0     0 -      Z     -    0:00.01 <defunct>
997 72131 1 0 30 5 17712 3364 pause IN - 0:00.00 /bin/zsh /usr/meridian/share/libexec/bareos/off 997 72132 1 0 25 5 12444 2068 piperd IN - 0:00.00 Mail -s Offsite Bareos Catalog Sync root@i 997 72453 72131 0 25 5 99380 19548 select SN - 1:11.52 /usr/local/bin/python2.7 /usr/local/bin/s3cmd -


Any thoughts on a better way of doing this?

--
You received this message because you are subscribed to the Google Groups 
"bareos-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
For more options, visit https://groups.google.com/d/optout.

Reply via email to