For disaster recovery I'm pushing our catalog (along with Bareos configs
& keys) out to S3. I'm doing this as a Run After Job script and pushing
the catalog to S3 via s3cmd.
My problem is that the catalog is large and takes about 10 hours to push
to S3. What I want to do is have the push to S3 happen in the
backkground and release the Director to move on to other things. But it
doesn't.
My script is like this:
#!/bin/zsh
path=(/usr/local/bin /usr/local/sbin $path)
## We do the work in a sub-shell that we push into the background
## so that we don't have Bareos waiting the 10ish hours it takes
## to push the PostgreSQL dump of the catalog to S3.
(
# Do the S3 equivelent of an rsync of /usr/local/etc/bareos
s3cmd --config /var/db/bareos/.s3cfg --delete-removed --no-progress
sync \
/usr/local/etc/bareos/ s3://dr.bareos.iteris.com/config/
# In suspenders fashion, store a compressed tarball of
# /usr/local/etc/bareos in S3
tar --create --gzip --file=- --directory=/usr/local/etc/bareos . | \
s3cmd --config /var/db/bareos/.s3cfg put -
s3://dr.bareos.iteris.com/bareos-config.tgz
# Now the catalog.
# First compress the catalog because it is a huge SQL dump
# and compresses very well. This takes a while.
pbzip2 --best /var/db/bareos/bareos.sql
# Now push the catalog out to S3
time s3cmd --config /var/db/bareos/.s3cfg put --no-progress \
/var/db/bareos/bareos.sql.bz2 \
s3://dr.bareos.iteris.com/bareos.sql.bz2
# Lets double check the upload via MD5 signatures
md5sum=$(s3cmd --config /var/db/bareos/.s3cfg info
s3://dr.bareos.iteris.com/bareos.sql.bz2 2>/dev/null | \
awk '/MD5 sum:/ {print $3}')
echo "S3 has a MD5 checksum of ${md5sum}, checking ..."
if md5 -c ${md5sum} /var/db/bareos/bareos.sql.bz2; then
# And finally delete the catalog
rm -f /var/db/bareos/bareos.sql.bz2
exit 0
else
echo "==> OH Shit! The signatures didn't match."
echo "==> Leaving /var/db/bareos/bareos.sql.bz2 in place."
echo "==> The broken upload nees to be fixed by hand."
echo "==> And then the bareos.sql.bz2 file deleted."
exit 1
fi
) |& Mail -s "Offsite Bareos Catalog Sync" [email protected] &
But the director gets stuck:
Running Jobs:
Console connected at 01-Dec-16 15:29
Console connected at 03-Dec-16 16:07
Console connected at 04-Dec-16 14:33
JobId Level Name Status
======================================================================
5792 Full catalog-offsite.2016-12-04_09.00.11_07 has terminated
with warnings
====
And everything gets suck on a <defunct> job. Once the push to S3
finishes, everything cleans up and we are good to go, just 10 hours
later. Here is the ps list:
UID PID PPID CPU PRI NI VSZ RSS MWCHAN STAT TT TIME COMMAND
997 1106 1 0 20 0 110340 15936 uwait Is - 217:42.36
/usr/local/sbin/bareos-sd -u bareos -g bareos -
997 1112 1 0 20 0 179900 36876 sbwait Is - 21:44.32
/usr/local/sbin/bareos-dir -u bareos -g bareos
997 72127 1112 0 37 0 0 0 - Z - 0:00.01 <defunct>
997 72131 1 0 30 5 17712 3364 pause IN - 0:00.00
/bin/zsh /usr/meridian/share/libexec/bareos/off
997 72132 1 0 25 5 12444 2068 piperd IN - 0:00.00 Mail
-s Offsite Bareos Catalog Sync root@i
997 72453 72131 0 25 5 99380 19548 select SN - 1:11.52
/usr/local/bin/python2.7 /usr/local/bin/s3cmd -
Any thoughts on a better way of doing this?
--
You received this message because you are subscribed to the Google Groups
"bareos-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To post to this group, send email to [email protected].
For more options, visit https://groups.google.com/d/optout.