Package: backup-manager
Version: 0.7.9-3
Severity: wishlist
Tags: upstream patch

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Currently when more than one job is run at the same day, backup-manager 
uploads all archives of the day that exist in the repository in each job's
invocation. This results in extraneous uploads that consume bandwidth and 
may also introduce errors later on, although the archive was succesfully 
uploaded the first time.

It would be nice if backup-manager kept track of all succesfully uploaded
archives in a central database and consulted it to filter out unecessary
uploads. 

The attached patch does exactly that, using a flat text file as the database.

In addition, other administrative tasks could be facilitated by such a
database (I, for example, use dar and isolated catalogs. If the database
exists I could make a cron job to remove archives that are uploaded and
eplace them with symlinks to the catalogs, to save space).

The patch has been tested and is used already on my system without producing
errors so far.

regards
George Zarkadas

- -- System Information:
Debian Release: 6.0.2
  APT prefers stable-updates
  APT policy: (500, 'stable-updates'), (500, 'proposed-updates'), (500, 
'stable'), (450, 'testing-proposed-updates'), (450, 'testing'), (400, 
'unstable')
Architecture: amd64 (x86_64)

Kernel: Linux 2.6.32-5-amd64 (SMP w/4 CPU cores)
Locale: LANG=el_GR.utf8, LC_CTYPE=el_GR.utf8 (charmap=UTF-8)
Shell: /bin/sh linked to /bin/dash

Versions of packages backup-manager depends on:
ii  debconf [debconf-2.0]        1.5.36.1    Debian configuration management sy
ii  findutils                    4.4.2-1+b1  utilities for finding files--find,
ii  ucf                          3.0025+nmu1 Update Configuration File: preserv

backup-manager recommends no packages.

Versions of packages backup-manager suggests:
ii  anacron               2.3-14             cron-like program that doesn't go 
ii  backup-manager-doc    0.7.9-3            documentation package for Backup M
ii  dar                   2.3.10-1+b1        Disk ARchive: Backup directory tre
ii  dvd+rw-tools          7.1-6              DVD+-RW/R tools
ii  genisoimage           9:1.1.11-1         Creates ISO-9660 CD-ROM filesystem
ii  gettext-base          0.18.1.1-3         GNU Internationalization utilities
ii  libfile-slurp-perl    9999.13-1          single call read & write file rout
pn  libnet-amazon-s3-perl <none>             (no description available)
ii  openssh-client        1:5.5p1-6+squeeze1 secure shell (SSH) client, for sec
ii  perl                  5.10.1-17squeeze2  Larry Wall's Practical Extraction 
ii  wodim                 9:1.1.11-1         command line CD/DVD writing tool
ii  zip                   3.0-3              Archiver for .zip files

- -- debconf information excluded

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.10 (GNU/Linux)

iQEcBAEBAgAGBQJOUt2XAAoJEJWXIVmJ5BwWHQ8IALobsYprBJu5FVBwcVOXQGkB
EmCgOxVd5ogiZQ2VVxKAQ9F5HBhX/JyjRxI3jxbc2gq3Dfn2FsyQFXpM6rGs9eIu
3yGTAIY1YtV2bHfeiUnl9hhQF3RQcuJ1nClLTw8cifWuGb+3qbmAWcSBicbJmYHV
KhDFncSiKHG7v4Q0uvGhMufAK31uJQQVachLHlex/9fTcZBPatHwFP39Jr6ZwZxA
j5/jwZx9SRDFAnlbOlHwfWjIYxdzrC5GLk95iF5rDYNv/J2+IUQMu6m6vywsskzI
kFXhDQoH9+Ki/rx+BvJ+unQK0TMXAbxyNwYH0gB7T22FJ4v+NoPhrGlCum5M/5E=
=oUkf
-----END PGP SIGNATURE-----
--- a/backup-manager-upload
+++ b/backup-manager-upload
@@ -105,6 +105,61 @@
        }
 }
 
+# The idea behind BM_UPLOADED_ARCHIVES is to have a database of what archives
+# have been uploaded so far. This allows multiple execution of upload actions
+# within a day without resending all archives of the day from the beginning.
+
+# Add one file,host pair to $BM_UPLOADED_ARCHIVES database.
+# Called immediately *after* successful uploading of an archive.
+sub appendto_uploaded_archives($$)
+{
+    my $file = shift;
+    my $host = shift;
+    unless ( defined $file and defined $host ) {
+        print_error "required args needed";
+        return FALSE;
+    }
+
+    my $upload_fname = $ENV{BM_UPLOADED_ARCHIVES};
+    unless ( defined $upload_fname ) {
+        # Uncomment next line if you want the mandatory use
+        # of BM_UPLOADED_ARCHIVES (ie always have it around).
+        #print_error "BM_UPLOADED_ARCHIVES is not defined";
+        return FALSE;
+    }
+
+    # if $file already in database, append host to that line;
+    # else append a lines "$file $host" to the end.
+
+    my $io_error = 0;
+    if ( ! system( "grep -q \"^$file \" $upload_fname" ) ) {
+        my $cmd = "sed -i \"s:^$file .*\$:\& $host:\" $upload_fname";
+        $io_error = system("$cmd");
+    }
+    elsif ( open(my $fh, ">>", $upload_fname) ) {
+        print($fh "$file $host\n") or $io_error = 1;
+        close $fh;
+    }
+    else {
+        $io_error = 2;
+    }
+    if ( $io_error ) {
+        print_error "IO error: did not update $upload_fname with '$file 
$host'";
+        return FALSE;
+    }
+
+    return TRUE;
+}
+
+# Get all files of the specified date; filter the list through 
+# BM_UPLOADED_ARCHIVES if it is set in the environment.
+# NOTE:  Doing the filtering here implies that the archive is considered
+# uploaded if a single upload to a host succeeds; that is even when there
+# are failures to other hosts (in case of multiple host uploading).
+# To consider it uploaded when all hosts succeed, the filtering must be
+# transfered to the individual upload subroutines (and check for existence
+# of file,host pair in the database).
+#
 sub get_files_list_from_date($)
 {
        my $date = shift;
@@ -129,8 +184,21 @@
         exit E_INVALID;
     }
 
-       while (<$g_root_dir/*$date*>) {
-        push @{$ra_files}, $_;
+    my $upload_fname = $ENV{BM_UPLOADED_ARCHIVES};
+    if ( defined $upload_fname ) {
+        # filter file list through the BM_UPLOADED_ARCHIVES database
+       while (<$g_root_dir/*$date*>) {
+            my $file = $_;
+            my $cmd = "grep -q '$file' $upload_fname";
+            if ( system ("$cmd") ) {
+                push @{$ra_files}, $file;
+            }
+        }
+    }
+    else {
+       while (<$g_root_dir/*$date*>) {
+            push @{$ra_files}, $_;
+        }
        }
 
        return $ra_files;
@@ -267,7 +335,12 @@
                print_error ("Unable to upload \"$file\". ".($! || $@ || $ret));
         return 0;
        }
-       return 1;
+    else {
+        # use same name in both cases (gpg encryption is done on the fly);
+        # continue if writing to uploaded archives file fails.
+        appendto_uploaded_archives($file, $host);
+    }
+    return 1;
 }
 
 # How to upload files with scp.
@@ -551,17 +624,24 @@
         # Put all the files over the connexion
         foreach my $file (@{$ra_files}) {
             chomp $file;
+            # continue if writing to uploaded archives file fails.
             if ($BM_UPLOAD_FTP_SECURE) {
-                unless (ftptls_put_file ($ftp, $file)) {
-                   print_error "Unable to transfer $file";
-                   return FALSE;
-               }
+                if (ftptls_put_file ($ftp, $file)) {
+                    appendto_uploaded_archives($file, $host);
+                }
+                else {
+                    print_error "Unable to transfer $file";
+                    return FALSE;
+                }
             }
             else {
-                unless (ftp_put_file ($ftp, $file)) {
-                   print_error "Unable to transfer $file: " . $ftp->message;
-                   return FALSE;
-               }
+                if (ftp_put_file ($ftp, $file)) {
+                    appendto_uploaded_archives($file, $host);
+                }
+                else {
+                    print_error "Unable to transfer $file: " . $ftp->message;
+                    return FALSE;
+                }
             }
         }
         print_info "All transfers done, loging out from $host\n";
@@ -727,6 +807,8 @@
                                );
                                $uploaded{$filename} = $file_length;
                        }
+            # For the S3 method, we assume success in any case.
+            appendto_uploaded_archives($file, $host);
                }
 
                # get a list of files and confirm uploads
--- a/backup-manager.conf.tpl
+++ b/backup-manager.conf.tpl
@@ -305,6 +305,20 @@
 # Where to put archives on the remote hosts (global)
 export BM_UPLOAD_DESTINATION=""
 
+# Uncomment the 'export ...' line below to activate the uploaded archives
+# database.
+# Using the database will avoid extraneous uploads to remote hosts in the
+# case of running more than one backup-manager jobs per day (such as when
+# you are using different configuration files for different parts of your
+# filesystem).
+# Note that when you upload to multiple hosts, a single succesfull upload
+# will mark the archive as uploaded. Thus upload errors to specific hosts
+# will have to be resolved manually.
+# You can specify any filename, but it is recommended to keep the database
+# inside the archive repository. The variable's value has been preset to
+# that.
+#export 
BM_UPLOADED_ARCHIVES=${BM_REPOSITORY_ROOT}/${BM_ARCHIVE_PREFIX}-uploaded.list
+
 ##############################################################
 # The SSH method
 #############################################################

Reply via email to