Hi,
Tim Fletcher wrote on 2011-10-06 10:17:03 +0100 [Re: [BackupPC-users] Bad
md5sums due to zero size (uncompressed) cpool files - WEIRD BUG]:
> On Wed, 2011-10-05 at 21:35 -0400, Jeffrey J. Kosowsky wrote:
> > Finally, remember it's possible that many people are having this
> > problem but just don't know it,
perfectly possible. I was just saying what possible cause came to my mind (any
many people *could* be running with an almost full disk). As you (Jeffrey)
said, the fact that the errors appeared only within a small time frame may or
may not be significant. I guess I don't need to ask whether you are *sure*
that the disk wasn't almost full back then.
To be honest, I would *hope* that only you had these issues and everyone
else's backups are fine, i.e. that your hardware and not the BackupPC software
was the trigger (though it would probably need some sort of software bug to
come up with the exact symptoms).
> > since the only way one would know would be if one actually computed the
> > partial file md5sums of all the pool files and/or restored & tested ones
> > backups.
Almost.
> > Since the error affects only 71 out of 1.1 million files it's possible
> > that no one has ever noticed...
Well, let's think about that for a moment. We *have* had multiple issues that
*sounded* like corrupt attrib files. What would happen, if you had an attrib
file that decompresses to "" in the reference backup?
> > It would be interesting if other people would run a test on their
> > pools to see if they have similar such issues (remember I only tested
> > my pool in response to the recent thread of the guy who was having
> > issues with his pool)...
>
> Do you have a script or series of commands to do this check with?
Actually, what I would propose in response to what you have found would be to
test for pool files that decompress to zero length. That should be
computationally less expensive than computing hashes - in particular, you can
stop decompressing once you have decompressed any content at all. Sure, that
just checks for this issue, not for possible different ones. On the one hand,
having the *correct* content in the pool under an incorrect hash would not be
a *serious* issue - it wouldn't prevent restoring your data, it would just
make pooling not work correctly (for the files affected). On the other,
different instances of this problem might point toward a common cause. And I
guess it would be possible to have *truncated* data (i.e. not zero-length, but
incomplete just the same) in your files as well.
You weren't asking me, but, yes, I wrote a script to check pool file contents
against the file names back in 2007. I'll append it here, but it would really
be interesting to add information on whether the file decompressed to
zero-length. I could easily add the decompressed file length to the output,
but it would make lines longer than 80 characters. Ok, I did that (and added
counting of zero-length files) - please make your terminals at least 93
characters wide :). I just scanned 1/16th of my pool and found various
mismatches, though none of them zero-length. Probably top-level attrib files.
Link counts might be interesting - I'll add them later.
> I have access to a couple of backuppc installs of various ages and sizes
> that I can test.
Try something like
BackupPC_verifyPool -s -p
to scan the whole pool, or
BackupPC_verifyPool -s -p -r 0
to test it on the 0/0/0 - 0/0/f pool subdirectories (-r takes a Perl
expression evaluating to an array of numbers between 0 and 255, e.g. "0",
"0 .. 255" (the default), or "0, 1, 10 .. 15, 5"; note the quotes to make your
shell pass it as a single argument). If you have switched off compression,
you'll have to add a '-u' (though I'm not sure this test makes much sense in
that case). You'll want either '-p' (progress) or '-v' (verbose) to see
anything happening. It *will* take time to traverse the pool, but you can
safely interrupt the script at any time and use the range parameter to resume
it later (though not at the exact place) - or just suspend and resume it (^Z).
You might need to change the 'use lib' statement in line 64 to match your
distribution.
Hope that helps.
Regards,
Holger
#!/usr/bin/perl
#============================================================= -*-perl-*-
#
# BackupPC_verifyPool: Verify pool integrity
#
# DESCRIPTION
#
# BackupPC_verifyPool tries to verify the integrity of the pool files,
# based on their file names, which are supposed to correspond to MD5
# digests calculated from the (uncompressed) length and parts of their
# (again uncompressed) contents.
# Needs to be run as backuppc user for access to the pool files and
# meta data.
#
# Usage: BackupPC_verifyPool [-v] [-p] [-u] [-r range] [-s]
#
# Options:
#
# -v Show what is going on. Without this flag, only errors found
# are displayed, which might be very boring. Use '-p' for a
# bit of entertainment without causing your tty to scroll so
# much.
# -p Show more terse progress output.
# -u Check the pool (uncompressed files), not the cpool
# (uncompressed files).
# -r range Specify the pool range to check (see below). 'range'
# can be a Perl expression like '0 .. 255' or '0, 10, 30 .. 40'.
# Only safe characters allowed [\da-fA-F,.\s].
# -s Show summary.
#
# The pool range was chosen as in BackupPC_nightly as means to divide
# the possibly lengthy operation of verifying the pool into smaller
# steps. The full pool goes from 0 to 255, that's all you really need to
# know. For more information, see BackupPC_nightly from the BackupPC
# distribution.
#
# AUTHOR
# Holger Parplies <wopp at parplies.de>
#
# VERSION
# $Id$
#
# COPYRIGHT
# Copyright (C) 2007 Holger Parplies
#
# This program is free software; you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation; either version 2 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program; if not, write to the Free Software
# Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
#
#========================================================================
use strict;
use lib '/usr/share/backuppc/lib'; # Debian; change to fit your needs
use BackupPC::Lib;
use BackupPC::FileZIO;
use Getopt::Std;
use File::Find;
# $ENV {PATH} = '/bin:/usr/bin';
# chdir '.';
my %opts = (
v => 0, # verbose output
p => 0, # terse progress output
u => 0, # pool (1) or cpool (0)
r => '0 .. 255', # range
s => 0, # show summary
);
unless (getopts ('vpur:s', \%opts)) {
die <<EOM;
Usage: $0 [-v] [-p] [-u] [-r range] [-s]
Options: -v Verbose output of files being checked and result
-p Terse output of what is happening (counter)
-u Check uncompressed pool (pool instead of cpool)
-r range Check subset of pool (default whole pool = 0 .. 255).
Use any valid Perl expression containing only numbers
(octal or hexadecimal is ok), dots, commas and whitespace
here.
-s Show summary.
EOM
}
# unbuffered output
$| = 1;
# some variables
my $bpc = new BackupPC::Lib # BackupPC object
or die "Can't create BackupPC object!\n";
my $pooldir = $bpc -> {TopDir} . '/' . ($opts {u} ? 'pool' : 'cpool') . '/';
# Pool directory to check
my @range; # Pool range to check
my $mismatch_count = 0; # number of invalid files
my $zero_count = 0; # of these: number of zero length files
my $file_count = 0; # running file count for progress output
my $md5; # handle of MD5 object
# # untaint range specification; NOTE THAT THIS IS NOT SECURE!
# $opts {r} = $1
# if $opts {r} =~ /^(.*)$/;
# check range specification
die "Range specification '$opts{r}' contains insecure characters!\n"
if $opts {r} !~ /^[0-9a-fA-Fx.,\s]*$/;
eval "\@range = ($opts{r});";
if ($@) {
die "Range specification '$opts{r}' is invalid: $@\n";
} elsif (not defined @range or @range == 0) {
die "Range specification '$opts{r}' is empty. Nothing to do.\n";
} elsif (grep { $_ < 0 or $_ > 255 } @range) {
die "Range specification '$opts{r}' contains values outside (0 .. 255)\n";
}
# iterate over the specified part of the pool
$md5 = new Digest::MD5
or die "Can't create MD5 object: $!\n";
foreach my $i (@range) {
my $dir = sprintf '%s/%1x/%1x', $pooldir, int ($i / 16), $i % 16;
find (\&validate_pool_dir, $dir)
if -d $dir;
}
# Summary
if ($opts {s}) {
printf "%d files in %d directories checked, %d had wrong digests, of these %d
zero-length.\n",
$file_count, @range * 16, $mismatch_count, $zero_count;
} elsif ($mismatch_count > 0) {
print "ERROR: $mismatch_count files in ", $opts {u} ? 'pool' : 'cpool',
", range ($opts{r}), seem to be corrupt!\n";
}
# Return code
exit $mismatch_count > 0 ? 1 : 0;
# actual verification process
sub validate_pool_dir {
my ($name_md5) = ($_ =~ /^([0-9a-fA-F]{32})/);
my $content_md5;
return # ignore directories
if -d $File::Find::name;
$file_count ++;
if ($opts {p} and not $opts {v}) {
if (/^([0-9a-fA-F])([0-9a-fA-F])/) {
print "[$1$2 $file_count]\r";
} else {
print "[ $file_count]\r";
}
}
# since we are only reading the file, we can treat the 'compLevel' parameter
# of BackupPC::FileZIO::open as a boolean
my $fh = BackupPC::FileZIO -> open ($File::Find::name, 0, ! $opts {u});
if (defined $fh) {
my $buf;
my $bytes = $fh -> read (\$buf, 1024 * 1024 + 100);
if ($bytes > 1024 * 1024) {
# read complete file for determining length. Keep first 1MB in $buf,
# put total length into $bytes, read in 100KB chunks for lower mem usage
my $buf2;
my $new = 1;
while ($new > 0) {
$new = $fh -> read (\$buf2, 102400);
$bytes += $new;
}
}
$content_md5 = $bpc -> Buffer2MD5 ($md5, $bytes, \$buf);
if ($content_md5 ne $name_md5) {
# print ">$content_md5< != >$name_md5<\n";
printf "[%5d] %-36.36s (%10d) != %-32.32s\n", $file_count, $_, $bytes,
$content_md5;
$mismatch_count ++;
$zero_count ++
if $bytes == 0;
} elsif ($opts {v}) {
printf "[%5d] %-36.36s (%10d) ok\n", $file_count, $_, $bytes;
}
} else {
# open failed, count as mismatch
print "$_: BackupPC::FileZIO::open failed!\n";
$mismatch_count ++;
}
}
------------------------------------------------------------------------------
All the data continuously generated in your IT infrastructure contains a
definitive record of customers, application performance, security
threats, fraudulent activity and more. Splunk takes this data and makes
sense of it. Business sense. IT sense. Common sense.
http://p.sf.net/sfu/splunk-d2dcopy1
_______________________________________________
BackupPC-users mailing list
BackupPC-users@lists.sourceforge.net
List: https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki: http://backuppc.wiki.sourceforge.net
Project: http://backuppc.sourceforge.net/