Hi,
another update from my side, for all who might be interested / future
generations ;-)
I set $Conf{PoolNightlyDigestCheckPercent} = 100
and ran BackupPC_refCountUpdate -m .
I get 104 ERRORs with md5sum d41d8cd98f00b204e9800998ecf8427e (= empty
file) instead of the expected digest. All of these files have 0 length,
see https://paste.ubuntu.com/p/RRBSsXvMWV/
My observations until now:
A. The timestamp of these files falls into 1 of 3 dates, see
https://paste.ubuntu.com/p/FcY3gb5q8V/ (sorted via xargs ls -altr)
If I disregard the 2 files from 2017, these files first started to
appear when I upgraded rsync-bpc from 3.0.9.14 to 3.0.9.15
(I usually initate a test backup right after an upgrade to validate
everything is functioning correctly)
/var/log/dpkg.log: 2020-08-12 22:46:22 upgrade rsync-bpc:amd64 3.0.9.14
3.0.9.15
B. I verified that there are no other pool files with the same digest
(so no digest* / digest + extension).
C. At least some of these pool files are related to attrib files, i.e.
BackupPC_attribPrint
aaa.aaa.aaa/8238/f%2fdata%2fmail%2f/fspool/attrib_f1811600a7b6f6e3f2fcafea644e005b
BackupPC_attribPrint: cannot read attrib file
mail.bhatia.eu/8238/f%2fdata%2fmail%2f/fspool/attrib_f1811600a7b6f6e3f2fcafea644e005b
(f1811600a7b6f6e3f2fcafea644e005b)
D. Looking at the source of rsync-bpc, I find:
https://github.com/backuppc/rsync-bpc/blob/3.0.9/backuppc/bpc_poolWrite.c#L348
and
https://github.com/backuppc/rsync-bpc/commit/3eb30d1b4fe844d0a1409f8a090c518e4d713f40#diff-11e5a98fabcaf3b4938189c693342209
. Could this play a role here?
(I have to admit that I am not (yet?) able to wrap my head around the
pooling, so please excuse any error in my assuptions.)
-->
1. To me, this reads in a way that backuppc might (have?) decide(d) to
zero out a file / create an empty pool file?
2. Would it make sense to handle this specific case (empty pool file) in
BackupPC_refCountUpdate when checking the pool, i.e. at
https://github.com/backuppc/backuppc/blob/master/bin/BackupPC_refCountUpdate#L821
Any further help to debug this problem would be appreciated!
Raoul
PS. Cross-posting to backuppc-devel mailing list, just in case.
On 2020-08-27 21:58, Raoul Bhatia wrote:
Hi Craig, Guillermo, all.
A quick update from my side
I have a (complete?) list of (potentially) broken files as reported by
BTRFS read errors.
I then validated the md5sum from the (c)pool with a one-liner:
for i in $(grep /cpool/ ~/broken_files.txt); do
M=$(/usr/share/backuppc/bin/BackupPC_zcat $i | md5sum | cut -d ' ' -f
1); echo -n "$M: "; echo $i | grep --color $M || echo "$f ERR"; done
--> No error, perhaps I am lucky? :-)
This leaves only one file that might be damaged:
backuppc/pc/abc.def.ghi/1989/refCnt/poolCnt.1.16 ?
I am now experimenting with running
/usr/share/backuppc/bin/BackupPC_refCountUpdate -m
(Until now: No error; exit code 0)
Reading the source I have the following questions:
1.
https://github.com/backuppc/backuppc/blob/master/bin/BackupPC_refCountUpdate#L819
does "rand(100) < $Conf{PoolNightlyDigestCheckPercent}"
My understanding would be that I won't get a predictable 1% check on
*each* run. Perhaps on average, with a sufficiently large pool, over
a longer period of time, this might work out, but for a few runs of
BackupPC_refCountUpdate not.
--> Perhaps there would be another implementation that gets a more
predictable result?
(i.e. create an array of all the files to check, sort them, and then
take the first $Conf{PoolNightlyDigestCheckPercent} of entries, but at
least 1?)
2. I also seem to have accumulated checksum errors over the past years.
--> How do I proceed with these files? If they still exist on the
source, I'd like to re-sync them to the backup.
3. For backuppc/pc/abc.def.ghi/1989/refCnt/poolCnt.1.16 , I do not
find a way to re-check only this one, old backup.
--> Would it be a good idea to add an option to add a "-n num" flag to
BackupPC_refCountUpdate to be able to operate on a particular backup?
Thanks for your guidance,
Raoul
On 2020-08-24 20:39, Raoul Bhatia wrote:
Hi Craig and Guillermo,
On 2020-08-23 19:35, Craig Barratt via BackupPC-users wrote:
$Conf{PoolNightlyDigestCheckPercent} is in percent, so you should set
it to 100 to check all the pool file's MD5 digest against their file
names.
As Guillermo mentions, to check the pool MD5 digests, you can set
temporarily set $Conf{PoolNightlyDigestCheckPercent} to 100 and
$Conf{PoolSizeNightlyUpdatePeriod} to 1.
When reading the documentation, I also came across these options.
However, I didn't dare to run backuppc / BackupPC_nightly, because
from the documentation:
Overnight, when BackupPC_nightly next runs,
all the unused pool files will be deleted and
this will recover the disk space used by the client's backups.
I didn't want to end up with an empty pool...
If you stop BackupPC, to check all the pool digests, run:
BackupPC_refCountUpdate -m
If you want to also regenerate all the host reference counts (which
will take a long time), you could run:
BackupPC_refCountUpdate -m -F
Meanwhile, with the kind help of the btrfs community, I figured out a
way to get the damaged files. This process is not finished, yet,
however, I have a first list:
/mnt/backuppc/pc/abc.def.ghi/1989/refCnt/poolCnt.1.16
/mnt/backuppc/cpool/5c/c8/5cc9373a32e06baaa308a7b341db5ac9
/mnt/backuppc/cpool/b2/62/b3629c46481cb038682aea248c45b89f
/mnt/backuppc/cpool/20/c6/21c61013d40e644af734e28459df0a1a
/mnt/backuppc/cpool/8a/f8/8af935bc53f7199ed75a5695bfd57f26
FYI: The most recent backup from host abc.def.ghi is 2291, so 1989 is
quite far in the past.
How should I proceed when I have a list of broken files?
Move them out from cpool and hope they will be re-synced by the next
(full?) backup?
Thanks,
Raoul
Craig
On Sun, Aug 23, 2020 at 6:45 AM Guillermo Rozas
<guille2...@gmail.com> wrote:
Hi Raoul,
are you using BackupPC v4? If yes, you can use a modification of the
script I posted here:
https://sourceforge.net/p/backuppc/mailman/message/37032497/
In the latest version (4.4.0) you also have the config option
$Conf{PoolNightlyDigestCheckPercent}, which checks the md5 digest of
this fraction of the pool files each night. You can probably set it
to 1 and wait a night for it to run.
Regards,
Guillermo
On Sun, Aug 23, 2020 at 5:38 AM Raoul Bhatia <ra...@bhatia.at> wrote:
Hi,
related to my previous email, it seems that the cause of my issues
was a
file system corruption after a "power cut".
I managed to recover (most of?) the data and would now like to do a
thorough check of the data.
Is there any way to "fully verify" the integrity of my backuppc
installation, ideally in a nondestructive way ;-)
Thanks,
Raoul
PS. My backuppc process is stopped.
--
DI (FH) Raoul Bhatia MSc
E-Mail. ra...@bhatia.at
Tel. +43 699 10132530
_______________________________________________
BackupPC-users mailing list
BackupPC-users@lists.sourceforge.net
List: https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki: https://github.com/backuppc/backuppc/wiki
Project: https://backuppc.github.io/backuppc/
--
DI (FH) Raoul Bhatia MSc
E-Mail. ra...@bhatia.at
Tel. +43 699 10132530
_______________________________________________
BackupPC-users mailing list
BackupPC-users@lists.sourceforge.net
List: https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki: https://github.com/backuppc/backuppc/wiki
Project: https://backuppc.github.io/backuppc/