Hi there,

About a month ago I made the elementary mistake of upgrading my backup
server from Debian Jessie to Debian Stretch.  Just about everything broke.

The biggest concern was the backups themselves.  The upgrade changed
BackupPC from version 3.3.0, which seemed to have been working fine for
years, to version 3.3.1, which didn't work at all.  Wouldn't even start.

After a couple of days trying to get it to start I decided the best thing
to do would be to cut my losses and install a version 4.  So I installed
version 4.2.1.

A couple of days later I had what I thought was a working 4.2.1 and things
started to settle down, but 4.2.1 seemed to be eating a LOT more disc space
than V3.x ever did and I spent another couple of days moving everything I
could off the 3TB drive which holds the pool partition.  V3 used just over
1TB to back up 12TB of data on our various machines.  V4 wanted more like
2.6TB and it was still rising slowly four days after the initial backups
had completed.  Well they'd stopped, if not exactly completed - there were
some 'partial' backups.

Still fearing "no space left on device" I used BackupPC_ls (it's lovely!)
to look around in the backups, and then took my courage in both hands and
tried used BackupPC_backupDelete to delete some cruft from home directories
which really doesn't need to be backed up; entire directories like '.cache'
and '.moonchild productions' from several home directories on several hosts.

That's when I think things really started to unravel.  AFAICT when I gave a
command to BackupPC_backupDelete to delete a directory from a backup, it in
fact deleted the entire backup.  More than once.  Here's some debug output
for an attempt to delete data from a backup for a remote host.  I've made a
couple of edits to shorten the bash prompt, split the long command line with
a backslash escaped newline, and redact the user's login name:

8<----------------------------------------------------------------------
tornado:[...]/kestrel# >>> /usr/share/backuppc/bin/BackupPC_backupDelete \
-h kestrel -n 271 -l -s Homes /xxxxxx/.cache '/xxxxxx/.moonchild productions'
__bpc_pidStart__ 20851
BackupPC_backupDelete: removing #271/Homes
__bpc_progress_state__ delete #271/Homes
BackupPC_backupDelete: No prior backup for merge
__bpc_progress_fileCnt__ 17116
__bpc_progress_fileCnt__ 17918
__bpc_progress_fileCnt__ 28734
bpc_attrib_backwardCompat: WriteOldStyleAttribFile = 0, KeepOldAttribFiles = 0
BackupPC_backupDelete: removing #271/Homes
__bpc_progress_state__ delete #271/Homes
BackupPC_backupDelete: No prior backup for merge
bpc_attrib_dirWrite: old attrib has same digest; no changes to ref counts
__bpc_pidStart__ 20861
BackupPC_refCountUpdate: computing totals for host kestrel
__bpc_progress_state__ sumUpdate 0/128
bpc_attrib_backwardCompat: WriteOldStyleAttribFile = 0, KeepOldAttribFiles = 0
__bpc_progress_state__ sumUpdate 8/128
__bpc_progress_state__ sumUpdate 16/128
__bpc_progress_state__ sumUpdate 24/128
__bpc_progress_state__ sumUpdate 32/128
__bpc_progress_state__ sumUpdate 40/128
__bpc_progress_state__ sumUpdate 48/128
__bpc_progress_state__ sumUpdate 56/128
__bpc_progress_state__ sumUpdate 64/128
__bpc_progress_state__ sumUpdate 72/128
__bpc_progress_state__ sumUpdate 80/128
__bpc_progress_state__ sumUpdate 88/128
__bpc_progress_state__ sumUpdate 96/128
__bpc_progress_state__ sumUpdate 104/128
__bpc_progress_state__ sumUpdate 112/128
__bpc_progress_state__ sumUpdate 120/128
__bpc_progress_state__ rename total
BackupPC_refCountUpdate: host kestrel got 0 errors (took 6 secs)
BackupPC_refCountUpdate total errors: 0
__bpc_pidEnd__ 20861
BackupPC_backupDelete: got 0 errors
__bpc_pidEnd__ 20851
tornado:/var/lib/backuppc/pc/kestrel# >>>
8<----------------------------------------------------------------------

After a couple of attempts to delete these directories, the next backup of
home directories on the _local_ machine seems to have started from scratch
five days ago and is still in progress.  The actual disc usage as reported
by the OS (df) isn't changing, but BackupPC appears to think that it has
now used 5TB of a 3TB disc and the claimed usage is growing.  If the graph
on the status page were to be believed (and clearly is it not) the V3 pool
is growing by hundreds of gigabytes per day, and the V4 pool is shrinking.

Output from top:
8<----------------------------------------------------------------------
tornado:~# >>> top -b -n1 -u backuppc -c
top - 12:31:18 up 266 days, 23:46,  9 users,  load average: 0.60, 0.84, 0.94
Tasks: 243 total,   1 running, 241 sleeping,   0 stopped,   1 zombie
%Cpu(s):  7.2 us,  6.0 sy,  3.6 ni, 77.6 id,  4.6 wa,  0.0 hi,  1.0 si,  0.0 st
KiB Mem :  6129868 total,   329516 free,  1121492 used,  4678860 buff/cache
KiB Swap: 12582908 total, 12309444 free,   273464 used.  4750136 avail Mem

  PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND
  18621 backuppc  39  19   85504  24340   5144 S   0.0  0.4   0:00.55 
/usr/bin/perl /usr/share/backuppc/bin/BackupPC_dump localhost
  18841 backuppc  39  19       0      0      0 Z   0.0  0.0   0:02.49 [rsync_bpc] 
<defunct>
  18843 backuppc  39  19   70964  10252    224 S   0.0  0.2   0:00.63 
/usr/local/bin/rsync_bpc --bpc-top-dir /var/lib/backuppc --bpc-host-name 
localhost --bpc-share-name Homes --bpc-bkup-num +
  19235 backuppc  39  19   57424   7568   2840 S   0.0  0.1   2:07.82 
/usr/bin/perl /usr/share/backuppc/bin/BackupPC -d
8<----------------------------------------------------------------------

Cut'n'pasted from the Server Status page, note "Pool is 663.71+4946.35GiB"
8<----------------------------------------------------------------------
BackupPC Server Status

Currently Running Jobs

Host    Type    User    Start Time      Command         PID     Xfer PID        
Status  Count
localhost       incr    backuppc        2018-09-09 20:00        BackupPC_dump localhost   
      18621   18841, 18843    backup share "Homes"  25601

General Server Information

    The servers PID is 19235, on host tornado, version 4.2.1, started at 
2018-09-04 16:48.
    This status was generated at 2018-09-14 11:15.
    The configuration was last loaded at 2018-09-04 16:48.
    PCs will be next queued at 2018-09-14 12:00.
    Other info:
        1 pending backup requests from last scheduled wakeup,
        0 pending user backup requests,
        0 pending command requests,
        Pool is 663.71+4946.35GiB comprising 1675170+7424490 files and 
16512+21845 directories (as of 2018-09-10 01:25),
        Pool hashing gives 2220+9555 repeated files with longest chain 7+4,
        Nightly cleanup removed 88290+105 files of size 580.84+1.00GiB (around 
2018-09-10 01:25),
Pool file system was recently at 80% (2018-09-14 11:15), today's max is 80% (2018-09-14 01:00) and yesterday's max was 80%. 8<----------------------------------------------------------------------

At this point I have no confidence in BackupPC version 4.

Given what I've done, is any of this to be expected?

Is there a way to go back to V3.x?

I see on github that there were some changes which might affect systems
with mixed V3/V4 backups, in particular

https://github.com/backuppc/backuppc/commit/8c2708c38cc4e6083e5e5345cff6d7a6b6790cee

Will 4.2.2 likely improve things?

Any other suggestions welcome.

--

73,
Ged.


_______________________________________________
BackupPC-users mailing list
BackupPC-users@lists.sourceforge.net
List:    https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki:    http://backuppc.wiki.sourceforge.net
Project: http://backuppc.sourceforge.net/

Reply via email to