Tim Connors wrote at about 15:35:09 +1100 on Thursday, December 23, 2021:
 > Hi all,
 > 
 > Anyone got a way to delete a single path from all historical backups?
 > Doesn't have to be immediate of course - happy for the nightly job to take
 > out the trash.
 > 
 > It was trivial in backuppc v3 - just go to the encoded path in the cpool
 > and truncate the file to 0 to take care of all the hardlinks.  But I
 > suspect here it's a case of finding the mapping of the file from each
 > backup (that info is not in $machine/XferLOG.4218.z, so I don't know where
 > else it would be), and then truncate/remove (since no more hardlinks),
 > while also taking care of legacy v3 files.
 > 

To the extend that you want a hack to crudely remove the contents of
backed up files without caring about corrupting the pc tree
database(s) or pool integrity, it is perhaps even easier in v4 in that
the (c)pool locations are encoded directly as m5sums of the original
file.

Use BackupPC_attribPrint to find the md5sum path of the backed up
file. Or if you know it already and are not worried about the infinitesimal
chance of MD5sum collision (which given the crudity of your v3
approach, you presumably don't care), then go directly to the pool
entry.

Then you can either *zero* out the (c)pool file as in v3 or *delete*
it. Both approaches create corruptions in the pool and pc trees but
presumably you don't care.

If you choose to delete the file, then on the next (complete) run of
BackupPC_nightly, the pool ref counts will be updated, deleting any
reference to the file in the pool.  However, these ref counts will be
incosistent with the ref counts for the corresponding host and the
attrib file for the corresponding backup. But unlike, your v3 method,
the pool will at least be consistent.

If you choose to zero out the file, the refcounts will be consistent
between the pool and pc tree.  However, technically, the pool will be
inconsistent in that the pool file md5sum name won't match the
contents. This could potentially  mess up future backups of a file with
the same md5sum in that unless '-c' is used in rsync, BackupPC will
assume that the file is still in the pool ad not back it up. This can
be dangerous.

Note that the problems listed in the last paragraph existed with your
crude delete method for v3. Additionally, since partial file md5sum
collisions were common in v3 (since only the md5sum of the first and
last parts of the file were calculated), in your old method, it was
possible that you could break an md5sum chain in the pool.

For both deleting and zeroing out pools files, the "deleted" file will
still show as being backed up but it will have no content. Subsequent
non-full and/or incremental backups may not act or display properly
regarding such files.

Many years ago, I wrote a somewhat complicated perl script to properly
delete individual files from v3 correcting both the pool partial
md5sum chains and the pc tree to delete the files and adjust both
prior and subsequent backups in each incremental/full chain.

Bottom line... Will it delete the contents? yes. But so will using dd
to delete entries from your filesystem table or for that matter
smashing your hard drive with a sledgehammer.  But your approach will
corrupt the pool and pc trees with potentially unappreciated and
unknown consequence. Plus you will lose the integrity of your backups
if you ever want to check them later. If you are appropriately
paranoid about the quality of your backups, you will avoid creating
your own kludges to manipulate backup files atomically, especially
when you seemingly have little clue about how files are stored and
manipulated in BackupPC.


_______________________________________________
BackupPC-users mailing list
BackupPC-users@lists.sourceforge.net
List:    https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki:    https://github.com/backuppc/backuppc/wiki
Project: https://backuppc.github.io/backuppc/

Reply via email to