Tim Connors wrote at about 15:35:09 +1100 on Thursday, December 23, 2021: > Hi all, > > Anyone got a way to delete a single path from all historical backups? > Doesn't have to be immediate of course - happy for the nightly job to take > out the trash. > > It was trivial in backuppc v3 - just go to the encoded path in the cpool > and truncate the file to 0 to take care of all the hardlinks. But I > suspect here it's a case of finding the mapping of the file from each > backup (that info is not in $machine/XferLOG.4218.z, so I don't know where > else it would be), and then truncate/remove (since no more hardlinks), > while also taking care of legacy v3 files. >
To the extend that you want a hack to crudely remove the contents of backed up files without caring about corrupting the pc tree database(s) or pool integrity, it is perhaps even easier in v4 in that the (c)pool locations are encoded directly as m5sums of the original file. Use BackupPC_attribPrint to find the md5sum path of the backed up file. Or if you know it already and are not worried about the infinitesimal chance of MD5sum collision (which given the crudity of your v3 approach, you presumably don't care), then go directly to the pool entry. Then you can either *zero* out the (c)pool file as in v3 or *delete* it. Both approaches create corruptions in the pool and pc trees but presumably you don't care. If you choose to delete the file, then on the next (complete) run of BackupPC_nightly, the pool ref counts will be updated, deleting any reference to the file in the pool. However, these ref counts will be incosistent with the ref counts for the corresponding host and the attrib file for the corresponding backup. But unlike, your v3 method, the pool will at least be consistent. If you choose to zero out the file, the refcounts will be consistent between the pool and pc tree. However, technically, the pool will be inconsistent in that the pool file md5sum name won't match the contents. This could potentially mess up future backups of a file with the same md5sum in that unless '-c' is used in rsync, BackupPC will assume that the file is still in the pool ad not back it up. This can be dangerous. Note that the problems listed in the last paragraph existed with your crude delete method for v3. Additionally, since partial file md5sum collisions were common in v3 (since only the md5sum of the first and last parts of the file were calculated), in your old method, it was possible that you could break an md5sum chain in the pool. For both deleting and zeroing out pools files, the "deleted" file will still show as being backed up but it will have no content. Subsequent non-full and/or incremental backups may not act or display properly regarding such files. Many years ago, I wrote a somewhat complicated perl script to properly delete individual files from v3 correcting both the pool partial md5sum chains and the pc tree to delete the files and adjust both prior and subsequent backups in each incremental/full chain. Bottom line... Will it delete the contents? yes. But so will using dd to delete entries from your filesystem table or for that matter smashing your hard drive with a sledgehammer. But your approach will corrupt the pool and pc trees with potentially unappreciated and unknown consequence. Plus you will lose the integrity of your backups if you ever want to check them later. If you are appropriately paranoid about the quality of your backups, you will avoid creating your own kludges to manipulate backup files atomically, especially when you seemingly have little clue about how files are stored and manipulated in BackupPC. _______________________________________________ BackupPC-users mailing list BackupPC-users@lists.sourceforge.net List: https://lists.sourceforge.net/lists/listinfo/backuppc-users Wiki: https://github.com/backuppc/backuppc/wiki Project: https://backuppc.github.io/backuppc/