Gary, Sorry about the delay in reply. I have a few moments now.
Do the files you want to back up exist on different hosts, or only on the one server? It sounds like they're only on the one server, but please let me know which it is. Phil said: But if I were you, if this rsync backup set contains *important* files that no longer exist "in production", so to speak, I would sort out those critical files into a distinct designated archive of their own and add *that archive* to your backup set. And I agree. However, that sorting process need not be manual. Consider using dupe guru. It is open source software that compares two datasets (even if folder structure isn't symmetrical) and finds duplicate files. You will need a GUI - it cannot run in CLI alone. The general process will be as follows: 1. Double check the setting since dupeguru to ensure it will be hashing every file you intend to compare. 2. On the main tab for dupeguru, click to compare by contents. In the bottom pane, add two paths: the path to the root folder for your live files and the path to the root folder for your rsync backups. For the live file path, on the right say that it is a "reference" dataset. Dupeguru wont take any action against a reference dataset. 3. Double check, everything, then click compare. Wait for the scans to finish. No action will be taken until you make a decision in the software. Expect this to take some time. 4. Look at the comparison results in the right tab. It should show you filenames and full paths for original and duplicate files. Examine the results for sanity, and then take action. I recommend hiding all reference files, re-checking to ensure that it now only shows duplicates, and they are all in your rsync backup folder. In dupeguru, select the option to move the detected duplicate files to another location on your system that is not among the live files or the rsync files. 5. In your file management tools (dolphin, Nautilus, mc, ls, whatever), examine the rsync backup folder. Is it now much smaller with many fewer files? These are the deltas, the files that were different from those seen in the live files. 6. Presumably a lot of duplicate files have been moved somewhere else. Also, hopefully the space needs of the rsync files have been greatly reduced. At this point, if you're satisfied with the results, you can delete the duplicate files. Be careful. Double and triple check everything. Read the manual. Here be dragons. Etc. https://dupeguru.voltaicideas.net Robert Gerber 402-237-8692 [email protected] On Sat, Aug 23, 2025, 5:00 PM Gary Dale <[email protected]> wrote: > On 2025-08-23 15:29, Rob Gerber wrote: > > I don't have a lot of time right now, but my main question is "Given > > enough time and effort, you almost certainly could do this, but should > > you?" > > > > I don't mean to be a downer, but are you sure screwing around with > > bacula and "faking" an initial backup condition is worth the risk that > > you get something wrong and you've "tricked" bacula into thinking > > things are ok, when they actually aren't and your backups are invalid? > > > > Got to run, sorry for lack of details. > > > Conversely, wouldn't I be able to find out if it worked fairly quickly? > The initial backup would be the "fake" so if I can't restore a file from > it after reverting to the real setup, then I'd know it didn't work. The > second test would come the next day, if the backup didn't duplicate > files already in the original and if if I could still restore files. > > Your cautions are well taken but I've got a lot of files and trying to > sift through them to find ones I may need in the future is a gargantuan > task, while keeping a duplicate set of files is a large (2.7T) waste of > space. > >
_______________________________________________ Bacula-users mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/bacula-users
