So, you could do the thing you were discussing, temporarily map the
rsynced files onto the directory for the live files, run a bacula full
backup, reverse the mapping, and then run subsequent bacula backups. In
theory, at least.

Here's what I've been able to think of, regarding this:
1. use 'mount --bind /source/dir/ /mount/point/' to mount your rsync files
over top of your live files. Better turn off anything that relies on the
live files. The live files will still be in their original locations, but
will be concealed by the mountpoint. The rsync files will still be
accessible to bacula in their original location. From the mount man page:
"After this call the same contents are accessible in two places."
2. Do something sort of suboptimal: configure your bacula fileset to
exclude the rsync files' directory. It doesn't make any sense to back up
these files twice, given your goals of saving space. The downside of this
is that you're expressly choosing to not back up your rsync files, so if
something goes wrong with this process, well, you won't have any backups of
those files.
3. Run your bacula backup. Once it completes, rerun an incremental again to
ensure you got anything new that popped up after the full was launched.
4. umount the mount bind. run an estimate operation for the bacula job.
does it show a ton of files to capture, something on a suitable scale that
indicates it's trying to re-capture the entire live files directory?
5. If not, congrats, you probably did what you are trying to do. Probably.
6. run your incremental backup to capture the deltas.
7. Despite all my protestations, once the retention period kicks in, you're
going to have a new full backup in a few months, and these shenanigans will
fade into memory, and eventually be purged with old backups.

Please note that for the dupeguru option, you might find that the vast
majority of the duplicate files account for the space used, with the
remaining delta files being fairly minimal in size and quantity. Beyond a
lack of aesthetic neatness and inherent elegance, it's a decent solution to
the problem.

I understand that you like the other idea more, however.

Regards,
Robert Gerber
402-237-8692
r...@craeon.net


On Fri, Aug 29, 2025 at 8:08 PM Gary Dale <g...@extremeground.com> wrote:

> On 2025-08-27 12:07, Rob Gerber wrote:
>
> Gary,
>
> Sorry about the delay in reply. I have a few moments now.
>
> Do the files you want to back up exist on different hosts, or only on the
> one server? It sounds like they're only on the one server, but please let
> me know which it is.
>
> All the files exist on the one server. However, I will need to log in
> locally as root to be umount /home. ssh is set to disallow connecting as
> root.
>
>
>
> Phil said:
> But if I were you, if this rsync backup set contains *important* files
> that no longer exist "in production", so to speak, I would sort out
> those critical files into a distinct designated archive of their own and
> add *that archive* to your backup set.
>
> And I agree. However, that sorting process need not be manual. Consider
> using dupe guru. It is open source software that compares two datasets
> (even if folder structure isn't symmetrical) and finds duplicate files. You
> will need a GUI - it cannot run in CLI alone.
>
> The general process will be as follows:
> 1. Double check the setting since dupeguru to ensure it will be hashing
> every file you intend to compare.
> 2. On the main tab for dupeguru, click to compare by contents. In the
> bottom pane, add two paths: the path to the root folder for your live files
> and the path to the root folder for your rsync backups. For the live file
> path, on the right say that it is a "reference" dataset. Dupeguru wont take
> any action against a reference dataset.
> 3. Double check, everything, then click compare. Wait for the scans to
> finish. No action will be taken until you make a decision in the software.
> Expect this to take some time.
> 4. Look at the comparison results in the right tab. It should show you
> filenames and full paths for original and duplicate files. Examine the
> results for sanity, and then take action. I recommend hiding all reference
> files, re-checking to ensure that it now only shows duplicates, and they
> are all in your rsync backup folder. In dupeguru, select the option to move
> the detected duplicate files to another location on your system that is not
> among the live files or the rsync files.
> 5. In your file management tools (dolphin, Nautilus, mc, ls, whatever),
> examine the rsync backup folder. Is it now much smaller with many fewer
> files? These are the deltas, the files that were different from those seen
> in the live files.
> 6. Presumably a lot of duplicate files have been moved somewhere else.
> Also, hopefully the space needs of the rsync files have been greatly
> reduced. At this point, if you're satisfied with the results, you can
> delete the duplicate files.
>
> Be careful. Double and triple check everything. Read the manual. Here be
> dragons. Etc.
>
>
> Sounds like more trouble than it's worth as it would still mean that I
> would have some files that I wouldn't be able to backup with their original
> paths.
>
> I've started the process of installing and configuring Bacula on the
> server but it's not going well. However that's a separate issue.
> _______________________________________________
> Bacula-users mailing list
> Bacula-users@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/bacula-users
>
_______________________________________________
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users

Reply via email to