On 8/2/24 06:36, Frank Steinmetzger wrote:
Am Tue, Jan 30, 2024 at 06:15:09PM -0000 schrieb Grant Edwards:
I need to set up some sort of automated backup on a couple Gentoo
machines (typical desktop software development and home use). One of
them used rsnapshot in the past but the crontab entries that drove
that have vanished :/ (presumably during a reinstall or upgrade --
IIRC, it took a fair bit of trial and error to get the crontab entries
figured out).

I believe rsnapshot ran nightly and kept daily snapshots for a week,
weekly snapshots for a month, and monthly snapshots for a couple
years.

Are there other backup solutions that people would like to suggest I
look at to replace rsnapshot?  I was happy enough with rsnapshot (when
it was running), but perhaps there's something else I should consider?
In my early backup times I, too, used rsnapshot to back up my ~ and rsync
for my big media files. But that only included my PC. My laptop was wholly
un-backed-up. I only syncronised much of my home and my audio collection
between the two with unison. At some point my external 3 TB drive became
free and then I started using borg to finally do proper backups.

Borg is very similar to restic, I actually used the two in parallel for a
while to compare them, but stayed with borg. One pain point was that I
couln’t switch off restic’s own password protection. Since all my backup
disks are LUKSed anyway, I don’t need that.

Since borg works block-based, it does deduplication without extra cost and
it is suitable for big image files which don’t change much. I do full
filesystem backups of /, ~ and my media partition of my main PC and my
laptop. I have one repository for each of those three filesystems, and each
repo receives the data from both machines, so they are deduped. Since both
machines run Arch, their roots are binary identical. The same goes for my
unison-synced homes.

Borg has retention logic built-in. You can say I want to keep the latest
archive of each of the last 6 days/weeks/months/years, and it even goes down
to seconds. And of course you can combine those rules. The only thing is
they don’t overlap, meaning if you want to keep the last 14 days and the
last four weeks, those weekly retentions start after the last daily
snapshots.

In summary, advantages:
+ fast dedup, built-in compression (different algos and levels configurable)
+ big data files allow for quick mirroring of repositories.
   I simply rsync my primary backup disk to two other external HDDs.
+ Incremental backups are quite fast because borg uses a cache to detect
   changed files quickly.
Disadvantages:
- you need borg to mount the backups it
- it is not as fast as native disk access, especially during restore and
   when getting a total file listing due to lots of random I/O on the HDD.


As example, I currently have 63 snapshots in my data partition repository:

# borg list data/
tp_2021-06-07           Mon, 2021-06-07 16:27:44 
[5f9ebd9f24353c340691b2a71f5228985a41699d2e23473ae4e9e795669c8440]
kern_2021-06-07         Mon, 2021-06-07 23:58:56 
[19c76211a9c35432e6a66ac1892ee19a08368af28d2d621f509af3d45f203d43]
[... 55 more lines ...]
kern_2024-01-14         Sun, 2024-01-14 20:53:23 
[499ce7629e64cffb7ec6ec9ffbf0c595e4ede3d93f131a9a4b424b165647f645]
tp_2024-01-14           Sun, 2024-01-14 20:57:42 
[ea2baef3e4bb49c5aec7cf8536f7b00b55fb27ecae3a80ef9f5a5686a1da30d5]
kern_2024-01-21         Sun, 2024-01-21 23:42:46 
[71aa2ce6cf4021712f949af068498bfda7797b5d1c5ddc0f0ce8862b89e48961]
tp_2024-01-21           Sun, 2024-01-21 23:48:24 
[45e35ed9206078667fa62d0e4a1ac213e77f52415f196101d14ee21e79fc393d]
kern_2024-02-04         Sun, 2024-02-04 23:16:43 
[e1b015117143fad6b89cea66329faa888cffc990644e157b1d25846220c62448]
tp_2024-02-04           Sun, 2024-02-04 23:23:15 
[e9b167ceec1ab9a80cbdb1acf4ff31cd3935fc23e81674cad1b8694d98547aeb]

The last “tp” (Thinkpad) snapshot contains 1 TB, “kern” (my PC) 809 GB.
And here you see how much space this actually takes on disk:

# borg info data/
[ ... ]
                  Original size   Compressed size    Deduplicated size
All archives:         56.16 TB          54.69 TB              1.35 TB

Obviously, compression doesn’t do much for media files. But it is very
effective in the repository for the root partitions:

# borg info arch-root/
[ ... ]
                  Original size   Compressed size    Deduplicated size
All archives:          1.38 TB         577.58 GB             79.41 GB

I would also like to add my +1 to borgbackup ... I long ago lost the ability to use snapshots and full size backups due to the sheer amount of data involved.  Currently I use borg to backup multiple hosts to individual backups on a  dedicated machine (low power arm based, 6TB drive).  I also backup from the top level of the directory all those repos are stored in to another arm system (2TB drive) again using borg.  As each 1st level backup only adds/changes a few chunks for each iteration, the second level only takes minutes to run as against 30minutes or so for some of the individual hosts.  The second level adds redundancy if I lose the 1st level backups, and the second can be recreated at any time from the 1st level.  This is working for ~15 hosts and VM's of various types involving hundreds of terabytes of original data.

Downside for VM's is that even a slight change to the image requires the whole image to be read and check-summed to identify the changes to be stored.  For images hundreds of gigabytes in size (on my hardware/network) its actually quicker to mount and backup the internal files (camera images in my case) than the VM image.

It is more complex than simple schemas, but I regularly restore from both the first and second level backups for disaster recovery/testing/rollbacks etc.  There is a management package (borgmatic) but I have not tried it as use my own scripts.

BillK



Reply via email to