On Mon, 15 Aug 2022 04:33:44 -0400,
Dale wrote:
>
> William Kenworthy wrote:
> >
> > On 15/8/22 06:44, Dale wrote:
> >> Howdy,
> >>
> >> With my new fiber internet, my poor disks are getting a work out, and
> >> also filling up. First casualty, my backup disk. I have one directory
> >> that is . . . well . . . huge. It's about 7TBs or so. This is where it
> >> is right now and it's still trying to pack in files.
> >>
> >>
> >> /dev/mapper/8tb 7.3T 7.1T 201G 98% /mnt/8tb
> >>
> >>
> >> Right now, I'm using rsync which doesn't compress files but does just
> >> update things that have changed. I'd like to find some way, software
> >> but maybe there is already a tool I'm unaware of, to compress data and
> >> work a lot like rsync otherwise. I looked in app-backup and there is a
> >> lot of options but not sure which fits best for what I want to do.
> >> Again, backup a directory, compress and only update with changed or new
> >> files. Generally, it only adds files but sometimes a file gets replaced
> >> as well. Same name but different size.
> >>
> >> I was trying to go through the list in app-backup one by one but to be
> >> honest, most links included only go to github or something and usually
> >> doesn't tell anything about how it works or anything. Basically, as far
> >> as seeing if it does what I want, it's useless. It sort of reminds me of
> >> quite a few USE flag descriptions.
> >>
> >> I plan to buy another hard drive pretty soon. Next month is possible.
> >> If there is nothing available that does what I want, is there a way to
> >> use rsync and have it set to backup files starting with "a" through "k"
> >> to one spot and then backup "l" through "z" to another? I could then
> >> split the files into two parts. I use a script to do this now, if one
> >> could call my little things scripts, so even a complicated command could
> >> work, just may need help figuring out the command.
> >>
> >> Thoughts? Ideas?
> >>
> >> Dale
> >>
> >> :-) :-)
> >>
> > The questions you need to ask is how compressible is the data and how
> > much duplication is in there. Rsync's biggest disadvantage is it
> > doesn't keep history, so if you need to restore something from last
> > week you are SOL. Honestly, rsync is not a backup program and should
> > only be used the way you do for data that don't value as an rsync
> > archive is a disaster waiting to happen from a backup point of view.
> >
> > Look into dirvish - uses hard links to keep files current but safe, is
> > easy to restore (looks like a exact copy so you cp the files back if
> > needed. Downside is it hammers the hard disk and has no compression
> > so its only deduplication via history (my backups stabilised about 2x
> > original size for ~2yrs of history - though you can use something like
> > btrfs which has filesystem level compression.
> >
> > My current program is borgbackup which is very sophisticated in how it
> > stores data - its probably your best bet in fact. I am storing
> > literally tens of Tb of raw data on a 4Tb usb3 disk (going back years
> > and yes, I do restore regularly, and not just for disasters but for
> > space efficient long term storage I access only rarely.
> >
> > e.g.:
> >
> > A single host:
> >
> > ------------------------------------------------------------------------------
> >
> > Original size Compressed size Deduplicated
> > size
> > All archives: 3.07 TB 1.96 TB
> > 151.80 GB
> >
> > Unique chunks Total chunks
> > Chunk index: 1026085 22285913
> >
> >
> > Then there is my offline storage - it backs up ~15 hosts (in repos
> > like the above) + data storage like 22 years of email etc. Each host
> > backs up to its own repo then the offline storage backs that up. The
> > deduplicated size is the actual on disk size ... compression varies as
> > its whatever I used at the time the backup was taken ... currently I
> > have it set to "auto,zstd,11" but it can be mixed in the same repo (a
> > repo is a single backup set - you can nest repos which is what I do -
> > so ~45Tb stored on a 4Tb offline disk). One advantage of a system
> > like this is chunked data rarely changes, so its only the differences
> > that are backed up (read the borgbackup docs - interesting)
> >
> > ------------------------------------------------------------------------------
> >
> > Original size Compressed size Deduplicated
> > size
> > All archives: 28.69 TB 28.69 TB
> > 3.81 TB
> >
> > Unique chunks Total chunks
> > Chunk index:
> >
> >
> >
> >
>
>
> For the particular drive in question, it is 99.99% videos. I don't want
> to lose any quality but I'm not sure how much they can be compressed to
> be honest. It could be they are already as compressed as they can be
> without losing resolution etc. I've been lucky so far. I don't think
> I've ever needed anything and did a backup losing what I lost on working
> copy. Example. I update a video only to find the newer copy is corrupt
> and wanting the old one back. I've done it a time or two but I tend to
> find that before I do backups. Still, it is a downside and something
> I've thought about before. I figure when it does happen, it will be
> something hard to replace. Just letting the devil have his day. :-(
>
> For that reason, I find the version type backups interesting. It is a
> safer method. You can have a new file but also have a older file as
> well just in case new file takes a bad turn. It is a interesting
> thought. It's one not only I should consider but anyone really.
>
> As I posted in another reply, I found a 10TB drive that should be here
> by the time I do a fresh set of backups. This will give me more time to
> consider things. Have I said this before a while back??? :/
>
zfs would solve your problem of corruption, even without versioning.
You do a scrub at short intervals and at least you would know if the
file is corrupted. Of course, redundancy is better, such as mirroring
and backups take a very short time because sending from one zfs to
another it knows exactly what bytes to send.
--
Your life is like a penny. You're going to lose it. The question is:
How do
you spend it?
John Covici wb2una
[email protected]