Re: OpenBSD alternative setup to ZFS on Linux or FreeBSD

2023-11-27 Thread Stuart Henderson
On 2023/11/26 11:36, Crystal Kolipe wrote:
> On Sun, Nov 26, 2023 at 01:52:22PM -, Stuart Henderson wrote:
> > On 2023-11-24, Crystal Kolipe  wrote:
> > > At the end of last year, I did a comprehensive write-up about using 
> > > blu-ray
> > > recordable on OpenBSD, and as part of that I checked around 100 BD-R discs
> > > that had been written about 10 years previously and verified as good at 
> > > the
> > > time.  Ten years laster, I found exactly ZERO bad discs.  All data was
> > > readable from every single disc, (and returned the correct checksums).
> > 
> > Anyone know whether USB BD-R drives are likely to work on OpenBSD?
> 
> From a software point of view you are just writing to the /dev/rcd* device, so
> any standards-compliant USB BD-R drive should work.
> 
> Like any USB peripherals, there are probably some stupid ones that don't or
> would require tweaks to the kernel to be recognised, etc.
> 
> Having said that, for maximum reliable operation you need a sustained data
> rate up to ~50 Mb/sec, and I wouldn't entirely trust USB for that.
> 
> Any reason why you particularly want a USB one?  If I wanted a reliable
> external BD-R drive that I could move between machines, I'd probably put one
> of the Pioneer ones in an external eSATA enclosure.

I don't think I have any hardware with eSATA (and most of my boxes
are either laptops or mini PCs - I think there's only one machine
where I could add a card with eSATA).

> The one Asus drive I tested had a few quirks, BTW, so I wouldn't be
> inclined to invest in one of those.

Thanks for the hints.



Re: OpenBSD alternative setup to ZFS on Linux or FreeBSD

2023-11-26 Thread Crystal Kolipe
On Sun, Nov 26, 2023 at 04:36:43PM -0500, Geoff Steckel wrote:
> On 2023-11-24, Crystal Kolipe  wrote:
> >>At the end of last year, I did a comprehensive write-up about using blu-ray
> >>recordable on OpenBSD, and as part of that I checked around 100 BD-R discs
> >>that had been written about 10 years previously and verified as good at the
> >>time.  Ten years laster, I found exactly ZERO bad discs.  All data was
> >>readable from every single disc, (and returned the correct checksums).

> Do you have any data about blu-ray double layer lifetime?

Most of the discs in that test that I mentioned were single layer, I think
there were probably six or eight that were dual layer.  But out of that small
sample, I had no issues with reading them several years later.

Dual layer recording _is_ more picky and error prone in general.  I have had
more difficulty finding good combinations of drive and brand of media when
using BR-R DL than with single layer.  But almost always, any problems are
repeatable with the same combination of drive and brand of media, whereas
other known good combinations work fine the vast majority of the time.

Reading and writing BD-R DL is somewhat slower than single layer, so for
smaller datasets I've generally preferred the 25 Gb discs.

Just to be clear: these are discs that verified successfully when first
written.  What I'm saying is that I've not seen degradation _over time_.

Those that read correctly the first time have never given me problems years
later, which is very different to my experiences with CD-R and DVD-R.



Re: OpenBSD alternative setup to ZFS on Linux or FreeBSD

2023-11-26 Thread Geoff Steckel

On 2023-11-24, Crystal Kolipe  wrote:

At the end of last year, I did a comprehensive write-up about using blu-ray
recordable on OpenBSD, and as part of that I checked around 100 BD-R discs
that had been written about 10 years previously and verified as good at the
time.  Ten years laster, I found exactly ZERO bad discs.  All data was
readable from every single disc, (and returned the correct checksums).

Do you have any data about blu-ray double layer lifetime?
thanks!



Re: OpenBSD alternative setup to ZFS on Linux or FreeBSD

2023-11-26 Thread Geoff Steckel

On 11/26/23 08:52, Stuart Henderson wrote:

Anyone know whether USB BD-R drives are likely to work on OpenBSD?

I've used several. XD08UMB-S works for reading - haven't tried writing yet.
Earlier ones worked for reading and writing



Re: OpenBSD alternative setup to ZFS on Linux or FreeBSD

2023-11-26 Thread Nowarez Market


Crystal Kolipe  wrote:

> The one Asus drive I tested had a few quirks, BTW, so I wouldn't be
> inclined to invest in one of those.


Here an Asus player too..



Re: OpenBSD alternative setup to ZFS on Linux or FreeBSD

2023-11-26 Thread Crystal Kolipe
On Sun, Nov 26, 2023 at 01:52:22PM -, Stuart Henderson wrote:
> On 2023-11-24, Crystal Kolipe  wrote:
> > At the end of last year, I did a comprehensive write-up about using blu-ray
> > recordable on OpenBSD, and as part of that I checked around 100 BD-R discs
> > that had been written about 10 years previously and verified as good at the
> > time.  Ten years laster, I found exactly ZERO bad discs.  All data was
> > readable from every single disc, (and returned the correct checksums).
> 
> Anyone know whether USB BD-R drives are likely to work on OpenBSD?

>From a software point of view you are just writing to the /dev/rcd* device, so
any standards-compliant USB BD-R drive should work.

Like any USB peripherals, there are probably some stupid ones that don't or
would require tweaks to the kernel to be recognised, etc.

Having said that, for maximum reliable operation you need a sustained data
rate up to ~50 Mb/sec, and I wouldn't entirely trust USB for that.

Any reason why you particularly want a USB one?  If I wanted a reliable
external BD-R drive that I could move between machines, I'd probably put one
of the Pioneer ones in an external eSATA enclosure.

The one Asus drive I tested had a few quirks, BTW, so I wouldn't be
inclined to invest in one of those.



Re: OpenBSD alternative setup to ZFS on Linux or FreeBSD

2023-11-26 Thread Stuart Henderson
On 2023-11-24, Crystal Kolipe  wrote:
> At the end of last year, I did a comprehensive write-up about using blu-ray
> recordable on OpenBSD, and as part of that I checked around 100 BD-R discs
> that had been written about 10 years previously and verified as good at the
> time.  Ten years laster, I found exactly ZERO bad discs.  All data was
> readable from every single disc, (and returned the correct checksums).

Anyone know whether USB BD-R drives are likely to work on OpenBSD?




Re: OpenBSD alternative setup to ZFS on Linux or FreeBSD

2023-11-25 Thread Stuart Henderson
On 2023-11-24, Stephen Wiley  wrote:
> I was messing with blueray a couple years ago for archiving. Last I checked
> it's pretty marginal in terms of cost when compared with SSDs.

SSDs are absolutely not suitable for long term archival. They need to be
kept powered to avoid losing data over the medium term, and even then
are still subject to problems.

Magnetic HDDs are better for medium term but if you want to be sure things
are readable in say 30 years you should probably look elsewhere too.


-- 
Please keep replies on the mailing list.



Re: OpenBSD alternative setup to ZFS on Linux or FreeBSD

2023-11-24 Thread Mihai Popescu
> Another interesting thread.

Not at all. Hollywood's movies are providing much interesting ideas
about human trying to obtain immortality.
Also midnight psy therapy threads are not very interesting.



Re: OpenBSD alternative setup to ZFS on Linux or FreeBSD

2023-11-24 Thread deich...@placebonol.com



On November 24, 2023 2:48:06 PM MST, Crystal Kolipe 
 wrote:
>On Fri, Nov 24, 2023 at 04:01:11PM -0500, Stephen Wiley wrote:
>> I was messing with blueray a couple years ago for archiving. Last I checked
>> it's pretty marginal in terms of cost when compared with SSDs.
>
>Archiving to SSD?  You can't be serious.  I've seen more spurious unreported
>bit flips from SSDs than just about any other storage medium.  SSD would be
>beyond my last choice for long term storage of anything I cared about.


Another interesting thread.

I would never consider anything that uses semiconductor technology for long 
term, archival storage.  Over 40 years ago I worked on an Intel research 
project looking at the effects of Alpha particles on memory cells.  Forty years 
later Alpha particles are much bigger relative to current architectures.  We 
used to measure in microns, now we are at or close to sub nanometer structures.




Re: OpenBSD alternative setup to ZFS on Linux or FreeBSD

2023-11-24 Thread Crystal Kolipe
On Fri, Nov 24, 2023 at 04:01:11PM -0500, Stephen Wiley wrote:
> I was messing with blueray a couple years ago for archiving. Last I checked
> it's pretty marginal in terms of cost when compared with SSDs.

Archiving to SSD?  You can't be serious.  I've seen more spurious unreported
bit flips from SSDs than just about any other storage medium.  SSD would be
beyond my last choice for long term storage of anything I cared about.

And is cost really the main or even most important factor when deciding on a
backup strategy?

> I don't think the larger capacity disks I bought are all that
> high quality either. I haven't checked on them lately but I suspect they won't
> be readable in ten years.

Why not?  (Unless you're using LTH discs, of course)

Did you verify them at the time of writing?  Manually verify, that is, not
relying on the automatic read after write that blu-ray usually does if you
have sparing enabled?

At the end of last year, I did a comprehensive write-up about using blu-ray
recordable on OpenBSD, and as part of that I checked around 100 BD-R discs
that had been written about 10 years previously and verified as good at the
time.  Ten years laster, I found exactly ZERO bad discs.  All data was
readable from every single disc, (and returned the correct checksums).

This is not the same technology as DVD-R.  Good quality HTL blu-ray media does
not, in my experience, deteriorate in the way organic dye based optical media
deteriorates.

> When you add to that the complexity of the whole multisession recording thing

Huh?  You can just write a _tar archive_ to the bd-r and read it back as if it
was a tape.  Where is the complexity in that?

> I'm just not sure it's an improvement over hosting whatever disk is common at
> the current time and periodically running rsync via some mechanism to keep 
> fresh
> copies of your archives.

So your 'archived' data is actually locally on-line and re-writable at any
time?  Silently overwritten by a faulty disk controller, for example?  That
sounds like a very bad choice to me.



Re: OpenBSD alternative setup to ZFS on Linux or FreeBSD

2023-11-24 Thread Stephen Wiley
I was messing with blueray a couple years ago for archiving. Last I checked
it's pretty marginal in terms of cost when compared with SSDs. It's just hard
to compete with the progress everyone's been making with semiconductor
manufacturing. I don't think the larger capacity disks I bought are all that
high quality either. I haven't checked on them lately but I suspect they won't
be readable in ten years.

When you add to that the complexity of the whole multisession recording thing
I'm just not sure it's an improvement over hosting whatever disk is common at
the current time and periodically running rsync via some mechanism to keep fresh
copies of your archives.

On Wed, Nov 22, 2023 at 04:49:33PM -0300, Crystal Kolipe wrote:
> On Wed, Nov 22, 2023 at 08:23:40PM +0100, i...@tutanota.com wrote:
> > > Once data is no longer "work in progress", archive it to write-only
> > > media and take it out of the regular backup loop.
> > 
> > What kind of write-only media do you use/recommend?
> 
> It depends on quite a few factors including the quantity of data you need to
> backup, and how much you are prepared to spend on equipment and media.
> 
> For a home or small office user, the most accessible in terms of cost, and
> useful in terms of capacity WORM device is probably a bluray disc recorder.
> 
> There are certainly other options, including, (much), more expensive optical
> disc formats such as Archival Disc, and certain LTO tapes which are not really
> WORM in the strictest sense but for most purposes behave like it.
> 
> But if you just want to "dip your toes" in to keeping physical copies of
> valuable data on a disc that can't be overwritten by software and isn't
> subject to the same hazards as magnetic media, then BD-R is probably the best
> way in to that.
> 
> And speaking from experience, it's _much_ more reliable than DVD-R or CD-R as
> long as the discs are correctly written in the first place.
> 
> If you search around the internet, you'll easily find a lot of negative
> commentary about BD-R from people who _don't use it_.  In my experience it
> works quite well, and certainly can be used on OpenBSD machines with little
> difficulty.
> 
> (BD-RW can even be written as a regular block device, and doesn't require
>  special writing software, but that's not WORM media.)
> 
> Oh, and punched aluminuimised tape is also quite a good choice for small
> files.  That'll outlast practically anything else.
> 



Re: OpenBSD alternative setup to ZFS on Linux or FreeBSD

2023-11-24 Thread tetrosalame

Il 22/11/2023 04:16, i...@tutanota.com ha scritto:

Ever since I read a post on @misc from Nick Holland to someone asking
about running a large filesystem on OpenBSD, in which Nick wrote:


[...]


Then for every important big file use something like par2cmdline to
create parity data.

[...]


Of course backup is essential, it's not about that.

Running a script that checks all checksums is a "poor mans" version of
ZFS scrubbing. If bit rot is found, repair the file with par2 parity.



You already got many good pointers. Let me add one: if you're after 
*long term* archiving, establish a process first. Understand what you 
need, what are your goals and how much hassle you can bear (err...afford).
Write it down, in paper (in stone would be a better choice). Double 
check you're not working with data subject to regulations. If not, read 
some of them anyway; it's interesting.


When the process is nailed down and you/your organization are willing to 
follow it, then look for tools.
Someone in this thread correctly observed that papyrus and friends 
lasted centuries: right, because they were simple.


Good luck!
--
f



Re: OpenBSD alternative setup to ZFS on Linux or FreeBSD

2023-11-23 Thread Crystal Kolipe
On Fri, Nov 24, 2023 at 08:32:20AM +1000, Stuart Longland VK4MSL wrote:
> On 22/11/23 18:25, Crystal Kolipe wrote:
> >1. Once data is no longer "work in progress", archive it to write-only
> >media and take it out of the regular backup loop.  In most cases this
> >drastically reduces the volume of data you need to manage.  Feel free
> >to keep a local on-line copy on a regular disk too for faster access.
> 
> If it's "write only", how does one read it?

You try to write each bit a second time, and if the write succeeds then you
know that it wasn't already written the first time around.

Haven't you ever used magnetic core memory?



Re: OpenBSD alternative setup to ZFS on Linux or FreeBSD

2023-11-23 Thread Stuart Longland VK4MSL

On 22/11/23 18:25, Crystal Kolipe wrote:

1. Once data is no longer "work in progress", archive it to write-only
media and take it out of the regular backup loop.  In most cases this
drastically reduces the volume of data you need to manage.  Feel free
to keep a local on-line copy on a regular disk too for faster access.


If it's "write only", how does one read it?
--
Stuart Longland (aka Redhatter, VK4MSL)

I haven't lost my mind...
  ...it's backed up on a tape somewhere.



Re: OpenBSD alternative setup to ZFS on Linux or FreeBSD

2023-11-23 Thread Nowarez Market
Geoff Steckel :

> Of course, there's one storage medium verified to last for centuries.
> Good ink on rag paper stored dry. Papyrus is good for millenia.
> Not entirely a joke.

This needs any bullet-proof..



Re: OpenBSD alternative setup to ZFS on Linux or FreeBSD

2023-11-23 Thread Geoff Steckel

On 11/22/23 20:31, j...@bitminer.ca wrote


For long-term storage, you have other risks to manage, not the
simple technical risk of "will my portable-USB disk be readable in
2038?".


Interfaces die - IDE interface cards? Even if you have one the ISA bus
might not be available. Parallel SCSI, parallel PCI etc.

Archiving data needs regular copy and verify to newer
formats/media/attachments and multiple copies stored multiple places.
I have files from 1982 which have survived many disasters.

Of course, there's one storage medium verified to last for centuries.
Good ink on rag paper stored dry. Papyrus is good for millenia.
Not entirely a joke.



Re: OpenBSD alternative setup to ZFS on Linux or FreeBSD

2023-11-22 Thread j


And speaking from experience, it's _much_ more reliable than DVD-R or 
CD-R as

long as the discs are correctly written in the first place.



For long-term storage, you have other risks to manage, not the
simple technical risk of "will my portable-USB disk be readable in
2038?".

If you are a home-based user, or sole practitioner, or lone-gunman
archivist, you should consider the possibility that in 20 years you
will no longer be able to remember how to process old disks and
files.  Writing yourself some instructions would be essential.  On
paper. And, too, regularly practicing on old media.

In the small-business case, your technology, media, even corporate
culture can result in unexpected destruction of "important" media
by unaware individuals who will make some unbelievable decisions.
Like "throw out any media smaller than 5TB as it's obsolete."  "Toss
those old DLT-4000 drives as nobody uses them anymore."  "Nobody
needs this box of discs..."

(No shit, it happened to me.)

In larger corporate cultures, for example with contract commitments
of decades, well, that's out of scope for this discussion.  But
it's fun to imagine "How do I support WordPerfect 2.3 in 2039?".

J



Re: OpenBSD alternative setup to ZFS on Linux or FreeBSD

2023-11-22 Thread Crystal Kolipe
On Wed, Nov 22, 2023 at 08:23:40PM +0100, i...@tutanota.com wrote:
> > Once data is no longer "work in progress", archive it to write-only
> > media and take it out of the regular backup loop.
> 
> What kind of write-only media do you use/recommend?

It depends on quite a few factors including the quantity of data you need to
backup, and how much you are prepared to spend on equipment and media.

For a home or small office user, the most accessible in terms of cost, and
useful in terms of capacity WORM device is probably a bluray disc recorder.

There are certainly other options, including, (much), more expensive optical
disc formats such as Archival Disc, and certain LTO tapes which are not really
WORM in the strictest sense but for most purposes behave like it.

But if you just want to "dip your toes" in to keeping physical copies of
valuable data on a disc that can't be overwritten by software and isn't
subject to the same hazards as magnetic media, then BD-R is probably the best
way in to that.

And speaking from experience, it's _much_ more reliable than DVD-R or CD-R as
long as the discs are correctly written in the first place.

If you search around the internet, you'll easily find a lot of negative
commentary about BD-R from people who _don't use it_.  In my experience it
works quite well, and certainly can be used on OpenBSD machines with little
difficulty.

(BD-RW can even be written as a regular block device, and doesn't require
 special writing software, but that's not WORM media.)

Oh, and punched aluminuimised tape is also quite a good choice for small
files.  That'll outlast practically anything else.



Re: OpenBSD alternative setup to ZFS on Linux or FreeBSD

2023-11-22 Thread iio7
> Once data is no longer "work in progress", archive it to write-only
> media and take it out of the regular backup loop.

What kind of write-only media do you use/recommend?



Re: OpenBSD alternative setup to ZFS on Linux or FreeBSD

2023-11-22 Thread Stefan Kreutz
On Wed, Nov 22, 2023 at 09:49:53AM +0100, Maja Reberc wrote:
> Does anyone recommend FAT32-formatted 1 TB external HDDs for
> OS-portable backups (using archive splitting to bypass the 4 GB limit)?
> I've heard FAT32 is very inefficient with big partitions. I currently
> have a mess of ext4 for Linux, ZFS (yes ...) for FreeBSD, and nothing
> yet for OpenBSD (sadly, my favourite OS does not support redshift on my
> Nvidia card, and that is a requirement for my eyes).

I use a 2TB FAT32-formatted USB HDD for portable backups. The archives
are splitted scrypt-encrypted gzipped GNU tarballs so I can read them on
virtually any machine. I use GNU tar because of extra long filenames I
don't control, otherwise I would prefer the POSIX ustar format.

It works just fine for my purposes.



Re: OpenBSD alternative setup to ZFS on Linux or FreeBSD

2023-11-22 Thread Maja Reberc
On Wed, 22 Nov 2023 06:47:11 -0300
Crystal Kolipe  wrote:

> I don't want to encourage people to just copy and paste some random
> scripts that were written to meet our needs but most likely don't
> exactly meet theirs.
> 
> But as a _starting point for writing your own_, the following script
> will let you create and verify checksums, as well as identify files
> which don't yet have a checksum recorded.

Thanks a lot for sharing!

As recommended, I will use it to get to know the process and craft my
own tools.


Re: OpenBSD alternative setup to ZFS on Linux or FreeBSD

2023-11-22 Thread Crystal Kolipe
On Wed, Nov 22, 2023 at 09:49:53AM +0100, Maja Reberc wrote:
> Would you mind sharing the scripts you mentioned for us newbies?

I don't want to encourage people to just copy and paste some random scripts
that were written to meet our needs but most likely don't exactly meet theirs.

But as a _starting point for writing your own_, the following script will let
you create and verify checksums, as well as identify files which don't yet
have a checksum recorded.

All it does is recurse down the directory structure looking for files called
'checksums' in each directory.  If it finds one then it verifies the checksums
it contains and if there are any files which are not listed then it prints a
message to the console with the filename.

So if you wanted to use it to monitor changes to your home directory, you
would just do 'touch checksums' in $HOME, and any subdirectories that you also
wanted to include.  Then invoke the script the first time with 'a' as an
argument to populate those checksum files.

Then, you can just run it with no arguments in $HOME, and it will tell you if
there are any new files, (which you can add by running the script with any
argument other than 'i'), or any changed files, (they will display a FAILED
message).

If you just want to add new files and skip verifying the existing checksums
for speed, the 'a' option will do that.  Likewise, 'i' will create a new
checksums file in a directory that didn't already have one.

Once again, this is intended as an example to get you started writing your
own better version.  I literally wrote and tested this just now in 15 minutes.
It's not what we actually use here.

Note that if a file has changed and fails the checksum, the script still
prints, 'All files have entries in the checksum file'.  This is intentional,
because the changed file is not _new_, it was already known about.  It's just
changed.

#/bin/sh
if [ "$1" == "i" ] ; then touch checksums ; fi
for i in `find . | grep /checksums$` ;
do (
if [ "$1" == "a" ] ; then echo -n "Not v" ; else echo -n "V" ; fi
echo "erifying checksums in directory ${i%/checksums}";
cd ${i%/checksums};
if [ "$1" != "a" ] ; then sha512 -cq checksums; fi
let flag=0;
for j in !(checksums|checksums.bak) ;
do
if [ ! -d $j ] ; then grep "($j)" checksums > /dev/null || { if [ -z "$1" ] ; 
then echo "$j is not in the checksums file!" ; let flag=1 ; else echo "Adding 
$j to checksums file" ; sha512 $j >> checksums ; fi ; } fi ;
done ;
if [ $flag -eq 1 ] ; then echo "Run $0 with any command line arguments to add 
missing entries to the checksums file."; else echo "All files have entries in 
the checksum file."; fi ;
 );
done



Re: OpenBSD alternative setup to ZFS on Linux or FreeBSD

2023-11-22 Thread Maja Reberc
On Wed, 22 Nov 2023 05:25:22 -0300
Crystal Kolipe  wrote:
> We have been doing "something similar", in fact much simpler, on
> OpenBSD and other unix-like systems for > 25 years.
> 
> It's trivially simple to protect your data, and you along with
> 99.999% of other people seem to be over thinking it.
> 
> 1. Once data is no longer "work in progress", archive it to write-only
>media and take it out of the regular backup loop.  In most cases
> this drastically reduces the volume of data you need to manage.  Feel
> free to keep a local on-line copy on a regular disk too for faster
> access.
> 
> 2. Write scripts to copy data that matters elsewhere automatically.
> This can be another drive in the local machine, or even another
> partition on the same disk.  This takes the place of your "RAID-1 or
> RAID-5", and is much, much less error-prone because it's just copying
> files around.
> 
> 3. Write a script to verify the copy with the original version and
>highlight changes.  (Ours is 18 lines of shell script.)
> 
> 4. Write a script to create and verify a file of checksums in the
> current directory.  (Also not complicated - ours is 15 lines of shell
> script.)
> 
> We have kept many Tb of data intact and free of bitrot for decades
> using simple methods like this.  No need for fancy filesystems or
> command line parity tools, just use tar, sha256 and crucially a
> little bit of intelligence, and the problem is solved.
> 
> And yes, we have certainly seen bitflips when reading from disk,
> reading from SSDs, (which overall seem _worse_ for random unreported
> bit flipping), and also bad system RAM which causes data in the
> buffer cache to be corrupted.  All of these threats are easily
> mitigated with tar and sha256, and the aforementioned application of
> some intelligence to the problem.
> 
> The only problem is that it doesn't have a flashy name like "ZFS".

Thank you for this super interesting answer! I am very much for
functional simplicity over complexity one does not understand.

Presently, I use a script, utilising rsync, for fast backups and sync
diffs, but I'd like a more long-term reliable and checksummed solution.

Would you mind sharing the scripts you mentioned for us newbies?

Some additional portability rant:

Does anyone recommend FAT32-formatted 1 TB external HDDs for
OS-portable backups (using archive splitting to bypass the 4 GB limit)?
I've heard FAT32 is very inefficient with big partitions. I currently
have a mess of ext4 for Linux, ZFS (yes ...) for FreeBSD, and nothing
yet for OpenBSD (sadly, my favourite OS does not support redshift on my
Nvidia card, and that is a requirement for my eyes).


Re: OpenBSD alternative setup to ZFS on Linux or FreeBSD

2023-11-22 Thread Crystal Kolipe
On Wed, Nov 22, 2023 at 04:16:00AM +0100, i...@tutanota.com wrote:
> Running disks in RAID1 or RAID5 (pick your poison) with softraid.
> 
> Then for every important big file use something like par2cmdline to
> create parity data.
> 
> par2cmdline can be used to verify and re-create files.
> 
> I would perhaps also create simple checksums for files as well, because
> that's faster to run through a script, checking all files, than
> par2verify.
> 
> For smaller files, perhaps put them into a version control system with
> integrity checking and parity rather than the above.
> 
> Of course backup is essential, it's not about that.
> 
> Running a script that checks all checksums is a "poor mans" version of
> ZFS scrubbing. If bit rot is found, repair the file with par2 parity.
> 
> For send/receive, if needed, I think rsync is adequate as it also uses
>  checksums to validate the transfer of files.
> 
> Any feedback? Do you do something similar on OpenBSD?

We have been doing "something similar", in fact much simpler, on OpenBSD
and other unix-like systems for > 25 years.

It's trivially simple to protect your data, and you along with 99.999% of
other people seem to be over thinking it.

1. Once data is no longer "work in progress", archive it to write-only
   media and take it out of the regular backup loop.  In most cases this
   drastically reduces the volume of data you need to manage.  Feel free
   to keep a local on-line copy on a regular disk too for faster access.

2. Write scripts to copy data that matters elsewhere automatically.  This
   can be another drive in the local machine, or even another partition
   on the same disk.  This takes the place of your "RAID-1 or RAID-5", and
   is much, much less error-prone because it's just copying files around.

3. Write a script to verify the copy with the original version and
   highlight changes.  (Ours is 18 lines of shell script.)

4. Write a script to create and verify a file of checksums in the current
   directory.  (Also not complicated - ours is 15 lines of shell script.)

We have kept many Tb of data intact and free of bitrot for decades using
simple methods like this.  No need for fancy filesystems or command line
parity tools, just use tar, sha256 and crucially a little bit of
intelligence, and the problem is solved.

And yes, we have certainly seen bitflips when reading from disk, reading
from SSDs, (which overall seem _worse_ for random unreported bit flipping),
and also bad system RAM which causes data in the buffer cache to be
corrupted.  All of these threats are easily mitigated with tar and sha256,
and the aforementioned application of some intelligence to the problem.

The only problem is that it doesn't have a flashy name like "ZFS".