On 5/3/24 04:26, Marc SCHAEFER wrote:
On Mon, Apr 08, 2024 at 10:04:01PM +0200, Marc SCHAEFER wrote:
For off-site long-term offline archiving, no, I am not using RAID.
Now, as I had to think a bit about ONLINE integrity, I found this
comparison:
https://github.com/t13a/dm-integrity-benchmarks
Contenders are btrfs, zfs, and notably ext4+dm-integrity+dm-raid
I tend to have a biais favoring UNIX layered solutions against
"all-into-one" solutions, and it seems that performance-wise,
it's also quite good.
I wrote this script to convince myself of auto-correction
of the ext4+dm-integrity+dm-raid layered approach.
Thank you for devising a benchmark and posting some data. :-)
FreeBSD also offers a layered solution. From the top down:
* UFS2 file system, which supports snapshots (requires partitions with
soft updates enabled).
* gpart(8) for partitions (volumes).
* graid(8) for redundancy and self-healing.
* geli(8) providers with continuous integrity checking.
AFAICT the FreeBSD stack is mature and production quality, which I find
very appealing. But the feature set is not as sophisticated as ZFS,
which leaves me wanting. Notably, I have not found a way to replicate
UFS snapshots directly -- the best I can dream up is synchronizing a
snapshot to a backup UFS2 filesystem and then taking a snapshot with the
same name.
I am coming to the conclusion that the long-term survivability of data
requires several components -- good live file system, good backups, good
archives, continuous internal integrity checking with self-healing,
periodic external integrity checking (e.g. mtree(1)) with some form of
recovery (e.g. manual), etc.. If I get the other pieces right, I could
go with OpenZFS for the live and backup systems, and worry less about
data corruption bugs.
David