Re: [gentoo-user] Re: Suggestions for backup scheme?

2024-02-09 Thread Peter Humphrey
On Friday, 9 February 2024 15:48:45 GMT Wols Lists wrote:

> ... And I'm not worried about a double failure - yes it could happen,
> but ...
> 
> Given that my brother's ex-employer was quite happily running a raid-6
> with maybe petabytes of data, over a double disk failure (until an
> employee went into the data centre and said "what are those red
> lights"), I don't think my 20TB of raid-5 is much :-)

[OT - anecdote]

I used to work in power generation and transmission (CEGB, for those with long 
memories), in which every system was required to be fault tolerant - one fault 
at a time. As Wol says, that's fine until your one fault has appeared and not 
been noticed. Then another fault appears - and the reactor shuts down! 
Carpeting comes next...

Oh, frabjous day!

[/OT]

-- 
Regards,
Peter.






Re: [gentoo-user] Re: Suggestions for backup scheme?

2024-02-09 Thread Wols Lists

On 09/02/2024 12:57, J. Roeleveld wrote:

I don't understand it exactly, but what I think happens is when I create
the snapshot it allocates, let's say, 1GB. As I write to the master
copy, it fills up that 1GB with CoW blocks, and the original blocks are
handed over to the backup snapshot. And when that backup snapshot is
full of blocks that have been "overwritten" (or in reality replaced),
lvm just adds another 1GB or whatever I told it to.



That works with a single snapshot.
But, when I last used LVM like this, I would have multiple snapshots. When I
change something on the LV, the original data would be copied to the snapshot.
If I would have 2 snapshots for that LV, both would grow at the same time.

Or is that changed in recent versions?


Has what changed? As I understand it, the whole point of LVM is that 
everything is COW. So any individual block can belong to multiple snapshots.


When you write a block, the original block is not changed. A new block 
is linked in to the current snapshot to replace the original. The 
original block remains linked in to any other snapshots.


So disk usage basically grows by the number of blocks you write. Taking 
a snapshot will use just a couple of blocks, no matter how large your LV is.



So when I delete a snapshot, it just goes through those few blocks,
decrements their use count (if they've been used in multiple snapshots),
and if the use count goes to zero they're handed back to the "empty" pool.

I know this is how ZFS snapshots work. But am not convinced LVM snapshots work
the same way.


All I have to do is make sure that the sum of my snapshots does not fill
the lv (logical volume). Which in my case is a raid-5.

I assume you mean PV (Physical Volume)?


Quite possibly. VG, PV, LV. I know which one I need (by reading the 
docs), I don't particularly remember which is which off the top of my head.


I actually ditched the whole idea of raid-5 when drives got bigger than 1TB. I
currently use Raid-6 (or specifically RaidZ2, which is the ZFS "equivalent")

Well, I run my raid over dm-integrity so, allegedly, I can't suffer disk 
corruption. My only fear is a disk loss, which raid-5 will happily 
recover from. And I'm not worried about a double failure - yes it could 
happen, but ...


Given that my brother's ex-employer was quite happily running a raid-6 
with maybe petabytes of data, over a double disk failure (until an 
employee went into the data centre and said "what are those red 
lights"), I don't think my 20TB of raid-5 is much :-)


Cheers,
Wol




Re: [gentoo-user] Re: Suggestions for backup scheme?

2024-02-09 Thread J. Roeleveld
On Thursday, February 8, 2024 6:44:50 PM CET Wols Lists wrote:
> On 08/02/2024 06:38, J. Roeleveld wrote:
> > ZFS doesn't have this "max amount of changes", but will happily fill up
> > the
> > entire pool keeping all versions available.
> > But it was easier to add zpool monitoring for this on ZFS then it was to
> > add snapshot monitoring to LVM.
> > 
> > I wonder, how do you deal with snapshots getting "full" on your system?
> 
> As far as I'm, concerned, snapshots are read-only once they're created.
> But there is a "grow the snapshot as required" option.
> 
> I don't understand it exactly, but what I think happens is when I create
> the snapshot it allocates, let's say, 1GB. As I write to the master
> copy, it fills up that 1GB with CoW blocks, and the original blocks are
> handed over to the backup snapshot. And when that backup snapshot is
> full of blocks that have been "overwritten" (or in reality replaced),
> lvm just adds another 1GB or whatever I told it to.

That works with a single snapshot.
But, when I last used LVM like this, I would have multiple snapshots. When I 
change something on the LV, the original data would be copied to the snapshot.
If I would have 2 snapshots for that LV, both would grow at the same time.

Or is that changed in recent versions?

> So when I delete a snapshot, it just goes through those few blocks,
> decrements their use count (if they've been used in multiple snapshots),
> and if the use count goes to zero they're handed back to the "empty" pool.

I know this is how ZFS snapshots work. But am not convinced LVM snapshots work 
the same way.

> All I have to do is make sure that the sum of my snapshots does not fill
> the lv (logical volume). Which in my case is a raid-5.

I assume you mean PV (Physical Volume)?

I actually ditched the whole idea of raid-5 when drives got bigger than 1TB. I 
currently use Raid-6 (or specifically RaidZ2, which is the ZFS "equivalent")

--
Joost





Re: [gentoo-user] Re: Suggestions for backup scheme?

2024-02-09 Thread J. Roeleveld
On Thursday, February 8, 2024 6:36:56 PM CET Wols Lists wrote:
> On 08/02/2024 06:32, J. Roeleveld wrote:
> >> After all, there's nothing stopping*you*  from combining Linux and ZFS,
> >> it's just that somebody else can't do that for you, and then give you
> >> the resulting binary.
> > 
> > Linux (kernel) and ZFS can't be merged. Fine.
> 
> But they can.

Not if you want to release it

> > But, Linux (the OS, as in, kernel + userspace) and ZFS can be merged
> > legally.
> Likewise here, they can.
> 
> The problem is, the BINARY can NOT be distributed. And the problem is
> the ZFS licence, not Linux.

You can distribute the binary of both, just not embedded into a single binary.

> What Linus, and the kernel devs, and that crowd *think* is irrelevant.

It is, as they are actively working on removing API calls that filesystems like 
ZFS actually need and hiding them behind a GPL wall.

> What matters is what SUSE, and Red Hat, and Canonical et al think. And
> if they're not prepared to take the risk of distributing the kernel with
> ZFS built in, because they think it's a legal minefield, then that's
> THEIR decision.

I'm not talking about distributing ZFS embedded into the kernel. It's 
perfectly fine to distribute a distribution with ZFS as a kernel module. The 
issue is caused by the linux kernel devs blocking access to (previously 
existing and open) API calls and limiting them to GPL only.

> That problem doesn't apply to gentoo, because it distributes the linux
> kernel and ZFS separately, and combines them ON THE USER'S MACHINE. But
> the big distros are not prepared to take the risk of combining linux and
> ZFS, and distributing the resulting *derived* *work*.

I would class Ubuntu as a big distribution and proxmox is also used a lot.
Both have ZFS support.

--
Joost