Re: [systemd-devel] systemd-tmpfiles subvolume handling vs. changing default btrfs root

2018-07-13 Thread Lennart Poettering
On Fr, 29.06.18 21:04, Ignaz Forster (ifors...@suse.de) wrote:

> Reordered the quotes below for better reading flow.
> 
> Am 28.06.2018 um 10:52 schrieb Lennart Poettering:
> > > > But quite frankly I don't grok the problem at hand, i.e. what you are
> > > > trying to do, even.
> > > 
> > > Was this explanation any better?
> > 
> > Not really still, what I don't grok what precisely a "system snapshot"
> > in suse terms is actually supposed to entail. Is it supposed to
> > contain only the vendor RPMs, i.e. only /usr?
> 
> That's the general idea, yes.*
> 
> Everything which contains variable or user data (i.e. which is not supposed
> to be rolled back like databases or files created by the user) will be put
> onto an own subvolume or partition.
> 
> For reference here's how this looks like on openSUSE Leap 15 again:
> ID parent top lvl path
> -- -- --- 
> 2575  5   /@
> 258257257 /@/var
> 259257257 /@/usr/local
> 260257257 /@/tmp
> 261257257 /@/srv
> 262257257 /@/root
> 263257257 /@/opt
> 264257257 /@/home
> 265257257 /@/boot/grub2/x86_64-efi
> 266257257 /@/boot/grub2/i386-pc
> 267257257 /@/.snapshots
> 411267267 /@/.snapshots/138/snapshot
> 412267267 /@/.snapshots/139/snapshot
> 
> 
> *) Some packages will still use /bin, /lib and the like, and those will be
> part of the snapshot; on the other hand distribution RPMs may also contain
> files or directories in e.g. /var, which will not be part of the snapshot.
> Because of that I'd prefer the term "static / read-only / unmodifiable part
> of the root file system" instead of "vendor RPMs".
> 
> > or everything except
> > /home, /srv, /var, /tmp?
> 
> Everything except the directories listed above, because those contain
> variable data which one usually doesn't want to reset just because e.g. a
> new kernel doesn't boot.
> That won't prevent the user from creating his own snapshots of these
> subvolumes of course.
> 
> > > > systemd will never create disassociated subvolumes for you.
> > > 
> > > That's the problem - it will create subvolumes which will just disappear
> > > from the system when switching to the next snapshot.
> > 
> > Well, no, if snapshots are done recursively they wouldn't, they would
> > be switched at the same time.
> 
> I think it's not relevant for this discussion, you were repeatedly talking
> about recursive snapshots now, however as far as I'm aware btrfs is not
> capable to doing that. I've found a patchset on
> https://www.spinics.net/lists/linux-btrfs/msg29205.html, but it seems the
> relevant parts for snapshot creation weren't added upstream.
> 
> So how are those recursive btrfs snapshots supposed to work?

So, systemd's btrfs code supports doing recursive snapshots (which is
exposed through "machinectl clone" or "systemd-nspawn
--ephemeral"). If the upstream btrfs tools don't support them, please
work with them to fix that. There's nothing too magic about them, it's
a pity that this isn't supported yet.

> > tmpfiles won't create any subvolumes for you — except if they are
> > missing. tmpfiles can't guess the complex mappings you applied to your
> > tree, it can't know that you don't want to allow recursive snapshots,
> > but place them all in the same dir and bind mount them. Also, if I
> > understand correctly the way suse sets this up always *requires*
> > additions to fstab for any subvol created, which is clearly out of
> > focus for tmpfiles.
> 
> I agree that it's next to impossible to programmatically find out what a
> user intended to do with a specific layout.
> However in my opinion it would be preferable to create at least a working,
> though maybe not optimal configuration compared to a configuration which is
> known to break in several cases (independent of the distribution).
> 
> Instead of adding fstab entries (which I also have a bellyache with) it may
> be an alternative to create a mount unit instead. But yes, something would
> have to be done to mount those subvolumes on boot.

I am very much convinced that tmpfiles not should change mount
configuration. It's a tool to adjust file system objects on disk, and
it should remain that.

I think the much nicer approach is the one I suggested, i.e. where
subvol trees are always cloned in full, recursively, and it is solely
/usr and whatever else shall be disconnected fom them each tree that
is mounted into it.

> I'm wondering if just refusing to create a subvolume on a snapshot would be
> another option... That way the problem would be given back to the user or
> distribution.

My recommendation: if you really want to go with the design you
proposed, go ahead, but make sure you created the bind mounts early
enough, tmpfiles won't change them then. After all, tmpfiles will only
make changes if something is missing here, it will never change
anything that already exists into a subvol.

> 

Re: [systemd-devel] systemd-tmpfiles subvolume handling vs. changing default btrfs root

2018-06-29 Thread Ignaz Forster

Reordered the quotes below for better reading flow.

Am 28.06.2018 um 10:52 schrieb Lennart Poettering:

But quite frankly I don't grok the problem at hand, i.e. what you are
trying to do, even.


Was this explanation any better?


Not really still, what I don't grok what precisely a "system snapshot"
in suse terms is actually supposed to entail. Is it supposed to
contain only the vendor RPMs, i.e. only /usr?


That's the general idea, yes.*

Everything which contains variable or user data (i.e. which is not 
supposed to be rolled back like databases or files created by the user) 
will be put onto an own subvolume or partition.


For reference here's how this looks like on openSUSE Leap 15 again:
ID parent top lvl path
-- -- --- 
2575  5   /@
258257257 /@/var
259257257 /@/usr/local
260257257 /@/tmp
261257257 /@/srv
262257257 /@/root
263257257 /@/opt
264257257 /@/home
265257257 /@/boot/grub2/x86_64-efi
266257257 /@/boot/grub2/i386-pc
267257257 /@/.snapshots
411267267 /@/.snapshots/138/snapshot
412267267 /@/.snapshots/139/snapshot


*) Some packages will still use /bin, /lib and the like, and those will 
be part of the snapshot; on the other hand distribution RPMs may also 
contain files or directories in e.g. /var, which will not be part of the 
snapshot. Because of that I'd prefer the term "static / read-only / 
unmodifiable part of the root file system" instead of "vendor RPMs".



or everything except
/home, /srv, /var, /tmp?


Everything except the directories listed above, because those contain 
variable data which one usually doesn't want to reset just because e.g. 
a new kernel doesn't boot.
That won't prevent the user from creating his own snapshots of these 
subvolumes of course.



systemd will never create disassociated subvolumes for you.


That's the problem - it will create subvolumes which will just disappear
from the system when switching to the next snapshot.


Well, no, if snapshots are done recursively they wouldn't, they would
be switched at the same time.


I think it's not relevant for this discussion, you were repeatedly 
talking about recursive snapshots now, however as far as I'm aware btrfs 
is not capable to doing that. I've found a patchset on 
https://www.spinics.net/lists/linux-btrfs/msg29205.html, but it seems 
the relevant parts for snapshot creation weren't added upstream.


So how are those recursive btrfs snapshots supposed to work?


tmpfiles won't create any subvolumes for you — except if they are
missing. tmpfiles can't guess the complex mappings you applied to your
tree, it can't know that you don't want to allow recursive snapshots,
but place them all in the same dir and bind mount them. Also, if I
understand correctly the way suse sets this up always *requires*
additions to fstab for any subvol created, which is clearly out of
focus for tmpfiles.


I agree that it's next to impossible to programmatically find out what a 
user intended to do with a specific layout.
However in my opinion it would be preferable to create at least a 
working, though maybe not optimal configuration compared to a 
configuration which is known to break in several cases (independent of 
the distribution).


Instead of adding fstab entries (which I also have a bellyache with) it 
may be an alternative to create a mount unit instead. But yes, something 
would have to be done to mount those subvolumes on boot.



Also, tmpfiles won't actually create any subvols below /usr (unless a
user dropped something in to do that on its own), it will only do so
in the root dir for precisely /var, /tmp, /home and /srv. All others
are created below /var. Which means you rule of "don't create subvols
below system directories" isn't actually touched, because the
read-only OS is monopolized in /usr anyway... Or maybe I am still not
getting what you are trying to say?


The rule would be "don't create subvols below snapshots", and the 
read-only OS is not exactly monopolized in /usr either (not only because 
of /bin, /lib etc, but also because of /boot - see last paragraph of the 
mail), but apart from that that nails it.


The issue was originally discovered when upgrading systemd on an older 
openSUSE machine which did not have a unified /var subvolume, so 
/var/lib/machines got attached to the root subvolume.
This may happen again in the future for us, but as said we are not the 
only ones using this mechanism. Seeing the default Fedora and Ubuntu 
btrfs layouts it's even more likely to happen if anybody is using 
pattern 3 there. Apart from that I'd prefer systemd-tmpfiles to work 
even if a user threw in something unexpected.


I'm wondering if just refusing to create a subvolume on a snapshot would 
be another option... That way the problem would be given back to the 
user or distribution.



The assumption systemd-tmpfiles makes 

Re: [systemd-devel] systemd-tmpfiles subvolume handling vs. changing default btrfs root

2018-06-28 Thread Lennart Poettering
On Mi, 27.06.18 23:25, Ignaz Forster (ifors...@suse.de) wrote:

> 3) Set the default btrfs subvolume to the backup snapshot subvolume (or a
> copy of it)
> 
> I'm talking about case *3* here. Whenever one wants to roll back to a
> certain snapshot one would just call e.g.
>   btrfs subvolume set-default /.snapshots/123/snapshot/
> and reboot.
> 
> In contrast to case 1 and 2 there is no dedicated static "/" subvolume
> (which also implies this is not a "Nested", but a "Flat" or "Mixed" layout
> as outlined on
> https://btrfs.wiki.kernel.org/index.php/SysadminGuide#Layout), but the
> default btrfs subvolume will always point to the snapshot of the last
> rollback. (On openSUSE this mechanism is also used for read-only systems
> where each update will switch to a new snapshot.)
> 
> 
> I'd sum this up as: subvolumes for system directories should never be
> created as children of a snapshot as snapshots are a variable thing.

tmpfiles won't create any subvolumes for you — except if they are
missing. tmpfiles can't guess the complex mappings you applied to your
tree, it can't know that you don't want to allow recursive snapshots,
but place them all in the same dir and bind mount them. Also, if I
understand correctly the way suse sets this up always *requires*
additions to fstab for any subvol created, which is clearly out of
focus for tmpfiles.

Also, tmpfiles won't actually create any subvols below /usr (unless a
user dropped something in to do that on its own), it will only do so
in the root dir for precisely /var, /tmp, /home and /srv. All others
are created below /var. Which means you rule of "don't create subvols
below system directories" isn't actually touched, because the
read-only OS is monopolized in /usr anyway... Or maybe I am still not
getting what you are trying to say?

> > The assumption systemd-tmpfiles makes is always that the subvolumes
> > it implicitly creates for you if they are missing are associated
> > with the subvolume they are created below, and that this means they
> > are snapshotted, removed and otheerwise managed along with them.
> 
> Keeping this logic more or less assumes that snapshots will always be used
> as static backups and pattern 3 from above must not be used.

I don't see that at all. I mean, this all depends how you want to
associate /var with /. my assumption is that they belong together, but
i figure that's not what you have in mind? you want to keep using the
same /var even though you switch back and forth to different /?

i am not sure if follow fully, but i think the model should be the
other way round: keep the root file system in one subvolume, and keep
/usr completely separate from that, and only combine the two through
bind mounts when you want to go for one specific version. In that
mode, all subvolumes systemd generates would be children of the root
subvolume, as they should be, but /usr would be separate.

> > systemd will never create disassociated subvolumes for you.
> 
> That's the problem - it will create subvolumes which will just disappear
> from the system when switching to the next snapshot.

Well, no, if snapshots are done recursively they wouldn't, they would
be switched at the same time.

> > But quite frankly I don't grok the problem at hand, i.e. what you are
> > trying to do, even.
> 
> Was this explanation any better?

Not really still, what I don't grok what precisely a "system snapshot"
in suse terms is actually supposed to entail. Is it supposed to
contain only the vendor RPMs, i.e. only /usr? or everything except
/home, /srv, /var, /tmp? Or the inverse of that?

Lennart

-- 
Lennart Poettering, Red Hat
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] systemd-tmpfiles subvolume handling vs. changing default btrfs root

2018-06-27 Thread Ignaz Forster

Am 27.06.2018 um 16:34 schrieb Lennart Poettering:

On Mi, 27.06.18 15:50, Ignaz Forster (ifors...@suse.de) wrote:


By recursive snaphots I really mean recursive snapshots, i.e. if you
have a subvolume called `/foobar` and there's a subvolume below it
called `/foobar/var`, and you'd make a snapshot of `/foobar` and call
it `/foobar2`, then this would implicitly also have the effect of
snapshotting `/foobar/var` and calling it `/foobar2/var`, so that each
snapshot is always "complete".


Ah, I see - no, that's not the problem here.
The subvolumes are there because we do *not* want to snapshot them.

It's guess it's best to just ignore the second bullet point - it's a follow
up problem, but it isn't really important for the main point: Attaching a
new subvolume to a snapshot.


I still don't grok this. What's the precise problem then?


The problem is that the subvolume layout may not always be the way 
systemd-tmpfiles expects it to be. This is caused by different possible 
ways of handling rollbacks to a previous snapshot.


I'm aware of three common patterns on how to restore a previous snapshot 
of the system (i.e. the root file system):

1) Use 'mv' to move the backup to the original location
2) Delete the root file system snapshot and recreate it using the backup 
snapshot as a parent
3) Set the default btrfs subvolume to the backup snapshot subvolume (or 
a copy of it)


I'm talking about case *3* here. Whenever one wants to roll back to a 
certain snapshot one would just call e.g.

btrfs subvolume set-default /.snapshots/123/snapshot/
and reboot.

In contrast to case 1 and 2 there is no dedicated static "/" subvolume 
(which also implies this is not a "Nested", but a "Flat" or "Mixed" 
layout as outlined on 
https://btrfs.wiki.kernel.org/index.php/SysadminGuide#Layout), but the 
default btrfs subvolume will always point to the snapshot of the last 
rollback. (On openSUSE this mechanism is also used for read-only systems 
where each update will switch to a new snapshot.)



I'd sum this up as: subvolumes for system directories should never be 
created as childs of a snapshot as snapshots are a variable thing.



The assumption systemd-tmpfiles makes is always that the subvolumes
it implicitly creates for you if they are missing are associated
with the subvolume they are created below, and that this means they
are snapshotted, removed and otheerwise managed along with them.


Keeping this logic more or less assumes that snapshots will always be 
used as static backups and pattern 3 from above must not be used.


However not only *SUSE or snapper are using this pattern, but several 
websites also suggest this workflow - that's why I'm interested in 
upstream support.



systemd will never create disassociated subvolumes for you.


That's the problem - it will create subvolumes which will just disappear 
from the system when switching to the next snapshot.



If you
want that use some other tools, but tmpfiles is not really supposed to
do complex stuff like that.


The added complexity is the reason why I brought this to the list. 
However I'd (obviously) still prefer compatibility with a larger array 
of btrfs layouts by default.


Finding the subvolume and making sure it's mounted on the next boot 
(e.g. by adding an fstab entry or a mount unit) would be the most 
complex part about this.



But quite frankly I don't grok the problem at hand, i.e. what you are
trying to do, even.


Was this explanation any better?

Ignaz
--
Ignaz Forster 
Research Engineer
SUSE Linux GmbH, Maxfeldstr. 5, D-90409 Nürnberg
Tel: +49-911-74053-281;  https://www.suse.com/
SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard,
Graham Norton, HRB 21284 (AG Nürnberg)
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] systemd-tmpfiles subvolume handling vs. changing default btrfs root

2018-06-27 Thread Lennart Poettering
On Mi, 27.06.18 15:50, Ignaz Forster (ifors...@suse.de) wrote:

> > By recursive snaphots I really mean recursive snapshots, i.e. if you
> > have a subvolume called `/foobar` and there's a subvolume below it
> > called `/foobar/var`, and you'd make a snapshot of `/foobar` and call
> > it `/foobar2`, then this would implicitly also have the effect of
> > snapshotting `/foobar/var` and calling it `/foobar2/var`, so that each
> > snapshot is always "complete".
> 
> Ah, I see - no, that's not the problem here.
> The subvolumes are there because we do *not* want to snapshot them.
> 
> It's guess it's best to just ignore the second bullet point - it's a follow
> up problem, but it isn't really important for the main point: Attaching a
> new subvolume to a snapshot.

I still don't grok this. What's the precise problem then?

The assumption systemd-tmpfiles makes is always that the subvolumes
it implicitly creates for you if they are missing are associated
with the subvolume they are created below, and that this means they
are snapshotted, removed and otheerwise managed along with them.

systemd will never create disassociated subvolumes for you. If you
want that use some other tools, but tmpfiles is not really supposed to
do complex stuff like that.

But quite frankly I don't grok the problem at hand, i.e. what you are
trying to do, even. 

Lennart

-- 
Lennart Poettering, Red Hat
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] systemd-tmpfiles subvolume handling vs. changing default btrfs root

2018-06-27 Thread Ignaz Forster

Am 27.06.2018 um 15:37 schrieb Lennart Poettering:

On Mi, 27.06.18 15:09, Ignaz Forster (ifors...@suse.de) wrote:


Am 27.06.2018 um 13:39 schrieb Lennart Poettering:

On Mi, 27.06.18 13:02, Ignaz Forster (ifors...@suse.de) wrote:


Hello,

when using systemd-tmpfiles' feature to create subvolumes it will always
create the new subvolume as a child of the subvolume of the given path. This
however may not always be the expected parent, especially when using btrfs
snapshots to switch between various system states.

Example layout:
===

Let's assume the following subvolume layout (a simplified openSUSE layout):

ID  parent  top level   path
--  --  -   
257 5   5   /@
258 257 257 /@/var
259 257 257 /@/.snapshots/1/snapshot
260 257 257 /@/.snapshots/2/snapshot
261 257 257 /@/.snapshots/3/snapshot

A corresponding /etc/fstab could look like this:

/dev/sdx/   btrfs   defaults0   0
/dev/sdx/varbtrfs   subvol=@/var0   0

with the default btrfs subvolume set to "261".
The third snapshot would thus be the root file system, with /var mounted on
top of it.


The problem:


Creating "/var/test" would create a new entry like
262 258 258 @/var/test
as expected.
However creating "/opt" would create an entry similar to the following:
263 261 261 @/.snapshots/3/snapshot/opt

This is not good, as two things will happen now:
* When changing the snapshot (e.g. by reverting back to an old snapshot or
creating a new one) /opt won't be visible any more (without manually
mounting it), as it is not nested into the existing structure any more
* The third snapshot cannot be deleted without removing the
subvolume first


I am not sure I follow here fully. but isn't this just a shortcoming because
you are not doing recursive snapshots? why not just fix that?


With "recursive snapshots" I assume you mean putting the snapshot below the
original root file system?


By recursive snaphots I really mean recursive snapshots, i.e. if you
have a subvolume called `/foobar` and there's a subvolume below it
called `/foobar/var`, and you'd make a snapshot of `/foobar` and call
it `/foobar2`, then this would implicitly also have the effect of
snapshotting `/foobar/var` and calling it `/foobar2/var`, so that each
snapshot is always "complete".


Ah, I see - no, that's not the problem here.
The subvolumes are there because we do *not* want to snapshot them.

It's guess it's best to just ignore the second bullet point - it's a 
follow up problem, but it isn't really important for the main point: 
Attaching a new subvolume to a snapshot.


Ignaz
--
Ignaz Forster 
Research Engineer
SUSE Linux GmbH, Maxfeldstr. 5, D-90409 Nürnberg
Tel: +49-911-74053-281;  https://www.suse.com/
SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard,
Graham Norton, HRB 21284 (AG Nürnberg)
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] systemd-tmpfiles subvolume handling vs. changing default btrfs root

2018-06-27 Thread Lennart Poettering
On Mi, 27.06.18 15:09, Ignaz Forster (ifors...@suse.de) wrote:

> Am 27.06.2018 um 13:39 schrieb Lennart Poettering:
> > On Mi, 27.06.18 13:02, Ignaz Forster (ifors...@suse.de) wrote:
> > 
> > > Hello,
> > > 
> > > when using systemd-tmpfiles' feature to create subvolumes it will always
> > > create the new subvolume as a child of the subvolume of the given path. 
> > > This
> > > however may not always be the expected parent, especially when using btrfs
> > > snapshots to switch between various system states.
> > > 
> > > Example layout:
> > > ===
> > > 
> > > Let's assume the following subvolume layout (a simplified openSUSE 
> > > layout):
> > > 
> > > IDparent  top level   path
> > > ----  -   
> > > 257   5   5   /@
> > > 258   257 257 /@/var
> > > 259   257 257 /@/.snapshots/1/snapshot
> > > 260   257 257 /@/.snapshots/2/snapshot
> > > 261   257 257 /@/.snapshots/3/snapshot
> > > 
> > > A corresponding /etc/fstab could look like this:
> > > 
> > > /dev/sdx  /   btrfs   defaults0   0
> > > /dev/sdx  /varbtrfs   subvol=@/var0   0
> > > 
> > > with the default btrfs subvolume set to "261".
> > > The third snapshot would thus be the root file system, with /var mounted 
> > > on
> > > top of it.
> > > 
> > > 
> > > The problem:
> > > 
> > > 
> > > Creating "/var/test" would create a new entry like
> > > 262   258 258 @/var/test
> > > as expected.
> > > However creating "/opt" would create an entry similar to the following:
> > > 263   261 261 @/.snapshots/3/snapshot/opt
> > > 
> > > This is not good, as two things will happen now:
> > > * When changing the snapshot (e.g. by reverting back to an old snapshot or
> > > creating a new one) /opt won't be visible any more (without manually
> > > mounting it), as it is not nested into the existing structure any more
> > > * The third snapshot cannot be deleted without removing the
> > > subvolume first
> > 
> > I am not sure I follow here fully. but isn't this just a shortcoming because
> > you are not doing recursive snapshots? why not just fix that?
> 
> With "recursive snapshots" I assume you mean putting the snapshot below the
> original root file system?

By recursive snaphots I really mean recursive snapshots, i.e. if you
have a subvolume called `/foobar` and there's a subvolume below it
called `/foobar/var`, and you'd make a snapshot of `/foobar` and call
it `/foobar2`, then this would implicitly also have the effect of
snapshotting `/foobar/var` and calling it `/foobar2/var`, so that each
snapshot is always "complete".

Lennart

-- 
Lennart Poettering, Red Hat
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] systemd-tmpfiles subvolume handling vs. changing default btrfs root

2018-06-27 Thread Ignaz Forster

Am 27.06.2018 um 13:39 schrieb Lennart Poettering:

On Mi, 27.06.18 13:02, Ignaz Forster (ifors...@suse.de) wrote:


Hello,

when using systemd-tmpfiles' feature to create subvolumes it will always
create the new subvolume as a child of the subvolume of the given path. This
however may not always be the expected parent, especially when using btrfs
snapshots to switch between various system states.

Example layout:
===

Let's assume the following subvolume layout (a simplified openSUSE layout):

ID  parent  top level   path
--  --  -   
257 5   5   /@
258 257 257 /@/var
259 257 257 /@/.snapshots/1/snapshot
260 257 257 /@/.snapshots/2/snapshot
261 257 257 /@/.snapshots/3/snapshot

A corresponding /etc/fstab could look like this:

/dev/sdx/   btrfs   defaults0   0
/dev/sdx/varbtrfs   subvol=@/var0   0

with the default btrfs subvolume set to "261".
The third snapshot would thus be the root file system, with /var mounted on
top of it.


The problem:


Creating "/var/test" would create a new entry like
262 258 258 @/var/test
as expected.
However creating "/opt" would create an entry similar to the following:
263 261 261 @/.snapshots/3/snapshot/opt

This is not good, as two things will happen now:
* When changing the snapshot (e.g. by reverting back to an old snapshot or
creating a new one) /opt won't be visible any more (without manually
mounting it), as it is not nested into the existing structure any more
* The third snapshot cannot be deleted without removing the
subvolume first


I am not sure I follow here fully. but isn't this just a shortcoming because
you are not doing recursive snapshots? why not just fix that?


With "recursive snapshots" I assume you mean putting the snapshot below 
the original root file system?


If so that's not how the btrfs subvolumes are organized in this case: 
The "@" subvolume itself is almost empty and only contains further 
subvolumes.
During *SUSE setup everything will be installed to 
"@/.snapshots/1/snapshot" instead, and this subvolume will be set as the 
default btrfs subvolume (which would be equivalent to using the mount 
options 'subvol=@/.snapshots/1/snapshot' for '/').


After installation a tool called Snapper (default on *SUSE, but also 
available on several other distributions) will take care of snapshot 
management (e.g. by creating a new snapshot on system changes).
Now if a user wants to do a rollback, the default btrfs subvolume will 
just be set to that specific snapshot and makes that one the new root 
file system.


This design is intentional by relying on btrfs' feature to change the 
default subvolume, and thus imho works a designed.

--
Ignaz Forster 
Research Engineer
SUSE Linux GmbH, Maxfeldstr. 5, D-90409 Nürnberg
Tel: +49-911-74053-281;  https://www.suse.com/
SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard,
Graham Norton, HRB 21284 (AG Nürnberg)
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] systemd-tmpfiles subvolume handling vs. changing default btrfs root

2018-06-27 Thread Lennart Poettering
On Mi, 27.06.18 13:02, Ignaz Forster (ifors...@suse.de) wrote:

> Hello,
> 
> when using systemd-tmpfiles' feature to create subvolumes it will always
> create the new subvolume as a child of the subvolume of the given path. This
> however may not always be the expected parent, especially when using btrfs
> snapshots to switch between various system states.
> 
> Example layout:
> ===
> 
> Let's assume the following subvolume layout (a simplified openSUSE layout):
> 
> IDparent  top level   path
> ----  -   
> 257   5   5   /@
> 258   257 257 /@/var
> 259   257 257 /@/.snapshots/1/snapshot
> 260   257 257 /@/.snapshots/2/snapshot
> 261   257 257 /@/.snapshots/3/snapshot
> 
> A corresponding /etc/fstab could look like this:
> 
> /dev/sdx  /   btrfs   defaults0   0
> /dev/sdx  /varbtrfs   subvol=@/var0   0
> 
> with the default btrfs subvolume set to "261".
> The third snapshot would thus be the root file system, with /var mounted on
> top of it.
> 
> 
> The problem:
> 
> 
> Creating "/var/test" would create a new entry like
> 262   258 258 @/var/test
> as expected.
> However creating "/opt" would create an entry similar to the following:
> 263   261 261 @/.snapshots/3/snapshot/opt
> 
> This is not good, as two things will happen now:
> * When changing the snapshot (e.g. by reverting back to an old snapshot or
> creating a new one) /opt won't be visible any more (without manually
> mounting it), as it is not nested into the existing structure any more
> * The third snapshot cannot be deleted without removing the
> subvolume first

I am not sure I follow here fully. but isn't this just a shortcoming because
you are not doing recursive snapshots? why not just fix that?

Lennart

-- 
Lennart Poettering, Red Hat
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel


[systemd-devel] systemd-tmpfiles subvolume handling vs. changing default btrfs root

2018-06-27 Thread Ignaz Forster

Hello,

when using systemd-tmpfiles' feature to create subvolumes it will always 
create the new subvolume as a child of the subvolume of the given path. 
This however may not always be the expected parent, especially when 
using btrfs snapshots to switch between various system states.


Example layout:
===

Let's assume the following subvolume layout (a simplified openSUSE layout):

ID  parent  top level   path
--  --  -   
257 5   5   /@
258 257 257 /@/var
259 257 257 /@/.snapshots/1/snapshot
260 257 257 /@/.snapshots/2/snapshot
261 257 257 /@/.snapshots/3/snapshot

A corresponding /etc/fstab could look like this:

/dev/sdx/   btrfs   defaults0   0
/dev/sdx/varbtrfs   subvol=@/var0   0

with the default btrfs subvolume set to "261".
The third snapshot would thus be the root file system, with /var mounted 
on top of it.



The problem:


Creating "/var/test" would create a new entry like
262 258 258 @/var/test
as expected.
However creating "/opt" would create an entry similar to the following:
263 261 261 @/.snapshots/3/snapshot/opt

This is not good, as two things will happen now:
* When changing the snapshot (e.g. by reverting back to an old snapshot 
or creating a new one) /opt won't be visible any more (without manually 
mounting it), as it is not nested into the existing structure any more

* The third snapshot cannot be deleted without removing the subvolume first


Possible solutions:
===

Let's have a look at the default btrfs layouts of some distributions.

Fedora 28 (Server):
ID  parent  top lvl path
--  --  --- 
257 5   5   /root

Ubuntu 18.04:
257 5   5   /@
258 5   5   /@home

openSUSE Leap 15:
257 5   5   /@
258 257 257 /@/var
259 257 257 /@/usr/local
260 257 257 /@/tmp
261 257 257 /@/srv
262 257 257 /@/root
263 257 257 /@/opt
264 257 257 /@/home
265 257 257 /@/boot/grub2/x86_64-efi
266 257 257 /@/boot/grub2/i386-pc
267 257 257 /@/.snapshots
411 267 267 /@/.snapshots/138/snapshot
412 267 267 /@/.snapshots/139/snapshot


Option 1:
-
Go back the path and check for subvolumes where the parent ID is "5". 
This would work for the default btrfs layout of these three 
distributions and is easy to implement, but would break in the 
hypothetical case if someone would create snapshot directories directly 
as children of ID 5, e.g.

257 5   5   /@
258 5   5   /@home
259 5   5   /@snapshot1

Option 2:
-
A variant would be to keep the current behaviour except when the parent 
ID would be a snapshot (by checking if a parent UUID is set?); in this 
case the snapshot would be created as a child of ID 5.


I can't think of any case where automatically creating a subvolume below 
a snapshot is a good idea, so in theory this sounds like best approach.



In any case a corresponding implementation would mean additional 
handling for generating an fstab entry and mounting the subvolume if the 
subvolume is not created as a subdirectory of an existing subvolume.



Would this be an approach that would be acceptable upstream? Please note 
that I'm not a btrfs expert, so I may be missing something.

--
Ignaz Forster 
Research Engineer
SUSE Linux GmbH, Maxfeldstr. 5, D-90409 Nürnberg
Tel: +49-911-74053-281;  https://www.suse.com/
SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard,
Graham Norton, HRB 21284 (AG Nürnberg)
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel